on short strings that request captures, don't run the DFA

Currently, if a caller requests captures, then we always run the DFA to determine the *extent* of the match and then we run either the Pike VM or the bounded backtracker on only the extent of the match to fill in the capture locations. This works well when searching long strings because the DFA can save the NFA engines from doing a lot of work. But on short strings, the DFA probably doesn't pay for itself, so we should just run one of the NFA engines if the string is short enough. (More precisely, in cases where the *match* is roughly the same length as the entire string, then the DFA isn't helping us at all since the NFA engine will still run the length of the match. But, obviously, this case isn't possible to know up front, so we use "short strings" as a likely predictor of that case.)

There should be some experimentation to determine where this boundary lies, so that we can invent a heuristic for when the string is "short enough."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on short strings that request captures, don't run the DFA #348

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

on short strings that request captures, don't run the DFA #348

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions