Closed
Description
Lately I've been thinking a lot about providing a "multi regex" similar to RE2's "regex set" functionality. The problem they solve is, "I have multiple regexes that I want to run over some large search text once, and I want to see every match." The poor man's way of doing this is to combine them in a single regex of alternations, e.g., re1|re2|re3|...
. Two problems with that though:
- The current search machinery reports non-overlapping matches. That is, it's impossible for one alternation in a regex to share a match with another alternation in the same regex.
- To check which expressions matched, one adds capture groups and then inspect them after a match. Requiring capture groups for this functionality is bad because it incurs a performance penalty and simply isn't needed.
We can start relatively simple by providing an API that answers these three questions:
- Do any of the given regexes match anywhere? (analogous to
is_match
) - If so, which of those regexes match? Where do they match? (analogous to
find
) - Can you show me all matches? (analogous to
find_iter
)
Adding capture groups to this API seems possible, but is tricky, so I suggest doing that after an initial implementation is done.
Metadata
Metadata
Assignees
Labels
No labels