Closed
Description
While trying to implement a lexer, I noticed that RegexSet::matches
exhibits very bad performance when used on long strings.
The following test case runs in 1 minute, 16 seconds (cargo build --release):
extern crate regex;
fn main() {
let mut test_input = String::from("42 ");
for _ in 0 .. 100_000 { test_input.push_str("XXXXXXXXXXXXXXXXXXXX"); } // Delete me
let set = regex::RegexSet::new(&[r"^[0-9]+", r"^THIS_SHOULD_NEVER_MATCH"]).unwrap();
for _ in 0 .. 10_000 { println!("{:?}", set.matches(&test_input)); }
}
Without the line marked with // Delete me
, the program runs in about 50 ms.
If I understand correctly, matches
should try to match only at the very beginning of the string, thanks to the ^ anchor. However, it would seem that matches
traverses the whole string on each call.