Skip to content

Commit aa4e4c7

Browse files
committed
automata: fix unintended panic in max_haystack_len
This fixes a bug where the bounded backtracker's `max_haystack_len` could panic if its bitset capacity ended up being smaller than the total number of NFA states. Under a default configuration this seems unlikely to happen due to the default limits on the size of a compiled regex. But if the compiled regex size limit is increased to a large number, then the likelihood of this panicking increases. Of course, one can provoke this even easier by just setting the visited capacity to a small number. Indeed, this is how we provoke it in a regression test.
1 parent 27a2538 commit aa4e4c7

File tree

2 files changed

+30
-3
lines changed

2 files changed

+30
-3
lines changed

regex-automata/src/meta/wrappers.rs

+4-1
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,10 @@ impl BoundedBacktrackerEngine {
212212
.configure(backtrack_config)
213213
.build_from_nfa(nfa.clone())
214214
.map_err(BuildError::nfa)?;
215-
debug!("BoundedBacktracker built");
215+
debug!(
216+
"BoundedBacktracker built (max haystack length: {:?})",
217+
engine.max_haystack_len()
218+
);
216219
Ok(Some(BoundedBacktrackerEngine(engine)))
217220
}
218221
#[cfg(not(feature = "nfa-backtrack"))]

regex-automata/src/nfa/thompson/backtrack.rs

+26-2
Original file line numberDiff line numberDiff line change
@@ -820,8 +820,11 @@ impl BoundedBacktracker {
820820
// bytes to the capacity in bits.
821821
let capacity = 8 * self.get_config().get_visited_capacity();
822822
let blocks = div_ceil(capacity, Visited::BLOCK_SIZE);
823-
let real_capacity = blocks * Visited::BLOCK_SIZE;
824-
(real_capacity / self.nfa.states().len()) - 1
823+
let real_capacity = blocks.saturating_mul(Visited::BLOCK_SIZE);
824+
// It's possible for `real_capacity` to be smaller than the number of
825+
// NFA states for particularly large regexes, so we saturate towards
826+
// zero.
827+
(real_capacity / self.nfa.states().len()).saturating_sub(1)
825828
}
826829
}
827830

@@ -1882,3 +1885,24 @@ fn div_ceil(lhs: usize, rhs: usize) -> usize {
18821885
(lhs / rhs) + 1
18831886
}
18841887
}
1888+
1889+
#[cfg(test)]
1890+
mod tests {
1891+
use super::*;
1892+
1893+
// This is a regression test for the maximum haystack length computation.
1894+
// Previously, it assumed that the total capacity of the backtracker's
1895+
// bitset would always be greater than the number of NFA states. But there
1896+
// is of course no guarantee that this is true. This regression test
1897+
// ensures that not only does `max_haystack_len` not panic, but that it
1898+
// should return `0`.
1899+
#[cfg(feature = "syntax")]
1900+
#[test]
1901+
fn max_haystack_len_overflow() {
1902+
let re = BoundedBacktracker::builder()
1903+
.configure(BoundedBacktracker::config().visited_capacity(10))
1904+
.build(r"[0-9A-Za-z]{100}")
1905+
.unwrap();
1906+
assert_eq!(0, re.max_haystack_len());
1907+
}
1908+
}

0 commit comments

Comments
 (0)