Skip to content

fuzz: compiling '\P{any}' panics by tripping an assertion in the compiler #722

Closed
@BurntSushi

Description

@BurntSushi

Specifically, this one:

assert!(!ranges.is_empty());

Normally, regexes like [^\w\W] with empty classes are banned at translation time. But it looks like \P{any} (which is empty) slipped through. So we should just improve the ban to cover that case.

However, empty character classes are occasionally useful constructs for injecting a "fail" sub-pattern into a regex, typically in the context of cases where regexes are generated. Indeed, the NFA compiler in regex-automata handles this case fine:

$ regex-cli debug nfa thompson '\P{any}' -B
      parse time:  48.809µs
  translate time:  17.48µs
compile nfa time:  18.638µs
   pattern count:  1

thompson::NFA(
>000000: alt(2, 1)
 000001: \x00-\xff => 0
^000002: sparse()
 000003: MATCH(0)
)

Where it's impossible to ever move past state 2. Arguably, it might be nicer if it were an explicit "fail" instruction, but an empty sparse instruction (a state with no outgoing transitions) serves the purpose as well.

So once #656 is done, we should be able to relax this restriction.

This bug was found by OSS-Fuzz.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions