Skip to content

Undefined behaviour in re2c lexers caused by unspecified default rule. #17523

Open
@skvadrik

Description

@skvadrik

Description

Hi! [re2c author here] It came to my attention in this nixos thread that re2c emits a some serious warnings on the current code for the lexers, e.g. this one. When running re2c with -W (or version >= 4.0) you should see this:

sapi/phpdbg/phpdbg_lexer.l:80:20: warning: escape has no effect: '\.' [-Wuseless-escape]
sapi/phpdbg/phpdbg_lexer.l:64:0: warning: control flow in condition 'NORMAL' is undefined for strings that match
        '\x22 [\x0\xA]'
        '\x27 [\x0\xA]'
        '\x22 \x22 [\x0\x9-\xA\xD\x20\x23]'
...

In particular, -Wundefined-control-flow warning indicates really serious issues; it is documented here. The fix should be simple: add default rule <*> * { /* error handling / abort / etc. */ }. Ideally also simplify some too-complex constructs like GENERIC_ID which make it hard to understand what's going on.

Some other warnings report unreachable rules, which is also not good (there's some rule that you think is doing something, but it's not).

I'm happy to help with further investigation and fixing these issues. I do recommend enabling the warnings from now on for all lexers - new bugs keep crawling in as the lexer code changes.

PHP Version

HEAD

Operating System

NixOS

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions