Skip to content

oss-fuzz 69058: TokenError #1787

Closed
@nedbat

Description

@nedbat

This link seems to be private, so copying details here... https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=69058

Project: coveragepy
Fuzzing Engine: libFuzzer
Fuzz Target: fuzz_parse
Job Type: libfuzzer_asan_coveragepy
Platform Id: linux

Crash Type: Uncaught exception
Crash Address: 
Crash State:
  _removeHandlerRef
  _tokenize
  generate_tokens

This is the claimed stack trace:

 	 === Uncaught Python exception: ===
	TokenError: ('EOF in multi-line string', (2, 0))
	Traceback (most recent call last):
	  File "fuzz_parse.py", line 33, in TestOneInput
	  File "coverage/parser.py", line 265, in parse_source
	  File "coverage/parser.py", line 143, in _raw_parse
	  File "coverage/phystokens.py", line 179, in generate_tokens
	  File "tokenize.py", line 461, in _tokenize
	TokenError: ('EOF in multi-line string', (2, 0))

The provided test case is an 8-byte file:

% hexdump -C /dwn/clusterfuzz-testcase-minimized-fuzz_parse-5820066691088384
00000000  ff 8d a7 dc 0a 27 27 a7                           |.....''.|
00000008

I've tried to reproduce this problem, and cannot:

from coverage.parser import PythonParser
parser = PythonParser(text="\xFF\x8D\xA7\xDC\n''\xA7")
parser.parse_source()

produces:

Traceback (most recent call last):
  File "/Users/ned/coverage/trunk/coverage/parser.py", line 265, in parse_source
    self._ast_root = ast_parse(self.text)
                     ^^^^^^^^^^^^^^^^^^^^
  File "/Users/ned/coverage/trunk/coverage/misc.py", line 381, in ast_parse
    return ast.parse(text)
           ^^^^^^^^^^^^^^^
  File "/usr/local/pyenv/pyenv/versions/3.11.9/lib/python3.11/ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<unknown>", line 1
    ÿ�§Ü
     ^
SyntaxError: invalid non-printable character U+008D

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/ned/coverage/trunk/fuzz.py", line 3, in <module>
    parser.parse_source()
  File "/Users/ned/coverage/trunk/coverage/parser.py", line 268, in parse_source
    raise NotPython(
coverage.exceptions.NotPython: Couldn't parse '<code>' as Python source: 'invalid non-printable character U+008D' at line 1

Somehow they have a TokenError, but coverage.py does not. I don't understand how they are getting their error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions