Extend lexer to cover non-ASCII unicode cases

There are a few parts to this:
- Implement the PEP 3131 rule for identifier tokenization: notice when you leave ASCII range, switch to XID_Start/XID_Continue, NFKC-normalize. This is expensive but it'll be a cold path in any ASCII-range input.
- Copy of the rules in rustboot's handling of character and string literals (handle unicode escapes in those contexts, as well as general non-normalizing UTF-8 input).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend lexer to cover non-ASCII unicode cases #242

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend lexer to cover non-ASCII unicode cases #242

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions