Inconsistent literal escaping in proc macros

Proc macros operate on tokens, including string/character/byte-string/byte literal tokens, which they can get from various sources.

- Source 1: Lexer.
This is the most reliable source, the token is passed to a macro *precisely* like it was written in source code.
`"C"` will be passed as `"C"`, but the same C in escaped form `"\x43"` will be passed as `"\x43"`.
Proc macros can observe the difference because `ToString` (the only way to get the literal contents in proc macro API) also prints the literal precisely.
- Source 2: Proc macro API.
`Literal::string(s: &str)` will make you a string literal containing data `s`, approximately.
The precise token (returned by `ToString`) will contain:
    - `escape_debug(s)` for string literals (`Literal::string`)
    - `escape_unicode(s)` for character literals (`Literal::character`)
    - `escape_default(s)` for byte string literals (`Literal::byte_string`)
- Source 3: Recovered from non-attribute AST
AST goes through pretty-printing first, then re-tokenized.
The precise token (returned by `ToString`) will contain:
    - precise `s` for raw AST strings
    - `escape_debug(s)` for non-raw AST strings
    - `escape_default(s)` for AST characters, bytes and byte strings (both raw and non-raw)
- Source 4: Recovered from attribute AST
Just an ad-hoc recovery without pretty-printing.
The precise token (returned by `ToString`) will contain:
    - precise `s` for raw AST strings
    - `escape_default(s)` for non-raw AST strings, AST characters, bytes and byte strings (both raw and non-raw)

EDIT: Also doc comments go through `escape_debug` when converted to `#[doc = "content"]` tokens for proc macros.

It would be nice to
- Figure out what escaping we actually want (perhaps none?) and document the motivation behind the escaping choices.
- Get rid of the escaping differences between token sources, so that at least literals of the same kind are escaped identically.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent literal escaping in proc macros #60495

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inconsistent literal escaping in proc macros #60495

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions