Skip to content

Emoji in label/lifetime recovered as character literal (rather than identifier) #108019

Closed
@izik1

Description

@izik1

Code

fn bar() {
    '🐱 loop {
        break
    }
}

Current output

error[E0762]: unterminated character literal
 --> src/lib.rs:2:5
  |
2 |     '🐱 loop {
  |     ^^^^^^^^^^

Desired output

error: identifiers cannot contain emoji
 --> src/lib.rs:2:5
  |
2 |     '🐱: loop {
  |      ^^

or something else similar to the one for

fn bar() {
    let 🐱 = ();
}
error: identifiers cannot contain emoji: `🐱`
 --> src/lib.rs:2:9
  |
2 |     let 🐱 = ();
  |         ^^

Perhaps with a =help "did you mean to use a character literal?" when applicable

Rationale and extra context

I feel the rationale is self-explanatory, however, if it ends up not being such, I can provide one upon request.

Other cases

small aside: I originally wrote this all for 🥺, but that is bizarrely not recognized in idents at all (it gives a error: unknown start of token: \u{1f97a}), and after realizing that some emotes are handled better, I decided to use to use 🐱. I specifically avoided 🦀 because it has extra-special handling ("Ferris cannot be used as an identifier")

Another case is, as mentioned prior, in lifetime names (as far as I'm aware, this is the same underlying cause: the emoji causes the token to be a character literal):

fn foo<'🐱>() -> &'🐱 () {
   &()
}

which gives 2 errors:

error: character literal may only contain one codepoint
 --> src/lib.rs:1:8
  |
1 | fn foo<'🐱>() -> &'🐱 () {
  |        ^^^^^^^^^^^^
  |
help: if you meant to write a `str` literal, use double quotes
  |
1 | fn foo<"🐱>() -> &"🐱 () {
  |        ~~~~~~~~~~~~

error: expected one of `#`, `>`, `const`, identifier, or lifetime, found `'🐱>() -> &'`
 --> src/lib.rs:1:8
  |
1 | fn foo<'🐱>() -> &'🐱 () {
  |        ^^^^^^^^^^^^ expected one of `#`, `>`, `const`, identifier, or lifetime

The following sample also has very different output (and probably closer to the expected output, although it's not without its own weirdness):

fn bar() {
    'a🐱: loop {}
}
error: malformed loop label
 --> src/lib.rs:6:7
  |
6 |     'a🐱: loop {}
  |       ^^ help: use the correct loop label format: `'🐱`

error: expected `while`, `for`, `loop` or `{` after a label
 --> src/lib.rs:6:7
  |
6 |     'a🐱: loop {}
  |       ^^ expected `while`, `for`, `loop` or `{` after a label
  |
help: consider removing the label
  |
6 -     'a🐱: loop {}
6 +     🐱: loop {}
  |

error: labeled expression must be followed by `:`
 --> src/lib.rs:6:7
  |
6 |     'a🐱: loop {}
  |     ---^^^^^^^^^^
  |     | |
  |     | help: add `:` after the label
  |     the label
  |
  = note: labels are used before loops and blocks, allowing e.g., `break 'label` to them

error: identifiers cannot contain emoji: `🐱`
 --> src/lib.rs:6:7
  |
6 |     'a🐱: loop {}
  |       ^^

warning: unused label
 --> src/lib.rs:6:7
  |
6 |     'a🐱: loop {}
  |       ^^
  |
  = note: `#[warn(unused_labels)]` on by default

warning: `playground` (lib) generated 1 warning
error: could not compile `playground` due to 4 previous errors; 1 warning emitted

Anything else?

No response

Metadata

Metadata

Assignees

Labels

A-diagnosticsArea: Messages for errors, warnings, and lintsT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions