Parse error recovery is obversable by macros in several cases

## Introduction

In multiple cases, macro fragment specifiers like `expr` and `stmt` match more token streams than they should as a consequence of the parser trying to recover from obviously invalid Rust code to provide better immediate and subsequent error messages to the user.

## Why Is This a Concern?

The user should be allowed to assume that a fragment specifier only matches valid Rust code, anything else would make the fragment specifier not live up to its name and as a result render it useless (to exaggerate).

One use of macros is the ability to define embedded / internal domain-specific languages (DSLs). Part of that is defining new syntax which might not necessarily be valid Rust syntax. Declarative macros allow users to create several macro rules / matchers enabling relatively fine-grained matching on tokens. Obviously, when writing those rules, macro authors need to know what a given fragment specifier accepts in order to confidently determine which specific rule applies for a given input. If the grammar used by a fragment specifier is actually larger than promised and basically *unknown* (implementation-defined to be precise), this becomes an impossible task.

Not only that. If we don't do anything, the grammars matched by fragment specifiers *will keep changing* over time as more and more recovery code gets added. This breaks Rust's backward compatibility guarantees! Macro calls that used to compile at some fixed point in time might potentially no longer compile in a future version of the compiler. In fact, backward compatibility has already been broken multiple times in the past without notice by (some) PRs introducing more error recovery.

## Examples

There might be **many more cases** than listed below but it takes a lot of time experimenting and looking through the parser. I'll try to extend the list over time.

### Expressions

```rust
macro_rules! check {
    ($e:expr) => {};

    // or `===`, `!==`, `<>`, `<=>`, `or` instead of `and`
    ($a:literal and $b:literal) => {};
    ($a:literal AND $b:literal) => {};

    (not $a:literal) => {};
    (NOT $a:literal) => {};
}

check! { 0 AND 1 } // passes (all good)
check! { 0 and 1 } // ⚠️ FAILS but should pass! "`and` is not a logical operator"

check! { NOT 1 } // passes (all good)
check! { not 1 } // ⚠️ FAILS but should pass! "unexpected `1` after identifier"
```

<details>
<summary><b>stderr</b></summary>

```
error: `and` is not a logical operator
  --> src/lib.rs:13:12
   |
13 | check! { 0 and 1 } // ⚠️ FAILS but should pass! "invalid comparison operator"
   |            ^^^ help: use `&&` to perform logical conjunction
   |
   = note: unlike in e.g., Python and PHP, `&&` and `||` are used for logical operators

error: unexpected `1` after identifier
  --> src/lib.rs:16:14
   |
16 | check! { not 1 } // ⚠️ FAILS but should pass! "unexpected `1` after identifier"
   |          ----^
   |          |
   |          help: use `!` to perform bitwise not

```

</details>

### Statements

```rust
macro_rules! check {
    ($s:stmt) => {};

    // or `auto` instead of `var`
    (var $i:ident) => {};
    (VAR $i:ident) => {};
}

check! { VAR x } // passes (all good)
check! { var x } // ⚠️ FAILS but should pass! "invalid variable declaration" (+ ICE, #103529)
```

<details>
<summary><b>stderr</b></summary>

```
error: invalid variable declaration
  --> src/lib.rs:10:10
   |
10 | check! { var x } // ⚠️ FAILS but should pass! "invalid variable declaration" (+ ICE, #103529)
   |          ^^^
   |
help: write `let` instead of `var` to introduce a new variable
   |
10 | check! { let x } // ⚠️ FAILS but should pass! "invalid variable declaration" (+ ICE, #103529)
   |          ~~~
```

</details>

### Other Fragments (e.g. Items, Types)

\[no known cases at the time of this writing (active search ongoing)\]

## Editorial Notes

<details>
<summary>editorial notes</summary>

I used to list some more cases above (which some of you might have noticed) but I've since removed them as they've turned out to be incorrect. Here they are:

```rust
macro_rules! check {
    ($e:expr) => {};

    ($a:literal++) => {};
    ($a:literal@@) => {};

    ($n:ident : $e:expr) => {};
}

check! { 1@@ } // passes (all good)
check! { 1++ } // FAILS but would fail ANYWAY! "Rust has no postfix increment operator"
// ^ without the recovery code, we would try to parse a binary expression (operator `+`) and fail at the 2nd `+`

check! { struct : loop {} } // passes (all good)
check! { main : loop {} } // FAILS but would fail ANYWAY! "malformed loop label"
// ^ without the recovery code, we would try to parse type ascription and fail at `loop {}` which isn't a type
```

</details>

---

Related issue: #90256 (concerning procedural macros).

@rustbot label A-macros A-diagnostics A-parser T-compiler


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse error recovery is obversable by macros in several cases #103534

Introduction

Why Is This a Concern?

Examples

Expressions

Statements

Other Fragments (e.g. Items, Types)

Editorial Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parse error recovery is obversable by macros in several cases #103534

Description

Introduction

Why Is This a Concern?

Examples

Expressions

Statements

Other Fragments (e.g. Items, Types)

Editorial Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions