Description
Introduction
In multiple cases, macro fragment specifiers like expr
and stmt
match more token streams than they should as a consequence of the parser trying to recover from obviously invalid Rust code to provide better immediate and subsequent error messages to the user.
Why Is This a Concern?
The user should be allowed to assume that a fragment specifier only matches valid Rust code, anything else would make the fragment specifier not live up to its name and as a result render it useless (to exaggerate).
One use of macros is the ability to define embedded / internal domain-specific languages (DSLs). Part of that is defining new syntax which might not necessarily be valid Rust syntax. Declarative macros allow users to create several macro rules / matchers enabling relatively fine-grained matching on tokens. Obviously, when writing those rules, macro authors need to know what a given fragment specifier accepts in order to confidently determine which specific rule applies for a given input. If the grammar used by a fragment specifier is actually larger than promised and basically unknown (implementation-defined to be precise), this becomes an impossible task.
Not only that. If we don't do anything, the grammars matched by fragment specifiers will keep changing over time as more and more recovery code gets added. This breaks Rust's backward compatibility guarantees! Macro calls that used to compile at some fixed point in time might potentially no longer compile in a future version of the compiler. In fact, backward compatibility has already been broken multiple times in the past without notice by (some) PRs introducing more error recovery.
Examples
There might be many more cases than listed below but it takes a lot of time experimenting and looking through the parser. I'll try to extend the list over time.
Expressions
macro_rules! check {
($e:expr) => {};
// or `===`, `!==`, `<>`, `<=>`, `or` instead of `and`
($a:literal and $b:literal) => {};
($a:literal AND $b:literal) => {};
(not $a:literal) => {};
(NOT $a:literal) => {};
}
check! { 0 AND 1 } // passes (all good)
check! { 0 and 1 } // ⚠️ FAILS but should pass! "`and` is not a logical operator"
check! { NOT 1 } // passes (all good)
check! { not 1 } // ⚠️ FAILS but should pass! "unexpected `1` after identifier"
stderr
error: `and` is not a logical operator
--> src/lib.rs:13:12
|
13 | check! { 0 and 1 } // ⚠️ FAILS but should pass! "invalid comparison operator"
| ^^^ help: use `&&` to perform logical conjunction
|
= note: unlike in e.g., Python and PHP, `&&` and `||` are used for logical operators
error: unexpected `1` after identifier
--> src/lib.rs:16:14
|
16 | check! { not 1 } // ⚠️ FAILS but should pass! "unexpected `1` after identifier"
| ----^
| |
| help: use `!` to perform bitwise not
Statements
macro_rules! check {
($s:stmt) => {};
// or `auto` instead of `var`
(var $i:ident) => {};
(VAR $i:ident) => {};
}
check! { VAR x } // passes (all good)
check! { var x } // ⚠️ FAILS but should pass! "invalid variable declaration" (+ ICE, #103529)
stderr
error: invalid variable declaration
--> src/lib.rs:10:10
|
10 | check! { var x } // ⚠️ FAILS but should pass! "invalid variable declaration" (+ ICE, #103529)
| ^^^
|
help: write `let` instead of `var` to introduce a new variable
|
10 | check! { let x } // ⚠️ FAILS but should pass! "invalid variable declaration" (+ ICE, #103529)
| ~~~
Other Fragments (e.g. Items, Types)
[no known cases at the time of this writing (active search ongoing)]
Editorial Notes
editorial notes
I used to list some more cases above (which some of you might have noticed) but I've since removed them as they've turned out to be incorrect. Here they are:
macro_rules! check {
($e:expr) => {};
($a:literal++) => {};
($a:literal@@) => {};
($n:ident : $e:expr) => {};
}
check! { 1@@ } // passes (all good)
check! { 1++ } // FAILS but would fail ANYWAY! "Rust has no postfix increment operator"
// ^ without the recovery code, we would try to parse a binary expression (operator `+`) and fail at the 2nd `+`
check! { struct : loop {} } // passes (all good)
check! { main : loop {} } // FAILS but would fail ANYWAY! "malformed loop label"
// ^ without the recovery code, we would try to parse type ascription and fail at `loop {}` which isn't a type
Related issue: #90256 (concerning procedural macros).
@rustbot label A-macros A-diagnostics A-parser T-compiler