@@ -25,9 +25,9 @@ match input_2 {
25
25
# }
26
26
~~~~
27
27
28
- This code could become tiresome if repeated many times. However, there is no
29
- straightforward way to rewrite it without the repeated code, using functions
30
- alone. There is a solution, though: defining a macro to solve the problem . Macros are
28
+ This code could become tiresome if repeated many times. However, no function
29
+ can capture its functionality to make it possible to rewrite the repetition
30
+ away. Rust's macro system, however, can eliminate the repetition . Macros are
31
31
lightweight custom syntax extensions, themselves defined using the
32
32
` macro_rules! ` syntax extension. The following ` early_return ` macro captures
33
33
the pattern in the above code:
@@ -65,7 +65,7 @@ macro. It appears on the left-hand side of the `=>` in a macro definition. It
65
65
conforms to the following rules:
66
66
67
67
1 . It must be surrounded by parentheses.
68
- 2 . ` $ ` has special meaning.
68
+ 2 . ` $ ` has special meaning (described below) .
69
69
3 . The ` () ` s, ` [] ` s, and ` {} ` s it contains must balance. For example, ` ([) ` is
70
70
forbidden.
71
71
@@ -118,10 +118,11 @@ expression, `() => (let $x=$val)` is a macro that expands to a statement, and
118
118
` () => (1,2,3) ` is a macro that expands to a syntax errror).
119
119
120
120
Except for permissibility of ` $name ` (and ` $(...)* ` , discussed below), the
121
- right-hand side of a macro definition follows the same rules as ordinary
122
- Rust syntax. In particular, macro invocations (including invocations of the
123
- macro currently being defined) are permitted in expression, statement, and
124
- item locations.
121
+ right-hand side of a macro definition is ordinary Rust syntax. In particular,
122
+ macro invocations (including invocations of the macro currently being defined)
123
+ are permitted in expression, statement, and item locations. However, nothing
124
+ else about the code is examined or executed by the macro system; execution
125
+ still has to wait until runtime.
125
126
126
127
## Interpolation location
127
128
@@ -199,7 +200,196 @@ parsing `e`. Changing the invocation syntax to require a distinctive token in
199
200
front can solve the problem. In the above example, ` $(T $t:ty)* E $e:exp `
200
201
solves the problem.
201
202
202
- ## A final note
203
+ # Macro argument pattern matching
204
+
205
+ Now consider code like the following:
206
+
207
+ ## Motivation
208
+
209
+ ~~~~
210
+ # enum t1 { good_1(t2, uint), bad_1 };
211
+ # pub struct t2 { body: t3 }
212
+ # enum t3 { good_2(uint), bad_2};
213
+ # fn f(x: t1) -> uint {
214
+ match x {
215
+ good_1(g1, val) => {
216
+ match g1.body {
217
+ good_2(result) => {
218
+ // complicated stuff goes here
219
+ return result + val;
220
+ },
221
+ _ => fail ~"Didn't get good_2"
222
+ }
223
+ }
224
+ _ => return 0 // default value
225
+ }
226
+ # }
227
+ ~~~~
228
+
229
+ All the complicated stuff is deeply indented, and the error-handling code is
230
+ separated from matches that fail. We'd like to write a macro that performs
231
+ a match, but with a syntax that suits the problem better. The following macro
232
+ can solve the problem:
233
+
234
+ ~~~~
235
+ macro_rules! biased_match (
236
+ // special case: `let (x) = ...` is illegal, so use `let x = ...` instead
237
+ ( ($e:expr) ~ ($p:pat) else $err:stmt ;
238
+ binds $bind_res:ident
239
+ ) => (
240
+ let $bind_res = match $e {
241
+ $p => ( $bind_res ),
242
+ _ => { $err }
243
+ };
244
+ );
245
+ // more than one name; use a tuple
246
+ ( ($e:expr) ~ ($p:pat) else $err:stmt ;
247
+ binds $( $bind_res:ident ),*
248
+ ) => (
249
+ let ( $( $bind_res ),* ) = match $e {
250
+ $p => ( $( $bind_res ),* ),
251
+ _ => { $err }
252
+ };
253
+ )
254
+ )
255
+
256
+ # enum t1 { good_1(t2, uint), bad_1 };
257
+ # pub struct t2 { body: t3 }
258
+ # enum t3 { good_2(uint), bad_2};
259
+ # fn f(x: t1) -> uint {
260
+ biased_match!((x) ~ (good_1(g1, val)) else { return 0 };
261
+ binds g1, val )
262
+ biased_match!((g1.body) ~ (good_2(result) )
263
+ else { fail ~"Didn't get good_2" };
264
+ binds result )
265
+ // complicated stuff goes here
266
+ return result + val;
267
+ # }
268
+ ~~~~
269
+
270
+ This solves the indentation problem. But if we have a lot of chained matches
271
+ like this, we might prefer to write a single macro invocation. The input
272
+ pattern we want is clear:
273
+ ~~~~
274
+ # macro_rules! b(
275
+ ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
276
+ binds $( $bind_res:ident ),*
277
+ )
278
+ # => (0))
279
+ ~~~~
280
+
281
+ However, it's not possible to directly expand to nested match statements. But
282
+ there is a solution.
283
+
284
+ ## The recusive approach to macro writing
285
+
286
+ A macro may accept multiple different input grammars. The first one to
287
+ successfully match the actual argument to a macro invocation is the one that
288
+ "wins".
289
+
290
+
291
+ In the case of the example above, we want to write a recursive macro to
292
+ process the semicolon-terminated lines, one-by-one. So, we want the following
293
+ input patterns:
294
+
295
+ ~~~~
296
+ # macro_rules! b(
297
+ ( binds $( $bind_res:ident ),* )
298
+ # => (0))
299
+ ~~~~
300
+ ...and:
301
+
302
+ ~~~~
303
+ # macro_rules! b(
304
+ ( ($e :expr) ~ ($p :pat) else $err :stmt ;
305
+ $( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*
306
+ binds $( $bind_res:ident ),*
307
+ )
308
+ # => (0))
309
+ ~~~~
310
+
311
+ The resulting macro looks like this. Note that the separation into
312
+ ` biased_match! ` and ` biased_match_rec! ` occurs only because we have an outer
313
+ piece of syntax (the ` let ` ) which we only want to transcribe once.
314
+
315
+ ~~~~
316
+
317
+ macro_rules! biased_match_rec (
318
+ // Handle the first layer
319
+ ( ($e :expr) ~ ($p :pat) else $err :stmt ;
320
+ $( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*
321
+ binds $( $bind_res:ident ),*
322
+ ) => (
323
+ match $e {
324
+ $p => {
325
+ // Recursively handle the next layer
326
+ biased_match_rec!($( ($e_rest) ~ ($p_rest) else $err_rest ; )*
327
+ binds $( $bind_res ),*
328
+ )
329
+ }
330
+ _ => { $err }
331
+ }
332
+ );
333
+ ( binds $( $bind_res:ident ),* ) => ( ($( $bind_res ),*) )
334
+ )
335
+
336
+ // Wrap the whole thing in a `let`.
337
+ macro_rules! biased_match (
338
+ // special case: `let (x) = ...` is illegal, so use `let x = ...` instead
339
+ ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
340
+ binds $bind_res:ident
341
+ ) => (
342
+ let ( $( $bind_res ),* ) = biased_match_rec!(
343
+ $( ($e) ~ ($p) else $err ; )*
344
+ binds $bind_res
345
+ );
346
+ );
347
+ // more than one name: use a tuple
348
+ ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
349
+ binds $( $bind_res:ident ),*
350
+ ) => (
351
+ let ( $( $bind_res ),* ) = biased_match_rec!(
352
+ $( ($e) ~ ($p) else $err ; )*
353
+ binds $( $bind_res ),*
354
+ );
355
+ )
356
+ )
357
+
358
+
359
+ # enum t1 { good_1(t2, uint), bad_1 };
360
+ # pub struct t2 { body: t3 }
361
+ # enum t3 { good_2(uint), bad_2};
362
+ # fn f(x: t1) -> uint {
363
+ biased_match!(
364
+ (x) ~ (good_1(g1, val)) else { return 0 };
365
+ (g1.body) ~ (good_2(result) ) else { fail ~"Didn't get good_2" };
366
+ binds val, result )
367
+ // complicated stuff goes here
368
+ return result + val;
369
+ # }
370
+ ~~~~
371
+
372
+ This technique is applicable in many cases where transcribing a result "all
373
+ at once" is not possible. It resembles ordinary functional programming in some
374
+ respects, but it is important to recognize the differences.
375
+
376
+ The first difference is important, but also easy to forget: the transcription
377
+ (right-hand) side of a ` macro_rules! ` rule is literal syntax, which can only
378
+ be executed at run-time. If a piece of transcription syntax does not itself
379
+ appear inside another macro invocation, it will become part of the final
380
+ program. If it is inside a macro invocation (for example, the recursive
381
+ invocation of ` biased_match_rec! ` ), it does have the opprotunity to affect
382
+ transcription, but only through the process of attempted pattern matching.
383
+
384
+ The second difference is related: the evaluation order of macros feels
385
+ "backwards" compared to ordinary programming. Given an invocation
386
+ ` m1!(m2!()) ` , the expander first expands ` m1! ` , giving it as input the literal
387
+ syntax ` m2!() ` . If it transcribes its argument unchanged into an appropriate
388
+ position (in particular, not as an argument to yet another macro invocation),
389
+ the expander will then proceed to evaluate ` m2!() ` (along with any other macro
390
+ invocations ` m1!(m2!()) ` produced).
391
+
392
+ # A final note
203
393
204
394
Macros, as currently implemented, are not for the faint of heart. Even
205
395
ordinary syntax errors can be more difficult to debug when they occur inside a
@@ -208,3 +398,4 @@ tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
208
398
states, invoking ` trace_macros!(true) ` will automatically print those
209
399
intermediate states out, and passing the flag ` --pretty expanded ` as a
210
400
command-line argument to the compiler will show the result of expansion.
401
+
0 commit comments