@@ -512,7 +512,7 @@ of St. Andrews (St. Andrews, Fife, UK).
512
512
Additional specific influences can be seen from the following languages:
513
513
@itemize
514
514
@item The structural algebraic types and compilation manager of SML.
515
- @item The syntax-extension systems of Camlp4 and the Common Lisp readtable.
515
+ @c @ item The syntax-extension systems of Camlp4 and the Common Lisp readtable.
516
516
@item The deterministic destructor system of C++.
517
517
@end itemize
518
518
@@ -599,12 +599,12 @@ U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}).
599
599
A @dfn {single-line comment } is any sequence of Unicode characters beginning
600
600
with U+002F U+002F (@code {"//" }) and extending to the next U+000A character,
601
601
@emph {excluding } cases in which such a sequence occurs within a string literal
602
- token or a syntactic extension token .
602
+ token.
603
603
604
604
A @dfn {multi-line comments } is any sequence of Unicode characters beginning
605
605
with U+002F U+002A (@code {"/*" }) and ending with U+002A U+002F (@code {"*/" }),
606
606
@emph {excluding } cases in which such a sequence occurs within a string literal
607
- token or a syntactic extension token . Multi-line comments may be nested.
607
+ token. Multi-line comments may be nested.
608
608
609
609
@node Ref.Lex.Ident
610
610
@subsection Ref.Lex.Ident
@@ -876,11 +876,11 @@ escaped in order to denote @emph{itself}.
876
876
@c * Ref.Lex.Syntax:: Syntactic extension tokens.
877
877
878
878
Syntactic extensions are marked with the @emph {pound } sigil U+0023 (@code {# }),
879
- followed by a qualified name of a compile-time imported module item, an
880
- optional parenthesized list of @emph { parsed expressions }, and an optional
881
- brace-enclosed region of free-form text (with brace-matching and
882
- brace-escaping used to determine the limit of the
883
- region). @xref { Ref.Comp.Syntax }.
879
+ followed by an identifier, one of @code { fmt }, @code { env },
880
+ @code { concat_idents }, @code { ident_to_str }, @code { log_syntax }, @code { macro }, or
881
+ the name of a user-defined macro. This is followed by a vector literal. (Its
882
+ value will be interpreted syntactically; in particular, it need not be
883
+ well-typed.)
884
884
885
885
@emph {TODO: formalize those terms more }.
886
886
@@ -1040,7 +1040,6 @@ Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a
1040
1040
@itemize
1041
1041
@item Metadata about the crate, such as author, name, version, and copyright.
1042
1042
@item The source-file and directory modules that make up the crate.
1043
- @item The set of syntax extensions to enable for the crate.
1044
1043
@item Any external crates or native modules that the crate imports to its top level.
1045
1044
@item The organization of the crate's internal namespace.
1046
1045
@item The set of names exported from the crate.
@@ -1087,11 +1086,13 @@ or Mach-O. The loadable object contains extensive DWARF metadata, describing:
1087
1086
derived from the same @code {use } directives that guided compile-time imports.
1088
1087
@end itemize
1089
1088
1090
- The @code {syntax } directives of a crate are similar to the @code {use }
1091
- directives, except they govern the syntax extension namespace (accessed
1092
- through the syntax-extension sigil @code {# }, @pxref {Ref.Comp.Syntax })
1093
- available only at compile time. A @code {syntax } directive also makes its
1094
- extension available to all subsequent directives in the crate file.
1089
+ @c This might come along sometime in the future.
1090
+
1091
+ @c The @code{syntax} directives of a crate are similar to the @code{use}
1092
+ @c directives, except they govern the syntax extension namespace (accessed
1093
+ @c through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax})
1094
+ @c available only at compile time. A @code{syntax} directive also makes its
1095
+ @c extension available to all subsequent directives in the crate file.
1095
1096
1096
1097
An example of a crate:
1097
1098
@@ -1105,9 +1106,6 @@ meta (author = "Jane Doe",
1105
1106
// Import a module.
1106
1107
use std (ver = "1.0");
1107
1108
1108
- // Activate a syntax-extension.
1109
- syntax re;
1110
-
1111
1109
// Define some modules.
1112
1110
mod foo = "foo.rs";
1113
1111
mod bar @{
@@ -1124,8 +1122,8 @@ mod bar @{
1124
1122
1125
1123
In a crate, a @code {meta } directive associates free form key-value metadata
1126
1124
with the crate. This metadata can, in turn, be used in providing partial
1127
- matching parameters to syntax-extension loading and crate importing
1128
- directives, denoted by @code { syntax } and @code { use } keywords respectively .
1125
+ matching parameters to crate importing directives, denoted by the @code { use }
1126
+ keyword .
1129
1127
1130
1128
Alternatively, metadata can serve as a simple form of documentation.
1131
1129
@@ -1134,49 +1132,76 @@ Alternatively, metadata can serve as a simple form of documentation.
1134
1132
@c * Ref.Comp.Syntax:: Syntax extension.
1135
1133
@cindex Syntax extension
1136
1134
1135
+ @c , statement or item
1137
1136
Rust provides a notation for @dfn {syntax extension }. The notation is a marked
1138
- syntactic form that can appear as an expression, statement or item in the body
1139
- of a Rust program, or as a directive in a Rust crate, and which causes the
1140
- text enclosed within the marked form to be translated through a named
1141
- extension function loaded into the compiler at compile-time.
1142
-
1143
- The compile-time extension function must return a value of the corresponding
1144
- Rust AST type, either an expression node, a statement node or an item
1145
- node. @footnote {The syntax-extension system is analogous to the extensible
1146
- reader system provided by Lisp @emph {readtables }, or the Camlp4 system of
1147
- Objective Caml. } @xref {Ref.Lex.Syntax }.
1148
-
1149
- A syntax extension is enabled by a @code {syntax } directive, which must occur
1150
- in a crate file. When the Rust compiler encounters a @code {syntax } directive
1151
- in a crate file, it immediately loads the named syntax extension, and makes it
1152
- available for all subsequent crate directives within the enclosing block scope
1153
- of the crate file, and all Rust source files referenced as modules from the
1154
- enclosing block scope of the crate file.
1155
-
1156
- For example, this extension might provide a syntax for regular
1157
- expression literals:
1137
+ syntactic form that can appear as an expression in the body of a Rust
1138
+ program. Syntax extensions make use of bracketed lists, which are
1139
+ syntactically vector literals, but which have no run-time semantics. After
1140
+ parsing, the notation is translated into Rust expressions. The name of the
1141
+ extension determines the translation performed. The name may be one of the
1142
+ built-in extensions listed below, or a user-defined extension, defined using
1143
+ @code {macro }.
1158
1144
1159
- @example
1160
- // In a crate file:
1145
+ @itemize
1146
+ @item @code {fmt } expands into code to produce a formatted string, similar to
1147
+ @code {printf } from C.
1148
+ @item @code {env } expands into a string literal containing the value of that
1149
+ environment variable at compile-time.
1150
+ @item @code {concat_idents } expands into an identifier which is the
1151
+ concatenation of its arguments.
1152
+ @item @code {ident_to_str } expands into a string literal containing the name of
1153
+ its argument (which must be a literal).
1154
+ @item @code {log_syntax } causes the compiler to pretty-print its arguments.
1155
+ @end itemize
1161
1156
1162
- // Requests the 're' syntax extension from the compilation environment.
1163
- syntax re;
1157
+ Finally, @code {macro } is used to define a new macro. A macro can abstract over
1158
+ second-class Rust concepts that are present in syntax. The arguments to
1159
+ @code {macro } are a bracketed list of pairs (two-element lists). The pairs
1160
+ consist of an invocation and the syntax to expand into. An example:
1164
1161
1165
- // Also declares an import dependency on the module 're'.
1166
- use re;
1162
+ @example
1163
+ #macro[[#apply[fn, [args, ...]], fn(args, ...)]];
1164
+ @end example
1167
1165
1168
- // Reference to a Rust source file as a module in the crate.
1169
- mod foo = "foo.rs";
1166
+ In this case, the invocation @code {#apply[sum , 5 , 8 , 6] } expands to
1167
+ @code {sum(5 ,8 ,6) }. If @code {... } follows an expression (which need not be as
1168
+ simple as a single identifier) in the input syntax, the matcher will expect an
1169
+ arbitrary number of occurences of the thing preceeding it, and bind syntax to
1170
+ the identifiers it contains. If it follows an expression in the output syntax,
1171
+ it will transcribe that expression repeatedly, according to the identifiers
1172
+ (bound to syntax) that it contains.
1170
1173
1171
- @dots {}
1174
+ The behavior of @code {... } is known as Macro By Example. It allows you to
1175
+ write a macro with arbitrary repetition by specifying only one case of that
1176
+ repetition, and following it by @code {... }, both where the repeated input is
1177
+ matched, and where the repeated output must be transcribed. A more
1178
+ sophisticated example:
1172
1179
1173
- // In the source file "foo.rs", use the #re syntax extension and
1174
- // the re module at run-time.
1175
- let s: str = get_string() ;
1176
- let pattern: regex = #re.pat @{ aa+b? @} ;
1177
- let matched: bool = re.match(pattern, s) ;
1180
+ @example
1181
+ #macro[#zip_literals[[x, ...], [y, ...]],
1182
+ [[x, y], ...]] ;
1183
+ #macro[#unzip_literals[[x, y], ...],
1184
+ [[x, ...], [y, ...]]] ;
1178
1185
@end example
1179
1186
1187
+ In this case, @code {#zip_literals[[1 ,2 ,3] , [1 ,2 ,3]] } expands to
1188
+ @code {[[1 ,1] ,[2 ,2] ,[3 ,3]] }, and @code {#unzip_literals[[1 ,1] , [2 ,2] , [3 ,3]] }
1189
+ expands to @code {[[1 ,2 ,3] ,[1 ,2 ,3]] }.
1190
+
1191
+ Macro expansion takes place outside-in: that is,
1192
+ @code {#unzip_literals[#zip_literals[[1 ,2 ,3] ,[1 ,2 ,3]]] } will fail because
1193
+ @code {unzip_literals } expects a list, not a macro invocation, as an
1194
+ argument.
1195
+
1196
+ @c
1197
+ The macro system currently has some limitations. It's not possible to
1198
+ destructure anything other than vector literals (therefore, the arguments to
1199
+ complicated macros will tend to be an ocean of square brackets). Macro
1200
+ invocations and @code {... } can only appear in expression positions. Finally,
1201
+ macro expansion is currently unhygienic. That is, name collisions between
1202
+ macro-generated and user-written code can cause unintentional capture.
1203
+
1204
+
1180
1205
@page
1181
1206
@node Ref.Mem
1182
1207
@section Ref.Mem
0 commit comments