Skip to content

Using character class in regex! macro produces numerous warnings on deprecation of \uABCD #19879

Closed
@nodakai

Description

@nodakai
#![feature(phase)]

extern crate regex;
#[phase(plugin)]
extern crate regex_macros;

fn main() {
    regex!(r"\d");
}
<quote expansion>:1:3: 1:8 warning: \U00ABCD12 and \uABCD escapes are deprecated
<quote expansion>:1 '\u0660'
                      ^~~~~
<quote expansion>:1:3: 1:8 help: use \u{ABCD12} escapes instead
<quote expansion>:1 '\u0660'
                      ^~~~~
<quote expansion>:1:3: 1:8 warning: \U00ABCD12 and \uABCD escapes are deprecated
<quote expansion>:1 '\u0669'
                      ^~~~~
<quote expansion>:1:3: 1:8 help: use \u{ABCD12} escapes instead
<quote expansion>:1 '\u0669'
                      ^~~~~
...
<quote expansion>:3:7: 3:12 warning: \U00ABCD12 and \uABCD escapes are deprecated
<quote expansion>:3     '\u0660' ...'\u0669' => true,
                          ^~~~~                     
<quote expansion>:3:7: 3:12 help: use \u{ABCD12} escapes instead
<quote expansion>:3     '\u0660' ...'\u0669' => true,
                          ^~~~~                     
<quote expansion>:3:19: 3:24 warning: \U00ABCD12 and \uABCD escapes are deprecated
<quote expansion>:3     '\u0660' ...'\u0669' => true,
                                      ^~~~~         
<quote expansion>:3:19: 3:24 help: use \u{ABCD12} escapes instead
<quote expansion>:3     '\u0660' ...'\u0669' => true,
...

In this case libunicode::regex::PERLD which is essentially Nd_table is expanded into a sequence of old style Unicode literals \u0660 etc and produce numerous warnings like above.

When we focus on the latter half of the warnings, we notice they come from fn match_class() in libregex_macros:

    fn match_class(&self, casei: bool, ranges: &[(char, char)]) -> P<ast::Expr> {
        let mut arms = ranges.iter().map(|&(mut start, mut end)| {
            if casei {
                start = start.to_uppercase();
                end = end.to_uppercase();
            }
            let pat = self.cx.pat(self.sp, ast::PatRange(quote_expr!(self.cx, $start),
                                                         quote_expr!(self.cx, $end)));
            self.cx.arm(self.sp, vec!(pat), quote_expr!(self.cx, true))
        }).collect::<Vec<ast::Arm>>();

So, this problem is probably because the quote_expr! built-in macro outputs old style Unicode literals (or, some internal data structures indistinguishable from them?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions