Skip to content

span information is mistaken in the presence of combining characters #3260

Closed
@msullivan

Description

@msullivan

rustc reckons spans in terms of unicode code points, which causes problems when faced with combining characters. When the following code is compiled

fn main() {
    let s = ~"ZͨA͑ͦ͒͋ͤ͑̚L̄͑͋Ĝͨͥ̿͒̽̈́Oͥ͛ͭ!̏"; while true { break; }
    io::println(s);
    log(error, s);
}

rustc will produce the following screwed up diagnostic:

./nubs/utf8-combining.rs:2:45: 2:66 warning: denote infinite loops with loop { ... }
./nubs/utf8-combining.rs:2     let s = ~"ZͨA͑ͦ͒͋ͤ͑̚L̄͑͋Ĝͨͥ̿͒̽̈́Oͥ͛ͭ!̏"; while true { break; }
                                                                        ^~~~~~~~~~~~~~~~~~~~~

emacs tells me that the while loop starts in column 23. (Interestingly, emacs seems to split up the combining characters when it displays them, presumably for easier editing, but doesn't advance the column numbers.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-UnicodeArea: UnicodeA-diagnosticsArea: Messages for errors, warnings, and lints

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions