Skip to content

bytes!() should encode to ASCII instead of UTF-8 #13955

Closed
@SimonSapin

Description

@SimonSapin

Update: The original title was bytes!() should encode to "Latin-1" instead of UTF-8, but Latin-1 turned out to be not such a good idea. See discussion below.


Currently, the bytes!() macro encodes character and string arguments as UTF-8. This gives surprising behavior such as bytes!("\xFF") == [0xC3, 0xBF] instead of [0xFF].

This macro should not assume UTF-8, since its typical use case is working with bytes in an encoding that is ASCII-compatible but is not necessarily UTF-8.

Instead, it should map code points in the U+0000 .. U+00FF range to bytes with the same numerical value, and trigger a compile-time error for other code points. (This encoding sometimes known as "Latin-1", although the official definition of ISO/IEC 8859-1:1998 leaves some bytes unmapped.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions