Description
Update: The original title was bytes!() should encode to "Latin-1" instead of UTF-8, but Latin-1 turned out to be not such a good idea. See discussion below.
Currently, the bytes!()
macro encodes character and string arguments as UTF-8. This gives surprising behavior such as bytes!("\xFF") == [0xC3, 0xBF]
instead of [0xFF]
.
This macro should not assume UTF-8, since its typical use case is working with bytes in an encoding that is ASCII-compatible but is not necessarily UTF-8.
Instead, it should map code points in the U+0000 .. U+00FF range to bytes with the same numerical value, and trigger a compile-time error for other code points. (This encoding sometimes known as "Latin-1", although the official definition of ISO/IEC 8859-1:1998 leaves some bytes unmapped.)