Closed
Description
The regex engine doesn't consider characters (graphemes) that consist of multiple code points correctly.
For example the letter 'ä' has two representations, that should both be matched by the regex .
, howver only the latter is.
Bash | Rust | Codepoints
echo $'\x61\xcc\x88' | "\u{e4}" | U+00e4
echo $'\xc3\xa4' | "a\u{308}" | U+0061 U+0308