Skip to content

Chinese numerals are not recognized by char::is_numeric #84056

Closed as not planned
@wooster0

Description

@wooster0

I tried this code:

fn main() {
    assert!('一'.is_numeric());
}

I expected it to evaluate to true.

Instead, it evaluated to false.

I would expect at least 零/〇、一、二、三、四、五、六、七、八、九 (0-9) to be recognized. As for other numeral systems, like the Arabic numerals, after 9 the number wouldn't fit into a char anymore and thus can't be recognized, but with Chinese numerals, beyond 0-9 there's many other numbers represented with a single character too, like for example 10: 十, which could still be a char. I'm not sure whether this should be recognized, but perhaps it should.
There is also financial numbers and many others, see https://en.wikipedia.org/wiki/Chinese_numerals#Standard_numbers for a comprehensive list.

I've been told that, the numerals are covered in the UnicodeData.txt file mentioned in the docs of char::is_numeric, but they are listed in the Lo category which stands for Other Letter and so Rust doesn't consider them numeric, which doesn't make sense to me because clearly they are numerals and not letters. Rust should probably either recognize (some parts of) this category as numerals or the numerals should be added manually.

Adding support for this would in turn also mean support for numerals of other East Asian languages, like Japanese and Hokkien.

Meta

This happens on the stable 1.51.0 channel and all others.

Metadata

Metadata

Labels

A-UnicodeArea: UnicodeC-bugCategory: This is a bug.T-libsRelevant to the library team, which will review and decide on the PR/issue.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions