Closed
Description
For Unicode Character Categories see http://www.fileformat.info/info/unicode/category/index.htm
Haskell implements the type "GeneralCategory" and a function to determine a character's "GeneralCategory".
Their implementation goes like this:
- the script https://github.com/ghc/packages-base/blob/master/cbits/ubconfc
- takes the Unicode Character Database (UCD) http://www.unicode.org/Public/6.0.0/ucd/UnicodeData.txt
- and generates https://github.com/ghc/packages-base/blob/master/cbits/WCsubst.c
I propose to write a Python script, which does something similar.
Having such a type and function in Rust enables us to correctly implement functions in the "char" module. See http://haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/src/Data-Char.html
Metadata
Metadata
Assignees
Labels
No labels