Skip to content

Tracking Issue for aligning Unicode version to 15.0 (year 2022) #101840

Closed
@crlf0710

Description

@crlf0710

Unicode is released on a yearly basis, so we update the data files we used accordingly after each Unicode release in the Rust project. (Keep in mind that new dependencies might be added over time.)

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Goal: Unicode 15.0 (year 2022)

Unicode version dependent crates:

Libraries
Compiler
Language integrated:
  • unicode-xid (Decide whether it's a valid identifier)
    Current: 0.2.2 (Unicode 13)
    Goal: 0.2.4 (Unicode 15)
  • unicode-normalization (Preprocess identifiers for equality)
    Current: 0.1.13 (Unicode 9)
    Goal: 0.1.22 (Unicode 15)
  • unicode-security (Decide whether lints against unwanted usages should be triggered)
    Current: 0.0.5 (Unicode 13)
    Goal: 0.1.0 (Unicode 15)
  • unicode-script (Used by unicode-security for script detection)
    Current: 0.5.3 (Unicode 13)
    Goal: 0.5.5 (Unicode 15)
Diagnostics:
  • unicode-width (used by rustc-parse, rustc-errors and many others)
    Current: 0.1.8 (Unicode 13)
    Goal: 0.1.10 (Unicode 15)
  • unicode-properties (used by rustc-lexer)
    Current: 0.1.0 (Unicode 15)
    Goal: 0.1.0 (Unicode 15)
  • Removed: unic-char-property (used by unic-emoji-char, then rustc-lexer)
    Current: 0.9.0 (Unclear, No release in 2 years)
    Goal: Need a new release (Will be replaced by unicode-properties in Update lexer emoji diagnostics to Unicode 15.0 #114193)
  • Removed: unic-char-range (used by unic-emoji-char, then rustc-lexer)
    Current: 0.9.0 (Unclear, No release in 2 years)
    Goal: Need a new release (Will be replaced by unicode-properties in Update lexer emoji diagnostics to Unicode 15.0 #114193)
  • Removed: unic-common (used by unic-emoji-char, then rustc-lexer)
    Current: 0.9.0 (Unclear, No release in 2 years)
    Goal: Need a new release (Will be replaced by unicode-properties in Update lexer emoji diagnostics to Unicode 15.0 #114193)
  • Removed: unic-ucd-version (used by unic-emoji-char, then rustc-lexer)
    Current: 0.9.0 (Unclear, No release in 2 years)
    Goal: Need a new release (Will be replaced by unicode-properties in Update lexer emoji diagnostics to Unicode 15.0 #114193)
  • Removed: unic-emoji-char (used by rustc-lexer)
    Current: 0.9.0 (Unclear, No release in 2 years)
    Goal: Need a new release (Will be replaced by unicode-properties in Update lexer emoji diagnostics to Unicode 15.0 #114193)
Dev-Tools:
Dependency crates:
  • unicode-bidi (used by idna then url then [ammonia, cargo, cargo-test-support, clippy_lints, crates-io, git2, git2-curl, rustc-workspace-hack])
    Previously: 0.3.4 (Unicode 10)
    Goal: >=0.3.10 (Unicode 15)
  • unicode-segmentation (used by rustfmt)
    Previously: 1.9.0 (Unicode 14)
    Goal: >=1.10.0 (Unicode 15)
  • unicode-properties (used by rustfmt)
    Mentioned above in compiler diagnostics section
  • Removed: unicode_categories (used by rustfmt)
    Current: 0.1.1 (Unclear, No release in 6 years)
    Goal: Need a new release (Will be replaced by unicode-properties in Update Unicode data to 15.0 rustfmt#5864)
  • unicase (used by pulldown-cmark, then [rustdoc, clippy-lints, mdbook])
    Current: 2.6.0 (Unclear, No release in 3 years)
    Goal: >=2.7.0 (Unicode 15)

Unicode version independent crates (ignorable for now, just for future reference):

  • unicode-bdd (in-tree maintainence tool): Unicode version independent
  • ucd-parse: Unicode version independent (used by unicode-bdd tool)
  • ucd-trie: Unicode version independent (used by handlebars, then mdbook)
  • unic-langid, unic-langid-impl, unic-langid-macros, unic-langid-macros-impl: Not really Unicode version independent but we only use Unicode version independent part. They're outdated, current: 0.9.1 (CLDR 37, Spring 2020, ~= Unicode 13), would be nice if a new release is used.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-UnicodeArea: UnicodeC-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFC

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions