Skip to content

Update Unicode tables to 14.0.0 #7502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

colinodell
Copy link
Contributor

These changes were created by:

  1. Checking out the latest master of this repository
  2. Downloading and extracting https://www.unicode.org/Public/zipped/14.0.0/UCD.zip
  3. Running the included ./ucgendat.php script

All tests seem to pass locally, but I'd appreciate a second set of eyes to ensure everything looks good.

Copy link
Member

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to merge this into 8.1.

@nikic nikic closed this in fe36b81 Sep 20, 2021
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 26, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 26, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 27, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Jun 29, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, php#7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
alexdowad pushed a commit that referenced this pull request Jun 29, 2024
Updates UCD to Unicode 15.1 (released 2023 Sept). The upcoming
Unicode 16 version will be released roughly on 2024 Sept.

Previously: 0fdffc1, #7502

UCD 15.1 `DerivedNormalizationProps` contains multiple properties in
the same line, which breaks the parser. This also updates the
`ucgendat.php` script to allow 2 or three fields in each line, and to
look for the `Cased` and `Case_Ignorable` properties in either of the
fields to mimic the previous behavior.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Sep 15, 2024
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc1, php#7502, php#14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
Ayesh added a commit to Ayesh/php-src that referenced this pull request Sep 16, 2024
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc1, php#7502, php#14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
alexdowad pushed a commit that referenced this pull request Sep 17, 2024
Updates UCD to Unicode 16.0 (released 2024 Sept).

Previously: 0fdffc1, #7502, #14680

Unicode 16 adds several new character sets and case folding rules.
However, the existing ucgendat script can still parse them.

This also adds a couple test cases to make sure the new rules for
East Asian Wide characters and case folding work correctly. These
tests fail on Unicode 15.1 and older because those verisons do not
contain those rules.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants