Description
Preconditions and environment
-
Magento version 2.4-develop
-
related user documentation:
Steps to reproduce
- set up a new product whose name includes characters
ſ
(long s),þ
(thorn) andð
(eth) - save to get a slugified URL key generated for SEO
Expected result
In the generated URL key,
ſ
becomess
þ
becomesth
ð
becomesd
, althoughdh
and eventh
would also be acceptable
Actual result
ſ
becomesz
þ
becomesp
ð
is removed
Additional information
These are just some mistakes I easily spotted by looking at the file. I’m pretty sure there are also errors (or questionable choices) in the romanisation of Cyrillic, Greek, Hebrew and Devanagari. The selection of less than 500 characters to be transliterated seems very random, so people created modules to properly support languages like Romanian and Vietnamese.
Just for Japanese, magento2-jp already introduces the use of PHP’s Transliterator
which is the right tool for the job. Its data comes from ICU which in turn uses CLDR data, both maintained by Unicode, i.e. it is as reliable as it gets (and will still be improved in the future).
If Transliterator
is not to be used for some reason, Magento should at least use the Unicode data for Latin-ASCII and …-Latn.
PS: Ideally, Magento would support setting a language for a store view which would then be respected for stuff like German umlauts (ä
→ ae
) that deviates from the script default (a
) – CLDR offers de-ASCII for that, also see #23292. Administrators should also be able to opt into UTF-8 percent encoding in all cases, but let’s keep this a bug report and not a feature request.
PPS: This won’t cover stuff like ½″
which would ideally become half-inch
but at best will be 1-2
, or 0.5 cm
which would better become 5mm
than 0-5-cm
.
Release note
No response
Triage and priority
- Severity: S0 - Affects critical data or functionality and leaves users without workaround.
- Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
- Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
- Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
- Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status