Skip to content

[Proposal] Use ICU message for i18n & l10n #23863

Open
@wxiaoguang

Description

@wxiaoguang

To avoid re-inventing wheels, it's better to use ICU message to do i18n/l10n.

Steps:

  1. Fix the buggy ini package
  2. Clean up all translation strings
  3. Introduce ICU message parser
  4. Convert legacy plural-related strings to ICU format
  5. Translate on Crowdin https://support.crowdin.com/icu-message-syntax/

Below is outdated description: the old idea is using a customized message format (it's a simple syntax like ICU message, but it's not supported by Crowdin, so Crowdin can't help to check mistakes).

The official package's design seems clear and will resolve Gitea's i18n/l10n problems fundamentally.

https://pkg.go.dev/golang.org/x/text/message

https://pkg.go.dev/golang.org/x/text/feature/plural

https://github.com/unicode-org/cldr/blob/main/common/supplemental/ordinals.xml

https://github.com/unicode-org/cldr/blob/main/common/supplemental/plurals.xml

I think a translator-friendly syntax is very important, because there are really a lot of broken translations, if we make the system more complex, there will be more errors.

And the syntax should be also designed for frontend (JS/Vue).

As the first step, we should refactor the locale package to make it stable, see the problems

A brief idea about how to maintain the translation strings:

<!-- 1: other -->  {%d $[text]}

<!-- 2: one,other --> {%d $[text,texts]}

<!-- 3: zero,one,other --> {%d $zero[0,1,o]}
<!-- 3: one,two,other --> {%d $two[1,2,o]}
<!-- 3: one,few,other --> {%d $few[1,f,o]}
<!-- 3: one,many,other --> {%d $many[1,m,o]}

<!-- 4: one,two,few,other --> {%d $two-few[1,2,f,o]}
<!-- 4: one,two,many,other --> {%d $two-many[1,2,m,o]}
<!-- 4: one,few,many,other --> {%d $few-many[1,f,m,o]}

<!-- 5: one,two,few,many,other --> {%d $[1,2,f,m,o]}

<!-- 6: zero,one,two,few,many,other --> {%d $[0,1,2,f,m,o]}

Then use the syntax to support different languages:

en: msg = there are {%d $[pull request, pull requests]}
lv: msg = there are {%d $zero[for 0 pull request, pull request, pull requests]}
ar: msg = there are {%d $[for 0, for 1, for 2, few, many, other]}

Another possible approach, define all concepts ahead:

en: NumPR = {%d $[pull request, pull requests]}
lv: NumPR = {%d $zero[for 0 pull request, pull request, pull requests]}
ar: NumPR = {%d $[for 0, for 1, for 2, few, many, other]}

Then the NumPR could be reused:

en: msg = there are {$NumPR}
lv: msg = there are {$NumPR}
ar: msg = there are {$NumPR}

If we only need to support one %d, the syntax might be simplified, eg:

en: msg = there are %d $[pull request, pull requests]
lv: msg = there are %d $zero[for 0 pull request, pull request, pull requests]
ar: msg = there are %d $[for 0, for 1, for 2, few, many, other]

Metadata

Metadata

Assignees

No one assigned

    Labels

    modifies/translationtype/featureCompletely new functionality. Can only be merged if feature freeze is not active.type/proposalThe new feature has not been accepted yet but needs to be discussed first.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions