Internationalisation Deep Pattern
Beyond 'wrap every string in `t()`' — the substance of locale-correct UI: plural rules, gender, ICU formatting, RTL, dates, numbers, currency, sorting, search.
Internationalisation Deep Pattern
Beyond "wrap every string in t()" — the substance of locale-correct UI: plural rules, gender, ICU formatting, RTL, dates, numbers, currency, sorting, search.
TL;DR (human)
Real intl is harder than key extraction. Each language has plural rules; some have gender; numbers / dates / currencies format differently by locale; right-to-left languages flip layout. Use ICU MessageFormat for messages; Intl.* APIs for formatting; CLDR data for everything locale-specific. Test in pseudo-locales + at least one RTL.
For agents
Beyond key extraction
The intl-pattern.md sibling doc covers the basic discipline: every string keyed, useT() everywhere. This doc covers what comes after.
ICU MessageFormat
Plain interpolation is insufficient for plurals and gender:
// ✗ wrong — doesn't pluralise; word order baked in
t("results", { count }) // "{count} result(s)"
// ✓ ICU MessageFormat
t("results", { count })
// "results": "{count, plural, =0 {No results} one {# result} other {# results}}"ICU handles:
- plural:
=0,one,two,few,many,other— depends on locale rules (CLDR). - select: branching on a value (gender, status).
- selectordinal: ordinal numbers (1st, 2nd, 3rd).
- number, date, time: format with locale rules.
Library: formatjs/intl-messageformat, messageformat, i18next with the icu plugin.
Plural rules differ wildly:
- English: 2 forms (one / other).
- Russian: 4 forms (one / few / many / other).
- Arabic: 6 forms (zero / one / two / few / many / other).
- Japanese, Chinese: 1 form.
Hard-coding "one" / "other" breaks Russian. Use CLDR-derived rules.
Gender
Some languages mark gender:
"welcome": "{gender, select, female {Bienvenida} male {Bienvenido} other {Bienvenidos}}, {name}"Gendered translations need:
- The user's gender (or "prefer not to say" → use neutral form).
- A neutral fallback for languages that don't have gendered forms.
Avoid generating sentences from glued fragments — gender + plural agreement requires the whole sentence at once.
Number formatting
// Locale-aware decimal separator, thousands grouping
new Intl.NumberFormat("en-US").format(1234567.89); // "1,234,567.89"
new Intl.NumberFormat("de-DE").format(1234567.89); // "1.234.567,89"
new Intl.NumberFormat("hi-IN").format(1234567.89); // "12,34,567.89" (Indian numbering)
new Intl.NumberFormat("ar-EG").format(1234567.89); // "١٬٢٣٤٬٥٦٧٫٨٩" (Arabic digits)
// Percentages
new Intl.NumberFormat("en-US", { style: "percent" }).format(0.42); // "42%"
// Compact
new Intl.NumberFormat("en-US", { notation: "compact" }).format(12345); // "12K"Currency
new Intl.NumberFormat("en-US", { style: "currency", currency: "USD" }).format(99.95); // "$99.95"
new Intl.NumberFormat("ja-JP", { style: "currency", currency: "JPY" }).format(99.95); // "¥100" (rounded; no decimals)
new Intl.NumberFormat("de-DE", { style: "currency", currency: "EUR" }).format(99.95); // "99,95 €"The currency code (USD / JPY / EUR) is part of the data, not derived from locale. A user in Germany might view US dollars.
Date and time
new Intl.DateTimeFormat("en-US").format(new Date()); // "10/14/2024"
new Intl.DateTimeFormat("en-GB").format(new Date()); // "14/10/2024"
new Intl.DateTimeFormat("ja-JP").format(new Date()); // "2024/10/14"
new Intl.DateTimeFormat("ar-EG").format(new Date()); // arabic-indic digits
// Relative time
new Intl.RelativeTimeFormat("en-US").format(-1, "day"); // "1 day ago"
new Intl.RelativeTimeFormat("es-ES").format(-1, "day"); // "hace 1 día"Timezone discipline:
- Server stores UTC (ISO-8601 with offset).
- Client renders in user's locale + timezone.
- For "5 days from now" calculations: use the user's timezone (a date in Tokyo is not the same date in LA).
Libraries: native Intl.* is usually enough; date-fns + date-fns-tz or Luxon for richer manipulation.
Right-to-left (RTL)
Arabic, Hebrew, Persian, Urdu read right-to-left.
CSS:
dir="rtl"on\<html\>or per-region.- Logical properties:
margin-inline-start(notmargin-left),padding-inline-end(notpadding-right). - Icons that imply direction (arrows, chevrons) mirror.
- Text alignment:
text-align: start(nottext-align: left).
Mixed-direction content (English text in Arabic UI): use \<bdi\> and dir="auto".
Layouts that look fine in LTR can be broken in RTL:
- Asymmetric padding.
- Custom dropdowns with hardcoded positioning.
- Carousels with directional swipe.
Test in at least one RTL locale before shipping.
Locale identifiers (BCP 47)
| Format | Meaning |
|---|---|
en | English (any region) |
en-US | English, United States |
en-GB | English, United Kingdom |
pt-BR | Portuguese, Brazil |
pt-PT | Portuguese, Portugal |
zh-Hant | Traditional Chinese |
zh-Hans | Simplified Chinese |
ar-EG | Arabic, Egypt |
User locale → fallback chain: pt-BR → pt → default (en).
Implement: locale = user setting + browser hint + URL param, with explicit precedence.
Sort + search
Locale-aware string comparison:
"ä".localeCompare("z", "de"); // -1 (ä before z in German)
"ä".localeCompare("z", "sv"); // 1 (ä after z in Swedish)Intl.Collator for batch sorting. Locale-aware sort matters for:
- User-facing lists (sort by name).
- Search match scoring.
- Autocomplete ranking.
Pluralisation of intl keys themselves
Avoid:
t("invite-button") // "Invite"
t("invite-buttons") // "Invites"Two keys, two translations, two slots to drift.
Instead:
t("invite", { count }) // ICU plural handles itOne key, one translation, plurals correct in every locale.
Translation workflow
Three actors:
- Developer: adds keys to source locale (typically
en). - Translator: receives keys; produces target locales.
- Translation management (TMS): platform (Phrase, Crowdin, Lokalise) that syncs keys, manages translator work, returns completed translations.
CI checks:
- Every source-locale key exists in every shipped locale (or has documented fallback).
- No orphan keys (in target but not source).
- No untranslated keys (in source but not target, beyond fallback policy).
Pseudo-locale for testing
A qa / pseudo locale transforms strings:
Save → [!! Šåvé !!]
Loading… → [!! Łõåðîñğ… (~30% longer) !!]
Welcome to Acme → [!! Wélçömé tö Áçmé !!]Run the app in pseudo-locale:
- Hardcoded strings stand out (not transformed).
- Length-sensitive layouts show their breakage.
- Missing keys obvious (no
[!! ... !!]wrap).
CI screenshots in pseudo-locale catches drift before release.
Currency + region pairing
A pricing page shows different prices per region. Two concerns:
- Display currency: format per user locale, regardless of price source.
- Tax / VAT: per region; show inclusive vs exclusive per regulatory norm.
Avoid mixing the user's locale with the product's currency (a Japanese user viewing USD pricing — keep USD; don't auto-convert unless you mean to).
Domain-specific localisation
Things that are NOT translated:
- Brand product name (per
whitelabel-pattern.mdbrand-token allowlist). - Code identifiers, file paths, URLs.
- Author / contributor names.
- Third-party brand names (Slack, GitHub).
Things that ARE translated:
- Generic terms ("workspace", "user", "settings").
- Status labels ("Running", "Failed").
- Error messages.
- Help text.
Common failure modes
- Plain interpolation for plurals. "1 result(s)" — broken in any non-English locale. → ICU MessageFormat.
- Date / number raw.
formatDate(d)returns ISO. Users see machine format. →Intl.*. - Locale derived from currency. User in Brazil viewing USD; UI assumes pt-BR formatting for
$. → Locale and currency independent. - Hardcoded
margin-left. RTL breaks. → Logical properties. - String concat for sentences.
t("hello") + " " + name + "!"→ word order assumption baked in. → ICU. - No RTL test. Bidi bugs ship. → At least one RTL in CI snapshots.
- Mixed-language fragments.
Welcome to {productName}, ${userName}!— direction ambiguity. →\<bdi\>/dir="auto". - CLDR not bundled. Locale features missing at runtime. → Include CLDR data for shipped locales (bundle size cost; trade-off).
Tooling stack (typical)
| Concern | Tool |
|---|---|
| Runtime formatting | Native Intl.* (broad browser support) |
| Message formatting | formatjs, i18next, lingui, react-intl |
| Plural / gender data | CLDR (bundled by libraries above) |
| TMS platform | Phrase, Crowdin, Lokalise, Tolgee |
| Static extraction | i18next-parser, formatjs CLI, lingui extract |
| Coverage / parity | i18next-locize, in-house gate |
| Date manipulation | date-fns + date-fns-tz, Luxon, native Intl.DateTimeFormat |
| Pseudo-locale | pseudo-loc, in-house |
Adoption path
- Day 0:
useT()for all strings;enonly; parity gate disabled. - Month 1: add ICU MessageFormat for plurals.
- Month 2: add
Intl.*for date / number / currency. - Quarter 1: first non-
enlocale; parity gate; pseudo-locale in CI. - Quarter 2: TMS workflow with external translators.
- Quarter 3: RTL locale; bidi audit on changed screens.
- Mature: localised search, sort, region-aware features.
See also
intl-pattern.md— the basic discipline.whitelabel-pattern.md— product name as a brand token.accessibility-deep-pattern.md—langattribute; reading order; bidi.universal.md— Rule 3 (intl every string), Rule 8 (human verbs).