Why Fluent Matters for Localization

In case you don’t know what Fluent is, it’s a localization system designed and developed by Mozilla to overcome the limitations of the existing localization technologies. If you have been around Mozilla Localization for a while, and you’re wondering what happened to L20n, you can read this explanation about the relation between these two projects.

With Firefox 58 we started moving Firefox Preferences to Fluent, and today we’re migrating the last pane (Firefox Account – Sync) in Firefox Nightly (61). The work is not done yet, there are still edge cases to migrate in the existing panes, and subdialogs, but we’re on track. If you’re interested in the details, you can read the full journey in two blog posts from Zibi (2017 and 2018), covering not only Fluent, but also the huge amount of work done on the Gecko platform to improve multilingual support.

At this point, you might be wondering: do we really need another localization system? What’s wrong with what we have?

The truth is that there is a lot wrong with the current systems. In Gecko alone, we support 4 different file formats to localize content: .dtd, .properties, .inc, .inc. And since none of them support plural forms, we built hacks on top of .properties to support pluralization.

Here are a few practical examples of why Fluent is a huge improvement over existing technologies, and will allow us to improve the quality of the localizations we ship.

DTDs and Concatenations

You want to localize this simple fragment of XUL code without using JavaScript.


 Please sign in to reconnect 

This turns into 2 separate strings in a DTD file, and a long localization comment:




Why the empty string at the end? Because, while English doesn’t need it, other languages might need to change the structure of the sentence, adding content after the email address. On top of that, some localization tools don’t support empty strings correctly, not allowing localizers to mark an empty translation as a “translated” string.

In Fluent, this is simply:

sync-signedin-login-failure = Please sign in to reconnect { $email }

One single string, full visibility on the context, flexibility to move around the email address.

Plural Forms

Plural forms are supported in Gecko only for .properties files. Fluent supports plural forms natively, and with a lot of additional flexibility.

First of all, if you’re not familiar with the complexity of plurals across languages (limiting the discussion to cardinal integer numbers):

  • English, like many other European languages, only has 2 plural forms: n=1 uses one form (“1 page”), all other numbers (n!=1) use a different form (“2 pages”). Sadly, this makes a lot of people think about plural in terms of “1 vs many”, while that’s not really the case for most languages.
  • French still has 2 plural forms, but uses the same form for both 0 and 1.
  • Other languages can only have one form (e.g. Chinese), or have up to 6 different plural forms (e.g. Arabic). Fluent uses the CLDR categories (zero, one, two, few, many, other) to match a number to the correct plural form. For example, in Russian 1 and 21 will use the form “one”, but 11 will use “other”.
  • The behavior might change if the actual number is present or not. For example, Turkic languages don’t need to pluralize a noun after a number (“1 page”), but need plural forms in sentences referencing to one or more elements (“this” vs “them”).

Consider for example this use case: in Firefox, the button to set the home page changes from “Use Current Page” to “Use Current Pages”, depending on the number of open tabs.

If you want to use a proper plural form, you need to add the number of tabs to the string. In .properties, it would look like this (plural forms are separated by a semicolon):

use-current-pages = Use Current Page;Use Current #1 Pages

This will force languages to create all plural forms for their locales, even if they might not be needed. If your language has 6 forms, you need to provide all 6 forms, even if they’re all identical. Fun, isn’t it? Note that this is not just a limitation of the plural system used in .properties, the same happens in GetText (.po files).

Here’s how Fluent improve things: first, you don’t need to add all plural forms, you can rely on the fall back to the default value (indicated by *), without raising any error:

             
use-current-pages =
     .label =
         {
            *[other] Использовать текущие страницы
         } 

More important, you can match a specific number (1 in this case):

  
use-current-pages =
     .label =
         {
             [1] Использовать текущую страницу           
            *[other] Использовать текущие страницы
         } 

In Russian, the “[one]” form would be also used for 21 tabs, while here it’s only used for 1 tab.

Let’s assume the English message looks like this:

use-current-pages =
     .label =
         { $tabCount ->
             [1] Use Current Page
            *[other] Use Current Pages
         }

Do you need to expose the number in your message and treat it like a standard plural form? You can:

use-current-pages =
     .label =
         { $tabCount ->
             [one] Use { $tabCount } Page
            *[other] Use { $tabCount } Pages
         }

Do you only need one form? Again, you can simplify it into:

use-current-pages =
     .label = Use { $tabCount } Pages

or even:

use-current-pages =
     .label = Use Current Pages

Variants

This is one of the most exciting changes introduced to the localization paradigm.

Consider this example: “Firefox Account” is a special brand within Firefox. While “Firefox” itself should not be localized or declined, “account” can be localized and moved. In Italian it’s “Account Firefox”, “Cuenta de Firefox” in Spanish.

A special entity is defined in order to be reused in other strings:


For example:


In Italian this results in “Connetti &syncBrand.fxAccount.label;”. It’s not natural, and it looks wrong, because we don’t capitalize nouns in the middle of a sentence.

My only option to improve the translation, and make it sound more natural, would have been to drop the entity and just add the translated name. That defies the entire concept of having a central definition for the brand.

Here’s what I can do in Fluent. The brand is defined as a term, a special type of message that can only be referenced from other strings (not code), and can have additional attributes.

-fxaccount-brand-name = Firefox Account
sync-signedout-account-title = Connect with a { -fxaccount-brand-name }

And now in Italian I can do:

-fxaccount-brand-name =
    {
        [lowercase] account Firefox
       *[uppercase] Account Firefox
    }
sync-signedout-account-title = Connetti il tuo { -fxaccount-brand-name[lowercase] }

While uppercase vs lowercase is a trivial example, variants can have a much deeper impact on localization quality for complex languages that use declensions, where the word “account” changes based on its role within the sentence (nominative, accusative, etc.).

This is only the tip of the iceberg, there’s more you can do with Fluent, and the new localization API will allow us to drastically improve the experience for non English users in Firefox. Here are some additional links if you want to learn more about Fluent:


Posted

in

, , ,

by

Tags:

Comments

One response to “Why Fluent Matters for Localization”

  1. […] volta il “nuovo” editor di WordPress. Non è che abbia completamente smesso di scrivere (qui e qui, in inglese), ma il tempo e la voglia sono quello che […]

Leave a Reply

Your email address will not be published. Required fields are marked *