Why Fluent Matters for Localization

In case you don’t know what Fluent is, it’s a localization system designed and developed by Mozilla to overcome the limitations of the existing localization technologies. If you have been around Mozilla Localization for a while, and you’re wondering what happened to L20n, you can read this explanation about the relation between these two projects.

With Firefox 58 we started moving Firefox Preferences to Fluent, and today we’re migrating the last pane (Firefox Account – Sync) in Firefox Nightly (61). The work is not done yet, there are still edge cases to migrate in the existing panes, and subdialogs, but we’re on track. If you’re interested in the details, you can read the full journey in two blog posts from Zibi (2017 and 2018), covering not only Fluent, but also the huge amount of work done on the Gecko platform to improve multilingual support.

At this point, you might be wondering: do we really need another localization system? What’s wrong with what we have?

The truth is that there is a lot wrong with the current systems. In Gecko alone, we support 4 different file formats to localize content: .dtd, .properties, .inc, .inc. And since none of them support plural forms, we built hacks on top of .properties to support pluralization.

Here are a few practical examples of why Fluent is a huge improvement over existing technologies, and will allow us to improve the quality of the localizations we ship.

DTDs and Concatenations

You want to localize this simple fragment of XUL code without using JavaScript.

<description flex="1">
 Please sign in to reconnect <label class="fxaEmailAddress"></label>
</description>

This turns into 2 separate strings in a DTD file, and a long localization comment:

<!-- LOCALIZATION NOTE (signedInLoginFailure.beforename.label, 
signedInLoginFailure.aftername.label): 
these two string are used respectively before and after the account 
email address. Localizers can use one of them, or both, to better 
adapt this sentence to their language. -->
<!ENTITY signedInLoginFailure.beforename.label "Please sign in to reconnect">
<!ENTITY signedInLoginFailure.aftername.label "">

Why the empty string at the end? Because, while English doesn’t need it, other languages might need to change the structure of the sentence, adding content after the email address. On top of that, some localization tools don’t support empty strings correctly, not allowing localizers to mark an empty translation as a “translated” string.

In Fluent, this is simply:

sync-signedin-login-failure = Please sign in to reconnect { $email }

One single string, full visibility on the context, flexibility to move around the email address.

Plural Forms

Plural forms are supported in Gecko only for .properties files. Fluent supports plural forms natively, and with a lot of additional flexibility.

First of all, if you’re not familiar with the complexity of plurals across languages (limiting the discussion to cardinal integer numbers):

  • English, like many other European languages, only has 2 plural forms: n=1 uses one form (“1 page”), all other numbers (n!=1) use a different form (“2 pages”). Sadly, this makes a lot of people think about plural in terms of “1 vs many”, while that’s not really the case for most languages.
  • French still has 2 plural forms, but uses the same form for both 0 and 1.
  • Other languages can only have one form (e.g. Chinese), or have up to 6 different plural forms (e.g. Arabic). Fluent uses the CLDR categories (zero, one, two, few, many, other) to match a number to the correct plural form. For example, in Russian 1 and 21 will use the form “one”, but 11 will use “other”.
  • The behavior might change if the actual number is present or not. For example, Turkic languages don’t need to pluralize a noun after a number (“1 page”), but need plural forms in sentences referencing to one or more elements (“this” vs “them”).

Consider for example this use case: in Firefox, the button to set the home page changes from “Use Current Page” to “Use Current Pages”, depending on the number of open tabs.

If you want to use a proper plural form, you need to add the number of tabs to the string. In .properties, it would look like this (plural forms are separated by a semicolon):

use-current-pages = Use Current Page;Use Current #1 Pages

This will force languages to create all plural forms for their locales, even if they might not be needed. If your language has 6 forms, you need to provide all 6 forms, even if they’re all identical. Fun, isn’t it? Note that this is not just a limitation of the plural system used in .properties, the same happens in GetText (.po files).

Here’s how Fluent improve things: first, you don’t need to add all plural forms, you can rely on the fall back to the default value (indicated by *), without raising any error:

use-current-pages =
     .label =
         {
            *[other] Использовать текущие страницы
         }

More important, you can match a specific number (1 in this case):

use-current-pages =
     .label =
         {
             [1] Использовать текущую страницу           
            *[other] Использовать текущие страницы
         }

In Russian, the “[one]” form would be also used for 21 tabs, while here it’s only used for 1 tab.

Let’s assume the English message looks like this:

use-current-pages =
     .label =
         { $tabCount ->
             [1] Use Current Page
            *[other] Use Current Pages
         }

Do you need to expose the number in your message and treat it like a standard plural form? You can:

use-current-pages =
     .label =
         { $tabCount ->
             [one] Use { $tabCount } Page
            *[other] Use { $tabCount } Pages
         }

Do you only need one form? Again, you can simplify it into:

use-current-pages =
     .label = Use { $tabCount } Pages

or even:

use-current-pages =
     .label = Use Current Pages

Variants

This is one of the most exciting changes introduced to the localization paradigm.

Consider this example: “Firefox Account” is a special brand within Firefox. While “Firefox” itself should not be localized or declined, “account” can be localized and moved. In Italian it’s “Account Firefox”, “Cuenta de Firefox” in Spanish.

A special entity is defined in order to be reused in other strings:

<!ENTITY syncBrand.fxAccount.label "Firefox Account">

For example:

<!ENTITY signedOut.accountBox.title "Connect with a &syncBrand.fxAccount.label;">

In Italian this results in “Connetti &syncBrand.fxAccount.label;”. It’s not natural, and it looks wrong, because we don’t capitalize nouns in the middle of a sentence.

My only option to improve the translation, and make it sound more natural, would have been to drop the entity and just add the translated name. That defies the entire concept of having a central definition for the brand.

Here’s what I can do in Fluent. The brand is defined as a term, a special type of message that can only be referenced from other strings (not code), and can have additional attributes.

-fxaccount-brand-name = Firefox Account
sync-signedout-account-title = Connect with a { -fxaccount-brand-name }

And now in Italian I can do:

-fxaccount-brand-name =
    {
        [lowercase] account Firefox
       *[uppercase] Account Firefox
    }
sync-signedout-account-title = Connetti il tuo { -fxaccount-brand-name[lowercase] }

While uppercase vs lowercase is a trivial example, variants can have a much deeper impact on localization quality for complex languages that use declensions, where the word “account” changes based on its role within the sentence (nominative, accusative, etc.).

This is only the tip of the iceberg, there’s more you can do with Fluent, and the new localization API will allow us to drastically improve the experience for non English users in Firefox. Here are some additional links if you want to learn more about Fluent:

Localizing BBCodeXtra

If you never noticed the menu item in this blog, I’m the developer of a small add-on for Firefox called BBCodeXtra: it’s an extension, started about 10 years ago, that makes posting on forums and other places (e.g. GitHub) a little less painful.

This extension, currently at version 0.4.1, is localized in 16 languages: de, es-ES, fi, fr, he, it, ja, ko, pl, pt-BR, pt-PT, ru, sk, sr, th, zh-TW.

For the first time in years, I’m going to release a new version that includes new features, and therefore new strings. While I obviously love localization and localizers, I don’t intend to work with localization platforms for an add-on with a very limited set of strings and infrequent updates.

The new version (v0.5.0) will be released with all the existing languages enabled, but the new strings will be left in English (excluding Italian). Starting from the next version, I will drop locales that go below 60% unchanged strings (currently it’s about 70% for all locales).

If you want to contribute updating your localization:

  • Source code is hosted on GitHub. Localizations are stored in /extensions/locale. If you’re already familiar with GitHub, great. If you’re not, you can try with this tutorial or try using the online editor available on GitHub.
  • If you want to receive an email before the next release, in case I have to add new strings and you want to localize them before release, send me an email at flod(at)lodolo(dot).net and I’ll organize a mailing list.

Summit 2013 planning assembly: a wonderful begin

Note: this is a guest post from Iacopo Benesperi, a fellow Mozillian from the Italian community.

This week-end took place in Mozilla’s Paris office the Summit 2013 planning assembly: a gathering of about 65 people from all around the world and representing all areas of the Mozilla project, with both paid staff and volunteers, aimed to plan and shape the next Summit, that will take place the first week-end of October in Bruxelles, Toronto and Santa Clara.

TL;DR: it’s been a great assembly. If we manage to accomplish at the next Summit half of the things we’ve discussed in this week-end, it will have been the best Mozilla event ever.

The aim of the assembly was not to define a schedule for the event and fix everything but to talk about which are the important topics that animate the Mozilla project these days, start to discuss them and shape them in a way so that we can come out with a good format for the Summit to address them and try to give and propose solutions for them. To do this, all the planning committee has taken interviews to fellow Mozillians in the last month to have a wider view of which is the temperature of the project in these days and act as a representative for the comments expressed.

I went to Paris without a clear idea of what we would have accomplished there, but I’m impressed with the result we had.
First of all, this assembly was facilitated by people of unconference.net, who proposed a peculiar way to proceed with it. I was a bit skeptic with the method proposed, but it turned out that some of their methods are really great (like unpanel) and we will definitely adopt them for the next Summit, while some others still look like rubbish (I may still be proved wrong).
The second important fact is that we talked little about technology and a lot about Mozilla, its community, its communication (internal and external) and the interactions between its components and people. On one hand, as Gandalf pointed out, this is a sign that we trust implicitly our technology and the fact that it will be discussed at the Summit, because this is a big portion of what Mozilla is about. On the other hand, it’s a sign that there’s a general awareness, not only among community members but also (finally) among employees and paid staff and board of directors that we have communication problems between the different parts of the projects and especially between paid staff and volunteers, and the time is now mature to address and try to solve them. What I’m talking about is not only communication to get things done but also communication related to the decision-making process.

So, it will be interesting to experiment discussions around different time-zones and locations. I will probably post more about the assembly and the planning for the Summit in the next days, when ideas and thoughts will have settled down a bit and I’ll have had the time to read all the ideas and documentation we produced during this two days. What I felt important to communicate immediately is the fact that the next Summit will be a wonderful occasion to talk not only about our technologies but also about who we are, what we want to do and where we want to go. It will be an occasion for the community to teach and mentor the newest community members and more importantly all the new employees to let them understand and feel the power and importance of our community and it will be, in general, an occasion to have our voice finally be heard and taken into consideration not only in the tasks at hand, but in building the new policies and guidelines that will drive all the project in the future.

I’m sure we’ll try, in the next months, to provide some initial information and documentation about what have been discussed and decided so far so that you can arrive at the Summit prepared to give your contribution to the conversation, so that we can take the most out of the Summit and make it really matter in our future.

As I said at the beginning: if we manage to discuss and propose solutions to half of the problems and concerns raised during this two days, we will have had the best Mozilla event ever; one that will have strengthened and made our project more mature.

Once upon a time there was a string freeze… pt.2

Since it probably looks like my favorite hobby is whining without a reason, let’s check what happened so far (always an optimist…) in this cycle.

Broken strings in Mozilla Beta

  • Bug 797036 – Update updater strings and icon
  • Bug 803344 – poor discoverability of the enable/disable menu item for Social API

Landing strings in Beta means that we did something wrong before (haste of moving forward features that weren’t probably ready, “we need this in ESR”, etc.).

Broken strings in Mozilla Aurora

Obviously the two changesets landed on beta, plus:

  • Bug 795691 – b2g fixes for the web console actors
  • Bug 800373 – Change marketplace strings to ‘Firefox Marketplace’

Consider several adding/removing strings both in beta and aurora (e.g. Bug 803630 or Bug 760951) and you’ll get the picture.

Bug 797036 is a good example of how bad we are working on the l10n side lately:

  1. changes land on central on Oct 02 16:34:08 (end of cycle is only 6 days ahead)
  2. the day after I wrote a comment in the bug about the bad review (that’s pure luck, I don’t work on localization every day, and there are very few localizers doing this kind of checks on central)
  3. nobody reacts, bad strings move to aurora and we need to break string freeze

For a starter a better review process could have avoided all this.