With many new faces joining Mozilla, as either staff or volunteer localizers, most are only familiar with the current, more streamlined localization infrastructure.
I thought it might be interesting to take a look back at the technical evolution of Mozilla’s localization systems. Having personally navigated every version — first as a community localizer from 2004 to 2013, and later as staff — I’ll share my perspective. That said, I might not have all the details exactly right (or I may have removed some for the sake of my sanity), so feel free to point out any inaccuracies.
Early days: Centralized version control
Back in the early 2000s, smartphones weren’t a thing, Windows XP was an acceptable operating system — especially in comparison to Windows Me — and distributed version controls weren’t as common. Let’s be honest, centralized version controls were not fun: every commit meant interacting directly with the server. You had to remember to update your local copy, commit your changes, and then hope no one else had committed in the meantime — otherwise, you were stuck resolving conflicts.
Given the high technical barriers, localizers at that time were primarily technical users, not discouraged by crappy text editors — encoding issues, BOMs, and other amenities — and command line tools.
To make things more complicated, localizers had to deal with 2 different systems:
- CVS (Concurrent Versioning System) was used for products like Mozilla Suite, Phoenix/Firefox, etc. To increase confusion, it used branch names that followed the Gecko versions (e.g. MOZILLA_1_8_BRANCH), and those didn’t map at all to product versions. Truth be told, the whole release cadence and cycle felt like complete chaos back then, at least as a volunteer.
- SVN (Subversion) was used to localize mozilla.org, addons.mozilla.org (AMO), and other web projects.
With time, desktop and web-based applications emerged to support localizers, hiding some of the complexity of version control systems and providing translation management features:
- Mozilla Translator (a local Java application. Yes kids, Java).
- Narro.
- Pootle.
- Verbatim: a customized Pootle instance run by Mozilla, used to localize web projects like addons.mozilla.org. This was shut down in 2015 and projects transitioned to Pontoon.
- Pontoon (here’s the first idea and repository, if you’re curious).
- Aisle, an internal experiment based on C9 that never got past the initial tests.
This proliferation of new tools led to a couple of key principles that are still valid to this day:
- The repository, not the TMS (Translation Management System), is the source of truth.
- TMSs need to support bidirectional synchronization between their internal data storage and the repository, i.e. they need to read updated translated content from the repository and store it internally (establishing a conflict resolution policy), not just write updates.
This might look trivial, but it’s an outlier in the localization industry, where the tool is the source of truth, and synchronization only happens in one direction (from the TMS to the repository).
The shift to Mercurial
At the end of 2007, Mozilla made the decision to transition from CVS to Mercurial, this time opting for a distributed version control system. For localization, this meant making the move to Mercurial as well, though it took a few more months of work. This marked the beginning of a new era where the infrastructure quickly started becoming more complex.
As code development was happening in mozilla-central
, localization was supposed to be stored in a matching l10n-central
repository. But here’s the catch: instead of one repository, the decision was to use one repository per locale, each one including the localization for all shipping projects (Firefox, Firefox for Android, etc.). I’m not sure how many repositories that meant at the time — based on the dependencies of this bug, probably around 30 — but as of today, there are 156 l10n-central
repositories, while Firefox Nightly only ships in 111 locales (a few of them added recently).
The next massive change was the adoption of the rapid release cycle in 2011:
- 3 new sets of repositories had to be created for the corresponding Firefox versions:
l10n/mozilla-aurora
,l10n/mozilla-beta
,l10n/mozilla-release
. - Localizers working against Nightly in
l10n-central
would need to manually move their updates tol10n/mozilla-aurora
, which was becoming the main target for localization. - At the end of the cycle (“merge day”), someone in the localization team would manually move content from Aurora to Beta, overwriting any changes.
- In order to allow localizers to make small fixes to Beta, 2 separate projects were set up in Pontoon (one working against Aurora, one against Beta), and it was up to localizers to keep them in sync, given that content in Beta would be overwritten on merge day.
If you’re still trying to keep count, we’re now at about 600 Mercurial repositories to localize a project like Firefox (and a few hundreds more added later for Firefox OS, one for each locale and version, but that’s a whole different story).
I won’t go into the fine details, but at this point localizers were also supposed to “sign off” on the version of their localization that they wanted to ship. Over time, this was done by:
- Calling out which changeset you wanted to ship in an email thread.
- Later, requesting sign-off in a web app called Elmo (because it was hosted on l10n.mozilla.org, (e)l.m.o., got it?). Someone in the localization team had to manually go through each request, check the diff from the previous sign-off to ensure that it would not break Firefox, and either accept or reject it. For context, at the time DTDs were still heavily in use for localization, and a broken translation could easily brick the browser (yellow screen of death).
- With the drop of Aurora in 2017, the localization team started reviewing and managing sign-offs in Elmo without waiting for localizers to make a request. Yay for localizers, one less thing to do.
- In 2020, partly because of the lay-offs that impacted the team, we completely dropped the sign-off process and decommissioned Elmo, automatically taking the latest changeset in each l10n repository.
The new kid on the block: GitHub
In 2015 we started migrating repositories from SVN to GitHub. At the time, that meant mostly web projects, managed by Pascal Chevrel and me, with the notable exception of Firefox for iOS. That part of localization had a whole infrastructure of its own: a web dashboard to track progress, a tool called langchecker
to update files and identify errors, and even a file format called dotlang (.lang) that was used for a while to localize mozilla.org (we switched to Fluent in 2019).
The move to GitHub removed a lot of bureaucracy, as the team could create new repositories and grant access to localizers without going through an external team, like it was the case for Mercurial. Still today, GitHub is the go-to choice for new projects, although the introduction of SAML single sign-on created a significant hurdle when it comes to add external contributors to a project.
Introduction of cross-channel for Firefox
Remember the 600 repositories? Still there… Also, the most observant among you might wonder: didn’t Mozilla had another version of Firefox (Extended Support Release, or ESR)? You’re correct, but the compromise there was that ESR would be string-frozen, so we didn’t need another ~150 repositories: we used the content from mozilla-release
at the time of launch, and that’s it, no more updates.
In 2017, the Aurora channel was “removed”, leaving Nightly (based on mozilla-central
), Developer Edition and Beta (based on mozilla-beta
), Release (based on mozilla-release
) and ESR. I use quotes, because “aurora” is still technically the internal channel name for Dev Edition.
That was a challenge, as Aurora represented the main target for localization. That change forced us to move all locales to work on Nightly around April 2017.
Later in the year, Axel Hecht came up with a core concept that still supports how we localize Firefox nowadays: cross-channel. What if, instead of having to extract strings from 4 huge code repositories, we create a tool that generates a superset of the strings shipping in all supported versions (channels) of Firefox, and put them in a nimble, string-only repository? That’s exactly what cross-channel did, allowing us to drop ~300 repositories (plus ~150 already dropped because of the removal of Aurora). It also gave us the opportunity to support localization updates in release and ESR. At this point, localization for any shipping version of Firefox comes out of a single repository for each locale (e.g. l10n-central/fr
for French).
In hindsight, cross-channel was overly complex: it would not only create the superset content, but it would also replay the Mercurial history of the commit introducing the change. The content would land in the cross-channel repository with a reference to the original changeset (example), making it possible to annotate the file via Mercurial’s web interface. In order to do that, the code hooked directly into Mercurial internals, and it would break frequently thanks to the complexity of Mozilla’s repositories. In 2021 the code was changed to stop replaying history and only merging content.
At this point, in late 2017, Firefox localization relied on ~150 l10n repositories, and 2 source repositories for cross-channel — one used as a quarantine, the other, called gecko-strings
, connected to Pontoon to expose strings for community localization.
Current Firefox infrastructure
Fast-forward to 2024, with Mozilla’s decision to move development to Git, we had an opportunity to simplify things even further, and rethink some of the initial choices:
- Instead of 2 repositories for cross-channel, we decided to use only one repository with 2 branches. The cross-channel code was completely rewritten by Eemeli Aro and now runs as a GitHub workflow.
- Instead of ~150 repositories, we now have a single l10n repository, covering all supported versions of Firefox as
l10n-central
used to do. All locales, except for Italian and Japanese, are localized through Pontoon.
Thunderbird has adopted a similar structure, with their own 3 repositories.
The team completed the migration to Git in June, ahead of the rest of the organization, and all current versions of Firefox ship from the firefox-l10n
repository (including ESR 115 and ESR 128).
Conclusions
So, this was the not-so-short story of how Mozilla’s localization infrastructure has evolved over time, with a focus on Firefox. Looking back, it’s remarkable to see how far we’ve come. Today, we’re in a much better place, also considering the constant effort to improve Pontoon and other tools used by the community.
As I approach one of my many anniversaries — I have one for when I started as a volunteer (January 2004), when I became a member of staff as a contractor (April 2013), one “official” when I became an employee (November 2018) — it’s humbling to think about what a small team has accomplished over the past 22 years. These milestones remind me of the incredible contributions of so many brilliant individuals at Mozilla, whose passion helped build the foundations we stand on today.
It’s also bittersweet to go back and read emails from over 15 years ago, remembering just how pivotal the community was in shaping Firefox into what it is today. The dedication of volunteers and localizers helped make Firefox a truly global browser, and their impact is still felt — and sometimes missed — today.
Leave a Reply