Meta has unveiled NLLB-200, a new AI model that can translate 200 languages and improves quality by 44 percent on average.
For a long time, translation apps have been fairly adept at the most common languages. Even if they do not provide an exact translation, it is usually close enough for the native speaker to understand.
However, hundreds of millions of people in multilingual regions such as Africa and Asia continue to be underserved by translation services.
In a press release, Meta wrote:
“To help people connect better today and be part of the metaverse of tomorrow, our AI researchers created No Language Left Behind (NLLB), an effort to develop high-quality machine translation capabilities for most of the world’s languages.
Today, we’re announcing an important breakthrough in NLLB: We’ve built a single AI model called NLLB-200, which translates 200 different languages with results far more accurate than what previous technology could accomplish.”
The metaverse aims to be boundless. Translation services will need to provide accurate translations quickly in order to facilitate this.
“As the metaverse takes shape,” the company explained, “the ability to build technologies that work well in a wider range of languages will help to democratise access to immersive experiences in virtual worlds.”
According to Meta, the NLLB-200 performed 44 percent better in terms of translation “quality” than previous AI research. The NLLB-200 translations were more than 70% more accurate for some African and Indian languages.
Meta created the FLORES-200 dataset to evaluate and improve NLLB-200. The dataset allows researchers to evaluate the performance of the FLORES-200 “in 40,000 different language directions.”
Both the NLLB-200 and FLORES-200 are being made available to developers in order for them to build on Meta’s work and improve their own translation tools.
Meta has a grant pool of up to $200,000 for researchers and nonprofit organisations interested in using NLLB-200 for impactful uses such as sustainability, food security, gender-based violence, education, or other areas that support the UN Sustainable Development Goals.
However, not everyone is fully convinced by Meta’s latest breakthrough.
“It’s worth bearing in mind, despite the hype, that these models are not the cure-all that they may first appear. The models that Meta uses are massive, unwieldy beasts. So, when you get into the minutiae of individualised use-cases, they can easily find themselves out of their depth – overgeneralised and incapable of performing the specific tasks required of them,” commented Victor Botev, CTO at Iris.ai.
“Another point to note is that the validity of these measurements has yet to be scientifically proven and verified by their peers. The datasets for different languages are too small, as shown by the challenge in creating them in the first place, and the metric they’re using, BLEU, is not particularly applicable.”