An artificial intelligence from Meta can recognize up to 4 thousand languages

Meta has created an artificial intelligence language modelthat can recognize more than 4,000 spoken languages ​​and produce audio in 1,100 of them.


Massively Multilingual Speech ( MMS ) is open source and is being supported by multiple scholars and stakeholders in this field to be developed and released worldwide.


For Meta , the MMS is “a small contribution to preserve the incredible linguistic diversity of the world.”

The AI ​​used an unconventional mechanism to collect audio data: leveraging religious texts.

“We use religious texts, such as the Bible, that have been translated into many different languages ​​and whose translations have been extensively studied for text-based language translation research,” the company said. “These translations have publicly available audio recordings of people reading these texts in different languages.”

This approach saved them thousands of hours of audio with transcription tags attached for machine learning.

The company claims that this training does not bias artificial intelligence. Meta calls this the connectionist temporal classification (CTC) approach.

and to talk

After recognizing and learning from these thousands of languages, Meta used wav2vec 2.0, the company’s “self-supervised speech representation learning” model, which can train on unlabeled data.

“Our results show that the Massively Multilingual Speech models perform well compared to existing models and cover 10 times more languages,” the company says, praising its few errors and greater breadth of languages.

“We envision a world where technology has the opposite effect, encouraging people to keep their languages ​​alive, since they can access information and use technology by speaking their preferred language,” concludes the company.