Google gets more multilingual, but will it get the nuance?

May 11, 2022, 12:42 PM | Updated: 1:37 pm

FILE - A student colors in a fox during during Quechua Indigenous language class focusing on animal...

FILE - A student colors in a fox during during Quechua Indigenous language class focusing on animal names at a public primary school in Licapa, Peru, Wednesday, Sept. 1, 2021. About 10 million people speak Quechua, but trying to automatically translate emails and text messages into the most widely spoken Indigenous language family in the Americas was nearly impossible before Google introduced it into its digital translation service Wednesday, May 11, 2022. The internet giant says new artificial intelligence technology is enabling it to vastly expand Google Translate’s repertoire of the world’s languages, adding 24 more this week including Quechua and other Indigenous South American languages such as Guarani and Aymara. (AP Photo/Martin Mejia, File)

(AP Photo/Martin Mejia, File)

LIMA, Peru (AP) — About 10 million people speak Quechua, but trying to automatically translate emails and text messages into the most widely spoken Indigenous language family in the Americas was long all but impossible.

That changed on Wednesday, when Google added Quechua and a variety of other languages to its digital translation service.

The internet giant says new artificial intelligence technology is enabling it to vastly expand Google Translate’s repertoire of the world’s languages. It added 24 of them this week, including Quechua and other Indigenous South American languages such as Guarani and Aymara. It is also adding a number of widely spoken African and South Asian languages that have long been missing from popular tech products.

“We looked at languages with very large, underserved populations,” Google research scientist Isaac Caswell told reporters.

The news from the California company’s annual I/O technology showcase may be celebrated in many corners of the world. But it will also likely draw criticism from those frustrated by previous tech products that failed to understand the nuances of their language or culture.

Quechua was the lingua franca of the Inca Empire, which stretched from what is now southern Colombia to central Chile. Its status began to decline following the Spanish conquest of Peru more than 400 years ago.

Adding it to the languages recognized by Google is a big victory for Quechua language activists like Luis Illaccanqui, a Peruvian who created the website Qichwa 2.0, which includes dictionaries and resources for learning the language.

“It will help put Quechua and Spanish on the same status,” said Illaccanqui, who was not involved in Google’s project.

Illaccanqui, whose last name in Quechua means “you are the lightning bolt,” said the translator will also help keep the language alive with a new generation of young people and teenagers, “who speak Quechua and Spanish at the same time and are fascinated by social networks.”

Caswell called the news a “very big technological step forward” because until recently, it was not possible to add languages if researchers couldn’t find a big enough trove of online text — such as digital books, newspapers or social media posts — for their AI systems to learn from.

U.S. tech giants don’t have a great track record of making their language technology work well outside the wealthiest markets, a problem that’s also made it harder for them to detect dangerous misinformation on their platforms. Until this week, Google Translate was offered in European languages like Frisian, Maltese, Icelandic and Corsican — each with fewer than 1 million speakers — but not East African languages like Oromo and Tigrinya, which have millions of speakers.

The new languages will roll out on Google’s Android system this week and on Apple devices later this month. They won’t yet be understood by Google’s voice assistant, which limits them to text-to-text translations for now. Google said it is working on adding speech recognition and other capabilities, such as being able to translate a sign by pointing a camera at it.

That will be important for largely spoken languages like Quechua, especially in the health field, because many Peruvian doctors and nurses who only speak Spanish work in rural areas and “are unable to understand patients who speak mostly Quechua,” said Illaccanqui.

“The next frontier, or challenge, is to work on speech,” said Arturo Oncevay, a Peruvian machine translation researcher at the University of Edinburgh who co-founded a research group to improve Indigenous language technology across the Americas. “The native languages of the Americas are traditionally oral.”

In its announcement, Google cautioned that the quality of translations in the newly added languages “still lags far behind” other languages it supports, such as English, Spanish and German, and noted that the models “will make mistakes and exhibit their own biases.” But the company only added languages if its AI systems met a certain threshold of proficiency, Caswell said.

“If there’s a significant number of cases where it’s very wrong, then we would not include it,” he said. “Even if 90% of the translations are perfect, but 10% are nonsense, that’s a little bit too much for us.”

Google said its products now support 133 languages. The latest 24 are the largest single batch to be added since Google incorporated 16 new languages in 2010. What made the expansion possible is what Google is calling a “Zero-Stop Machine Translation” model — an AI model that learns to translate into another language without ever seeing an example of it.

Facebook and Instagram parent company Meta introduced a similar concept called the Universal Speech Translator last year.

“At a high level, the way you can imagine it working is you have a single gigantic neural model and it is trained on 100 different languages,” Caswell said of the Google model.

He said the new group ranges from smaller languages like Mizo, spoken in northeastern India by about 800,000 people, to more widely spoken languages like Lingala, spoken by around 45 million people across Central Africa.

It was more than 15 year ago — in 2006 — that Microsoft got some positive attention in South America with a software feature translating familiar Microsoft menus and commands into Quechua. But that was before the current wave of AI advancements in real-time translation.

Harvard University language scholar Américo Mendoza, who speaks Quechua, said getting Google’s attention brings some needed visibility to the language in places like Peru, where Quechua speakers are still lacking in many public services. The survival of many of these languages “will depend on their use in digital contexts,” he said.

The new languages added are: Assamese, Aymara, Bambara, Bhojpuri, Dhivehi, Dogri, Ewe, Guarani, Ilocano, Konkani, Krio, Lingala, Luganda, Maithili, Meiteilon (Manipuri), Mizo, Oromo, Quechua, Sanskrit, Sepedi, Sorani Kurdish, Tigrinya, Tsonga and Twi.

O’Brien reported from Providence, Rhode Island.

Copyright © The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or redistributed.


              FILE - Books written in the Quechua Indigenous language sit behind a student during a class on medicinal plants, at a public primary school in Licapa, Peru, Wednesday, Sept. 1, 2021.  About 10 million people speak Quechua, but trying to automatically translate emails and text messages into the most widely spoken Indigenous language family in the Americas was nearly impossible before Google introduced it into its digital translation service Wednesday, May 11, 2022.  The internet giant says new artificial intelligence technology is enabling it to vastly expand Google Translate’s repertoire of the world’s languages, adding 24 more this week including Quechua and other Indigenous South American languages such as Guarani and Aymara.   (AP Photo/Martin Mejia)
            
              FILE - Teacher Carmen Cazorla writes in the Quechua Indigenous language during a class on medicinal plants at a public primary school in Licapa, Peru, Wednesday, Sept. 1, 2021.  About 10 million people speak Quechua, but trying to automatically translate emails and text messages into the most widely spoken Indigenous language family in the Americas was nearly impossible before Google introduced it into its digital translation service Wednesday, May 11, 2022.  The internet giant says new artificial intelligence technology is enabling it to vastly expand Google Translate’s repertoire of the world’s languages, adding 24 more this week including Quechua and other Indigenous South American languages such as Guarani and Aymara.  (AP Photo/Martin Mejia)
            
              FILE - A student colors in a fox during during Quechua Indigenous language class focusing on animal names at a public primary school in Licapa, Peru, Wednesday, Sept. 1, 2021.  About 10 million people speak Quechua, but trying to automatically translate emails and text messages into the most widely spoken Indigenous language family in the Americas was nearly impossible before Google introduced it into its digital translation service Wednesday, May 11, 2022.  The internet giant says new artificial intelligence technology is enabling it to vastly expand Google Translate’s repertoire of the world’s languages, adding 24 more this week including Quechua and other Indigenous South American languages such as Guarani and Aymara.  (AP Photo/Martin Mejia, File)

AP

FILE - U.S. Border Patrol Chief Raul Ortiz listens during a news conference, Jan. 5, 2023, in Washi...

Associated Press

US Border Patrol chief is retiring after seeing through end of Title 42 immigration restrictions

The head of the U.S. Border Patrol announced Tuesday that he was retiring, after seeing through a major policy shift that seeks to clamp down on illegal crossings at the U.S.-Mexico border following the end of Title 42 pandemic restrictions.

20 hours ago

FILE - President Joe Biden talks with House Speaker Kevin McCarthy of Calif., on the House steps as...

Associated Press

House OKs debt ceiling bill to avoid default, sends Biden-McCarthy deal to Senate

The House approved a debt ceiling and budget cuts package late Wednesday, as President Joe Biden and Speaker Kevin McCarthy assembled a bipartisan coalition of centrist Democrats and Republicans against fierce conservative blowback and progressive dissent.

20 hours ago

Sean Bickings (Family Photo via city of Tempe)...

Associated Press

Family of man who drowned last year in Tempe Town Lake files wrongful death lawsuit

The family of a man who drowned in Tempe Town Lake a year ago filed a wrongful death lawsuit against the city Wednesday, noting that its police department doesn't have a policy requiring officers to go into the water to save someone.

20 hours ago

(Mike Stocker/South Florida Sun-Sentinel via AP)Credit: ASSOCIATED PRESS...

Associated Press

Florida police search for 3 gunmen who wounded 9 at crowded beach on Memorial Day

Police are responding to a shooting near the beach broadwalk in Hollywood, Florida.

3 days ago

Crew members assemble the main stage ahead of the 2023 Scripps Nations Spelling Bee on Sunday, May ...

Associated Press

Exclusive secrets of the National Spelling Bee: Picking the words to identify a champion

As the final pre-competition meeting of the Scripps National Spelling Bee's word selection panel stretches into its seventh hour, the pronouncers no longer seem to care.

3 days ago

FILE - Gabby Petito's mother Nichole Schmidt, wipes a tear from her face during a news conference o...

Associated Press

Mother of man who killed Gabby Petito said in letter she would help son ‘dispose of a body’

The mother of the man who killed Gabby Petito told her son in an undated letter that she would “dispose of a body” if needed because she loved him so much, according to copies of the note shared publicly for the first time this week by attorneys for Petito's parents.

6 days ago

Sponsored Articles

...

DAY & NIGHT AIR CONDITIONING, HEATING AND PLUMBING

Here are the biggest tips to keep your AC bill low this summer

PHOENIX — In Arizona during the summer, having a working air conditioning unit is not just a pleasure, but a necessity. No one wants to walk from their sweltering car just to continue to be hot in their home. As the triple digits hit around the Valley and are here to stay, your AC bill […]

...

Desert Institute for Spine Care

Spinal fusion surgery has come a long way, despite misconceptions

As Dr. Justin Field of the Desert Institute for Spine Care explained, “we've come a long way over the last couple of decades.”

(Photo: OCD & Anxiety Treatment Center)...

OCD & Anxiety Treatment Center

Here’s what you need to know about OCD and where to find help

It's fair to say that most people know what obsessive-compulsive spectrum disorders generally are, but there's a lot more information than meets the eye about a mental health diagnosis that affects about one in every 100 adults in the United States.

Google gets more multilingual, but will it get the nuance?