AI of Google translation enters a stage where it can acquire its own "intermediate language" and even "translate between unlabeled languages"


ByPer egevad

"Google Translate" provided by Google is celebrating its 10th anniversary since its launch in 2016. It consists of a huge amount of dataCorpusIn September 2016, translation algorithms were changed in some languages ​​and artificial intelligence "neural network" was used, so natural language processing closer to humans can be performed "Google Neural Machine Translation (GNMT)"Introducing.

Apparently it seems that Artificial Intelligence (AI) has developed its own translation method in such evolved Google translation. AI has the ability to make certain translations even in combinations of languages ​​that are created internally in the interior, such as "intermediate language" used for translation, and which should not have been learned yet "Zero-Shot TranslationIt has been announced that you are wearing.

Research Blog: Zero-Shot Translation with Google's Multilingual Neural Machine Translation System
https://research.googleblog.com/2016/11/zero-shot-translation-with-googles.html

GNMT learns translation based on a huge corpus using deep learning, enabling more natural translation across multiple languages. During this learning translation from one language to another is to be learned but naturally it is not prepared for any combination of languages ​​that have not yet been learned It is conceivable. However, researchers developing AI for Google translationPublished paperAccording to AI, it seems that AI is creating an "intermediate language (interlingua)" for processing languages ​​independently in learning. This intermediate language seems to hold information closer to a more abstract concept, not to connect words one-on-one.

Actually AI translated "English" "Japanese" "Korean" was done as follows. First, let AI learn translation from English (left) to Korean (right).


Next, learn translation of the opposite Korean into English. I learned English and Korean languages.


And then learning English-Japanese translation. First of all, learning English to Japanese translation ... ...


Likewise, learning day → English.


Here, Google's AI researcher thought "I can translate Japanese and Korean as well". And actually translating it, although accuracy deteriorates, it is said that succeeded in doing a reasonable translation.


In the past Google translation, when translating from one language other than English to another language, it was said that it was a mechanism to translate through English once. That is why English was the traditional "intermediate language", but GNMT does not go through English "Zero-Shot TranslationIt is supposed to be done.

Of course, it was possible to translate in both directions between two languages.


This event seems to be a very big event in considering the translation system. Although some core languages ​​existed in machine translation so far, GNMT has internally held its main language as "something" which is neither English nor Japanese nor Korean It is considered.

The figure below shows how this is done. (A) is a linguistically identical visualization of the concepts obtained by learning 74 sentences of Japanese, English, Korean, and three languages, and the sentences of the same content in three languages ​​are color coded according to the respective colors It is displayed in "Cluster" which was done. (B) extracted the cluster of three languages ​​for the sentence "The stratosphere extends from about 10 km to about 50 km in altitude." (Stratosphere is in the range of altitude 10 km to 50 km) (C) The clusters are color coded by language.


From the above figure, it can be seen that the AI ​​of GNMT recognizes the meanings and concepts of words and sentences in each language as being in a nearby place. It seems that there are many parts that are not actually understood in what form this mechanism is realized yet, but I think that the translation ability of AI is becoming much closer to the human brain than ever It seems to be able to think.

The published paper can be referred from the following link.

[1611.04558] Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
https://arxiv.org/abs/1611.04558

in Software,   Web Service, Posted by darkhorse_log