New translation technology developing bilingual AI using back translation


Artificial intelligence (AI) may translate levels that exceed amateur translators within 10 yearsIt is being watched.. However, at this stage, learning data that translates millions of sentences is necessary to make the neural network learn translation. Under such circumstances, a new technology was announced that allows the neural network to learn translation without a bilingual document.

[1711.00043] Unsupervised Machine Translation Using Monolingual Corpora Only

[1710.11041] Unsupervised Neural Machine Translation

Artificial intelligence goes bilingual - without a dictionary | Science | AAAS

It is when there are plenty of documents to be inferred by computers in existing AI translations, receiving correct answers, and adjusting processes accordingly. For that reason, translation works well for languages ​​with many texts such as English and French, but accurate translation is difficult for languages ​​that do not have minor languages ​​or bilingual documents.

Machine learning so far required human work to "teach" AI first as described above, but another approach has been taken in the newly announced method. With this technology, AI creates its own dictionary without teaching AI "your guess is correct". There are similarities in many languages, such as "the words" table "and" chair "are often used together", so making mapping based on such co-occurrence makes a dictionary It will be possible. After that, by superimposing multiple maps, a dictionary for translation is completed.

byMikhail Pavstyuk

Two new studies on the above technologies were announced. One was from Mikel Artetxe et al., Computer scientist at Basque University of Spain, and the other by Facebook computer scientist Guillaume Lample et al.

Both of the studies are similar methods using reverse translation and denoising. Reverse translation is to re-translate sentences that were roughly translated into another language to the original language once. At this time, the developed neural network is "adjusted" when the back-translated document and the first document do not match, so when you translate the same document next time, the two documents get closer .

Noise removal is similar to reverse translation, but when we re-translate it we try to reproduce the original sentence by removing words and rearranging it. By performing reverse translation and noise elimination as described above, the neural network is a mechanism to learn the deeper parts of sentence structure. The system developed by any of the research teams encodes one language and takes a mechanism to make it an abstract expression once before converting it into another language, but the intermediate language of the Facebook research team is more abstract There is a system difference. Both research teams are reading middle-aged research papers and they are in the midst of further developing their own developed technology.

byAlexa Mazzarello

Although the translation accuracy from English to French in the technology developed by the two studies is not as high as that of Google translation at the time of article creation, it is easy to improve accuracy by incorporating some bilingual documents into training It is said that it can do. In addition, this technology is not useful for languages ​​with few bilingual translations, but also for abundant English in bilingual documents, it is expected to be useful for translating new slang and medical terminology.

in Note, Posted by darkhorse_log