The latest version of the topic generation AI `` GPT-2 '' that is `` too dangerous '' has been released, and a demo site that can actually be used also appears


GPT-2, an AI sentence generation tool developed by OpenAI, a non-profit organization that researches artificial intelligence, is said to be able to automatically generate high-precision texts easily, and the development team said that it is `` too dangerous '' It has developed into a situation where publication of papers is postponed because of fear. GPT-2 has been released in February, May and August 2019, with models with increased parameters in stages, but the latest version with 1.5 billion parameters has finally been released.

GPT-2: 1.5B Release

OpenAI determined that GPT-2 was at risk of being exploited to generate very sophisticated sentences, and refrained from publishing all four developed models at once. Instead, we have released models with increased parameters in stages in February, May, and August 2019, and the GPT-2 with 1.5 billion parameters released this time is the stage of GPT-2 OpenAI says it will be the last model in the release.

Cornell 's research team collaborated with OpenAI to investigate the accuracy of sentences generated by GPT-2 with a reliability score of '10' at maximum, and conducted research on human subjects. As a result, the reliability score of the medium model released in May 2019 (355 million parameters) is `` 6.07 '', the large model released in August (774 million parameters) The reliability score for was 6.72. And the reliability score of the latest model released this time is “6.91,” which is higher than the medium model and slightly higher than the large model.



With this release, OpenAI also points out the dangers of GPT-2. First, OpenAI admits that 'GPT-2 can be fine-tuned for abuse,' and according to the Center for Terrorism, Extremeism and Counter-Terrorism (CTEC) at the University of Middlebury , GPT-2 is・ It can be adjusted to correspond to the four ideologies of Islamic jihadism and anarchism, and propaganda sentences can be generated.

On the other hand, OpenAI has developed a model to detect sentences generated by GPT-2, and already seems to be able to detect sentences of the latest release model of GPT-2 with an accuracy of nearly 95%. On the other hand, OpenAI believes that this model is not sufficient to operate alone, and it is necessary to improve detection accuracy by combining it with a meta-database approach and human judgment.

Although there was great concern that `` GPT-2 might be misused by someone '', as of the release on November 5, 2019, the GPT-2 model released so far was actually abused No evidence has been found. However, as the output reliability improved, the possibility of exploiting text generation tools such as GPT-2 increased, and we acknowledged that even the developed OpenAI could not be aware of all exploits.


Also, a website called `` Text Synth '' that can actually generate text using GPT-2 has appeared, and you can actually check how accurate GPT-2 is I can do it.

Text synth

In Text Synth, GPT-2 automatically generates text following the text entered by the user. Enter the text “A famed novelist well-known for” and click “Complete Text” at the bottom.

Then, although it is slow speed, the continuation of the sentence entered at the bottom of the screen is automatically generated.

This is the final output text.

`` A famed novelist well-known for its literary realism, the author was an admirer of his own work as well as those of some of his peers.He would read his poems, especially the great French one, Les Prouvés de l'Oeil ( or 'Tales of the Old Road') in the course of his morning walk.He was fond of the novel of his friend and fellow writer, Jean Genet, which he wrote during a stay in Genet's home.And in his youth he had been very close to one of his favorite writers, Louis Aragon, whom he read in his youth as the young man began to become more and more interested in the arts. (The author who is a novelist famous for literary realism, As a companion, he was a fan of his work, and during his morning walk, he read the great French poetry “Les Prouvés de l'Oeil” (or the Old Road Story). I liked Genet's novel that I wrote during my stay at Jean Genet, a friend and fellow novelist. He was very close to his favorite novelist, Louis Aragon, which he read a lot when he was young, so he became more interested in art, '' although the context is a bit confused, A sentence that has meaning is generated.

In Text Synth, even if you enter the same text, the text that is output will vary from trial to trial. As before, I entered `` A famed novelist well-known for '' again and generated a sentence again, this time `` A famed novelist well-known for his love of the novel, but who is also considered one of the country's followed by leading poets ... '.

If you enter another sentence, a new sentence is automatically generated. Although the generated text is certainly composed as a text, there is a sense of discomfort with respect to the context.

If you click “Select an example”, examples of appropriate sentences are prepared, so you can select from various candidates and generate sentences.

The code base of GPT-2 can be confirmed from the following.

GitHub-openai / gpt-2: Code for the paper 'Language Models are Unsupervised Multitask Learners'

