'GPT-Neo' aiming for a language model with performance close to 'GPT-3' with open source

Although the language model 'GPT-3' that can create extremely high-precision sentences was developed by OpenAI, it is neither open source nor open access, and it has an exclusive license agreement with Microsoft, so you can use it freely. can not. There is a move to create an 'open source GPT-3' for this situation. One of them is ' GPT-Neo '.

EleutherAI --GPT-Neo

AI Weekly: Meet the people trying to replicate and open-source OpenAI's GPT-3 | VentureBeat

Eleuther AI, a grassroots group of researchers, is developing 'GPT-Neo'. The development seems to have started from the almost joke-like exchanges that the founding members Connor Rihi, Leo Gao, and Sid Black had on Discord.

Before 'GPT-Neo', Mr. Rihi seems to have personally tried to make a copy of GPT-2 using Google's Tensorflow Research Cloud (TFRC). This code is the basis of GPT-Neo.

However, when aiming for a copy of GPT-3, there was a problem that the TPU provided through TFRC was not enough. Helping this point is CoreWeave, a cryptocurrency miner that provides cloud services for CGI rendering and machine learning. According to Mr. Rihi, CoreWeave only provides hardware resources, and GPT-Neo is still open source.

It has been pointed out that if there is a bias in the training dataset, the language model may amplify the bias, so we have set a strict editorial policy to exclude datasets that contain unacceptable negative bias, Rihi, Gao. It was supervised by 10 members of Eleuther AI including Mr. and Mr. Black.

The completed corpus (database for language research) 'The Pile' has a data size of 835GB. A wide range of generalization capabilities is ensured by combining 22 small databases.

EleutherAI expects GPT-Neo to exhibit similar performance with the same amount of parameters as GPT-3. In the future, it seems that it is planning a final model that is lighter by dropping one digit of the parameter.

Although Eleuther AI does not plan to provide a commercial API for GPT-Neo, it is expected that general users will be able to use GPT-Neo by providing services by CoreWeave and third parties. It is about.

in Software, Posted by logc_nt