There were many commented out contents in the GPT-4 paper published by Microsoft
On March 22, 2023, Microsoft published an OpenAI large-scale language model (LLM) titled ' Sparks of Artificial General Intelligence: Early experiments with GPT-4'. published a 154-page paper related to GPT-4. However, when comparing the published paper with the original version, it is pointed out that various contents are commented out.
You might know that MSFT has released a 154-page paper (
https://t.co/7ptdkaWjgb ) on #OpenAI #GPT4 , but do you know they also commented out many parts from the original version?
????: A thread of hidden information from their latex source code
[1/n] pic.twitter.com/fYszk9365h —DV (@DV2559106965076) March 23, 2023
A Twitter user named DV (@ DV2559106965076) pointed out the existence of commenting out of the paper. In addition, the user's tweet is only this series of tweets at the time of article creation, and it is completely unknown who it is.
Mr. DV and his colleagues looked at the LaTeX source code on arXiv, where the paper was posted, and noticed a lot of information commented out from the paper.
We inspect their latex source code from arxiv ( https://t.co/2mkXJRcoUZ ) and found a LOT of interesting information commented out from the main paper.
—DV (@DV2559106965076) March 23, 2023
[2/n]
According to Mr. DV, there was a rumor that GPT-4 had an internal name of 'DV-3', and in fact the third author of the paper was named 'Davinci 3', but ' The affiliation is unclear' and is commented out.
There were rumors that GPT-4 got an internal name DV-3.
—DV (@DV2559106965076) March 23, 2023
[3/n] pic.twitter.com/R26UPhM9Gf
In addition, GPT-4, which is actually multimodal and can be processed by combining text, voice, images, mathematical expressions, etc., is expressed as 'text-only', and the cost of training is not understood. ``Interestingly, Microsoft researchers don't seem to know much about GPT-4,'' DV said.
Interestingly, these poor MSFT researchers didn't know too much about GPT-4 (than us?)
—DV (@DV2559106965076) March 23, 2023
1. They have no idea how much exactly does it cost to train the model.
2. They seem to refer to this model as text-only, contradicting to the known fact that GPT-4 is multi-modal.
[4/n] pic.twitter.com/Q5JRH5MrIW
7.3 'Toxic Content' and 7.4 'Toxicity: Generation & Detection' in the latter half of Chapter 7 'Discriminative Capabilities' of the paper have disappeared entirely. In this part, the story 'GPT-4 generates harmful content without prompting' was developed. ``Fortunately, GPT-4 is better at detecting harmful content than existing LLMs,'' DV said.
We found they've commented out two sections on toxicity entirely.
—DV (@DV2559106965076) March 23, 2023
An excerpt from these hidden sections: 'the model generates toxic content without any prompting'.
But luckily, GPT-4 is also better at detecting toxic language compared to all known LLMs.
[5/n] pic.twitter.com/t3bDnq232v
“There is still more information to be mined from this material,” DV said. I was worried about whether or not, ”he concluded the tweet.
There are way more information to dig out from this document, but we were worried about the unknown alignment procedures that OpenAI had taken to reduce the harmfulness of this powerful AI model, and, to what extent, is this model safe for public access.
—DV (@DV2559106965076) March 23, 2023
[6/n]
This case has received many comments on the social news site Hacker News.
Microsoft's paper on OpenAI's GPT-4 had hidden information | Hacker News
https://news.ycombinator.com/item?id=35281527
According to zamnos, the original title of the paper was 'First Contact With an AGI System' .
Mr. tysam_and points out that it is a clickbait (with a sensational title intended to get you to click) using superficial information, and calls on you not to access the tweet.
In addition, max_expectation advises that the boss used a tool to organize the LaTeX code before posting arXiv to avoid such a situation.
GitHub - google-research/arxiv-latex-cleaner: arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
https://github.com/google-research/arxiv-latex-cleaner
Related Posts:
in Web Service, Posted by logc_nt