There were many commented out contents in the GPT-4 paper published by Microsoft



On March 22, 2023, Microsoft published an OpenAI large-scale language model (LLM) titled ' Sparks of Artificial General Intelligence: Early experiments with GPT-4'. published a 154-page paper related to GPT-4. However, when comparing the published paper with the original version, it is pointed out that various contents are commented out.



A Twitter user named DV (@ DV2559106965076) pointed out the existence of commenting out of the paper. In addition, the user's tweet is only this series of tweets at the time of article creation, and it is completely unknown who it is.

Mr. DV and his colleagues looked at the LaTeX source code on arXiv, where the paper was posted, and noticed a lot of information commented out from the paper.



According to Mr. DV, there was a rumor that GPT-4 had an internal name of 'DV-3', and in fact the third author of the paper was named 'Davinci 3', but ' The affiliation is unclear' and is commented out.



In addition, GPT-4, which is actually multimodal and can be processed by combining text, voice, images, mathematical expressions, etc., is expressed as 'text-only', and the cost of training is not understood. ``Interestingly, Microsoft researchers don't seem to know much about GPT-4,'' DV said.



7.3 'Toxic Content' and 7.4 'Toxicity: Generation & Detection' in the latter half of Chapter 7 'Discriminative Capabilities' of the paper have disappeared entirely. In this part, the story 'GPT-4 generates harmful content without prompting' was developed. ``Fortunately, GPT-4 is better at detecting harmful content than existing LLMs,'' DV said.



“There is still more information to be mined from this material,” DV said. I was worried about whether or not, ”he concluded the tweet.



This case has received many comments on the social news site Hacker News.

Microsoft's paper on OpenAI's GPT-4 had hidden information | Hacker News
https://news.ycombinator.com/item?id=35281527



According to zamnos, the original title of the paper was 'First Contact With an AGI System' .

Mr. tysam_and points out that it is a clickbait (with a sensational title intended to get you to click) using superficial information, and calls on you not to access the tweet.

In addition, max_expectation advises that the boss used a tool to organize the LaTeX code before posting arXiv to avoid such a situation.

GitHub - google-research/arxiv-latex-cleaner: arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
https://github.com/google-research/arxiv-latex-cleaner

in Web Service, Posted by logc_nt