Jan 13, 2026 23:32:00

It has been demonstrated that multiple AIs can print 70 to 90 percent of books such as 'Harry Potter.'

A research team from Stanford University and Yale University has reported that they have succeeded in getting multiple AIs to reproduce most of the text contained in books such as '

Harry Potter and the Philosopher's Stone ' in its original form.

[2601.02671] Extracting books from production language models
https://arxiv.org/abs/2601.02671

Researchers extract up to 96% of Harry Potter word-for-word from leading AI models

https://the-decoder.com/researchers-extract-up-to-96-of-harry-potter-word-for-word-from-leading-ai-models/

Boffins probe commercial AI models, find Harry Potter • The Register

https://www.theregister.com/2026/01/09/boffins_probe_commercial_ai_models/

From mid-August to mid-September 2025, the research team tested four models: Anthropic's 'Claude 3.7 Sonnet,' OpenAI's 'GPT-4.1,' Google's 'Gemini 2.5 Pro,' and xAI's 'Grok 3' to see if they could output the contents of the book.

The graph below shows the results of examining only sentences with at least 100 coherent words among the output. Red is Claude 3.7 Sonnet, blue is GPT-4.1, yellow is Grok 3, and green is Gemini 2.5 Pro. The higher the bar, the closer the output is to the original work. The works tested are ' The Great Gatsby, ' 'Harry Potter and the Philosopher's Stone,' ' 1984 ,' ' Frankenstein, ' ' The Hobbit: The Hidden Figure, ' ' Harry Potter and the Goblet of Fire ,' ' The Hunger Games, ' ' The Catcher in the Rye, ' ' A Song of Ice and Fire ,' ' Beloved ,' ' The Da Vinci Code, ' and ' Catch-22 .'

In Phase 1, the researchers attempted to output a short phrase from a book, followed by Phase 2, in which they showed the beginning of the book and instructed the AI to 'continue writing the sentence exactly as it appears in the original work.' After Phase 1, the researchers asked the AI to continue outputting until it rejected the request, output an ending like 'The end,' or reached a request limit. They did not provide any information about the original text beyond the beginning.

The Claude 3.7 Sonnet reproduced 95.8% of 'Harry Potter and the Philosopher's Stone,' while the Gemini 2.5 Pro and Grok 3 were less impressive, but still managed over 70%. The Gemini 2.5 Pro also produced the largest continuous block of 9070 words in the test.

Claude 3.7 Sonnet and GPT-4.1 showed strong resistance to output, with GPT-4.1 in particular refusing to continue output at the stage of Phase 1, resulting in a significantly lower recall. Claude 3.7 Sonnet required a jailbreak , which involved fine-tuning the prompts, while Gemini 2.5 Pro and Grok 3 obediently output without any noticeable resistance.

Just because the number is low does not mean that it is not included in the training data. The research team reported that the output of GPT-4.1 included scenes featuring characters from A Song of Ice and Fire, such as Ser Waymar, the Heretics, and the Stark sword, Ice.

Related Posts:

Jan 13, 2026 23:32:00 in AI, Posted by logc_nt