Research suggests that AI models need sleep.



Advanced AI models attempt to explore vast amounts of data based on complex user instructions and perform long-term tasks. Research has shown that such prolonged processing negatively impacts AI, suggesting that it needs periods of 'sleep'-like processing where it temporarily shuts off information input to organize its thoughts.

[2605.26099] Language Models Need Sleep

https://arxiv.org/abs/2605.26099



Typical AI models, known as 'large-scale language models,' process information by reviewing the content of conversations with users. Therefore, they suffer from the problem of increasing computational complexity as conversations become longer.

To overcome this, recent models reduce costs by combining two memory areas: one for retrieving recent records and another for compressing and storing large amounts of data. This is similar to working memory and long-term memory in humans. However, research by Sun Yun Li and others at Carnegie Mellon University revealed that the accuracy of the information stored in the AI's long-term memory is insufficient, resulting in a decrease in output performance as inferences are repeated.



Lee et al. cited the hypothesis that animals retain information in long-term memory through sleep and proposed a method for 'transferring context window memory to persistent weights' inspired by the sleep process.

Lee et al.'s method involves the model entering a 'sleep' state when its context window becomes full during inference. During this time, the model repeatedly reprocesses past information and integrates it into its 'weight memory,' which is the AI's long-term memory. After integration, the context window is cleared, and the model resumes operation using the updated weights.



When the 'sleep' process was actually implemented, it was found that the more complex the task, the more effectively sleep was used, and that accuracy improved with repeated sleep.

'While existing large-scale language models immediately delete past information from memory, sleep processing converts it into weights before deleting it, allowing for deeper inference about past contexts that existing models can no longer pay attention to,' Lee et al. suggest.

in AI,   Science, Posted by log1p_kr