What is Microsoft's 'Memora' long-term memory architecture that prevents AI agents from 'forgetting'?



On June 29, 2026, Microsoft Research released ' Memora: A Harmonic Memory Representation Balancing Abstraction and Specificity, ' a long-term memory architecture for AI agents. This memory system allows AI agents to efficiently store long-term conversation and work histories and retrieve only the necessary information.

Memora: A Harmonic Memory Representation Balancing Abstraction and Specificity - Microsoft Research

https://www.microsoft.com/en-us/research/blog/memora-a-harmonic-memory-representation-balancing-abstraction-and-specificity/



Many AI agents are designed so that their memory starts from scratch each time they engage in a conversation. In large-scale projects, this means they have to reread long histories or search for relevant information from external databases every time. As the amount of conversation and documents increases, the number of tokens required for processing also increases, and summarizing information can cause small details such as numbers and conditions to be lost. For AI agents that work long-term, memory management is a major bottleneck.

Existing approaches to improving AI memory include 'Mem0,' which extracts individual facts from conversations; 'Search Augmented Generator (RAG),' which searches for documents and conversational fragments to use in answers; and 'Zep' and 'GraphRAG,' which organize relationships between people and things as graphs.

However, trying to retain detailed information results in too many fragments of memory, making it difficult to understand the overall flow. On the other hand, strengthening the summarization makes searching easier, but detailed information such as dates, exceptions, and the process leading to agreements tends to be lost. Memora is said to be a technology that balances the preservation of detailed information with ease of organization.

Memora's distinguishing feature is its design that separates 'what to save' from 'how to search.' The saved content is called 'memory value,' and it holds detailed information such as the flow of a conversation or project details. On the other hand, 'primary abstraction' and 'cue anchors' are provided as entry points for searching. The primary abstraction is a short phrase that represents the subject of the memory, while cue anchors are clues that lead you back to the memory from related people's names, plans, or topics.



For example, if you remember that 'Dave and Sarah agreed to postpone the Project Orion prototype to April 1st, the pilot launch to May 2nd, and the Minimum Viable Product (MVP) to May 30th,' Memora will save the details as a memory value. For searching, you can create a subject like 'Project Orion update schedule agreed upon by Dave and Sarah,' and also create separate entry points such as 'Dave,' 'Project Orion,' 'Prototype schedule,' and 'Pilot launch schedule.'

When people recall past events, they may recall them by date, or by names of people or projects. Memora makes it easier to access related information that would be difficult to find through simple keyword searches or searches for similar sentences, by allowing users to return to the same memory using multiple clues.

Furthermore, Memora not only returns search results all at once, but also features a 'policy-guided retriever' that progressively reviews the search strategy. The AI agent expands on clues if there is insufficient information in the initial search, searches for memories that are related but not superficially similar, and stops the search once enough information has been gathered.

The results of the 'LoCoMo' benchmark, which handles long conversations, are as follows. 'Memora(S)' is a simple method that searches for semantically similar memories, while 'Memora(P)' is a method that uses a 'policy-guided retriever'. Memora(P) achieved an LLM (Likely Lump in Memory) score of 86.3%, which checks whether the generated answer semantically matches the correct answer. Both methods also outperformed methods that read the entire context, as well as methods such as RAG and Mem0. Furthermore, it is explained that they operated with up to 98% fewer context tokens compared to methods that read the entire context. However, Memora(P) is more powerful but takes longer to process than Memora(S) because it uses multiple steps for searching.



As of the time of writing, Memora has not yet been added to existing products such as Microsoft 365 Copilot, and is still in the stage of research results and code release by Microsoft Research.

Microsoft Research also mentioned future directions such as 'MemLoop,' which improves memory systems from memory retrieval and task failures; 'Deferred Memory,' which delays memory creation until sufficient context is gathered; and 'Group Memory,' which manages sources and access scope while sharing knowledge among teams and multiple agents. Microsoft Research stated, 'We released Memora as a foundation for AI agents to become entities that collaborate over the long term and build knowledge, rather than being entities that forget everything every time.'

in AI, Posted by log1d_ts