Can AI detect signs of cognitive decline better than human doctors?



Dementia has a devastating impact on patients and their families worldwide, making it crucial for doctors to detect early signs of dementia and provide appropriate prevention and treatment. Researchers at Massachusetts General Hospital have developed an AI tool that scans doctor records to identify signs of cognitive decline.

An autonomous agentic workflow for clinical detection of cognitive concerns using large language models | npj Digital Medicine

https://www.nature.com/articles/s41746-025-02324-4



Can AI detect cognitive decline better than a doctor? New study reveals surprising accuracy | Live Science
https://www.livescience.com/health/ageing/can-ai-detect-cognitive-decline-better-than-a-doctor-new-study-reveals-surprising-accuracy

The research team developed an AI tool that scans doctor records and flags patients who need closer attention regarding their cognitive function. Instead of a single AI, they created five collaborative AI programs, each with an 'agent' approach, that review and improve each other's work.

The AI tool was built on Meta's Llama 3.1 and trained on three years' worth of medical records, including patient visit notes, progress reports, and discharge summaries . This data was collected from real hospitals, where clinicians reviewed medical records for cognitive concerns.

First, the AI was presented with a balanced set of medical records containing and not containing cognitive concerns, and trained until it matched clinician-assigned labels with 91% accuracy. The resulting AI tool was then tested on a separate dataset to assess its accuracy in flagging cognitive concerns. To better reflect real-world care, the second dataset contained approximately one-third of the notes flagged by clinicians as cognitive concerns.



In testing, the AI tool's sensitivity dropped to about 62%—meaning it missed 4 out of 10 visit notes that clinicians flagged as 'worrisome for cognitive decline.' While this result alone makes it seem like the AI tool failed, when the research team reexamined the visit notes and the AI tool's diagnoses, they found that the AI tool and human clinicians had classified the findings differently.

The researchers then had clinicians review the 16 consultation notes where the AI tool and humans had different results, again without telling them which consultation notes had been flagged by the AI or by a human.

After review, clinicians favored the AI tool's flagging over the original human judgment in seven cases (about 44%). 'This is one of the most surprising findings,' said study co-author

Hussein Estiri , an associate professor of neurology at Massachusetts General Hospital.

According to Estil, the AI did not flag any obvious memory problems, confusion, or changes in a patient's thinking, even if they were directly documented in the medical records. This means the AI applied diagnostic criteria more conservatively than clinicians, resulting in different diagnoses than clinicians.

'The goal of AI tools is not to replace clinical judgment, but to act as an adjunct to screening,' said Lydia Moura , co-author of the paper and associate professor of neurology at Massachusetts General Hospital. In situations where there is a clinician shortage, AI tools could help determine which patients should receive follow-up care.



Please note that the AI tool developed this time is based on data obtained from a single hospital network, so results may not necessarily be accurate in environments where recording methods differ.

According to Estilo, the AI tool is intended to work quietly in the background during a doctor's routine visit, flagging potential concerns. 'The concept is not that the doctor sits down and uses an AI tool, but rather that the AI system is part of the clinical record, providing insight into what the doctor is seeing and why,' Estilo said.

in AI,   Science, Posted by log1h_ik