Microsoft AI Outperforms Doctors in Diagnosing Complex Cases

Mimicking the diagnostic approach of human doctors, MAI-DxO analyzes patient symptoms, asks follow-up questions, and recommends tests, all while aiming to minimize unnecessary diagnostics that often lead to healthcare overspending.
Microsoft has unveiled a new artificial intelligence tool that outperforms experienced doctors in diagnosing complex medical cases.
The AI system, called the Microsoft AI Diagnostic Orchestrator (MAI-DxO), correctly diagnosed 85.5% of difficult cases sourced from the New England Journal of Medicine, compared to just 20% by a group of 21 physicians from the US and UK.
Although not yet available for clinical use, the tool marks a significant advance in medical AI. “We’re taking a big step towards medical superintelligence,” said Mustafa Suleyman, CEO of Microsoft AI, in a LinkedIn post.
MAI-DxO was tested alongside several prominent AI models, such as GPT, Llama, Claude, Gemini, Grok, and DeepSeek. The system's strongest diagnostic performance came when it was paired with OpenAI’s o3 model.
Mimicking the diagnostic approach of human doctors, MAI-DxO analyzes patient symptoms, asks follow-up questions, and recommends tests, all while aiming to minimize unnecessary diagnostics that often lead to healthcare overspending.
While acknowledging the AI's superior performance in these trials, Microsoft noted that in actual clinical settings, doctors typically have access to second opinions, references, and tools that weren’t factored into the study.
The benchmark used in this study involved 304 challenging, real-world cases from recent issues of The New England Journal of Medicine. Microsoft stated that this approach surpasses earlier assessments that relied on the USMLE multiple-choice format, which the company said “favors memorization over deep understanding.”
Unlike past tests, the new diagnostic benchmark requires “sequential diagnosis, a cornerstone of real-world medical decision making,” Microsoft wrote in a blog post.
Microsoft plans to further develop MAI-DxO by evaluating it under more routine conditions and conducting clinical testing for safety and accuracy. Regulatory approval would be necessary before the tool could be deployed in medical settings.
“This is a proof-of-concept showing that [large language model] systems can master medicine’s most intricate diagnostic challenges by following the same step-by-step reasoning and debate process that expert physicians use every day,” said Bay Gross, Microsoft AI’s vice president of health.
Microsoft reiterated that its goal is not to replace physicians, but to enhance productivity and care. The company sees AI playing a key role in automating repetitive tasks, supporting diagnoses, and tailoring treatment strategies.
A detailed research paper on MAI-DxO has been prepared but is yet to be peer-reviewed or published in a scientific journal.
Stay tuned for more such updates on Digital Health News