Artificial intelligence is revolutionizing healthcare, but it’s not replacing your doctor
The next time you undergo a blood test, X-ray, mammogram, or colonoscopy, an AI algorithm will likely interpret the results first, even before your doctor sees them.
In just a few years, AI has rapidly spread to hospitals and clinics around the world. The U.S. Food and Drug Administration (FDA) has approved the use of more than 1,000 health-related AI tools, and more than two out of three physicians say they use AI to some extent, according to a recent survey by the American Medical Association. The potential is enormous. AI—particularly in the form of AI agents capable of thinking, adapting, and acting autonomously—can ease physicians’ workloads by drafting patient notes and chart summaries, supporting precision medicine with more targeted treatments, and flagging subtle abnormalities in scans and slides that the human eye might miss. AI can accelerate drug discovery and therapeutic targets through novel processes, such as protein structure prediction and AI-based design, which led to last year’s Nobel Prize in Chemistry. AI can provide patients with faster and more personalized support by scheduling appointments, answering questions, and predicting side effects. It can also help match candidates for clinical trials, monitor health data in real time, and alert doctors and patients early to prevent complications and improve outcomes.
But the promise of AI in medicine will only be realized if it is built and used responsibly.
Today’s AI algorithms are powerful tools that recognize patterns, predict, and even make decisions. But they are not infallible or omniscient predictors. Nor are they close to matching human intelligence, despite what some advocates of so-called artificial general intelligence (AGI) suggest. A number of recent studies highlight the potential, but also the risks, pointing to how medical AI tools can misdiagnose patients and how AI can undermine doctors’ own skills.
A team at Duke University (including one of us) tested an FDA-approved AI tool designed to detect microscopic swelling and hemorrhages in brain MRIs of Alzheimer’s patients. The tool improved radiologists’ ability to detect these tiny specks in MRIs, but it also generated false alarms, often mistaking harmless blurry images for serious ones. We concluded that the tool is useful, but radiologists should carefully read MRIs first and then use them as a second opinion, not the other way around.
These findings are not limited to the tool we studied. Few hospitals independently evaluate the AI tools they use. Many assume that once a tool is FDA-approved, it will work in their local environment, which is not necessarily true. AI tools work differently for different patient populations, each with unique vulnerabilities. Therefore, it is essential that health systems conduct a thorough and comprehensive quality check before implementing any AI tool to ensure it works in that local environment and then educate clinicians. Furthermore, AI algorithms and the ways humans interact with them change over time, prompting Robert Califf, former FDA Commissioner, to urge continued monitoring of medical AI tools after they are marketed to ensure their reliability and safety in the real world.
In another recent study, gastroenterologists in Europe were given a new AI-powered system for detecting polyps during colonoscopies. Using this tool, they initially detected more polyps—small growths that can develop into cancer—suggesting that the AI was helping them identify areas they might otherwise have missed. But when doctors returned to performing colonoscopies without the AI system, they detected fewer precancerous polyps than when they used AI. Although the exact reason is unclear, the study authors believe that doctors may have become so reliant on AI that, in its absence, they were less focused and less able to identify these polyps. Another study supports this phenomenon of “skill default,” showing that overreliance on computer aids may make human vision less likely to scan peripheral visual fields. The very tool that was supposed to improve medical practice may have weakened it.
If used uncritically, AI not only spreads misinformation but also weakens our ability to verify its accuracy. It’s the Google Maps effect: Drivers who once relied on memory for navigation now often lack basic geographic awareness because they have become accustomed to unconsciously following the voice in their cars. Earlier this year, a researcher surveyed more than 600 people of various ages and educational backgrounds and found that the more a person uses AI tools, the weaker their critical thinking abilities become. This is known as “cognitive offloading,” and we’re only just beginning to understand how it relates to doctors’ use of AI.
All of this confirms that AI in medicine, as in any other field, works best when it augments the work of humans. The future of medicine isn’t about replacing healthcare providers with algorithms; it’s about designing tools that improve human judgment and enhance what we can accomplish. Physicians and other providers must be able to identify when AI is wrong and maintain the ability to operate without AI when necessary. The way to achieve this is to build medical AI tools responsibly.
We need tools built on a different model—tools that motivate providers to reconsider, weigh alternatives, and engage effectively. This approach is known as “intelligent choice architecture” (ICA). Using ICA, AI systems are designed to support, rather than replace, sound judgment. Instead of declaring, “There’s a deficiency,” an ICA tool might highlight a specific area and prompt “to examine this area carefully.” ICA enhances the skills on which medicine relies—clinical reasoning, critical thinking, and human judgment.
Apollo Hospitals, India’s largest private healthcare system, recently began using an AI-powered cardiac risk assessment (ICA) tool to guide doctors in heart attack prevention. A previous AI tool provided a single heart attack risk score for each patient. The new system provides a more personalized analysis of what that score means for the patient and what contributed to it, so the patient knows which risk factors to address. It’s an example of this kind of gentle guidance that enables doctors to succeed without compromising their autonomy.
There’s a tendency to overhype AI as if it has all the answers. In medicine, we must temper these expectations to save lives. We must train medical students to work with and without AI tools, and treat AI as a second opinion or assistant rather than an expert with all the right answers. The future lies in the collaboration of humans and AI agents.
We’ve already added tools to medicine without diminishing doctors’ skills. A stethoscope amplifies the ear without replacing it. Blood tests provide new diagnostic information without replacing medical history or physical exams. We must apply the same standard to artificial intelligence. If a new product makes doctors less accurate or decisive, it is either not ready for practical use or is being used incorrectly.
For any new medical AI, we must ask: Does it make doctors more or less thoughtful? Does it encourage reconsideration or encourage automatic assent? If we commit to designing systems that only enhance our abilities rather than replace them, we will have the best of both worlds, combining the tremendous potential of AI with the critical thinking, empathy, and realistic judgment that only humans possess.
