Storia in 1 fonti

The AI tools shaping patient care may be operating outside regulatory oversight. MIT researchers say it's time to change that.

Every day, across thousands of American hospitals, artificial intelligence quietly shapes decisions that determine patient outcomes. An algorithm flags a patient as high risk for sepsis; a risk score informs whether a woman receives additional cancer screening; a deterioration model triggers an alert that sends a care team to a bedside. These tools are embedded in the workflows of nearly two-thirds of US hospitals, integrated into the electronic health record systems clinicians rely on daily. But many have never been reviewed by the FDA.A new viewpoint in The Lancet Digital Health, co-authored by researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and Jameel Clinic, traces how this problem took root, why it carries serious consequences, and what genuine transparency would require to fix it.The argument, the scientists say, is not that AI has no place in clinical decision-making. It is that a $4 billion market of clinical decision support tools operates largely beyond public accountability, leaving patients and providers often unable to know whether the tools influencing their care have been validated, by whom, or for which populations they work as intended."The current situation leaves clinical decision software (CDS) tool developers and users with undesirable uncertainty," the authors write, "and patients and providers using tools with unknown validation, outcomes, and regulatory status." That uncertainty, they argue, stems from well-intentioned policy that moved faster than enforcement could follow.CDS refers broadly to any tool that provides health professionals with information intended to improve care — a category spanning simple reminders and order sets to sophisticated AI-driven prediction models. The FDA's definition of a medical device is broad enough to encompass many of them, but a decade of regulatory shifts has produced a landscape where the same tool might face rigorous premarket scrutiny if sold by a commercial device manufacturer, and none at all if bundled into an electronic health record system.The pivotal moment came with the 2016 21st Century Cures Act, which carved out certain CDS software from device regulation, provided the tools met four criteria centered on transparency and clinician autonomy. "The FDA's 2019 draft guidance interpreted those criteria relatively permissively, and health systems responded by adopting a wave of new AI tools during a period when they believed they were operating within the rules," says Regina Barzilay, professor at MIT CSAIL and co-author of the viewpoint. "When the FDA's 2022 final guidance significantly tightened that interpretation, those tools were already embedded in clinical practice, and enforcement remained minimal. The effect was a kind of regulatory whiplash: the goalposts moved, the tools stayed, and the gap between policy and practice quietly widened."The paper draws on a 2023 survey of more than 2,400 hospitals, which found that 65 percent reported using AI or predictive models, primarily to identify high-risk patients. The researchers point to several widely used tools to illustrate the uneven terrain. The Epic Sepsis Model, deployed broadly across the country through the Epic electronic health record system, is not FDA-cleared. The Sepsis ImmunoScore, which serves the same clinical function, obtained FDA clearance in 2024 through the agency's De Novo pathway. The Epic Deterioration Index, the most widely accessible inpatient deterioration model in the country through its integration with Epic, similarly lacks clearance, while two competing models, PeraTrend and eCART, have obtained it.The Tyrer-Cuzick breast cancer risk model offers perhaps the most pointed example. It is embedded in national clinical guidelines from the National Comprehensive Cancer Network and used by some insurers to determine whether a woman qualifies for MRI screening or chemoprevention. Studies have raised concerns about its predictive accuracy, reporting area-under-the-curve values averaging around 0.60, and research has documented that the model underestimates cancer risk for Black women. Despite the high stakes of the decisions it informs, it has not been subject to FDA review."Why should a CDS tool developed by a large interstate health system and used on thousands of patients be exempt from regulation," the authors ask, "whereas an identical CDS tool developed and sold by a device manufacturer is subject to premarket clearance?"The question points to a deeper structural problem. The current framework places the heaviest regulatory burden not on the tools with the widest reach, but on those sold through conventional commercial channels. An AI model developed internally by a hospital system, or embedded in software hospitals already use, can reach millions of patients without triggering the scrutiny a smaller, commercially sold tool would face. The authors acknowledge that the FDA's low-intensity enforcement reflects genuine resource constraints and competing priorities, including the rapid emergence of generative AI tools that existing guidance was not designed to address. But the effect, they argue, is a market in which patient safety protections are applied inconsistently and often inversely to clinical impact.The landscape shifted again in January 2026, when the FDA issued revised final guidance superseding the 2022 version. The update introduced enforcement discretion for CDS tools that produce a single clinically appropriate recommendation, expanded its illustrative examples of what counts as a regulated device, and removed language from the 2022 version that had explicitly identified risk prediction tools as CDS. It was a substantive response to years of industry pushback – but it moved in more than one direction. Under the new enforcement discretion policy, a tool like the Tyrer-Cuzick is arguably less likely to face active FDA oversight than before. The authors' concern is not that outcome in isolation. Whether a tool benefits from enforcement discretion or is not a medical device at all, it is still being used at scale to shape patient care – and in each of those categories, the mechanisms for validating performance, surfacing population-specific failures, and informing the patients and clinicians involved are largely absent.A full regulatory crackdown, the researchers argue, would be neither realistic nor desirable. It would overwhelm an already resource-constrained agency, risk disrupting patient care where no regulated alternatives exist, and could create a chilling effect on CDS innovation at a moment when the technology holds genuine promise. The stakes are too high, however, for the status quo to continue. Instead, they propose a pragmatic framework of three mutually reinforcing steps: increased public disclosure, structured dialogue between industry and regulators, and updated FDA guidance.The first is a public registry in which health systems disclose which CDS tools they are using and how those tools were validated, without triggering regulatory consequences for disclosure. There is currently no comprehensive picture of the CDS landscape, either for regulators prioritizing oversight or for patients and providers trying to understand the tools shaping their care. A registry would not replace FDA review, but it would create the visibility necessary for risk-based oversight to function.The second is structured, non-binding dialogue between the FDA, developers, and clinicians. The current environment discourages developers from engaging with regulators proactively, because doing so can invite scrutiny they are not yet equipped to navigate. Formal but consequence-free consultation pathways would allow the FDA to better understand the tools being deployed and give developers a clearer sense of where their products stand.The third is a revision of the 2022 guidance itself, updated to reflect new technologies and to resolve interpretive disputes that have left developers and health systems uncertain about their obligations. On this front, the authors have in a sense gotten what they asked for. The FDA's January 2026 update addressed several of the ambiguities the viewpoint identified, but also narrowed the scope of active oversight in places where the authors argue it is most needed. Taken together, they argue, these steps represent a pragmatic path between the extremes of enforcement overreach and continued ambiguity. The goal is not to slow the adoption of AI in clinical settings but to make that adoption visible, accountable, and ultimately safer. Patients interacting with AI-informed care deserve to know whether the tools involved have been rigorously evaluated, and for patients in populations historically underserved by medical research, that question carries particular weight."The guidance is evolving, but not necessarily in the direction patient safety requires," says Maëlle-Marie Corso. "Getting CDS guidance right matters more than ever as AI tools proliferate in clinical settings, but drawing the line between what the FDA should and should not scrutinize is genuinely hard – and wherever that line ends up, it will shape patient care for years."“The FDA itself is just beginning to use AI,” says coauthor Paul Kim, who is principal of Kendall Square Policy Strategies. “Only dialogue with stakeholders, including innovators like MIT, can ensure the Agency focuses on meeting patient needs while managing critical risks."The viewpoint was authored by Maëlle-Marie Corso and MIT professor Regina Barzilay of CSAIL and the Jameel Clinic, alongside colleagues from Kendall Square Policy Strategies, Mayo Clinic's Center for Digital Health, and Massachusetts General Hospital. It was funded by the MIT Abdul Latif Jameel Clinic for Machine Learning in Health.

Raccontata da

csail.mit.edu

Timeline cronologica

mercoledì 10 giugno 2026·csail.mit.edu
The AI tools shaping patient care may be operating outside regulatory oversight. MIT researchers say it's time to change that.
Every day, across thousands of American hospitals, artificial intelligence quietly shapes decisions that determine patient outcomes. An algorithm flags a patient as high risk for…