Munib Mesinovic, a Postdoctoral Research Assistant in the Department of Engineering Science in the AI4DH lab, has been awarded funding as principal investigator for CRITICAL-MM, a new open benchmark platform for evaluating large AI models in intensive care settings.

The project is funded through the Open Multimodal AI Benchmark (OMAIB) call, part of the UK Open Multimodal AI Network (UKOMAIN), an EPSRC Network+ led by the University of Sheffield. The call was highly competitive with 35 expressions of interest submitted, 12 were invited to submit full proposals, and only 4 projects were ultimately funded. Munib is the only postdoctoral researcher acting as principal investigator among the funded projects in this round.

CRITICAL-MM (Cross-modal Reasoning with Integrated Clinical Assessment for Large Multimodal Models) tackles how AI systems are evaluated for clinical use. Rather than measuring only technical performance, the benchmark is designed to assess whether models can be trusted by clinicians, integrated into healthcare systems, and understood by patients. Drawing on nearly half a million admissions across publicly accessible intensive and emergency care datasets, the platform will enable rigorous evaluation of AI models on real-world ICU tasks. A second evaluation tier involves participatory workshops with clinicians and ethicists, assessing trust, interpretability, and ethical acceptability using co-designed rubrics.