Databricks for Good and Virtue Foundation: Partnering to Connect Medical Volunteers to Critical Health Services in 72 Countries

How an AI-focused Databricks pro bono project is helping the Virtue Foundation deliver healthcare where it is most needed by unifying millions of records to better match clinician skills to volunteer opportunities. By building scalable AI pipelines, we are creating one of the most advanced medical datasets in the world.

venerdì 22 maggio 2026 New tab

IntroductionVirtue Foundation is a nonprofit focused on global health delivery and creating an efficient marketplace for global philanthropic healthcare. To date, they’ve delivered care to over 50,000 patients with a special focus on Ghana and Mongolia. The backbone of this marketplace is the curation of global healthcare facility data through VF Match, a platform that connects medical professionals to volunteer opportunities in 72 low and low-middle income countries. Databricks for Good has been partnering closely with Virtue Foundation since 2024 to leverage AI to aggregate data across these countries and make it actionable.An initial proof of concept demonstrated that LLMs could extract structured information from disparate web data sources to create a map of healthcare infrastructure and, most importantly, the gaps in services in under-resourced areas. However, scaling this functionality and moving it into production posed many challenges. Since that first iteration, we’ve built a Databricks-based platform that has transformed the POC into a production-grade system aggregating data from thousands of healthcare facilities and non-profits across the globe.In this article, we walk through how we improved on our earlier work to further enable Virtue Foundation to match their community of medical volunteers with critical needs in these countries.Building the Foundation: 72 Countries of Healthcare DataThe core of VF Match is the Foundational Data Refresh (FDR): a comprehensive healthcare facility and nonprofit dataset built from the ground up from various web-based sources. We systematically ingest and refresh data from 72 low and low-middle income countries across the globe.Two complementary data sources power this refresh:Overture Maps: An open-source geospatial dataset by Meta and Microsoft, providing authoritative locations for healthcare facilities.Bright Data: Industrial web-scraping infrastructure that captures real-time information from across the internet.The heart of FDR is an information extraction pipeline powered by OpenAI’s GPT models. Processing more than 25 million web pages through LLMs with production guarantees required rethinking traditional LLM inference pipelines. Rather than attempting one-shot extraction, our pipeline breaks the task into targeted steps: classifying medical relevance, identifying organization type (either a medical facility or NGO), and extracting specialties, equipment, and procedures.

Databricks for Good and Virtue Foundation: Partnering to Connect Medical Volunteers to Critical Health Services in 72 Countries

Other newsrooms on this story

Related reading

Anthropic, Gates Foundation launch $200 million partnership for AI in health,…

Other newsrooms on this story

Related reading

Anthropic, Gates Foundation launch $200 million partnership for AI in health,…

Anthropic commits $200M with Gates Foundation to deploy AI in global health,…

Anthropic, Gates Foundation launch $200 million partnership for AI in health,…

Anthropic And Gates Foundation Sign $200 Million Deal For AI Use In Health,…

At Anthropic, we believe that AI can increase nonprofit capacity. And we've…

Anthropic and nonprofit partner to streamline benefits administration with AI