Biohub, the nonprofit biomedical research organization co-founded by Mark Zuckerberg and Priscilla Chan, just dropped what might be the most ambitious open-source toolkit in the history of protein science. The suite includes three AI models designed to map, predict, and design proteins at a scale that was genuinely unthinkable a few years ago.

The centerpiece is the ESM Atlas, which covers 6.8 billion proteins. To put that in perspective, the human body contains somewhere around 20,000 protein-coding genes. This atlas is charting territory several orders of magnitude beyond what any individual lab could catalog in a lifetime.

What Biohub actually built

The release includes three distinct tools working in concert. ESMFold2 handles structure prediction and protein design, essentially letting researchers model how proteins fold and engineer new ones from scratch. ESMC is a protein language model trained on billions of sequences, treating amino acid chains the way GPT treats words. And the ESM Atlas ties it all together as a comprehensive database spanning those 6.8 billion proteins.

The practical payoff is already showing up in the lab. Biohub says the models can design functional binders with therapeutic-level affinity, meaning the AI-designed proteins actually stick to their targets well enough to work as potential drugs. These results have been validated through laboratory testing, not just computational prediction.