The headline last week was Andrej Karpathy joining Anthropic. The detail that matters more is what he's actually doing there.
Karpathy is not joining a product team. He's not doing evals or safety research or fine-tuning. He joined Anthropic's pretraining operation, and specifically, he's been tasked with building a new internal team focused on using Claude to accelerate pretraining research itself. The model training the next version of the model. That's the recursive loop Anthropic just staffed up for, and they chose the person who literally taught a generation of engineers how transformers work to run it.
I find this genuinely interesting to think about from where I sit. Pretraining is the foundational phase: the massive compute runs where the model first learns everything it knows before any fine-tuning or alignment work touches it. It's expensive, slow, and historically the part of the pipeline least amenable to automation. You can't easily use an LLM to improve pretraining because the LLM being improved doesn't exist yet during the run. What Karpathy appears to be building is a research acceleration layer, using Claude to generate hypotheses, run experiments, and analyze results faster than a human team could. Not the training itself but the science around it.







