Vitalik Buterin wants to cut the cloud out of his AI workflow entirely. In an April 2, 2026 blog post, the Ethereum co-founder laid out a detailed update on his local large language model setup, one designed to keep every token of inference running on hardware he physically controls.
The bigger pitch buried in the technical walkthrough: Ethereum needs its own fine-tuned AI models, purpose-built for tasks like verifying transactions and auditing smart contracts.
The hardware and the stack
Buterin’s setup runs Qwen3.5:35B, an open-weight model, locally on an Nvidia 5090 laptop. The performance numbers are genuinely impressive for consumer-grade hardware: up to 90 tokens per second.
He also tested alternative hardware. An AMD Ryzen AI Max Pro with 128 GB of unified memory hit 51 tokens per second. A DGX Spark managed 60 tokens per second.











