Every few weeks someone asks me the same question: "Should I buy a Mac mini M4 to run AI locally?" And every time, my answer is the same - that's the wrong question to lead with. The right question is: which task, at what quality, on how much memory? Hardware is the last decision, not the first.
I've been chasing the same goal a lot of practitioners have: becoming self-sufficient on local AI so I'm less dependent on cloud LLM subscriptions, without sacrificing output quality. My current Windows machine has no usable GPU, which makes tools like Ollama and LM Studio frustrating at best. The Mac mini M4 is an obvious candidate. But "is it good?" is meaningless until you define what you're asking it to do. So let's do this the way we'd plan any piece of infrastructure: start from the workload and work backward to the spec.
The One Constraint That Governs Everything: Unified Memory
On Apple Silicon, the instinct from the PC world - "I need a bigger GPU", leads you astray. The Mac mini M4 doesn't have a discrete GPU with its own VRAM. It has unified memory, a single pool shared by the CPU and GPU. For local inference, this is actually a strength: there's no copying model weights across a PCIe bus, and the whole memory pool is available to the model.







