Deploying Gemma 4 26B on Proxmox: IaC Setup with Terraform, Ansible & AMD iGPU

Originally published at woitzik.dev

Running large language models (LLMs) like Gemma 4 26B locally usually requires massive Nvidia clusters. But what if you want to run it in a home lab or a constrained edge environment using Infrastructure as Code (IaC)?

In this guide, I will show you how to automate a complete local AI stack on Proxmox VE using Terraform for the infrastructure and Ansible for provisioning. We will cover the quirks of the Proxmox Terraform provider, setting up Ollama, and deploying Open-WebUI as our frontend.

As a bonus, I will show you how to enable hardware acceleration by passing through an unsupported AMD iGPU to the LXC container.

View the complete Proxmox IaC source code on GitHub 🐙

Originally published at woitzik.dev

As a bonus, I will show you how to enable hardware acceleration by passing through an unsupported AMD iGPU to the LXC container.

View the complete Proxmox IaC source code on GitHub 🐙

Deploying Gemma 4 26B on Proxmox: IaC Setup with Terraform, Ansible & AMD iGPU

Deploying Gemma 4 26B on Proxmox: IaC Setup with Terraform, Ansible & AMD iGPU

Other newsrooms on this story

Related reading

Breathing Life into the Pi: Deploying Gemma 4 2B on a Raspberry Pi 5

Gemma 4 on 16GB RAM: What Actually Works for Structured AI Workflows

Gemma 4 QAT on 10GB Laptop: Local AI with 6.7GB VRAM

The Delusion of Infinite Compute: Running Gemma 4 on an i5 CPU

Running Gemma 4 on a Modest Machine: Unsloth vs LM Studio vs llama.cpp vs Ollama

Gemma 4 vs GPT-4o vs Llama 3: What Actually Works Locally?

Other newsrooms on this story

Related reading

Breathing Life into the Pi: Deploying Gemma 4 2B on a Raspberry Pi 5

Gemma 4 on 16GB RAM: What Actually Works for Structured AI Workflows

Gemma 4 QAT on 10GB Laptop: Local AI with 6.7GB VRAM

The Delusion of Infinite Compute: Running Gemma 4 on an i5 CPU

Running Gemma 4 on a Modest Machine: Unsloth vs LM Studio vs llama.cpp vs Ollama

Gemma 4 vs GPT-4o vs Llama 3: What Actually Works Locally?