Every agent framework I tried assumed one paid frontier model. I wanted the opposite: an orchestrator that treats free and local models as first-class, and gets smarter over time without me paying per token. That idea turned into FreePalp, and the core trick is worth sharing on its own.

The problem with cheap models

A small/free model (Llama-3.1-8B, a local Ollama model) is fast and costs nothing, but it fails the hard tasks: multi-file edits, strict output formats, tool-use discipline. The usual answer is "just use a bigger model." That's expensive and gives up on the free tier entirely.

The trick: corrections accumulate

FreePalp runs a two-tier critic. Cheap deterministic checks first (did the promised file actually get written? did the model leak a tool call as text? did it slip identity?), and only then an LLM critic. When a cheap model fails and a stronger model succeeds on retry, FreePalp doesn't throw that success away. It distills the working procedure into a reusable SKILL.md — the same format Claude Code uses — capturing the steps, the tools involved, and the one lesson that fixed it.