I spent three days building a regex monster to parse customer emails. It had 47 patterns, each one more fragile than the last. A single missing space would break the whole thing. By day four, I wanted to throw my laptop out the window.

That’s when I decided to try something completely different: let a large language model do the heavy lifting.

Here’s the story of how I went from regex hell to a clean, maintainable data extraction pipeline using LLMs — and why I won’t go back to hand-crafted patterns for unstructured text.

The Problem: Messy, Human-Written Text

I was building an internal tool to process support tickets. Customers would write things like: