We all know the feeling. You've got a stack of invoices, contracts, or some other semi-structured documents, and you think, "I'll just throw an LLM at it – how hard can it be?"
Hard. Very hard. At least, that was my experience last month.
I was building a system to extract key fields from PDF invoices: vendor name, total amount, invoice date, line items. Seemed straightforward. I'd used GPT-4 before, and it's great at understanding natural language. How wrong I was.
My First Attempt: The Naive Prompt
I wrote a simple system prompt:






