If you've tried asking ChatGPT, Claude, or any AI SQL assistant to generate a query and gotten back something that looked plausible but was subtly wrong — you're not alone. The frustrating part is it often runs. The database returns rows, the numbers look reasonable, and you ship it. Three days later, someone points out the totals are off by 20%.
The problem isn't the AI. The problem is the prompt.
Text-to-SQL works remarkably well when you give the model what it actually needs. According to AWS's benchmarking, GPT-4-class models achieve a 94% first-try success rate on ad-hoc analytics queries when the schema and foreign key constraints are properly provided. Without that context? You're closer to 60%. The difference is entirely in how you prompt.
This guide covers the practical techniques — schema context, few-shot examples, business term definitions, and chain-of-thought decomposition — that separate accurate AI-generated SQL from the kind that silently lies to you.
Why AI Gets SQL Wrong (It's Not the Model's Fault)










