​Jochen Deister is the founder and VP of innovation and strategy at Privacy Solutions.gettyIf you have used an AI tool, you have probably had this experience. You query AI about something in your domain, and it gives you an answer that is well-structured, clearly reasoned and properly sourced. It sounds right. But you have a nagging sense that you should verify it, and then you realize that verifying it would take almost as long as doing the work without AI. So you don’t. You move on and hope the answer was correct. It sounded plausible after all, didn’t it?​That moment, the one where you choose to trust the AI’s output because checking it would cut into your productivity gains, is worth examining. Because there is a gap between the answer you get and the model that produces it. And if you don’t reckon with this discrepancy, it can put your business at risk. How LLMs Actually LearnAI training pipelines are robust. They filter aggressively for text quality, removing spam, duplicates and poorly written content. Some use classifier models to identify educational or high-value material. By the time data reaches the model, it has passed through multiple rounds of curation.​But I believe this is the critical distinction: These filters assess linguistic quality, not domain-specific accuracy, because how could they “know” what’s right and what’s not? LLMs are notorious for hallucinating and producing believable but incorrect or misleading text. ​You may be willing to accept a reasonable amount of risk in certain AI applications, such as in marketing or customer service, but not in others, such as in legal or healthcare. Yet I am seeing companies indiscriminately incorporating AI into high-risk applications without verifying their training data. This is where I am most concerned about the domino effect of inaccurate information leading to more inaccurate information. ​The AI training pipeline is excellent at identifying well-written text, but it has no mechanism for verifying whether well-written text is true. A well-crafted regulatory guidance document may score highly on every quality metric, regardless of whether its legal analysis is correct. A polished law firm blog post written for marketing purposes can pass every filter, even if it simplifies the law to the point of distortion. ​I see an even deeper problem. AI is trained on material written by human beings whose understanding was shaped by their knowledge, their perspective and the state of their field at the time they wrote. A regulatory guidance document reflects its authors’ interpretation on that particular day. An academic paper reflects the available case law at the time of publication. None of these is ground truth. Each is a snapshot of one person’s or one organization’s understanding at a moment in time. Plus, they might have used generative AI to write it.​Once these materials enter the training pipeline, that context is stripped away. The AI model doesn’t know if a source represents one authority’s contested interpretation or if it was written before a landmark ruling changed the legal landscape. It sees text, identifies patterns and learns what is commonly stated. When the commonly stated position happens to be correct, the model is correct. When it does not, the model reproduces the error with the same confidence it brings to everything else.A Concrete ExampleEarlier this year, I published an analysis of an official questionnaire issued by a German data protection authority, one of the agencies that enforces the GDPR. The document was designed to help organizations assess legitimate interest as a legal basis for processing personal data. It contained 38 questions, and I identified at least 19 with significant legal errors.​These were not matters of interpretation. The authority confused EU Charter of Fundamental Rights protections with internal market freedoms from a different treaty altogether, one that governs trade between member states, not the rights of individuals. They formulated the central legal test backward. They reframed absolute legal prohibitions as discretionary balancing factors.​The document was published and praised by commentators who seemed not to have read it but might have wanted to be seen next to an official document in a LinkedIn feed.​Every version of this document, and every commentary that treated it as settled guidance are now likely to enter LLM training data as high-authority content. The source is a government regulator, so the quality filters are inclined to treat it favorably. The errors risk being reinforced by each uncritical repetition—which only encourages the AI model to keep reproducing them. They form the consensus view, not an outlier position, because they are supported by the available training material. ​The citations are real. The legal provisions exist. The structure is professional. Only the substance is wrong.Beyond LawThis is not only a legal sector problem. The same mechanism can apply in every expert domain where the training material reflects human interpretation rather than verified data. ​Medical guidelines get updated, but the previous versions, written by physicians whose understanding reflected the evidence available to them at the time, persist in training data and continue to shape model outputs. Financial regulatory positions get overturned by courts, but the superseded interpretation, once someone’s good-faith analysis, still outnumbers the correction. Engineering standards evolve, but the internet preserves the old consensus more reliably than the new one.​In each case, the model reflects the aggregate of what humans have written, not the current state of knowledge.What Technology Leaders Should AskWhen evaluating AI for any domain where accuracy has consequences, I think one question matters more than all others: What is the verified source of truth behind this tool?​A good answer describes a curated knowledge layer built and maintained by domain practitioners, anchored to primary sources and updated when the underlying reality changes. A bad answer talks about the model, the guardrails or the size of the training data.​If no one can point to a verified source of truth, what you have is not an expert system. It is a sophisticated prediction engine that reflects the statistical patterns of everything that has been written about your domain, by people with varying levels of expertise and varying degrees of accuracy, at various points in time. ​The output will be polished, confident and well-structured. Whether it is correct depends on whether the majority of those human authors happened to be correct.​This is my big takeaway: The model is not the product. The knowledge layer is. And for most AI deployments in expert domains, that layer has not been built yet.​​Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?