RUFARO MAFINYANI | Data — the raw material on which everything runs

This is the second entry in AI Fluency Corner, a 16-part weekly series building one connected mental model of artificial intelligence (AI), in plain language. Two people stand at the same till, buying the same R14,000 laptop. Both tap a card. One walks out. The other’s phone lights up: “Did you make this transaction?” Same item. Same price. Same shop. Different verdict. Last week we reduced AI to a defensible idea: it learns patterns from data rather than waiting for a human to write every rule. This week comes the harder question: what data shaped the decision? Because AI does not learn from the world. It learns from the record we kept of the world. Four implications follow. Data is not reality, it is a sourced version of reality Data is recorded experience: a payment, route, email, click, complaint or sensor reading. Each record has a source, a purpose, an age and a gap. A fraud system does not “see” a customer. It may see card transactions, spending ranges, merchant types, device signals, time, location consistency and confirmed fraud. To one customer a R14,000 purchase after payday in her usual suburb is normal. To another, the same purchase on a newly added device, far from usual activity, is an outlier. The system is not judging the laptop. It is comparing this moment with the history available to it. Before asking whether an AI tool is clever, ask what slice of life it was permitted to observe. A model cannot learn what was never collected. Before data trains AI, people shape the data “Trained on data” can sound as though computers drink from a pristine digital river. In reality, people created much of the material, selected what to capture, and often labelled examples so systems could learn. Transactions may be labelled “fraud” or “genuine”. Complaints may be tagged “resolved” or “escalated”. Applicants may be labelled “successful”, even when old definitions carried old blind spots. Not every model requires a human to label every record. But human behaviour, definitions and judgment sit upstream of nearly every useful business application. Labels are instructions disguised as history. If a business calls customers “low quality leads” when the real failure was slow service, the model does not correct the misunderstanding. It learns it and scales it. AI can automate a pattern. It cannot decide whether the pattern deserved to exist. Data quality and quantity decide where AI can safely work Once sourced and labelled, data must be tested for relevance, accuracy, completeness, recency, consistency and sufficient volume. A few reliable examples may support a prototype; they cannot safely justify an automated decision affecting thousands. This is why a business can be ready for one AI use case and recklessly early for another. A chatbot answering from approved, current product documents may be useful. An agent changing customer records or releasing payments requires richer evidence, permissions, audit trails and human escalation. AI-powered automation needs stable process data. Insight and scenario analysis need broad, timely histories. Optimisation needs reliable constraints and outcomes. Prototyping tolerates incomplete data only while it remains exploration, not production. Quantity matters because models need enough examples to distinguish a pattern from an accident. Quality matters because thousands of inaccurate, irrelevant or stale examples simply teach the wrong lesson confidently. Bad data inside an AI agent can automate the wrong decision across an entire workflow — at speed, at scale and before anyone notices why. The output is a claim, not a conclusion Even with sound data and a capable model, results must be verified in proportion to what happens next. With generative AI, the chain is easy to follow: prompt → sourced data → validated evidence → model → inference → verified action. When reliable evidence is missing, a model may fill the space with something shaped like an answer: a citation that never existed, code calling a fictional function, or a contract summary containing a clause nobody wrote. South Africa supplied an uncomfortably neat example. The Draft National Artificial Intelligence Policy was withdrawn after fictitious sources were confirmed in its reference list. Communications minister Solly Malatsi said the most plausible explanation was that AI-generated citations had been included without proper verification.The joke writes itself: a policy about governing AI was undone by the absence of the most basic AI governance step — check the output. This is not an argument against AI. It is an argument for placing it where the evidence and controls justify it. What this means for your work Think of a workflow as a chain: capture information, analyse, decide, act, then review. AI can assist throughout, but not with equal autonomy. It can classify documents, summarise calls or prototype options with modest risk; generate scenarios when data is current; and act as an agent, only where inputs are trusted, permissions controlled and results reversible or escalated. AI fluency means asking before deployment: do we have enough of the right data; is it accurate and recent; who validates it; what action follows; and who checks the outcome? The model is the impressive middle. The data before it and oversight after it decide whether optimisation is productive, safe and accurate — or merely fast. Our task this week Choose one process at work and map its data: what is captured, who verifies it, where it may be stale or missing, and which AI role it can safely support today — assistant, chatbot, insight tool, automation or agent. • Mafinyani is senior partner in financial engineering & artificial intelligence at specialised finance, risk and applied technology firm Intellica Analytics. Next week: Algorithms — the instructions behind every decision. If data is the material, algorithms are the machinery that turns it into outcomes.

RUFARO MAFINYANI | Data — the raw material on which everything runs

RUFARO MAFINYANI | Data — the raw material on which everything runs

Other newsrooms on this story

Related reading

RUFARO MAFINYANI | Algorithms — how AI turns data into decisions

RUFARO MAFINYANI | How AI is reading your words at work

RUFARO MAFINYANI | What AI is — and isn’t

RUFARO MAFINYANI | Machine learning ― how systems learn from data

RUFARO MAFINYANI | Multimodal AI – when information stops living in separate…

RUFARO MAFINYANI | The invisible agreements that let systems — and now AI —…

Other newsrooms on this story

Related reading

RUFARO MAFINYANI | Algorithms — how AI turns data into decisions

RUFARO MAFINYANI | How AI is reading your words at work

RUFARO MAFINYANI | What AI is — and isn’t

RUFARO MAFINYANI | Machine learning ― how systems learn from data

RUFARO MAFINYANI | Multimodal AI – when information stops living in separate…

RUFARO MAFINYANI | The invisible agreements that let systems — and now AI —…