RUFARO MAFINYANI | Machine learning ― how systems learn from data

This is the fourth instalment of AI Fluency Corner, a 16-part weekly series building one connected mental model of artificial intelligence (AI), in plain language. You open YouTube. Before you type a word, videos are already waiting. You open Gmail and several messages are in spam. Your banking app queries a transaction before you noticed the notification. None of this is guesswork. It is machine learning — and all three are running simultaneously, on three different types of it. Last week we examined algorithms — the step-by-step instructions that turn inputs into outputs. This week comes the twist: in many AI systems, those instructions are not written in advance. They emerge from the data. What machine learning actually is Machine learning is a branch of AI in which a system improves at a task by finding patterns in data, rather than following rules written for every scenario. Instead of a developer writing “if this phrase appears, mark it as spam,” the system studies thousands of emails already labelled spam or genuine and learns the fingerprints separating one from the other. This is the reversal: from instructions followed to patterns discovered. Traditional programming requires a human to foresee every case. Machine learning lets the system find what matters from the evidence and improve as new evidence arrives. Three types, three different questions Machine learning is not a single technique. There are three principal types, and knowing which applies changes the questions worth asking: Supervised learning is the most common in business. The system trains on labelled examples — data where the correct answer is already known. A credit model is shown past applications, each marked repaid or defaulted, and learns the relationship between inputs and outcomes. Every time you click “Report spam,” you provide a fresh labelled example refining the filter in real time. The model is only as reliable as the historical labels: if past decisions carried bias, the model inherits it and scales it. Labels are instructions disguised as history. Unsupervised learning works without labels. The system is handed raw data and asked to find structure in it. A retailer’s model may group customers by purchasing rhythm and promotional response ― clusters emerging from behaviour, not from a prior idea of what segments should exist. Powerful for discovery. Harder to verify, because there is no labelled ground truth to test against. Reinforcement learning trains through interaction. A system takes an action, receives a signal — reward or penalty — and adjusts. Chatbots that improve with every escalation avoided and pricing engines that update from market response both run on this principle. The reward function is the strategy: if the objective is poorly defined, the model optimises for the wrong thing. Where this is already running FNB’s banking app scores your financial health using supervised models trained on millions of account histories. The comparison placing you below peers reflects patterns from labelled outcomes, not a human analyst reviewing your account. Standard Bank’s anti-money laundering systems use unsupervised models to detect anomalous clusters across transaction networks. The system was not given a definition of suspicious. It found one by first establishing what normal looks like. Takealot’s recommendation engine groups buyers by behavioural similarity, reinforced by conversion signals. It did not know you when you arrived. It learnt from everyone who arrived before you. Credit providers are combining supervised and unsupervised models to manage risk in informal and gig economies — populations that older rule-based systems could not adequately assess. Three questions before trusting the output Machine learning produces authoritative-looking outputs. A score, a shortlist, a recommendation, a risk flag. Understanding the type behind each one changes how much confidence it deserves. For supervised models, ask who labelled the examples and how. A label such as “successful hire” may hide assumptions about what success meant at a specific time. For unsupervised models, ask what the clusters actually represent — grouping customers similarly in data does not mean they should be treated similarly in practice. For reinforcement systems, ask what is being rewarded: a chatbot rewarded only for closing tickets quickly may frustrate customers with shallow answers. What was it trained on, and when? A model built before a significant economic shift may apply patterns that no longer hold. Training data has an expiry date. It is rarely printed on the label. How often is it updated? A model accurate at deployment degrades as the world diverges from its training assumptions. Vendors who cannot answer this are selling past performance as present capability. What this means for your work Machine learning does not rise above the business that built it. It reflects the data, labels, objectives and incentives it was given — and scales whatever it found, useful or otherwise. A supervised model trained on a biased hiring history reproduces that history at volume. A reinforcement system optimising for the wrong reward gets very good at the wrong outcome. Fluency does not require building models. It requires knowing which type is in use, what it was taught, and how often the lesson is refreshed. A model nobody is monitoring is not a system. It is an assumption, running unattended. A pattern is only as useful as the world it was trained on. Your task this week Next time someone says a product “uses machine learning”, ask three questions: which type; trained on what data; updated how often? The answers will be worth more than any product brochure. • Mafinyani is senior partner in financial engineering & artificial intelligence at specialised finance, risk and applied technology firm Intellica Analytics. Next week: neural networks and deep learning — why modern AI can see, hear, and transcribe, and what that means for the tools being sold to you.

RUFARO MAFINYANI | Machine learning ― how systems learn from data

Other newsrooms on this story

Related reading

RUFARO MAFINYANI | Neural networks and deep learning

RUFARO MAFINYANI | What AI is — and isn’t

RUFARO MAFINYANI | Algorithms — how AI turns data into decisions

RUFARO MAFINYANI | How AI is reading your words at work

RUFARO MAFINYANI | Data — the raw material on which everything runs

RUFARO MAFINYANI | Multimodal AI – when information stops living in separate…