Data Quality 2.0: From Scoring To Data Reliability Engineering

Sandesh Gawande, with 29+ years of experience in data and CEO of iceDQ: We engineer data reliability, because quality is never an accident.gettyFor the past 20 years, data quality has mainly been assessed by scoring. Profilers measure six dimensions—accuracy, completeness, consistency, timeliness, validity and uniqueness—along with the more recently added observability metrics. A high score means “good” data.But modern data ecosystems span dozens of sources, streaming pipelines, lakehouses and AI models making decisions in real time. Measurement often happens far downstream and almost always too late—by then, the damage may be done.Scores show outcomes, but they don’t reveal how a data system is built, tested and operated, or whether the data meets the needs of the business. Judging only by results is like grading a cake without checking the kitchen. Data Quality 2.0 closes that gap.Why Scoring Alone Is Not EnoughFour questions inside any organization reveal the limits of DQ scoring—and define what comes next:1. Prevention: Are defects being avoided before they reach production?2. Operations: Can the pipeline detect, halt and contain bad data in real time?3. Metrics: Are we measuring what the business actually depends on?4. Insights: When data fails, do we know origin, impact and cause (and not just that it happened)?To stay ahead, organizations must extend the DQ discipline across four domains: how data systems are built, how they are operated, what is measured and how problems are diagnosed. That broader discipline is data reliability engineering.Understanding The Domains Of Data Reliability Data quality is one capability inside a much larger system. Data reliability spans four domains: 1. Prevention: Build-And-Test PhaseMost data defects are not introduced in production. The build phase is where reliability is engineered in or quietly engineered out.Developers need business stakeholders present when defining transformation and audit rules. For instance, edge cases like voided transactions, partial refunds or multi-currency adjustments can break pipelines later if they aren’t addressed early.ETL testing needs to happen before deployment. Code tested only against sample rows could fail on production volumes schema variations. Pre-production testing across realistic scenarios can be a cheaper place to find a defect. Production is usually the most expensive.Pipelines should be observable from the inside with row-level tracing and check-points embedded during development, instead of bolted on after an incident. A pipeline that cannot tell you what it is doing is a pipeline you cannot trust. You need to build in whitebox monitoring and proper logging.2. Operations: Production PhaseOnce a pipeline is live, the focus shifts from design to operation. The goal is to stop bad data from propagating. Prioritize establishing checks and controls in orchestration. Daily schedules should enforce data conditions such as row-count thresholds, reconciliation balances and dependency completeness even before allowing the next stage to run.Aim to stop mechanisms with alerts. When a defect is detected, the pipeline should halt, not continue with corrupted inputs. This is the data equivalent of Toyota’s andon cord. Without it, a single upstream failure could result in a downstream cascade in a short amount of time.3. Metrics: Layered MeasurementsThis is where DQ 2.0 most directly extends DQ 1.0. The six dimensions of data quality metrics are necessary and remain useful, but a reliability-engineered system should also measure:• Observability Metrics: Think pipeline-level signals, like processing delays, volume anomalies, data drift, schema drift. These catch problems the data itself does not yet show.• Business Metrics: A row can be technically valid and still be business-wrong. Anomalies in the values that matter to the business may appear as sudden spikes in transaction amounts, unusual discount patterns or outlier order sizes. • Business Reconciliations: End-to-end integrity across systems matters. Does the source ledger total match the reporting layer? Does shipping data line up with orders data? Reconciliation answers a question no single-system DQ check can: Did the truth survive the journey?Each layer catches what the previous one cannot. Together, they describe whether the system is delivering business reality, not just whether the data passes a profile.4. Insights: Origin, Cause And ImpactDetection is necessary, but not sufficient. Knowing something is wrong without knowing where it started, what it affected or why it happened produces alert fatigue as opposed to true resolution. Insight-driven reliability requires three answers every time a defect is flagged.Where did the defect actually start? Symptoms appear in the gold or reporting layer, but the root cause is almost always upstream—a malformed source file, a schema change in an API, a failed dependency. Look at the origin. What downstream tables, dashboards, models and decisions are affected? Without impact analysis, every defect feels equally urgent, and the wrong things get fixed first.Was this an improper code change, an execution-timing issue, an upstream data shift or an infrastructure failure? Looking at the cause helps you identify how to approach remedying. The fix for each is different. Conflating them means you risk shipping the wrong patch.From Scoring To Reliability EngineeringScoring tells you what just happened and even then, DQ dimensions can be incomplete, because they do not measure business metrics or financial reconciliation. Engineering tells you whether the system is built to keep delivering. Practically, this means treating data pipelines the way reliability engineers treat any production system: design for failure, instrument every layer, set targets, build stop mechanisms and invest as much in diagnosis as in detection. The six dimensions stay in the picture in the inner circle, but they sit inside a larger discipline that asks better questions.Data Quality 1.0 asked: Is the data correct right now?Data Quality 2.0 asks: Is the system engineered to keep producing correct data and to tell us, fast, when it is not?Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Data Quality 2.0: From Scoring To Data Reliability Engineering

Other newsrooms on this story

Related reading

# Turning Data Quality Checks into a Measurable Trust Score

From Raw Data To Smarter Decisions: Decision Intelligence Best Practices

Most data strategies fail before they start: What organisations miss about…

The Architecture Of Trust: Why Quality Engineering Must Move Upstream In The…

Understanding Data Accuracy Meaning: Best Practices for Engineers

Sumeh: one API for data quality across 14 engines

Related reading

# Turning Data Quality Checks into a Measurable Trust Score

From Raw Data To Smarter Decisions: Decision Intelligence Best Practices

Most data strategies fail before they start: What organisations miss about…

The Architecture Of Trust: Why Quality Engineering Must Move Upstream In The…

Understanding Data Accuracy Meaning: Best Practices for Engineers

Sumeh: one API for data quality across 14 engines

Other newsrooms on this story

Other newsrooms on this story

Related reading

# Turning Data Quality Checks into a Measurable Trust Score

From Raw Data To Smarter Decisions: Decision Intelligence Best Practices

Most data strategies fail before they start: What organisations miss about…

​The Architecture Of Trust: Why Quality Engineering Must Move Upstream In The…

Understanding Data Accuracy Meaning: Best Practices for Engineers

Sumeh: one API for data quality across 14 engines

Related reading

# Turning Data Quality Checks into a Measurable Trust Score

From Raw Data To Smarter Decisions: Decision Intelligence Best Practices

Most data strategies fail before they start: What organisations miss about…

​The Architecture Of Trust: Why Quality Engineering Must Move Upstream In The…

Understanding Data Accuracy Meaning: Best Practices for Engineers

Sumeh: one API for data quality across 14 engines

Other newsrooms on this story

The Architecture Of Trust: Why Quality Engineering Must Move Upstream In The…

The Architecture Of Trust: Why Quality Engineering Must Move Upstream In The…