Why Most Disaster Recovery Solves The Wrong Problem

Michael Campbell, CEO at Fusion Risk Management.gettyIn October 2025, over the course of nine days, outages across two major cloud providers disrupted operations for organizations worldwide. It started with an AWS outage that restricted access to critical systems. Just over a week later, a configuration change in Microsoft Azure's routing service triggered a second, cascading disruption. As organizations rely on a growing network of cloud platforms, third-party providers and interconnected services, disruptions like these are more frequent and harder to contain.Recovery is no longer a technical problem with a business consequence. Anticipating disruptions and prioritizing recovery steps are business decisions that depend on understanding how services and business processes are connected and on strong technical execution. Yet, most organizations still approach recovery as a purely technical function, leaving critical decisions disconnected from the business outcomes executives actually care about.What matters is understanding how disruption affects services, revenue streams, customer commitments and regulatory obligations most critical to the business. Much of the cost of a modern disruption is the time spent figuring out what is impacted, who owns the response and what to restore first. The challenge becomes exponentially harder in large, interconnected enterprises where recovery depends on systems, services, sites and other dependencies that are constantly changing. Why Recovery Breaks Down In Practice Disaster recovery is often described as a straightforward process: restore systems and resume operations. In reality, it rarely follows this path.Large enterprises rely on hundreds, sometimes thousands, of systems that support critical operations, often in ways that are not obvious until something fails. As one resilience leader described it during a recent industry event, "There are five apps you've never heard of that have to come up first." The result is predictable. Teams restore what they can, when they can, and less important systems come back before critical services.Recovery is as much a sequencing challenge as a technical one, and most organizations still rely on manual coordination to determine what gets restored first. In an analysis of more than 4,500 conversations with resilience leaders over five years, Fusion's Enterprise Resilience Report found that more than half of North American organizations continue to manage recovery with manual tools such as spreadsheets, shared drives and static documents. The technology to address this already exists; adoption is the constraint.The hard part of resilience used to be creating the plans. AI is making that process faster and easier. The real challenge has always been ensuring those plans reflect how the business actually operates, and that the right actions happen in the right order under pressure.Recovery plans are often built on assumptions that systems are stable, dependencies are known and scenarios can be planned. But as we have all seen, in reality, those assumptions break down in a modern environment. Cloud environments change, dependencies shift, and third-party services are added or replaced. What looked accurate even a few months ago can be misleading during an incident.The same disconnect often exists with external vendors. The teams negotiating service level agreements (SLAs) frequently lack visibility into how critical a vendor’s service is to business operations. As a result, contractual SLAs are often misaligned with the recovery time objectives the business actually depends on in practice. During a disruption, organizations discover that a “contractually acceptable” outage may still create unacceptable operational or financial impact.Aligning Recovery To What The Business Needs Recovery is measured by how quickly customers regain access to services, revenue resumes and the business meets its regulatory obligations. System uptime is the input, not the outcome. Improving this requires a shift built on a few core principles:1. Anchor recovery in real-time operational data. Decisions should reflect what is actually happening, not what was documented months ago. When teams can see what is impacted in real time, they can act faster and with more precision.2. Map dependencies across the business, not just inside IT. Understanding how systems, applications and services connect to business operations is what separates a coordinated recovery from a guess.3. Prioritize by business impact, not by system. The most disruptive failures often involve systems that don't appear high on a typical recovery list. Treating prioritization as a business question rather than an infrastructure question surfaces those dependencies before they become incidents.4. Optimize the sequencing itself. Determining the optimal order of recovery is a problem humans cannot solve under pressure in the time available. When sequencing is automated and aligned to business priorities, teams move from analysis to execution in minutes rather than days.The difference shows up during an actual disruption. In many organizations, the response starts with uncertainty about which systems are affected, what should be recovered first and which teams need to act. In a more mature model, impacted systems are identified in real time, dependencies make it clear which services are at risk, and recovery begins immediately on what matters most to customers and revenue.It also changes how organizations prepare. Less time goes into maintaining documentation, and more into testing realistic scenarios and finding gaps before a disruption exposes them.From Recovery To ResilienceRecovery is a business function. It directly affects financial performance, customer experience and regulatory exposure.Leaders should expect clear answers to a few basic questions: • Which services matter most to our business? • What do those services depend on? • If something fails, can teams move toward recovery immediately, or do they need to stop and figure it out first?Too often, this clarity is missing. The next era of recovery is not about restoring everything. It is about deciding what level of service the business can operate at during a disruption, and orchestrating around that. The organizations that can answer that question in real time will recover faster, protect more revenue, and maintain customer trust through whatever comes next. Everyone else will keep discovering, mid-incident, that their plans were descriptions of recovery rather than paths to it.Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Why Most Disaster Recovery Solves The Wrong Problem

Why Most Disaster Recovery Solves The Wrong Problem

Other newsrooms on this story

Related reading

Designing for disaster: why one data center is never enough

Why Business Continuity Planning Must Keep Up With Data Center Risk

Disaster recovery: quando l’organizzazione scopre se è in grado di sopravvivere…

From Recovery To Resilience: The AI Shift In Business Continuity

Adapting to new threats with proactive risk management

Beyond backup: Why resilience now defines business continuity

Other newsrooms on this story

Related reading

Designing for disaster: why one data center is never enough

Why Business Continuity Planning Must Keep Up With Data Center Risk

Disaster recovery: quando l’organizzazione scopre se è in grado di sopravvivere…

From Recovery To Resilience: The AI Shift In Business Continuity

Adapting to new threats with proactive risk management

Beyond backup: Why resilience now defines business continuity