One Month On: “Largest IT outage in history”, what happened, what was the impact, what did we learn?
Written by Alex Locatelli, Chief Technology Officer
The recent global IT outage was a significant disruption that impacted businesses, governments, and individuals worldwide. Triggered by a cascading failure in a major cloud service provider's infrastructure, the outage brought down numerous essential services and websites, causing widespread operational halts and communication breakdowns which lasted over days for some operators.
The NZ Herald reported the incident as the largest in global history, disrupting banking, flights and retail across New Zealand businesses. Despite the fallout from this event, the disruption was caused by a content update, rather than a software update that affected software run by windows computers.
What Did We Learn?
- How did the blue screen outage emerge?>
- Content updates have downstream impacts>
- Small database of large customers = more risk>
- Reputational damage is the biggest cost >
The disruptions began when a faulty update was pushed out from CrowdStrike for one of its tools, “Falcon.” In a statement about the ongoing situation, the company said the defect was found “in a single content update for Windows hosts” — noting that Mac and Linux systems were not impacted. The update in question was a software update. Usually with software updates, users are pre-notified and asked to test and roll out the update across their systems. In this case, the content update testing wasn’t completed and was then pushed out to all global customers.
Though it is understood the content update wasn’t intentional or malicious, the downstream impact on client systems and scale of disruption is what was particularly newsworthy. It caused complete systems to ‘BSOD’ the dreaded “blue screen of death”; they were not able to be used at all.
The business in question reports to have a database of 20,000 customers. While this sounds a lot, by comparison, New Zealand accounting software firm Xero reports to have 4.2 million subscribers. A compounding factor to the incident wasn’t the number of customers, rather that the type of customer was a large global enterprise. The reported major disruptions included banks, transportation such as airlines, supermarkets and major retail stores. Banks in South Africa and New Zealand reported outages impacting payments. Some news stations, particularly in Australia, were unable to broadcast for hours. And hospitals had problems with their appointment systems, leading to delays and sometimes cancellations for critical care.
According to its website, CrowdStrike was founded in 2011 and launched in early 2012. It listed on the Nasdaq exchange five years ago. Last month, the Austin, Texas, company reported that its revenue rose 33% in the latest quarter from the same quarter a year earlier — logging a net profit of $42.8 million, up from $491,000 in the first quarter of last year. It reported having 29,000 subscribing customers.
While the organisation in question will likely recover in time, the gravity of the interruption will live beyond the accounting and revenue figures in their customers’ minds. As the New Zealand Herald says, “the grim irony was in the fact that CrowdStrike’s technology was designed to prevent malicious attacks on companies’ systems”. “We stop breaches,” the cybersecurity company says on its website.