Investigations into what caused the Optus outage are pointing to potential similarities with the infamous Facebook disruption from two years ago, as indicated by insights from industry experts.
Behind the scenes of the Optus outage
Monitoring from Cloudflare, a company overseeing various internet operations, identified a surge in Border Gateway Protocol (BGP) announcements from Optus that aligned with the timeline of the network’s failure. BGP serves as the internet’s navigational system, with these announcements informing the network of the most efficient paths to specific destinations.
Matt Tett, the managing director at Enex TestLab, a network analysis firm, relayed to Guardian Australia that the exact origins of the problem are not yet confirmed. However, he noted that Optus seemed to experience a routing issue around 4 am, which triggered a sharp increase in BGP announcements, a possible factor in the outage.
Beware customers of Optus: Compensation in talks
Is this case an Optus data breach or not?
Upon awakening to news of the outage, Tett’s initial instincts suggested two probable culprits for what caused the Optus outage: a data breach or a configuration error. He mentioned that, more often than not, such widespread disruptions are due to the latter. Tett conjectured that resolving the issue likely required on-site intervention, with an engineer having to make a physical connection to a router to restore functionality.
Optus, Tett speculated, would be in the midst of an intensive investigation to ascertain the responsible party, whether it be an internal error or an external partner involved in their service operations.
He further explained that the comprehensive impact on internet, landline, and mobile services could be attributed to the integrated nature of modern networks, which are predominantly IP-based. Thus, a single issue within the internet protocol network has the potential to cascade, bringing down the entirety of the system’s infrastructure.
The massive disruption experienced by Facebook, WhatsApp, and Instagram in 2021, which lasted five hours, was attributed to a complication with BGP. Facebook identified the cause as a configuration change to their backbone routers responsible for directing traffic between data centers, leading to an extensive domino effect that caused the services to cease operation.
Reflecting on this precedent, Optus CEO Kelly Bayer Rosmarin conveyed to the ABC that their engineers had pursued multiple restoration strategies in response to what caused the Optus outage. Despite these efforts, the desired outcomes to reinstate mobile and internet services remained elusive.
“We had a number of hypotheses – and each one so far that we’ve tested, and put in place new actions for, has not resolved the fundamental issue.”
Unveiling the Mr. Cooper data breach: What happened?
Following last year’s widely reported hack of Optus, which led to the exposure of personal information belonging to 10 million customers, speculation arose around the possibility of a subsequent cyber-attack being what caused the Optus outage. However, CEO Kelly Bayer Rosmarin deemed it “highly unlikely” that the recent service interruption stemmed from a hack, emphasizing that such outages are considered “very, very rare occurrences.”
As one of the triad of mobile network operators in Australia, the dependency of the public on Optus services is not lost on the company. In pursuit of resilience and reliability, Singtel, the parent company of Optus, reported in its latest annual review that it had fortified its infrastructure with “key network infrastructure diversity,” a strategic move to mitigate the risk of network disruptions and ensure consistent uptime.
Featured image credit: Kerem Gülen/DALL-E 3