After a Global IT Outage: 3 Actions for CIOs and CISOs

A failed software update on 19 July 2024 caused one of the largest IT disruptions in history. Approximately 8.5 million Microsoft Windows devices were affected, disrupting critical sectors such as airlines, healthcare, and banking. This outage is a major wake-up call for business leaders to reassess their security posture and third-party vendor relationships.  

Here are three post-incident actions CIOs and CISOs can put into motion: 

 

1. Improve Communication with Vendors 

 

Maintaining regular communication with vendors is essential for a secure and stable IT environment. Rather than solely focusing on selecting new vendors, CIOs and CISOs should frequently engage with their current vendors, review their product offerings and features, and consider having a backup vendor for added protection

More importantly, it’s essential to regularly review all agreements and contracts with vendors. This includes end-user licensing agreements (EULAs), service level agreements (SLAs), and liability and compensation clauses. Understanding who is liable for outages, faulty software, and operational mistakes is critical. 

Moving forward, CIOs and CISOs must also revisit current vendor criteria and make necessary updates that prioritize trust, reputation, certifications, insurance, history, and cybersecurity practices. Matthew Rosenquist, CISO at Mercury Risk, advises documenting configuration and allowable settings in a policy procedure to maintain consistency and prevent unexpected issues. “When you either upgrade to a new piece of software or a different service, or you change vendors, having it documented will maintain that consistency, and you can feel confident that you’re not going to have some unusual surprises because of poor change management.” 

 

2. Update Incident Response Plans  

 

Robust incident response plans are paramount to lessen the blow of unexpected disruptions and attacks. The Bonadio Group, one of the companies who experienced the blue screen of death managed to get their servers up and running within three hours because they were prepared. CIO John Roman says, “The reason we were able to do that was we implemented our incident response plan. Most incident response plans are created in the event there’s some type of malware incident. We genericized ours to take into consideration any type of incident — including a global pandemic.” 

Therefore, it is vital to create or update incident response plans with clear procedures, roles, responsibilities, and communication protocols. These plans should detail specific steps for detecting, responding to, and recovering from various types of security incidents. Regularly testing the people, processes, and tools involved in incident management is essential to ensure quick and effective responses. 

Additionally, strengthening cooperation between IT and cybersecurity teams is crucial. During a cyberattack or IT disruption, these teams must work closely to contain and eliminate the threat. Emphasizing redundancy and failover mechanisms is also important to ensure that critical systems remain operational even if one component fails. Building redundancy into enterprise systems can prevent widespread disruptions. 

Organizations should prepare PR, legal, and cybersecurity teams for rapid response. This preparation will help mitigate damage and maintain business continuity during an incident. However, it’s important to eliminate as much red tape as possible to ensure action is taken swiftly. “We don’t want to have too many layers of bureaucracy that could slow them down, because that could make all the difference in the world and make sure the disaster recovery and business continuity plans, the communication processes, teams, tools, and necessary outside vendors are well prepared to work together,” Rosenquist says.  

Other than behind-the-scenes work, organizations must also practice transparency and honesty with their customers. In the event of an IT outage, ensure customers are informed and supported with the right information and data to prevent panic. CISOs and CISOs must also equip their teams with the right data that clearly illustrates the root cause of the disruption and its business impact. 

 

3. Conduct a Thorough Security Audit 

 

It’s high time to conduct a thorough security audit to identify and mitigate risks within your IT ecosystem. This involves a comprehensive examination of network security, endpoint protection, access controls, and data protection measures

One key aspect of the audit is identifying vulnerabilities, outdated systems, and single points of failure. Software with a single point of failure can cause millions of devices to malfunction simultaneously. Protecting and monitoring vulnerable areas is crucial, and so is having robust recovery options. Updating and testing backup systems every quarter ensures data can be restored quickly in an outage. 

Companies such as Black Wallet had their security systems compromised during the 19 July outage, but this allowed them to highlight weaknesses in their overall security posture. CIO Remi Alli explains, “The lack of access to critical security insights put us at risk temporarily, but more importantly, it highlighted vulnerabilities in our overall security posture. We had to quickly shift some of our security protocols and rely on other measures, which was a reminder of the importance of having a robust backup plan and redundancies in place.” 

Keeping all software and systems up to date is vital for maintaining a strong security posture. This includes operating systems, applications, security tools, and firmware. Regular patching addresses known vulnerabilities that could be exploited by attackers, reducing the risk of security incidents.  

 

The recent IT outage teaches valuable lessons in managing third-party vendor risks as downtime causes organizations significant financial losses and reputational damage. Other than nurturing existing vendor relationships, CIOs, CISOs, and board members must ensure they are on top of incident response plans and regular security audits to enhance resilience and maintain business continuity.  

Leave a Reply

Your email address will not be published. Required fields are marked *