#186 Business Continuity lessons learnt from CrowdStrike
Listen now
Description
In July 2024, A logic error in an update for CrowdStrike’s Falcon software caused 8.5 million windows computers to crash. While a fix was pushed out shortly after, the nature of the error meant that a full recovery of all effected machines took weeks to complete. Many businesses were caught up in the disruption, regardless of if this affected them directly or by proxy due to affected suppliers. So, what can businesses learn from this? Today, Ian Battersby and Steve Mason discuss the aftermath of the CrowdStrike crash, the importance of good business continuity and what actions all businesses should take to ensure they are prepared in the event of an IT incident. You’ll learn ·      What happened following the CrowdStrike crash? ·      How long did it take businesses to recover? ·      Which ISO management system standards would this impact? ·      How can you use your Management System to address the affects of an IT incident? ·      How would this change your understanding of the needs and expectations of interested parties? ·      How do risk assessments factor in where IT incidents are concerned?   Resources ·      Isologyhub ·      ISO 22301 Business Continuity     In this episode, we talk about: [00:30] Join the isologyhub – To get access to a suite of ISO related tools, training and templates. Simply head on over to isologyhub.com to either sign-up or book a demo. [02:05] Episode summary: Ian Battersby is joined by Steve Mason to discuss the recent CrowdStrike crash, the implications on your Management system and business continuity lessons learned that you can apply ahead of any potential future incidents.   [03:00] What happened following the CrowdStrike crash?– In short, An update to CrowdStrike’s Falcon software brought down computer systems globally. 8.5 million windows systems, which in reality is less than 1% of windows systems, were affected as a result of this error. Even still, the damage could still be felt from key pillars of our societal infrastructure, with a lot of hospitals and transportation like trains and airlines being the worst affected. [04:45] How long did it take CrowdStrike to issue a fix? – CrowdStrike fixed the issue in about 30 minutes, but this didn’t mean that computers affected would be automatically fixed. In many cases applying the fix meant that engineers had to go on site to many different locations which is both time consuming and costly. In some cases Microsoft said that some computers might need as many as 15 reboots to clear the problem. So, a fix that many were hoping would solve the issue ended up taking a few weeks to fully resolve as not everyone has IT or tech support in the field to issue a manual reboot. A lot of businesses were caught out as they don’t factor this into their recovery time, some assuming that an issue like this is guaranteed to be fixed within 48 hours, which is not something you can promise. You need to be realistic when filling out a Business Impact Assessment (BIA). [07:55] How do you know in advance if an outage will need physical intervention to resolve? – There is a lesson to be learnt from this most recent issue. You need to take a look at your current business continuity plans and ask yourself: ·      What systems to you use? ·      How reliable are the third-party applications that you use? ·      If an issue like this to reoccur, how would it affect us? ·      Do we have the necessary resource to fix it? i.e. staff on site if needed? Third-parties will have a lot of clients, some may even prioritise those that pay a more premium package, so you can’t always count on them for a quick fix. [09:10] How does this impact out businesses in terms of our management standards? – When we begin to analyse how this has impacted our management systems, we can’t afford to say ‘We don’t use CrowdSt
More Episodes
AI has been integrated into almost every aspect of our lives, from everyday software we use at work, to the algorithms that determine what content is recommended to us at home. While extraordinary in its capabilities, it isn’t infallible and will open up everyone to new and emerging risks....
Published 11/20/24
Published 11/20/24
One of the biggest contributors to a stagnating ISO Management System is a failure to communicate. This has certainly been true in our experience with implementing ISO Standards for over 18 years, and as a result, we make sure to highlight awareness and communication as an integral step of the...
Published 11/12/24