Episodes
In episode 7 of Getting There, Nora and Niall speak with Laura de Vesine of Datadog. Laura shares a unique perspective on the March 2023 Datadog outage, how the incident was handled internally, the resulting damage of the outage, and the many lessons learned.
Published 06/12/23
In episode 6 of Getting There, Nora and Niall discuss Twitter’s 2022 acquisition by Elon Musk. This talk unpacks the acquisition in terms of the cultural and social implications, the resulting fallout from massive layoffs, and the deprioritization of reliability standards within the company.
Published 03/14/23
In episode 5 of Getting There, Nora and Niall meet for a conversation at SREcon. This talk explores the history of the conference, the state of SRE, the role of company historians, insights on complexity management, and the value of SREs and systems thinkers.
Published 11/16/22
In episode 4 of Getting There, Nora Jones and Niall Murphy discuss the Atlassian outage of April 2022. This talk explores Atlassian’s 20-year history, key takeaways from this 14-day outage, surprising findings from the incident report, and critical discussion of Atlassian’s response.
Published 08/24/22
In episode 3 of Getting There, Nora Jones and Niall Murphy unpack the Roblox outage of October 2021. Together they review the incident report, discuss the contributing factors and the users affected, and examine the attributes of Roblox’s business model that led to this 73-hour outage.
Published 06/28/22
In episode 2 of Getting There, Nora and Niall discuss the socio-technical aspects of the AWS outages that occurred in December 2021. Together they unpack what happened, the inherent implications, and how organizations can learn from outages at such scale.
Published 03/04/22
In this inaugural episode of Getting There, co-hosts Nora Jones and Niall Murphy unpack the October 4th ‘21 Facebook Outage and the unforeseen challenges and responsibilities that emerge when responding to an incident of such magnitude.
Published 11/22/21