Slight Reliability
Listen now
More Episodes
This week I'm joined by Karanveer Anand, SRE Technical Program Manager at Google to discuss blameless post-mortems. We cover:🦅 The recent Crowdstrike outage and their public post-mortem🚑 When do we do a blameless post-mortem?😕 How do we do a blameless post-mortem?✅ How do we make sure action...
Published 09/03/24
This week Zach Michel from https://middleware.io/ and I discuss the state of OpenTelemetry and what it means to adopt it. We cover:🌩️ Achieving observability in a SaaS world🥫 Context propagation - the magic sauce of OTEL🚪 The telemetry gateway concept and leveraging the OTEL collector🪵 The state...
Published 08/27/24
In Episode 80 Niall Murphy talked about the need for SREs to be better at articulating the value of our work. In this episode I'm joined by ex-Googler and Engineering Director (SRE) at Culture Amp Artem Yakimenko about how we might achieve this.We discuss both quantifiable and qualitative...
Published 07/24/24