Episodes
The Digital Operational Resilience Act (DORA) goes into effect on January 17, 2025, and financial institutions serving the EU will need to meet an enhanced set of requirements related to risk management, network resilience, and incident reporting. While DORA is directly applicable to EU financial institutions, it prompts important discussions about resilience and ensuring digital experiences that are relevant for all IT operations teams, regardless of industry or region. Tune in to the...
Published 11/08/24
A recent Salesforce outage highlighted the limitations of status pages and the importance of considering a variety of data points when identifying the source of an outage. Tune in to hear The Internet Report team discuss what happened and why. They’ll also share insights from a recent Microsoft Outlook outage and cover the latest Internet outage trends. Listen now or use the chapters below to jump to the sections that most interest you. CHAPTERS00:00 Intro00:48 Salesforce Outage10:00...
Published 10/25/24
Published 10/25/24
A recent certificate problem impacted ServiceNow, and other issues prevented users from accessing key cloud services including Microsoft 365, Azure Virtual Desktop, and Workday. Tune in to hear what happened during these incidents and a separate data center fire that caused a Reliance Jio outage for customers across multiple areas of India. Listen now or use the chapters below to jump to the sections that most interest you. CHAPTERS00:00 Intro00:59 ServiceNow Outage03:20 Microsoft 365...
Published 10/04/24
During high-traffic seasons like Black Friday or a much-anticipated product launch, maintaining good digital experiences for customers is vital. We’ve all heard tales of floods of eager shoppers crashing a website during a major sale—leaving them unable to make their coveted purchases. To guard against a breakdown like this during high-traffic periods, companies sometimes use various traffic management strategies such as digital waiting rooms. In this episode, The Internet Report team...
Published 09/21/24
Let’s dive into the fascinating world of subsea cables. With special guest Murray Burling—Executive Director of Oceans and Environment at RPS—we’ll explore the current subsea cable ecosystem and chat about what the future might hold. Tune in for insights on how important subsea cables are for today’s digital experiences, how decisions are made on where to place them, the consequences of cable cuts, and route diversity and Internet resilience. CHAPTERS00:00 Intro02:29 Current Subsea Cable...
Published 09/06/24
Explore the recent Google Cloud and GitHub outages, plus get insights from a network perspective into the August 12 X livestream event featuring Elon Musk and Donald Trump. In the case of Google Cloud, a power issue in one of its European regions impacted connectivity and affected several services and networking equipment. The problems disrupted connectivity into the region as well as some Partner Interconnect connections and associated routes between other Google regions. Traffic to and from...
Published 08/23/24
This week, The Internet Report team and special guest Dave Anderson—a tech industry veteran and co-host of "A Very Melbourne Podcast," which covers the Australian Football League and more—are chatting about how to assure great digital experiences at major sporting events. Large sporting events are always logistically complex, and today that’s even more the case with digital technology permeating every part of operational and experience delivery. And due to the real-time nature of live sports,...
Published 08/10/24
On July 19, many organizations around the globe—including airlines, banks, and hospitals—experienced outages as Windows machines reportedly got stuck in a boot loop that ultimately resulted in the Blue Screen of Death (BSOD).  These disruptions had a common source: an update from CrowdStrike, a managed detection and response (MDR) service used to protect Windows endpoints from attack.  Tune in to hear The Internet Report team’s insights on this CrowdStrike update and the ensuing IT outages....
Published 07/26/24
On May 17, X reached a major milestone when the social media platform completed its full migration from twitter.com to x.com. While the number and frequency of outages did increase after the company’s acquisition by Elon Musk, following the domain migration, there don’t appear to have been any significant disruptions to the X.com platform.  In this week’s podcast, The Internet Report team discusses what they observed during (and after) the domain migration, and analyzes X’s performance pre-...
Published 07/16/24
Three recent outages at Starlink, Charles Schwab, and the Internet Archive highlight key reminders for NetOps teams around backup options, the role of intelligence, and understanding your end-to-end service delivery chain. A subset of Starlink users were unable to establish a connection; some users of Schwab.com and its apps may have found themselves unable to transact or trade due to an authentication issue; and the Internet Archive and the Wayback Machine were intermittently overwhelmed by...
Published 06/21/24
Believe it or not, we’re already about halfway through 2024. Looking at the outage data from this year so far, we see continued evolution, following patterns observed over the past few years.  Notably, the percentage of cloud service provider (CSP) outages is still increasing—though at a more accelerated rate than seen in recent years. Tune on to learn more about this trend and other themes we’re noticing in the Internet ecosystem, as well as tips for how IT teams can respond to these...
Published 06/14/24
When it comes to assuring great digital experiences for your users, intermittent issues can be incredibly difficult to discover and diagnose because the service is both working and not working simultaneously—or, it may simply be running slow. Some users may experience issues, while for others, everything will work just fine. In this week’s episode, The Internet Report team will explore the complexities that intermittent issues can bring by examining two recent incidents at Meta and...
Published 05/25/24
Explore what happened during recent outages at google.com, X (formerly Twitter), and CDN service jsDelivr.  The Internet Report team will also discuss why a detailed understanding of every component in your service delivery chain is vital to maintain the availability and resiliency of your service. If even one component encounters challenges, the entire service can be impacted. In jsDelivr’s case, for example, the detail at issue was an expired cert, which created problems serving content and...
Published 05/10/24
Go under the hood of a ChatGPT outage, H&R Block’s Tax Day disruption, and more incidents from the past few weeks. The Internet Report team will also discuss Microsoft’s update on recent subsea cable cuts and the latest global outage trends. ——— CHAPTERS:00:00 Intro00:57 ChatGPT Outage03:35 Revisiting West Coast of Africa Cable Cuts09:07 H&R Block Outage11:32 Sky Mobile Outage12:25 Outage on unpkg CDN14:06 PlayHQ Outage16:40 Outage Trends: By the Numbers19:33 Get in Touch  ——— For...
Published 04/27/24
With tax season coming to a close in the United States, IT teams at tax preparation companies and other organizations in the industry will be taking extra care to make sure that their systems can handle a spike in traffic due to a potential last-minute rush of filings.  Tune in to hear The Internet Report hosts discuss how IT teams can navigate major spikes in demand and give customers the best possible digital experience, whether it’s Tax Day, Black Friday, or another high-traffic...
Published 04/13/24
The end-to-end delivery of modern digital services can introduce a complex web of dependencies and failure points, which can stem from direct relationships as well as third-party providers, introducing layers of abstraction for operations teams to keep track of. Managing this complex ecosystem can be challenging. Unexpected issues may arise from seemingly insignificant components, surprising even the largest, most technologically sophisticated organizations. For example, in recent weeks,...
Published 03/30/24
Over a two-day period this past week, major social media platforms—Meta’s Facebook and Instagram, LinkedIn, and Discord—all experienced disruptions. In the same timeframe, Comcast was also impacted by an outage that affected access to specific services and applications. Meta experienced issues with its log-in process, Discord navigated unexpectedly high load volumes, Comcast dealt with 100% packet loss in part of its backbone, and—the following day—LinkedIn worked its way through a backend...
Published 03/16/24
Load is a fundamental but, at times, challenging variable for networks and operations teams to handle. In the past few weeks, ThousandEyes saw various load-related problems impact organizations including Google Cloud, Front, several Australian banks, and Minnesota State University Moorhead. Tune in to learn more about what happened during these incidents, as well as hear our commentary on the recent outage impacting AT&T. Use the timestamps below to jump to the sections that most interest...
Published 03/04/24
When outages happen, it’s what you do next that matters. It’s important to have a backup plan in place that you can quickly activate to minimize the impact of an incident. Over the past two weeks, companies initiated a range of resiliency actions, including asking customers to use alternate authentication methods (or to avoid logging out of a service), setting up a new contact center to re-establish lines of communication, and reverting to manual processes. Tune in to learn more about what...
Published 02/17/24
The ThousandEyes Internet Intelligence team joins us from Cisco Live in Amsterdam, talking about a major theme from the event—security. Tune in to hear their thoughts on how visibility can help companies in their security efforts, the sovereignty of data in flight, and why you don’t have to choose between security and performance. ——— CHAPTERS00:00 Intro01:09 Evolving Security Landscape04:53 Security Excellence & Optimal Digital Experience10:13 Sovereignty of Data in Flight14:57 Key...
Published 02/10/24
What happened during the recent Microsoft Teams and Azure disruptions? Go under the hood of these incidents and also explore other recent disruptions in this week’s Pulse Update. CHAPTERS- 01:03 Network issue leads to Microsoft Teams service disruption- 04:09 Azure Resource Manager exhausts capacity, causing service issues- 06:20 Oracle Cloud experiences network outage- 09:56 Jira users encounter 503s and other errors- 10:30 Sage outage impacts South Africa- 11:08 Red Hat experiences four...
Published 02/03/24
What caused recent dips in performance for OpenAI’s ChatGPT? Tune in to hear The Internet Report team unpack this and other recent disruptions, including a hack that led to an outage at the Spanish branch of the Orange mobile network, and a blip for customers of the cloud services provider DigitalOcean. They’ll also cover the outage trends they’re seeing in 2024 so far and how extreme cold weather can cause problems for data centers. For more insights on outage trends and analysis of some of...
Published 01/20/24
As they launch into 2024, organizations are facing a different outage landscape than they had at the start of 2023. The past year saw increases in cloud service provider (CSP) outages, application outages, and the percentage of U.S.-centric outages—all of which point to an evolution in the way outages happen and the need for different strategies to minimize the impact of disruptions. In this episode, Mike Hicks (Principal Solutions Analyst at ThousandEyes) unpacks these trends and shares...
Published 01/13/24
As 2023 comes to a close, in the spirit of Dickens’ holiday classic “A Christmas Carol,” let’s reflect on the valuable insights left by the ghosts of network operations teams past, present, and yet to come.  Tune in to hear host Mike Hicks (Principal Solutions Analyst at ThousandEyes) discuss lessons from the NetOps teams of the past, the current state of NetOps, and what the future might hold—all with the goal of helping teams take steps to optimize performance and deliver delightful digital...
Published 12/21/23