21: The Man Who Killed ABBA
Listen now
Description
Last episode, we talked about ABBA, our first A/B testing tool. We used it to test UI changes, new features, content recommendations — anything and everything we could think of. ABBA was so good and worked so well for so long…that we decided to get rid of it. Years of using ABBA taught us what makes for good experimentation, and we eventually realized we needed a better tool, built from scratch. Listen to find out why we pulled the plug on ABBA and how Spotify’s Experimentation Platform was born. And in case you missed it, a version of our internal platform will be available to the public as Confidence, a new enterprise product for developer teams — read today’s announcement: “Coming Soon: Confidence — An Experimentation Platform from Spotify”. But first, let’s talk buttons. Everyone always has so many questions about buttons. How do you know which color they should be? Or how big they should be? Or whether the corners should be round or square? The easy answer: an A/B test! But if only all product experimentation was as simple as testing buttons. Senior staff engineer Mark Grey returns to talk with host Dave Zolotusky, along with senior engineer Dima Kunin — he helped build Spotify’s Experimentation Platform and was the guy who had the honor of finally retiring ABBA. They discuss the ins and outs of enabling experimentation at scale, including targeting criteria, controlling eligibility, the importance of measuring exposure, using properties instead of feature flags, the advantages of separating your app configuration from your experiments, fallback states, sample ratio mismatches — and all the other questions you have to answer about your experimentation process before you can even ask something as simple as “what color should a button be” — let alone “will this machine learning model consistently provide recommendations users appreciate over the next year”. Plus, did you definitely, positively, absolutely eat the bread? Or did you just buy the bread? And a bonus trick question: What’s the difference between “treatments”, “variants”, and “groups” — and why is it always so hard to name things? Learn more about ABBA and its successor, Spotify’s Experimentation Platform: 20: The Rise and Fall of ABBA: Listen to our previous episode with Mark Grey talking about our very first A/B testing tool, ABBA. Spotify’s New Experimentation Platform (Part 1): How we went from ABBA to building EP, the internal experimentation platform we use today and that Confidence is based on. Spotify’s New Experimentation Platform (Part 2): More features of EP, including our custom “salt machine” and more. Plus, find out lots more about how we do experimentation at Spotify on our engineering blog — including a little light reading on automated salting and bucket reuse, choosing sequential testing frameworks, comparing quantiles at scale, and how we scale other scientific best practices across the org. Read what else we’re nerding out about on the Spotify Engineering Blog: engineering.atspotify.com You should follow us on Twitter @SpotifyEng and on LinkedIn!
More Episodes
Register for Spotify’s roadmap webinar on April 30, 2024 — and see what’s coming next from Spotify for Backstage, the open source platform for building internal developer portals. We’ll show you our latest developer tools, including a sneak peek at new Spotify Plugins for Backstage and a...
Published 04/19/24
Published 04/18/24
Host and principal engineer Dave Zolotusky talks with Kyle Buttner, a product manager on Spotify’s insights team, to discuss Spotify's journey in measuring developer productivity — from how we evaluate different frameworks (like DORA and SPACE) to what kind of data we collect, to the role...
Published 04/18/24