Episodes
On the last episode of the Towards Data Science Podcast, host Jeremie Harris offers his perspective on the last two years of AI progress, and what he thinks it means for everything, from AI safety to the future of humanity. Going forward, Jeremie will be exploring these topics on the new Gladstone AI podcast. ***  Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original mix) - Link to Track: https://youtu.be/d8Y2sKIgFWc Chapters: 0:00 Intro 6:00 The Bitter...
Published 10/19/22
Published 10/19/22
Progress in AI has been accelerating dramatically in recent years, and even months. It seems like every other day, there’s a new, previously-believed-to-be-impossible feat of AI that’s achieved by a world-leading lab. And increasingly, these breakthroughs have been driven by the same, simple idea: AI scaling. For those who haven’t been following the AI scaling sage, scaling means training AI systems with larger models, using increasingly absurd quantities of data and processing power. So...
Published 10/12/22
It’s no secret that a new generation of powerful and highly scaled language models is taking the world by storm. Companies like OpenAI, AI21Labs, and Cohere have built models so versatile that they’re powering hundreds of new applications, and unlocking entire new markets for AI-generated text. In light of that, I thought it would be worth exploring the applied side of language modelling — to dive deep into one specific language model-powered tool, to understand what it means to build apps...
Published 10/05/22
Imagine you’re a big hedge fund, and you want to go out and buy yourself some data. Data is really valuable for you — it’s literally going to shape your investment decisions and determine your outcomes. But the moment you receive your data, a cold chill runs down your spine: how do you know your data supplier gave you the data they said they would? From your perspective, you’re staring down 100,000 rows in a spreadsheet, with no way to tell if half of them were made up — or maybe more for...
Published 09/28/22
Today, we live in the era of AI scaling. It seems like everywhere you look people are pushing to make large language models larger, or more multi-modal and leveraging ungodly amounts of processing power to do it. But although that’s one of the defining trends of the modern AI era, it’s not the only one. At the far opposite extreme from the world of hyperscale transformers and giant dense nets is the fast-evolving world of TinyML, where the goal is to pack AI systems onto small edge...
Published 09/21/22
Deep learning models — transformers in particular — are defining the cutting edge of AI today. They’re based on an architecture called an artificial neural network, as you probably already know if you’re a regular Towards Data Science reader. And if you are, then you might also already know that as their name suggests, artificial neural networks were inspired by the structure and function of biological neural networks, like those that handle information processing in our brains. So it’s a...
Published 09/14/22
It’s no secret that the US and China are geopolitical rivals. And it’s also no secret that that rivalry extends into AI — an area both countries consider to be strategically critical. But in a context where potentially transformative AI capabilities are being unlocked every few weeks, many of which lend themselves to military applications with hugely destabilizing potential, you might hope that the US and China would have robust agreements in place to deal with things like runaway conflict...
Published 09/07/22
There’s a website called thispersondoesnotexist.com. When you visit it, you’re confronted by a high-resolution, photorealistic AI-generated picture of a human face. As the website’s name suggests, there’s no human being on the face of the earth who looks quite like the person staring back at you on the page. Each of those generated pictures are a piece of data that captures so much of the essence of what it means to look like a human being. And yet they do so without telling you anything...
Published 05/18/22
Two ML researchers with world-class pedigrees who decided to build a company that puts AI on the blockchain. Now to most people — myself included — “AI on the blockchain” sounds like a winning entry in some kind of startup buzzword bingo. But what I discovered talking to Jacob and Ala was that they actually have good reasons to combine those two ingredients together. At a high level, doing AI on a blockchain allows you to decentralize AI research and reward labs for building better models,...
Published 05/12/22
As you might know if you follow the podcast, we usually talk about the world of cutting-edge AI capabilities, and some of the emerging safety risks and other challenges that the future of AI might bring. But I thought that for today’s episode, it would be fun to change things up a bit and talk about the applied side of data science, and how the field has evolved over the last year or two. And I found the perfect guest to do that with: her name is Sadie St. Lawrence, and among other things,...
Published 05/04/22
If the name data2vec sounds familiar, that’s probably because it made quite a splash on social and even traditional media when it came out, about two months ago. It’s an important entry in what is now a growing list of strategies that are focused on creating individual machine learning architectures that handle many different data types, like text, image and speech. Most self-supervised learning techniques involve getting a model to take some input data (say, an image or a piece of text) and...
Published 04/27/22
AI scaling has really taken off. Ever since GPT-3 came out, it’s become clear that one of the things we’ll need to do to move beyond narrow AI and towards more generally intelligent systems is going to be to massively scale up the size of our models, the amount of processing power they consume and the amount of data they’re trained on, all at the same time. That’s led to a huge wave of highly scaled models that are incredibly expensive to train, largely because of their enormous compute...
Published 04/20/22
There’s an idea in machine learning that most of the progress we see in AI doesn’t come from new algorithms of model architectures. instead, some argue, progress almost entirely comes from scaling up compute power, datasets and model sizes — and besides those three ingredients, nothing else really matters. Through that lens the history of AI becomes the history f processing power and compute budgets. And if that turns out to be true, then we might be able to do a decent job of predicting AI...
Published 04/13/22
Generating well-referenced and accurate Wikipedia articles has always been an important problem: Wikipedia has essentially become the Internet's encyclopedia of record, and hundreds of millions of people use it do understand the world. But over the last decade Wikipedia has also become a critical source of training data for data-hungry text generation models. As a result, any shortcomings in Wikipedia’s content are at risk of being amplified by the text generation tools of the future. If one...
Published 04/06/22
Trustworthy AI is one of today’s most popular buzzwords. But although everyone seems to agree that we want AI to be trustworthy, definitions of trustworthiness are often fuzzy or inadequate. Maybe that shouldn’t be surprising: it’s hard to come up with a single set of standards that add up to “trustworthiness”, and that apply just as well to a Netflix movie recommendation as a self-driving car. So maybe trustworthy AI needs to be thought of in a more nuanced way — one that reflects the...
Published 03/30/22
Until recently, very few people were paying attention to the potential malicious applications of AI. And that made some sense: in an era where AIs were narrow and had to be purpose-built for every application, you’d need an entire research team to develop AI tools for malicious applications. Since it’s more profitable (and safer) for that kind of talent to work in the legal economy, AI didn’t offer much low-hanging fruit for malicious actors. But today, that’s all changing. As AI becomes...
Published 03/23/22
Imagine, for example, an AI that’s trained to identify cows in images. Ideally, we’d want it to learn to detect cows based on their shape and colour. But what if the cow pictures we put in the training dataset always show cows standing on grass? In that case, we have a spurious correlation between grass and cows, and if we’re not careful, our AI might learn to become a grass detector rather than a cow detector. Even worse, we could only realize that’s happened once we’ve deployed it in the...
Published 03/09/22
Google the phrase “AI over-hyped”, and you’ll find literally dozens of articles from the likes of Forbes, Wired, and Scientific American, all arguing that “AI isn’t really as impressive at it seems from the outside,” and “we still have a long way to go before we come up with *true* AI, don’t you know.” Amusingly, despite the universality of the “AI is over-hyped” narrative, the statement that “We haven’t made as much progress in AI as you might think™️” is often framed as somehow being an...
Published 03/02/22
It’s no secret that AI systems are being used in more and more high-stakes applications. As AI eats the world, it’s becoming critical to ensure that AI systems behave robustly — that they don’t get thrown off by unusual inputs, and start spitting out harmful predictions or recommending dangerous courses of action. If we’re going to have AI drive us to work, or decide who gets bank loans and who doesn’t, we’d better be confident that our AI systems aren’t going to fail because of a freak...
Published 02/09/22
Until very recently, the study of human disease involved looking at big things — like organs or macroscopic systems — and figuring out when and how they can stop working properly. But that’s all started to change: in recent decades, new techniques have allowed us to look at disease in a much more detailed way, by examining the behaviour and characteristics of single cells. One class of those techniques now known as single-cell genomics — the study of gene expression and function at the level...
Published 02/02/22
If you were scrolling through your newsfeed in late September 2021, you may have caught this splashy headline from The Times of London that read, “Can this man save the world from artificial intelligence?”. The man in question was Mo Gawdat, an entrepreneur and senior tech executive who spent several years as the Chief Business Officer at GoogleX (now called X Development), Google’s semi-secret research facility, that experiments with moonshot projects like self-driving cars, flying vehicles,...
Published 01/26/22
Today’s episode is somewhat special, because we’re going to be talking about what might be the first solid quantitative study of the power-seeking tendencies that we can expect advanced AI systems to have in the future. For a long time, there’s kind of been this debate in the AI safety world, between: People who worry that powerful AIs could eventually displace, or even eliminate humanity altogether as they find more clever, creative and dangerous ways to optimize their reward metrics on...
Published 01/19/22
Until recently, AI systems have been narrow — they’ve only been able to perform the specific tasks that they were explicitly trained for. And while narrow systems are clearly useful, the holy grain of AI is to build more flexible, general systems. But that can’t be done without good performance metrics that we can optimize for — or that we can at least use to measure generalization ability. Somehow, we need to figure out what number needs to go up in order to bring us closer to...
Published 01/12/22
2021 has been a wild ride in many ways, but its wildest features might actually be AI-related. We’ve seen major advances in everything from language modeling to multi-modal learning, open-ended learning and even AI alignment. So, we thought, what better way to take stock of the big AI-related milestones we’ve reached in 2021 than a cross-over episode with our friends over at the Last Week In AI podcast. *** Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original...
Published 01/05/22