79. Ryan Carey - What does your AI want?
Listen now
Description
AI safety researchers are increasingly focused on understanding what AI systems want. That may sound like an odd thing to care about: after all, aren’t we just programming AIs to want certain things by providing them with a loss function, or a number to optimize? Well, not necessarily. It turns out that AI systems can have incentives that aren’t necessarily obvious based on their initial programming. Twitter, for example, runs a recommender system whose job is nominally to figure out what tweets you’re most likely to engage with. And while that might make you think that it should be optimizing for matching tweets to people, another way Twitter can achieve its goal is by matching people to tweets — that is, making people easier to predict, by nudging them towards simplistic and partisan views of the world. Some have argued that’s a key reason that social media has had such a divisive impact on online political discourse. So the incentives of many current AIs already deviate from those of their programmers in important and significant ways — ways that are literally shaping society. But there’s a bigger reason they matter: as AI systems continue to develop more capabilities, inconsistencies between their incentives and our own will become more and more important. That’s why my guest for this episode, Ryan Carey, has focused much of his research on identifying and controlling the incentives of AIs. Ryan is a former medical doctor, now pursuing a PhD in machine learning and doing research on AI safety at Oxford University’s Future of Humanity Institute.
More Episodes
On the last episode of the Towards Data Science Podcast, host Jeremie Harris offers his perspective on the last two years of AI progress, and what he thinks it means for everything, from AI safety to the future of humanity. Going forward, Jeremie will be exploring these topics on the new...
Published 10/19/22
Published 10/19/22
Progress in AI has been accelerating dramatically in recent years, and even months. It seems like every other day, there’s a new, previously-believed-to-be-impossible feat of AI that’s achieved by a world-leading lab. And increasingly, these breakthroughs have been driven by the same, simple...
Published 10/12/22