92. Daniel Filan - Peering into neural nets for AI safety - Listen

92. Daniel Filan - Peering into neural nets for AI safety

Listen now

Description

Many AI researchers think it’s going to be hard to design AI systems that continue to remain safe as AI capabilities increase. We’ve seen already on the podcast that the field of AI alignment has emerged to tackle this problem, but a related effort is also being directed at a separate dimension of the safety problem: AI interpretability. Our ability to interpret how AI systems process information and make decisions will likely become an important factor in assuring the reliability of AIs in the future. And my guest for this episode of the podcast has focused his research on exactly that topic. Daniel Filan is an AI safety researcher at Berkeley, where he’s supervised by AI pioneer Stuart Russell. Daniel also runs AXRP, a podcast dedicated to technical AI alignment research.

More Episodes

See all »

131. Jeremie Harris - TDS Podcast Finale: The future of AI, and the risks that come with it

On the last episode of the Towards Data Science Podcast, host Jeremie Harris offers his perspective on the last two years of AI progress, and what he thinks it means for everything, from AI safety to the future of humanity. Going forward, Jeremie will be exploring these topics on the new...

Published 10/19/22

Towards Data Science

Published 10/19/22

130. Edouard Harris - New Research: Advanced AI may tend to seek power *by default*

Progress in AI has been accelerating dramatically in recent years, and even months. It seems like every other day, there’s a new, previously-believed-to-be-impossible feat of AI that’s achieved by a world-leading lab. And increasingly, these breakthroughs have been driven by the same, simple...

Published 10/12/22