Dan Hendrycks on Catastrophic AI Risks
Listen now
Description
Dan Hendrycks joins the podcast again to discuss X.ai, how AI risk thinking has evolved, malicious use of AI, AI race dynamics between companies and between militaries, making AI organizations safer, and how representation engineering could help us understand AI traits like deception. You can learn more about Dan's work at https://www.safe.ai Timestamps: 00:00 X.ai - Elon Musk's new AI venture 02:41 How AI risk thinking has evolved 12:58 AI bioengeneering 19:16 AI agents 24:55 Preventing autocracy 34:11 AI race - corporations and militaries 48:04 Bulletproofing AI organizations 1:07:51 Open-source models 1:15:35 Dan's textbook on AI safety 1:22:58 Rogue AI 1:28:09 LLMs and value specification 1:33:14 AI goal drift 1:41:10 Power-seeking AI 1:52:07 AI deception 1:57:53 Representation engineering
More Episodes
Connor Leahy joins the podcast to discuss the motivations of AGI corporations, how modern AI is "grown", the need for a science of intelligence, the effects of AI on work, the radical implications of superintelligence, open-source AI, and what you might be able to do about all of this.    Here's...
Published 11/22/24
Suzy Shepherd joins the podcast to discuss her new short film "Writing Doom", which deals with AI risk. We discuss how to use humor in film, how to write concisely, how filmmaking is evolving, in what ways AI is useful for filmmakers, and how we will find meaning in an increasingly automated...
Published 11/08/24