Ep 14 - Interp, latent robustness, RLHF limitations w/ Stephen

Artificial General Intelligence (AGI) Show with...

Ep 14 - Interp, latent robustness, RLHF limitations w/ Stephen Casper (PhD AI researcher, MIT)

Listen now

Description

We speak with Stephen Casper, or "Cas" as his friends call him. Cas is a PhD student at MIT in the Computer Science (EECS) department, in the Algorithmic Alignment Group advised by Prof Dylan Hadfield-Menell. Formerly, he worked with the Harvard Kreiman Lab and the Center for Human-Compatible AI (CHAI) at Berkeley. His work focuses on better understanding the internal workings of AI models (better known as “interpretability”), making them robust to various kinds of adversarial attacks, and ca...

More Episodes

See all »

Ep 13 - AI researchers expect AGI sooner w/ Katja Grace (Co-founder & Lead Researcher, AI Impacts)

We speak with Katja Grace. Katja is the co-founder and lead researcher at AI Impacts, a research group trying to answer key questions about the future of AI — when certain capabilities will arise, what will AI look like, how it will all go for humanity.We talk to Katja about:* How AI Impacts...

Published 06/19/24

Artificial General Intelligence (AGI) Show with Soroush Pour

Published 06/19/24

Ep 12 - Education & advocacy for AI safety w/ Rob Miles (YouTube host)

We speak with Rob Miles. Rob is the host of the “Robert Miles AI Safety” channel on YouTube, the single most popular AI alignment video series out there — he has 145,000 subscribers and his top video has ~600,000 views. He goes much deeper than many educational resources out there on alignment,...

Published 03/08/24