30 - AI Security with Jeffrey Ladish - Listen - AXRP - the AI

AXRP - the AI X-risk Research Podcast

30 - AI Security with Jeffrey Ladish

Listen now

More Episodes

See all »

31 - Singular Learning Theory with Daniel Murfet

Published 05/07/24

29 - Science of Deep Learning with Vikrant Varma

In 2022, it was announced that a fairly simple method can be used to extract the true beliefs of a language model on any given topic, without having to actually understand the topic at hand. Earlier, in 2021, it was announced that neural networks sometimes 'grok': that is, when training them on...

Published 04/25/24

AXRP - the AI X-risk Research Podcast

Published 04/25/24