15 - Natural Abstractions with John Wentworth
Listen now
Description
Why does anybody care about natural abstractions? Do they somehow relate to math, or value learning? How do E. coli bacteria find sources of sugar? All these questions and more will be answered in this interview with John Wentworth, where we talk about his research plan of understanding agency via natural abstractions. Topics we discuss, and timestamps: 00:00:31 - Agency in E. Coli 00:04:59 - Agency in financial markets 00:08:44 - Inferring agency in real-world systems 00:16:11 - Selection theorems 00:20:22 - Abstraction and natural abstractions 00:32:42 - Information at a distance 00:39:20 - Why the natural abstraction hypothesis matters 00:44:48 - Unnatural abstractions used by humans? 00:49:11 - Probability, determinism, and abstraction 00:52:58 - Whence probabilities in deterministic universes? 01:02:37 - Abstraction and maximum entropy distributions 01:07:39 - Natural abstractions and impact 01:08:50 - Learning human values 01:20:47 - The shape of the research landscape 01:34:59 - Following John's work The transcript John on LessWrong Research that we discuss: Alignment by default - contains the natural abstraction hypothesis The telephone theorem Generalizing Koopman-Pitman-Darmois The plan Understanding deep learning requires rethinking generalization - deep learning can fit random data A closer look at memorization in deep networks - deep learning learns before memorizing Zero-shot coordination A new formalism, method, and open issues for zero-shot coordination Conservative agency via attainable utility preservation Corrigibility Errata: E. coli has ~4,400 genes, not 30,000. A typical adult human body has thousands of moles of water in it, and therefore must consist of well more than 10 moles total.
More Episodes
Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this approach when the models are smarter than the humans providing feedback. In this episode, I talk...
Published 06/12/24
What's the difference between a large language model and the human brain? And what's wrong with our theories of agency? In this episode, I chat about these questions with Jan Kulveit, who leads the Alignment of Complex Systems research group. Patreon: patreon.com/axrpodcast Ko-fi:...
Published 05/30/24