14 - Infra-Bayesian Physicalism with Vanessa Kosoy - Listen - AXRP

14 - Infra-Bayesian Physicalism with Vanessa Kosoy

Listen now

Description

Late last year, Vanessa Kosoy and Alexander Appel published some research under the heading of "Infra-Bayesian physicalism". But wait - what was infra-Bayesianism again? Why should we care? And what does any of this have to do with physicalism? In this episode, I talk with Vanessa Kosoy about these questions, and get a technical overview of how infra-Bayesian physicalism works and what its implications are. Topics we discuss, and timestamps: 00:00:48 - The basics of infra-Bayes 00:08:32 - An invitation to infra-Bayes 00:11:23 - What is naturalized induction? 00:19:53 - How infra-Bayesian physicalism helps with naturalized induction 00:19:53 - Bridge rules 00:22:22 - Logical uncertainty 00:23:36 - Open source game theory 00:28:27 - Logical counterfactuals 00:30:55 - Self-improvement 00:32:40 - How infra-Bayesian physicalism works 00:32:47 - World models 00:39-20 - Priors 00:42:53 - Counterfactuals 00:50:34 - Anthropics 00:54:40 - Loss functions 00:56:44 - The monotonicity principle 01:01:57 - How to care about various things 01:08:47 - Decision theory 01:19:53 - Follow-up research 01:20:06 - Infra-Bayesian physicalist quantum mechanics 01:26:42 - Infra-Bayesian physicalist agreement theorems 01:29:00 - The production of infra-Bayesianism research 01:35:14 - Bridge rules and malign priors 01:45:27 - Following Vanessa's work The transcript Vanessa on the Alignment Forum Research that we discuss: Infra-Bayesian physicalism: a formal theory of naturalized induction Updating ambiguous beliefs (contains the infra-Bayesian update rule) Functional Decision Theory: A New Theory of Instrumental Rationality Space-time embedded intelligence Attacking the grain of truth problem using Bayes-Savage agents (generating a simplicity prior with Knightian uncertainty using oracle machines) Quantity of experience: brain-duplication and degrees of consciousness (the thick wires argument) Online learning in unknown Markov games Agreeing to disagree (contains the Aumann agreement theorem) What does the universal prior actually look like? (aka "the Solomonoff prior is malign") The Solomonoff prior is malign Eliciting Latent Knowledge ELK Thought Dump, by Abram Demski

More Episodes

See all »

33 - RLHF Problems with Scott Emmons

Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this approach when the models are smarter than the humans providing feedback. In this episode, I talk...

Published 06/12/24

AXRP - the AI X-risk Research Podcast

Published 06/12/24

32 - Understanding Agency with Jan Kulveit

What's the difference between a large language model and the human brain? And what's wrong with our theories of agency? In this episode, I chat about these questions with Jan Kulveit, who leads the Alignment of Complex Systems research group. Patreon: patreon.com/axrpodcast Ko-fi:...

Published 05/30/24