14 - Infra-Bayesian Physicalism with Vanessa Kosoy
Listen now
Description
Late last year, Vanessa Kosoy and Alexander Appel published some research under the heading of "Infra-Bayesian physicalism". But wait - what was infra-Bayesianism again? Why should we care? And what does any of this have to do with physicalism? In this episode, I talk with Vanessa Kosoy about these questions, and get a technical overview of how infra-Bayesian physicalism works and what its implications are. Topics we discuss, and timestamps: 00:00:48 - The basics of infra-Bayes 00:08:32 - An invitation to infra-Bayes 00:11:23 - What is naturalized induction? 00:19:53 - How infra-Bayesian physicalism helps with naturalized induction 00:19:53 - Bridge rules 00:22:22 - Logical uncertainty 00:23:36 - Open source game theory 00:28:27 - Logical counterfactuals 00:30:55 - Self-improvement 00:32:40 - How infra-Bayesian physicalism works 00:32:47 - World models 00:39-20 - Priors 00:42:53 - Counterfactuals 00:50:34 - Anthropics 00:54:40 - Loss functions 00:56:44 - The monotonicity principle 01:01:57 - How to care about various things 01:08:47 - Decision theory 01:19:53 - Follow-up research 01:20:06 - Infra-Bayesian physicalist quantum mechanics 01:26:42 - Infra-Bayesian physicalist agreement theorems 01:29:00 - The production of infra-Bayesianism research 01:35:14 - Bridge rules and malign priors 01:45:27 - Following Vanessa's work The transcript Vanessa on the Alignment Forum Research that we discuss: Infra-Bayesian physicalism: a formal theory of naturalized induction Updating ambiguous beliefs (contains the infra-Bayesian update rule) Functional Decision Theory: A New Theory of Instrumental Rationality Space-time embedded intelligence Attacking the grain of truth problem using Bayes-Savage agents (generating a simplicity prior with Knightian uncertainty using oracle machines) Quantity of experience: brain-duplication and degrees of consciousness (the thick wires argument) Online learning in unknown Markov games Agreeing to disagree (contains the Aumann agreement theorem) What does the universal prior actually look like? (aka "the Solomonoff prior is malign") The Solomonoff prior is malign Eliciting Latent Knowledge ELK Thought Dump, by Abram Demski
More Episodes
In 2022, it was announced that a fairly simple method can be used to extract the true beliefs of a language model on any given topic, without having to actually understand the topic at hand. Earlier, in 2021, it was announced that neural networks sometimes 'grok': that is, when training them on...
Published 04/25/24