3 - Negotiable Reinforcement Learning with Andrew Critch - Listen

3 - Negotiable Reinforcement Learning with Andrew Critch

Listen now

Description

Link to the paper - Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making Link to the transcript Critch's Google Scholar profile

More Episodes

See all »

33 - RLHF Problems with Scott Emmons

Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this approach when the models are smarter than the humans providing feedback. In this episode, I talk...

Published 06/12/24

AXRP - the AI X-risk Research Podcast

Published 06/12/24

32 - Understanding Agency with Jan Kulveit

What's the difference between a large language model and the human brain? And what's wrong with our theories of agency? In this episode, I chat about these questions with Jan Kulveit, who leads the Alignment of Complex Systems research group. Patreon: patreon.com/axrpodcast Ko-fi:...

Published 05/30/24