26 - AI Governance with Elizabeth Seger
Listen now
Description
The events of this year have highlighted important questions about the governance of artificial intelligence. For instance, what does it mean to democratize AI? And how should we balance benefits and dangers of open-sourcing powerful AI systems such as large language models? In this episode, I speak with Elizabeth Seger about her research on these questions. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Topics we discuss, and timestamps: 0:00:40 - What kinds of AI? 0:01:30 - Democratizing AI 0:04:44 - How people talk about democratizing AI 0:09:34 - Is democratizing AI important? 0:13:31 - Links between types of democratization 0:22:43 - Democratizing profits from AI 0:27:06 - Democratizing AI governance 0:29:45 - Normative underpinnings of democratization 0:44:19 - Open-sourcing AI 0:50:47 - Risks from open-sourcing 0:56:07 - Should we make AI too dangerous to open source? 1:00:33 - Offense-defense balance 1:03:13 - KataGo as a case study 1:09:03 - Openness for interpretability research 1:15:47 - Effectiveness of substitutes for open sourcing 1:20:49 - Offense-defense balance, part 2 1:29:49 - Making open-sourcing safer? 1:40:37 - AI governance research 1:41:05 - The state of the field 1:43:33 - Open questions 1:49:58 - Distinctive governance issues of x-risk 1:53:04 - Technical research to help governance 1:55:23 - Following Elizabeth's research The transcript: https://axrp.net/episode/2023/11/26/episode-26-ai-governance-elizabeth-seger.html Links for Elizabeth: Personal website: elizabethseger.com Centre for the Governance of AI (AKA GovAI): governance.ai Main papers: Democratizing AI: Multiple Meanings, Goals, and Methods: arxiv.org/abs/2303.12642 Open-sourcing highly capable foundation models: an evaluation of risks, benefits, and alternative methods for pursuing open source objectives: papers.ssrn.com/sol3/papers.cfm?abstract_id=4596436 Other research we discuss: What Do We Mean When We Talk About "AI democratisation"? (blog post): governance.ai/post/what-do-we-mean-when-we-talk-about-ai-democratisation Democratic Inputs to AI (OpenAI): openai.com/blog/democratic-inputs-to-ai Collective Constitutional AI: Aligning a Language Model with Public Input (Anthropic): anthropic.com/index/collective-constitutional-ai-aligning-a-language-model-with-public-input Against "Democratizing AI": johanneshimmelreich.net/papers/against-democratizing-AI.pdf Adversarial Policies Beat Superhuman Go AIs: goattack.far.ai Structured access: an emerging paradigm for safe AI deployment: arxiv.org/abs/2201.05159 Universal and Transferable Adversarial Attacks on Aligned Language Models (aka Adversarial Suffixes): arxiv.org/abs/2307.15043
More Episodes
Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this approach when the models are smarter than the humans providing feedback. In this episode, I talk...
Published 06/12/24
What's the difference between a large language model and the human brain? And what's wrong with our theories of agency? In this episode, I chat about these questions with Jan Kulveit, who leads the Alignment of Complex Systems research group. Patreon: patreon.com/axrpodcast Ko-fi:...
Published 05/30/24