Episode 12: Jacob Steinhardt, UC Berkeley, on machine learning

Episode 12: Jacob Steinhardt, UC Berkeley, on machine learning safety, alignment and measurement

Listen now

Description

Jacob Steinhardt (Google Scholar) (Website) is an assistant professor at UC Berkeley. His main research interest is in designing machine learning systems that are reliable and aligned with human values. Some of his specific research directions include robustness, rewards specification and reward hacking, as well as scalable alignment. Highlights: 📜“Test accuracy is a very limited metric.” 👨‍👩‍👧‍👦“You might not be able to get lots of feedback on human values.” 📊“I’m interested in measuring the progress in AI capabilities.”

More Episodes

See all »

Episode 35: Percy Liang, Stanford: On the paradigm shift and societal effects of foundation models

Percy Liang is an associate professor of computer science and statistics at Stanford. These days, he’s interested in understanding how foundation models work, how to make them more efficient, modular, and robust, and how they shift the way people interact with AI—although he’s been working on...

Published 05/09/24

Episode 34: Seth Lazar, Australian National University: On legitimate power, moral nuance, and the political philosophy of AI

Seth Lazar is a professor of philosophy at the Australian National University, where he leads the Machine Intelligence and Normative Theory (MINT) Lab. His unique perspective bridges moral and political philosophy with AI, introducing much-needed rigor to the question of what will make for a good...

Published 03/12/24

Generally Intelligent

Published 03/12/24