Tim & Heinrich — Democraticizing Reinforcement Learning Research
Listen now
Description
Since reinforcement learning requires hefty compute resources, it can be tough to keep up without a serious budget of your own. Find out how the team at Facebook AI Research (FAIR) is looking to increase access and level the playing field with the help of NetHack, an archaic rogue-like video game from the late 80s. Links discussed: The NetHack Learning Environment: https://ai.facebook.com/blog/nethack-learning-environment-to-advance-deep-reinforcement-learning/ Reinforcement learning, intrinsic motivation: https://arxiv.org/abs/2002.12292 Knowledge transfer: https://arxiv.org/abs/1910.08210 Tim Rocktäschel is a Research Scientist at Facebook AI Research (FAIR) London and a Lecturer in the Department of Computer Science at University College London (UCL). At UCL, he is a member of the UCL Centre for Artificial Intelligence and the UCL Natural Language Processing group. Prior to that, he was a Postdoctoral Researcher in the Whiteson Research Lab, a Stipendiary Lecturer in Computer Science at Hertford College, and a Junior Research Fellow in Computer Science at Jesus College, at the University of Oxford. https://twitter.com/_rockt Heinrich Kuttler is an AI and machine learning researcher at Facebook AI Research (FAIR) and before that was a research engineer and team lead at DeepMind. https://twitter.com/HeinrichKuttler https://www.linkedin.com/in/heinrich-kuttler/ Topics covered: 0:00 a lack of reproducibility in RL 1:05 What is NetHack and how did the idea come to be? 5:46 RL in Go vs NetHack 11:04 performance of vanilla agents, what do you optimize for 18:36 transferring domain knowledge, source diving 22:27 human vs machines intrinsic learning 28:19 ICLR paper - exploration and RL strategies 35:48 the future of reinforcement learning 43:18 going from supervised to reinforcement learning 45:07 reproducibility in RL 50:05 most underrated aspect of ML, biggest challenges? Get our podcast on these other platforms: Apple Podcasts: http://wandb.me/apple-podcasts Spotify: http://wandb.me/spotify Google: http://wandb.me/google-podcasts YouTube: http://wandb.me/youtube Soundcloud: http://wandb.me/soundcloud Tune in to our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://wandb.me/salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://wandb.me/slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices: https://wandb.ai/gallery
More Episodes
🚀 Discover the cutting-edge AI hardware development for enterprises in this episode of Gradient Dissent, featuring Rodrigo Liang, CEO of SambaNova Systems.  Rodrigo Liang’s journey from Oracle to founding SambaNova is a tale of innovation and determination. In this episode, Rodrigo discusses the...
Published 04/11/24
🚀 This episode of Gradient Dissent welcomes Edo Liberty, the mind behind Pinecone's revolutionary vector database technology. As a former leader at Amazon AI Labs and Yahoo's New York lab, Edo Liberty's extensive background in AI research and development showcases the complexities behind vector...
Published 03/28/24