Anton Teaches Packy AI | Ep 2 | Chinchilla
Listen now
Description
We're back! In Episode 2, Anton Teaches Packy about Deepmind's March 2022 paper, Training Compute-Optimal Large Language Models, or as it's more commonly known, Chinchilla.  Prior to Chinchilla, the best way to improve the performance of LLMs was thought to be by scaling up the size of the model. As a result, the largest models now have over 500 billion parameters. But there are only so many GPUs in the world, and throwing compute at the problem is expensive and energy intensive. In this paper, Deepmind found that the optimal way to scale an LLM is actually by scaling size (parameters) and training (data) proportionally.  Given the race for size, today's models are plenty big but need a lot more data.    In this conversation, we go deep on the paper itself, but we also zoom out to talk about the politics of AI, when AGI is going to hit, where to get more data, and why AI won't take our jobs. This one gets a lot more philosophical than our first episode as we explore the implications of Chinchilla and LLMs more generally.  If you enjoyed this conversation, subscribe for more. We're going to try to release one episode per week, and we want to make this the best way to get a deeper understanding of the mind-blowing progress happening in AI and what it means for everything we do as humans.   LINKS: Training Compute-Optimal Large Language Models: https://arxiv.org/abs/2203.15556  chinchilla's wild implications: https://www.lesswrong.com/posts/6Fpvc...  Scaling Laws for Neural Language Models (Kaplan et al): https://arxiv.org/abs/2001.08361 --- Send in a voice message: https://anchor.fm/notboring/message
More Episodes
Published 10/13/23
Age of Miracles is a narrative show that explores the complex industries that will play an important role in creating an abundant future for humanity. Episodes 1 and 2 drop October 27th. Every season, host Packy McCormick – a venture investor and writer of the popular Not Boring newsletter –...
Published 10/13/23
Crusoe is on a mission to align the future of computing with the future of the climate. It has pioneered infrastructure that taps into stranded energy — methane being flared or excess production from clean and renewable sources — to power the compute resources we need to drive our shared progress...
Published 09/07/23