All episodes of Damar Podcast

Episodes

[NotebookLM] Scaling Laws for Neural Language Models

Scaling Laws for Neural Language Models https://arxiv.org/abs/2001.08361 Summary: This research paper empirically investigates scaling laws for the performance of Transformer-based language models. The authors find that performance scales predictably as a power law with model size, dataset size, and compute used for training, while showing weak dependence on other architectural details. They establish equations that predict overfitting and training speed, enabling optimal compute budget...

Published 12/01/24

Damar Podcast

Published 12/01/24

Intro

Published 10/13/23