Minimizing GPU RAM and Scaling Model Training Horizontally with

Minimizing GPU RAM and Scaling Model Training Horizontally with Quantization and Distributed Training

Listen now

Description

Training multibillion-parameter models in machine learning poses significant challenges, particularly concerning GPU memory limitations. A single NVIDIA A100 or H100 GPU, with its 80 GB of GPU RAM, often falls short when handling 32-bit full-precision models. This blog post will delve into two powerful techniques to overcome these challenges: quantization and distributed training.

More Episodes

See all »

Migrating my blog from Gatsby to Astro

Welcome back to "Continuous Improvement," the podcast where we explore tools, techniques, and stories that help us all get better, one step at a time. I'm your host, Victor Leung, and today we're diving into the world of static site generators—specifically, my journey from Gatsby to Astro and why...

Published 08/25/24

Continuous improvement

Published 08/25/24

An Overview of Reinforcement Learning

Hello, and welcome to another episode of "Continuous Improvement," the podcast where we explore the latest trends and insights in technology, innovation, and leadership. I'm your host, Victor Leung. Today, we're diving into a fascinating area of machine learning—Reinforcement Learning, often...

Published 08/24/24