Deep Dive into Inference Optimization for LLMs with Philip Kiely
Listen now
Description
Today we have Philip Kiely from Baseten on the show. Baseten is a Series B startup focused on providing infrastructure for AI workloads. We go deep on Inference Optimization. We cover choosing a model, discuss the hype around Compound AI, choosing an Inference Engine, Optimization Techniques like Quantization and Speculative Decoding all the way down to your GPU choice.
More Episodes
Today, we have David Cramer on the show. David is one of the co-founders of Sentry, an application monitoring tool that's one of the most widely-adopted tools for developers. Sentry does over 300,000 events per second on average, and there's a lot of fancy work to process these application...
Published 11/12/24
Published 11/12/24
Today on the show, we have Kevin Dubois. Kevin is a Senior Principal Developer Advocate at Red Hat, Java Champion, and well known open source contributor. In our conversation with Kevin, we talk about his history with Java and the evolution of the language and where it now fits within the world...
Published 10/29/24