How do you evaluate an LLM? Try an LLM.
Listen now
Description
On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.
More Episodes
On this episode: The FTC bans most noncompete agreements, the implications of the TikTok “ban,” why a 2017 law is hitting startups with huge tax bills seven years later, and the return of net neutrality. Plus: the wunderkind hacker who ransomed Finland’s anxieties and secrets.
Published 04/30/24
Published 04/30/24
Dr. Richard Hipp, creator of SQLite, shares how he taught himself to program, the challenges he faced in creating SQLite, and the importance of testing and maintaining the software for long-term support.
Published 04/26/24