The New Claude 3.5 Sonnet: Better, Yes, But Not Just in the Way

The New Claude 3.5 Sonnet: Better, Yes, But Not Just in the Way You Might Think

Listen now

Description

A new state of the art LLM (at least for creative writing and basic reasoning) but what lies behind the numbers that were put out? Is it for real, and are AI agents about to grab your mouse and shake your cursor? Plus, results on my own Simple Bench, and new tools from Runway (Act-One), HeyGen (Zoom Calls) and an updated NotebookLM. AI, without the hype.Weights and Biases' Weave: https://wandb.me/ai_explained

More Episodes

See all »

New Google Model Ranked ‘No. 1 LLM’, But There’s a Problem

A new and mysterious Gemini model appears at the top of the leaderboard, but is that the full story? I dig behind the headline to show you some anti-climactic results, give some context with leaks in the last 48 hours of diminishing returns to scaling, and add the response of Altman, OpenAI and...

Published 11/15/24

AI Explained Official Podcast

Published 11/10/24

Leak: ‘GPT-5 exhibits diminishing returns’, Sam Altman: ‘lol’

The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in this video) from Sam Altman, regarding the ‘clear path’...

Published 11/10/24