The New Claude 3.5 Sonnet: Better, Yes, But Not Just in the Way You Might Think
Listen now
Description
A new state of the art LLM (at least for creative writing and basic reasoning) but what lies behind the numbers that were put out? Is it for real, and are AI agents about to grab your mouse and shake your cursor? Plus, results on my own Simple Bench, and new tools from Runway (Act-One), HeyGen (Zoom Calls) and an updated NotebookLM. AI, without the hype.Weights and Biases' Weave: https://wandb.me/ai_explained
More Episodes
A new and mysterious Gemini model appears at the top of the leaderboard, but is that the full story? I dig behind the headline to show you some anti-climactic results, give some context with leaks in the last 48 hours of diminishing returns to scaling, and add the response of Altman, OpenAI and...
Published 11/15/24
Published 11/10/24
The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in this video) from Sam Altman, regarding the ‘clear path’...
Published 11/10/24