📆 ThursdAI - Nov 14 - Qwen 2.5 Coder, No Walls, Gemini 1114 👑 LLM, ChatGPT OS integrations & more AI news
Listen now
Description
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real time at 59:32) 00:00 Welcome to ThursdAI 00:25 Meet the Hosts 02:38 Show Format and Community 03:18 TLDR Overview 04:01 Open Source Highlights 13:31 Qwen Coder 2.5 Release 14:00 Speculative Decoding and Model Performance 22:18 Interactive Demos and Artifacts 28:20 Training Insights and Future Prospects 33:54 Breaking News: Nexus Flow 36:23 Exploring Athene v2 Agent Capabilities 36:48 Understanding ArenaHard and Benchmarking 40:55 Scaling and Limitations in AI Models 43:04 Nexus Flow and Scaling Debate 49:00 Open Source LLMs and New Releases 52:29 FrontierMath Benchmark and Quantization Challenges 58:50 Gemini Experimental 1114 Release and Performance 01:11:28 LLM Observability with Weave 01:14:55 Introduction to Tracing and Evaluations 01:15:50 Weave API Toolkit Overview 01:16:08 Buzz Corner: Weights & Biases 01:16:18 Nous Forge Reasoning API 01:26:39 Breaking News: OpenAI's New MacOS Features 01:27:41 Live Demo: ChatGPT Integration with VS Code 01:34:28 Ultravox: Real-Time AI Conversations 01:42:03 Tilde Research and Stargazer Tool 01:46:12 Conclusion and Final Thoughts This week also, there was a debate online, whether deep learning (and scale is all you need) has hit a wall, with folks like Ilya Sutskever being cited by publications claiming it has, folks like Yann LeCoon calling "I told you so". TL;DR? multiple huge breakthroughs later, and both Oriol from DeepMind and Sam Altman are saying "what wall?" and Heiner from X.ai saying "skill issue", there is no walls in sight, despite some tech journalism love to pretend there is. Also, what happened to Yann? 😵‍💫 Ok, back to our scheduled programming, here's the TL;DR, afterwhich, a breakdown of the most important things about today's update, and as always, I encourage you to watch / listen to the show, as we cover way more than I summarize here 🙂 TL;DR and Show Notes: * Open Source LLMs * Qwen Coder 2.5 32B (+5 others) - Sonnet @ home (HF, Blog, Tech Report) * The End of Quantization? (X, Original Thread) * Epoch : FrontierMath new benchmark for advanced MATH reasoning in AI (Blog) * Common Corpus: Largest multilingual 2T token dataset (blog) * NexusFlow - Athena v2 - open model suite (X, Blog, HF) * Big CO LLMs + APIs * Gemini 1114 is new king LLM #1 LMArena (X) * Nous Forge Reasoning API - beta (Blog, X) * Reuters reports "AI is hitting a wall" and it's becoming a meme (Article) * Cursor acq. SuperMaven (X) * This Weeks Buzz * Weave JS/TS support is here 🙌 * Voice & Audio * Fixie releases UltraVox SOTA (Demo, HF, API) * Suno v4 is coming and it's bonkers amazing (Alex Song, SOTA Jingle) * Tools demoed * Qwen artifacts - HF Demo * Tilde Galaxy - Interp Tool This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
More Episodes
👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24
Hey everyone, Happy Halloween! Alex here, coming to you live from my mad scientist lair! For the first ever, live video stream of ThursdAI, I dressed up as a mad scientist and had my co-host, Fester the AI powered Skeleton join me (as well as my usual cohosts haha) in a very energetic and...
Published 11/01/24