ThursdAI - Sep 19 - πŸ‘‘ Qwen 2.5 new OSS king LLM, MSFT new MoE, Nous Research's Forge announcement, and Talking AIs in the open source!
Listen now
Description
Hey folks, Alex here, back with another ThursdAI recap – and let me tell you, this week's episode was a whirlwind of open-source goodness, mind-bending inference techniques, and a whole lotta talk about talking AIs! We dove deep into the world of LLMs, from Alibaba's massive Qwen 2.5 drop to the quirky, real-time reactions of Moshi. We even got a sneak peek at Nous Research's ambitious new project, Forge, which promises to unlock some serious LLM potential. So grab your pumpkin spice latte (it's that time again isn't it? 🍁) settle in, and let's recap the AI awesomeness that went down on ThursdAI, September 19th! ThursdAI is brought to you (as always) by Weights & Biases, we still have a few spots left in our Hackathon this weekend and our new advanced RAG course is now released and is FREE to sign up! TL;DR of all topics + show notes and links * Open Source LLMs * Alibaba Qwen 2.5 models drop + Qwen 2.5 Math and Qwen 2.5 Code (X, HF, Blog, Try It) * Qwen 2.5 Coder 1.5B is running on a 4 year old phone (Nisten) * KyutAI open sources Moshi & Mimi (Moshiko & Moshika) - end to end voice chat model (X, HF, Paper) * Microsoft releases GRIN-MoE - tiny (6.6B active) MoE with 79.4 MMLU (X, HF, GIthub) * Nvidia - announces NVLM 1.0 - frontier class multimodal LLMS (no weights yet, X) * Big CO LLMs + APIs * OpenAI O1 results from LMsys do NOT disappoint - vibe checks also confirm, new KING llm in town (Thread) * NousResearch announces Forge in waitlist - their MCTS enabled inference product (X) * This weeks Buzz - everything Weights & Biases related this week * Judgement Day (hackathon) is in 2 days! Still places to come hack with us Sign up * Our new RAG Course is live - learn all about advanced RAG from WandB, Cohere and Weaviate (sign up for free) * Vision & Video * Youtube announces DreamScreen - generative AI image and video in youtube shorts ( Blog) * CogVideoX-5B-I2V - leading open source img2video model (X, HF) * Runway, DreamMachine & Kling all announce text-2-video over API (Runway, DreamMachine) * Runway announces video 2 video model (X) * Tools * Snap announces their XR glasses - have hand tracking and AI features (X) Open Source Explosion! πŸ‘‘ Qwen 2.5: new king of OSS llm models with 12 model releases, including instruct, math and coder versions This week's open-source highlight was undoubtedly the release of Alibaba's Qwen 2.5 models. We had Justin Lin from the Qwen team join us live to break down this monster drop, which includes a whopping seven different sizes, ranging from a nimble 0.5B parameter model all the way up to a colossal 72B beast! And as if that wasn't enough, they also dropped Qwen 2.5 Coder and Qwen 2.5 Math models, further specializing their LLM arsenal. As Justin mentioned, they heard the community's calls for 14B and 32B models loud and clear – and they delivered! "We do not have enough GPUs to train the models," Justin admitted, "but there are a lot of voices in the community...so we endeavor for it and bring them to you." Talk about listening to your users! Trained on an astronomical 18 trillion tokens (that’s even more than Llama 3.1 at 15T!), Qwen 2.5 shows significant improvements across the board, especially in coding and math. They even open-sourced the previously closed-weight Qwen 2 VL 72B, giving us access to the best open-source vision language models out there. With a 128K context window, these models are ready to tackle some serious tasks. As Nisten exclaimed after putting the 32B model through its paces, "It's really practical…I was dumping in my docs and my code base and then like actually asking questions." It's safe to say that Qwen 2.5 coder is now the best coding LLM that you can use, and just in time for our chat, a new update from ZeroEval confirms, Qwen 2.5 models are the absolute kings of OSS LLMS, beating Mistral large, 4o-mini, Gemini Flash and other huge models with just 72B parameters πŸ‘ Moshi: The Chatty Cathy of AI We've covered Moshi Voice back i
More Episodes
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
πŸ‘‹ Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24