📅 ThursdAI Apr 4 - Weave, CMD R+, SWE-Agent, Everyone supports Tool Use + JAMBA deep dive with AI21
Listen now
Description
Happy first ThursdAI of April folks, did you have fun on April Fools? 👀 I hope you did, I made a poll on my feed and 70% did not participate in April Fools, which makes me a bit sad! Well all-right, time to dive into the news of this week, and of course there are TONS of news, but I want to start with our own breaking news! That's right, we at Weights & Biases have breaking new of our own today, we've launched our new product today called Weave! Weave is our new toolkit to track, version and evaluate LLM apps, so from now on, we have Models (what you probably know as Weights & Biases) and Weave. So if you're writing any kind RAG system, anything that uses Claude or OpenAI, Weave is for you! I'll be focusing on Weave and I'll be sharing more on the topic, but today I encourage you to listen to the launch conversation I had with Tim & Scott from the Weave team here at WandB, as they and the rest of the team worked their ass off for this release and we want to celebrate the launch 🎉 TL;DR of all topics covered: * Open Source LLMs * Cohere - CommandR PLUS - 104B RAG optimized Sonnet competitor (Announcement, HF) * Princeton SWE-agent - OSS Devin - gets 12.29% on SWE-bench (Announcement, Github) * Jamba paper is out (Paper) * Mozilla LLamaFile now goes 5x faster on CPUs (Announcement, Blog) * Deepmind - Mixture of Depth paper (Thread, ArXiv) * Big CO LLMs + APIs * Cloudflare AI updates (Blog) * Anthropic adds function calling support (Announcement, Docs) * Groq lands function calling (Announcement, Docs) * OpenAI is now open to customers without login requirements * Replit Code Repair - 7B finetune of deep-seek that outperforms Opus (X) * Google announced Gemini Prices + Logan joins (X)קרמ * This weeks Buzz - oh so much BUZZ! * Weave lunch! Check weave out! (Weave Docs, Github) * Sign up with Promo Code THURSDAI at fullyconnected.com * Voice & Audio * OpenAI Voice Engine will not be released to developers (Blog) * Stable Audio v2 dropped (Announcement, Try here) * Lightning Whisper MLX - 10x faster than whisper.cpp (Announcement, Github) * AI Art & Diffusion & 3D * Dall-e now has in-painting (Announcement) * Deep dive * Jamba deep dive with Roi Cohen from AI21 and Maxime Labonne Open Source LLMs Cohere releases Command R+, 104B RAG focused model (Blog) Cohere surprised us, and just 2.5 weeks after releasing Command-R (which became very popular and is No 10 on Lmsys arena) gave us it's big brother, Command R PLUS With 128K tokens in the context window, this model is multilingual as well, supporting 10 languages and is even beneficial on tokenization for those languages (a first!) The main focus from Cohere is advanced function calling / tool use, and RAG of course, and this model specializes in those tasks, beating even GPT-4 turbo. It's clear that Cohere is positioning themselves as RAG leaders as evident by this accompanying tutorial on starting with RAG apps and this model further solidifies their place as the experts in this field. Congrats folks, and thanks for the open weights 🫡 SWE-Agent from Princeton Folks remember Devin? The super cracked team born agent with a nice UI that got 13% on the SWE-bench a very hard (for LLMs) benchmark that requires solving real world issues? Well now we have an open source agent that comes very very close to that called SWE-Agent SWE agent has a dedicated terminal and tools, and utilizes something called ACI (Agent Computer Interface) allowing the agent to navigate, search, and edit code. The dedicated terminal in a docker environment really helps as evident by a massive 12.3% score on SWE-bench where GPT-4 gets only 1.4%! Worth mentioning that SWE-bench is a very hard benchmark that was created by the folks who released SWE-agent, and here's some videos of them showing the agent off, this is truly an impressive achievement! Deepmind publishes Mixture of Depth (arXiv) Thanks to Hassan who read the paper and wrote a deep dive, this paper by Deepmind shows thei
More Episodes
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24