๐Ÿ“… ThursdAI Jan 18 - Nous Mixtral, Deepmind AlphaGeometry, LMSys SGLang, Rabbit R1 + Perplexity, LLama 3 is training & more AI news this week
Listen now
Description
๐Ÿ‘‹ Hey there, been quite a week, started slow and whoah, the last two days were jam-packed with news, I was able to barely keep up! But thankfully, the motto of ThursdAI is, we stay up to date so you donโ€™t have to! We had a milestone, 1.1K listeners tuned into the live show recording, itโ€™s quite the number, and Iโ€™m humbled to present the conversation and updates to that many people, if youโ€™re reading this but never joined live, welcome! Weโ€™re going live every week on ThursdAI, 8:30AM pacific time. TL;DR of all topics covered: * Open Source LLMs * Nous Hermes Mixtral finetune (X, HF DPO version, HF SFT version) * NeuralBeagle14-7B - From Maxime Labonne (X, HF,) * It's the best-performing 7B parameter model on the Open LLM Leaderboard (when released, now 4th) * We had a full conversation with Maxime about merging that will release as a standalone episode on Sunday! * LMsys - SGLang - a 5x performance on inference (X, Blog, Github) * NeuralMagic applying #sparceGPT to famous models to compress them with 50% sparsity (X, Paper) * Big CO LLMs + APIs * ๐Ÿ”ฅ Google Deepmind solves geometry at Olympiad level with 100M synthetic data (Announcement, Blog) * Meta announces Llama3 is training, will have 350,000 H100 GPUs (X) * Open AI releases guidelines for upcoming elections and removes restrictions for war use (Blog) * Sam Altman (in Davos) doesn't think that AGI will change things as much as people think (X) * Samsung S24 has AI everywhere, including real time translation of calls (X) * Voice & Audio * Meta releases MAGNet (X, HF) * AI Art & Diffusion & 3D * Stable diffusion runs 100% in the browser with WebGPU, Diffusers.js (X thread) * DeciAI - Deci Diffusion - A text-to-image 732M-parameter model thatโ€™s 2.6x faster and 61% cheaper than Stable Diffusion 1.5 with on-par image quality * Tools & Hardware * Rabbit R1 announces a deal with Perplexity, giving a full year of perplexity pro to Rabbit R1 users and will be the default search engine on Rabbit (link) Open Source LLMs Nous Research releases their first Mixtral Finetune, in 2 versions DPO and SFT (X, DPO HF) This is the first Mixtral finetune from Teknium1 and Nous team, trained on the Hermes dataset and comes in two variants, the SFT and SFT+DPO versions, and is a really really capable model, they call it their flagship! This is the fist Mixtral finetune to beat Mixtral instruct, and is potentially the best open source model available right now! ๐Ÿ‘ Already available at places like Together endpoints, GGUF versions by the Bloke and Iโ€™ve been running this model on my mac for the past few days. Quite remarkable considering where we are in only January and this is the best open chat model available for us. Make sure you use ample system prompting for it, as it was trained with system prompts in mind. LMsys new inference 5x with SGLang & RadixAttention (Blog) ย LMSys introduced SGLang, a new interface and runtime for improving the efficiency of large language model (LLM) inference. It claims to provide up to 5x faster inference speeds compared to existing systems like Guidance and vLLM.ย  SGLang was designed to better support complex LLM programs through features like control flow, prompting techniques, and external interaction. It co-designs the frontend language and backend runtime. - On the backend, it proposes a new technique called RadixAttention to automatically handle various patterns of key-value cache reuse, improving performance.ย  - Early users like LLaVa reported SGLang providing significantly faster inference speeds in their applications compared to other options. The LMSys team released code on GitHub for others to try it out. Big CO LLMs + APIs Meta AI announcements (link) These #BreakingNews came during our space, Mark Zuckerberg posted a video on Instagram saying that Llama3 is currently training, and will be open sourced! He also said that Meta will have 350K (thatโ€™s not a typo, 350,000) H100 GPUs by end of the year, and a tot
More Episodes
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
๐Ÿ‘‹ Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24