📅 ThursdAI - Jan 24 - ⌛Diffusion Transformers,🧠 fMRI

ThursdAI - The top AI news from the past week

📅 ThursdAI - Jan 24 - ⌛Diffusion Transformers,🧠 fMRI multimodality, Fuyu and Moondream1 VLMs, Google video generation & more AI news

Listen now

Description

What A SHOW folks, I almost don't want to write anything in the newsletter to MAKE you listen haha but I will I know many of you don't like listening to be babble. But if you chose one episode to listen to instead of just skimming the show-notes, make it this one. We've had 2 deep dives, one into the exciting world of multi-modalilty, we chatted with the creator of Moondream1, Vik and the co-founders of Prophetic, Wes and Eric about their EEG/fMRI multimodal transformer (that's right!) and then we had a DEEP dive into the new Hourglass Diffusion Transformers with Tanishq from MedArc/Stability. More than 1300 tuned in to the live show 🔥 and I've got some incredible feedback on the fly, which I cherish so if you have friends who don't already know about ThursdAI, why not share this with them as well? TL;DR of all topics covered: * Open Source LLMs * Stability AI releases StableLM 1.6B params (X, Blog, HF) * InternLM2-Math - SOTA on math LLMs (90% GPT4 perf.) (X, Demo, Github) * MedArc analysis for best open source use for medical research finds Qwen-72 the best open source doctor (X) * Big CO LLMs + APIs * Google teases LUMIERE - incredibly powerful video generation (TTV and ITV) (X, Blog, ArXiv) * 🤗 HuggingFace announces Google partnership (Announcement) * OpenAi 2 new embeddings models, tweaks turbo models and cuts costs (My analysis, Announcement) * Google to add 3 new AI features to Chrome (X, Blog) * Vision & Video * Adept Fuyu Heavy - Third in the world MultiModal while being 20x smaller than GPT4V, Gemini Ultra (X, Blog) * FireLLaVa - First LLaVa model with commercial permissive license from fireworks (X, Blog, HF, DEMO) * Vikhyatk releases Moondream1 - tiny 1.6B VLM trained on Phi 1 (X, Demo, HF) * This weeks's buzz 🐝🪄 - What I learned in WandB this week * New course announcement from Jason Liu & WandB - LLM Engineering: Structured Outputs (Course link) * Voice & Audio * Meta W2V-BERT - Speech encoder for low resource languages (announcement) * 11 labs has dubbing studio (my dubbing test) * AI Art & Diffusion & 3D * Instant ID - zero shot face transfer diffusion model (Demo) * 🔥 Hourglass Diffusion (HDiT) paper - High Resolution Image synthesis - (X, Blog, Paper, Github) * Tools & Others * Prophetic announces MORPHEUS-1, their EEG/fMRI multimodal ultrasonic transformer for Lucid Dream induction (Announcement) * NSF announces NAIRR with partnership from all major government agencies & labs including, OAI, WandB (Blog) * Runway adds multiple motion brushes for added creativity (X, How to) Open Source LLMs Stability releases StableLM 1.6B tiny LLM Super super fast tiny model, I was able to run this in LMStudio that just released an update supporting it, punches above it's weight specifically on other languages like German/Spanish/French/Italian (beats Phi) Has a very surprisingly decent MT-Bench score as well License is not commercial per se, but a specific Stability AI membership I was able to get above 120tok/sec with this model with LM-Studio and it was quite reasonable and honestly, it’s quite ridiculous how fast we’ve gotten to a point where we have an AI model that can weight less that 1GB and has this level of performance 🤯 Vision & Video & Multimodality Tiny VLM Moonbeam1 (1.6B) performs really well (Demo) New friend of the pod Vik Hyatk trained Moonbeam1, a tiny multimodal VLM with LLaVa on top of Phi 1 (not 2 cause.. issues) and while it's not commercially viable, it's really impressive in how fast and how quite good it is. Here's an example featuring two of my dear friends talking about startups, and you can see how impressive this TINY vision enabled model can understand this scene. This is not cherry picked, this is literally the first image I tried with and my first result. The image features two men sitting in chairs, engaged in a conversation. One man is sitting on the left side of the image, while the other is on the right side. They are both looking at a laptop

More Episodes

See all »

📆 ThursdAI - Nov 14 - Qwen 2.5 Coder, No Walls, Gemini 1114 👑 LLM, ChatGPT OS integrations & more AI news

This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...

Published 11/15/24

ThursdAI - The top AI news from the past week

Published 11/15/24

📆 ThursdAI - Nov 7 - Video version, full o1 was given and taken away, Anthropic price hike-u, halloween 💀 recap & more AI news

👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...

Published 11/08/24