📆 ThursdAI - August 1st - Meta SAM 2 for video, Gemini 1.5 is king now?, GPT-4o Voice is here (for some), new Stability, Apple Intelligence also here & more AI news
Listen now
Description
Starting Monday, Apple released iOS 18.1 with Apple Intelligence, then Meta dropped SAM-2 (Segment Anything Model) and then Google first open sourced Gemma 2B and now (just literally 2 hours ago, during the live show) released Gemini 1.5 0801 experimental that takes #1 on LMsys arena across multiple categories, to top it all off we also got a new SOTA image diffusion model called FLUX.1 from ex-stability folks and their new Black Forest Lab. This week on the show, we had Joseph & Piotr Skalski from Roboflow, talk in depth about Segment Anything, and as the absolute experts on this topic (Skalski is our returning vision expert), it was an incredible deep dive into the importance dedicated vision models (not VLMs). We also had Lukas Atkins & Fernando Neto from Arcee AI talk to use about their new DistillKit and explain model Distillation in detail & finally we had Cristiano Giardina who is one of the lucky few that got access to OpenAI advanced voice mode + his new friend GPT-4o came on the show as well! Honestly, how can one keep up with all this? by reading ThursdAI of course, that's how but ⚠️ buckle up, this is going to be a BIG one (I think over 4.5K words, will mark this as the longest newsletter I penned, I'm sorry, maybe read this one on 2x? 😂) [ Chapters ] 00:00 Introduction to the Hosts and Their Work 01:22 Special Guests Introduction: Piotr Skalski and Joseph Nelson 04:12 Segment Anything 2: Overview and Capabilities 15:33 Deep Dive: Applications and Technical Details of SAM2 19:47 Combining SAM2 with Other Models 36:16 Open Source AI: Importance and Future Directions 39:59 Introduction to Distillation and DistillKit 41:19 Introduction to DistilKit and Synthetic Data 41:41 Distillation Techniques and Benefits 44:10 Introducing Fernando and Distillation Basics 44:49 Deep Dive into Distillation Process 50:37 Open Source Contributions and Community Involvement 52:04 ThursdAI Show Introduction and This Week's Buzz 53:12 Weights & Biases New Course and San Francisco Meetup 55:17 OpenAI's Advanced Voice Mode and Cristiano's Experience 01:08:04 SearchGPT Release and Comparison with Perplexity 01:11:37 Apple Intelligence Release and On-Device AI Capabilities 01:22:30 Apple Intelligence and Local AI 01:22:44 Breaking News: Black Forest Labs Emerges 01:24:00 Exploring the New Flux Models 01:25:54 Open Source Diffusion Models 01:30:50 LLM Course and Free Resources 01:32:26 FastHTML and Python Development 01:33:26 Friend.com: Always-On Listening Device 01:41:16 Google Gemini 1.5 Pro Takes the Lead 01:48:45 GitHub Models: A New Era 01:50:01 Concluding Thoughts and Farewell Show Notes & Links * Open Source LLMs * Meta gives SAM-2 - segment anything with one shot + video capability! (X, Blog, DEMO) * Google open sources Gemma 2 2.6B (Blog, HF) * MTEB Arena launching on HF - Embeddings head to head (HF) * Arcee AI announces DistillKit - (X, Blog, Github) * AI Art & Diffusion & 3D * Black Forest Labs - FLUX new SOTA diffusion models (X, Blog, Try It) * Midjourney 6.1 update - greater realism + potential Grok integration (X) * Big CO LLMs + APIs * Google updates Gemini 1.5 Pro with 0801 release and is #1 on LMsys arena (X) * OpenAI started alpha GPT-4o voice mode (examples) * OpenAI releases SearchGPT (Blog, Comparison w/ PPXL) * Apple releases beta of iOS 18.1 with Apple Intelligence (X, hands on, Intents ) * Apple released a technical paper of apple intelligence * This weeks Buzz * AI Salons in SF + New Weave course for WandB featuring yours truly! * Vision & Video * Runway ML adds Gen -3 image to video and makes it 7x faster (X) * Tools & Hardware * Avi announces friend.com * Jeremy Howard releases FastHTML (Site, Video) * Applied LLM course from Hamel dropped all videos Open Source It feels like everyone and their grandma is open sourcing incredible AI this week! Seriously, get ready for segment-anything-you-want + real-time-video capability PLUS small AND powerful language models. Meta Gives Us SAM-2: Segment ANYTHIN
More Episodes
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24