🔥 ThursdAI - Feb 15, 2024 - OpenAI changes the Video Game, Google

ThursdAI - The top AI news from the past week

🔥 ThursdAI - Feb 15, 2024 - OpenAI changes the Video Game, Google changes the Context game, and other AI news from past week

Listen now

Description

Holy SH*T, These two words have been said on this episode multiple times, way more than ever before I want to say, and it's because we got 2 incredible exciting breaking news announcements in a very very short amount of time (in the span of 3 hours) and the OpenAI announcement came as we were recording the space, so you'll get to hear a live reaction of ours to this insanity. We also had 3 deep-dives, which I am posting on this weeks episode, we chatted with Yi Tay and Max Bane from Reka, which trained and released a few new foundational multi modal models this week, and with Dome and Pablo from Stability who released a new diffusion model called Stable Cascade, and finally had a great time hanging with Swyx (from Latent space) and finally got a chance to turn the microphone back at him, and had a conversation about Swyx background, Latent Space, and AI Engineer. I was also very happy to be in SF today of all days, as my day is not over yet, there's still an event which we Cohost together with A16Z, folks from Nous Research, Ollama and a bunch of other great folks, just look at all these logos! Open Source FTW 👏 TL;DR of all topics covered: * Breaking AI News * 🔥 OpenAI releases SORA - text to video generation (Sora Blogpost with examples) * 🔥 Google teases Gemini 1.5 with a whopping 1 MILLION tokens context window (X, Blog) * Open Source LLMs * Nvidia releases Chat With RTX local models (Blog, Download) * Cohere open sources Aya 101 - 101 languages supporting 12.8B model (X, HuggingFace) * Nomic releases Nomic Embed 1.5 + with Matryoshka embeddings (X) * Big CO LLMs + APIs * Andrej Karpathy leaves OpenAI (Announcement) * OpenAI adds memory to chatGPT (X) * This weeks Buzz (What I learned at WandB this week) * We launched a new course with Hamel Husain on enterprise model management (Course) * Vision & Video * Reka releases Reka-Flash, 21B & Reka Edge MM models (Blog, Demo) * Voice & Audio * WhisperKit runs on WatchOS now! (X) * AI Art & Diffusion & 3D * Stability releases Stable Casdade - new AI model based on Würstchen v3 (Blog, Demo) * Tools & Others * Goody2ai - A very good and aligned AI that does NOT want to break the rules (try it) 🔥 Let's start with Breaking News (in the order of how they happened) Google teases Gemini 1.5 with a whopping 1M context window This morning, Jeff Dean released a thread, full of crazy multi modal examples of their new 1.5 Gemini model, which can handle up to 1M tokens in the context window. The closest to that model so far was Claude 2.1 and that was not multi modal. They also claim they are researching up to 10M tokens in the context window. The thread was chock full of great examples, some of which highlighted the multimodality of this incredible model, like being able to pinpoint and give a timestamp of an exact moment in an hour long movie, just by getting a sketch as input. This, honestly blew me away. They were able to use the incredible large context window, break down the WHOLE 1 hour movie to frames and provide additional text tokens on top of it, and the model had near perfect recall. They used Greg Kamradt needle in the haystack analysis on text, video and audio and showed incredible recall, near perfect which highlights how much advancement we got in the area of context windows. Just for reference, less than a year ago, we had this chart from Mosaic when they released MPT. This graph Y axis at 60K the above graph is 1 MILLION and we're less than a year apart, not only that, Gemini Pro 1.5 is also multi modal I got to give promps to the Gemini team, this is quite a huge leap for them, and for the rest of the industry, this is a significant jump in what users will expect going forward! No longer will we be told "hey, your context is too long" 🤞 A friend of the pod Enrico Shipolle joined the stage, you may remember him from our deep dive into extending Llama context window to 128K and showed that a bunch of new research makes all this possib

More Episodes

See all »

📆 ThursdAI - Nov 14 - Qwen 2.5 Coder, No Walls, Gemini 1114 👑 LLM, ChatGPT OS integrations & more AI news

This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...

Published 11/15/24

ThursdAI - The top AI news from the past week

Published 11/15/24

📆 ThursdAI - Nov 7 - Video version, full o1 was given and taken away, Anthropic price hike-u, halloween 💀 recap & more AI news

👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...

Published 11/08/24