📅 ThursdAI - ChatGPT-4o back on top, Nous Hermes 3 LLama finetune, XAI uncensored Grok2, Anthropic LLM caching & more AI news from another banger week
Description
Look these crazy weeks don't seem to stop, and though this week started out a bit slower (while folks were waiting to see how the speculation about certain red berry flavored conspiracies are shaking out) the big labs are shipping!
We've got space uncle Elon dropping an "almost-gpt4" level Grok-2, that's uncensored, has access to real time data on X and can draw all kinds of images with Flux, OpenAI announced a new ChatGPT 4o version (not the one from last week that supported structured outputs, a different one!) and Anthropic dropping something that makes AI Engineers salivate!
Oh, and for the second week in a row, ThursdAI live spaces were listened to by over 4K people, which is very humbling, and awesome because for example today, Nous Research announced Hermes 3 live on ThursdAI before the public heard about it (and I had a long chat w/ Emozilla about it, very well worth listening to)
TL;DR of all topics covered:
* Big CO LLMs + APIs
* Xai releases GROK-2 - frontier level Grok, uncensored + image gen with Flux (𝕏, Blog, Try It)
* OpenAI releases another ChatGPT-4o (and tops LMsys again) (X, Blog)
* Google showcases Gemini Live, Pixel Bugs w/ Gemini, Google Assistant upgrades ( Blog)
* Anthropic adds Prompt Caching in Beta - cutting costs by u to 90% (X, Blog)
* AI Art & Diffusion & 3D
* Flux now has support for LORAs, ControlNet, img2img (Fal, Replicate)
* Google Imagen-3 is out of secret preview and it looks very good (𝕏, Paper, Try It)
* This weeks Buzz
* Using Weights & Biases Weave to evaluate Claude Prompt Caching (X, Github, Weave Dash)
* Open Source LLMs
* NousResearch drops Hermes 3 - 405B, 70B, 8B LLama 3.1 finetunes (X, Blog, Paper)
* NVIDIA Llama-3.1-Minitron 4B (Blog, HF)
* AnswerAI - colbert-small-v1 (Blog, HF)
* Vision & Video
* Runway Gen-3 Turbo is now available (Try It)
Big Companies & LLM APIs
Grok 2: Real Time Information, Uncensored as Hell, and… Flux?!
The team at xAI definitely knows how to make a statement, dropping a knowledge bomb on us with the release of Grok 2. This isn't your uncle's dad joke model anymore - Grok 2 is a legitimate frontier model, folks.
As Matt Shumer excitedly put it
“If this model is this good with less than a year of work, the trajectory they’re on, it seems like they will be far above this...very very soon” 🚀
Not only does Grok 2 have impressive scores on MMLU (beating the previous GPT-4o on their benchmarks… from MAY 2024), it even outperforms Llama 3 405B, proving that xAI isn't messing around.
But here's where things get really interesting. Not only does this model access real time data through Twitter, which is a MOAT so wide you could probably park a rocket in it, it's also VERY uncensored. Think generating political content that'd make your grandma clutch her pearls or imagining Disney characters breaking bad in a way that’s both hilarious and kinda disturbing all thanks to Grok 2’s integration with Black Forest Labs Flux image generation model.
With an affordable price point ($8/month for x Premium including access to Grok 2 and their killer MidJourney competitor?!), it’ll be interesting to see how Grok’s "truth seeking" (as xAI calls it) model plays out. Buckle up, folks, this is going to be wild, especially since all the normies now have the power to create political memes, that look VERY realistic, within seconds.
Oh yeah… and there’s the upcoming Enterprise API as well… and Grok 2’s made its debut in the wild on the LMSys Arena, lurking incognito as "sus-column-r" and is now placed on TOP of Sonnet 3.5 and comes in as number 5 overall!
OpenAI last ChatGPT is back at #1, but it's all very confusing 😵💫
As the news about Grok-2 was settling in, OpenAI decided to, well… drop yet another GPT-4.o update on us. While Google was hosting their event no less. Seriously OpenAI? I guess they like to one-up Google's new releases (they also kicked Gemini from the #1 position after only 1 week there)
So what
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr).
I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24