π
ThursdAI - Mar 28 - 3 new MoEs (XXL, Medium and Small), Opus is π of the arena, Hume is sounding emotional + How Tanishq and Paul turn brainwaves into SDXL images π§ ποΈ
Description
Hey everyone, this is Alex and can you believe that we're almost done with Q1 2024? March 2024 was kind of crazy of course, so I'm of course excited to see what April brings (besides Weights & Biases conference in SF called Fully Connected, which I encourage you to attend and say Hi to me and the team!)
This week we have tons of exciting stuff on the leaderboards, say hello to the new best AI in the world Opus (+ some other surprises), in the open source we had new MoEs (one from Mosaic/Databricks folks, which tops the open source game, one from AI21 called Jamba that shows that a transformers alternative/hybrid can actually scale) and tiny MoE from Alibaba, as well as an incredible Emotion TTS from Hume.
I also had the pleasure to finally sit down with friend of the pod Tanishq Abraham and Paul Scotti from MedArc and chatted about MindEye 2, how they teach AI to read minds using diffusion models π€―π§ ποΈ
Thank you for reading ThursdAI - Recaps of the most high signal AI weekly spaces. This post is public so feel free to share it.
TL;DR of all topics covered:
* AI Leaderboard updates
* Claude Opus is number 1 LLM on arena (and in the world)
* Claude Haiku passes GPT4-0613
* π₯ Starling 7B beta is the best Apache 2 model on LMsys, passing GPT3.5
* Open Source LLMs
* Databricks/Mosaic DBRX - a new top Open Access model (X, HF)
* π₯ AI21 - Jamba 52B - Joint Attention Mamba MoE (Blog, HuggingFace)
* Alibaba - Qwen1.5-MoE-A2.7B (Announcement, HF)
* Starling - 7B that beats GPT3.5 on lmsys (HF)
* LISA beats LORA as the frontrunner PeFT (X, Paper)
* Mistral 0.2 Base released (Announcement)
* Big CO LLMs + APIs
* Emad leaves stability π₯Ί
* Apple rumors - Baidu, Gemini, Anthropic, who else? (X)
* This weeks buzz
* WandB Workshop in SF confirmed April 17 - LLM evaluations (sign up here)
* Vision & Video
* Sora showed some demos by actual artists, Air Head was great (Video)
* Tencent Aniportait - generate Photorealistic Animated avatars (X)
* MedArc - MindEye 2 - fMRI signals to diffusion models (X)
* Voice & Audio
* Hume demos EVI - empathic voice analysis & generation (X, demo)
* AI Art & Diffusion & 3D
* Adobe firefly adds structure reference and style transfer - (X, Demo)
* Discussion
* Deep dive into MindEye 2 with Tanishq & Paul from MedArc
* Is narrow finetuning done-for with larger context + cheaper prices - debate
π₯π₯π₯Leaderboards updates from LMSys (Arena)
This weeks updates to the LMsys arena are significant. (Reminder in LMsys they use a mix of MT-Bench, LLM as an evaluation and user ELO scores where users play with these models and choose which answer they prefer)
For the first time since the Lmsys arena launched, the top model is NOT GPT-4 based. It's now Claude's Opus, but that's not surprising if you used the model, what IS surprising is that Haiku, it's tiniest, fastest brother is now well positioned at number 6, beating a GPT4 version from the summer, Mistral Large and other models while being dirt cheap.
We also have an incredible show from the only Apache 2.0 licensed model in the top 15, Starling LM 7B beta, which is now 13th on the chart, with incredible finetune of a finetune (OpenChat) or Mistral 7B. π
Yes, you can now run a GPT3.5 beating model, on your mac, fully offline π Incredible.
Open Source LLMs (Welcome to MoE's)
Mosaic/Databricks gave us DBRX 132B MoE - trained on 12T tokens (X, Blog, HF)
Absolutely crushing the previous records, Mosaic has released the top open access model (one you can download and run and finetune) in a while, beating LLama 70B, Grok-1 (314B) and pretty much every other non closed source model in the world not only on metrics and evals, but also on inference speed
It uses a Mixture of Experts (MoE) architecture with 16 experts that each activate for different tokens. this allows it to have 36 billion actively parameters compared to 13 billion for Mixtral. DBRX has strong capabilities in math, code, and natural language un
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
π Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr).
I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24