π
ThursdAI - Feb 8 - Google Gemini Ultra is here, Qwen 1.5 with Junyang and deep dive into ColBERT, RAGatouille and DSPy with Connor Shorten and Benjamin Clavie
Description
Hihi, this is Alex, from Weights & Biases, coming to you live, from Yosemite! Well, actually Iβm writing these words from a fake virtual yosemite that appears above my kitchen counter as Iβm not a Vision Pro user and I will force myself to work inside this thing and tell you if itβs worth it. I will also be on the lookout on anything AI related in this new spatial computing paradigm, like THIS for example!
But back to rfeality for a second, we had quite the show today! We had the awesome time to have Junyang Justin Lin, a dev lead in Alibaba, join us and talk about Qwen 1.5 and QwenVL and then we had a deep dive into quite a few Acronyms Iβve been seeing on my timeline lately, namely DSPy, ColBERT and (the funniest one) RAGatouille and we had a chat with Connor from Weaviate and Benjamin the author of RAGatouille about what it all means! Really really cool show today, hope you donβt only read the newsletter but listen on Spotify, Apple or right here on Substack.
TL;DR of all topics covered:
* Open Source LLMs
* Alibaba releases a BUNCH of new QWEN 1.5 models including a tiny .5B one (X announcement)
* Abacus fine-tunes Smaug, top of HF leaderboard based Qwen 72B (X)
* LMsys adds more open source models, sponsored by Together (X)
* Jina Embeddings fine tune for code
* Big CO LLMs + APIs
* Google rebranding Bard to Gemini and launching Gemini Ultra (Gemini)
* OpenAI adds image metadata (Announcement)
* OpenAI keys are now restricted per key (Announcement)
* Vision & Video
* Bria - RMBG 1.4 - Open Source BG removal that runs in your browser (X, DEMO)
* Voice & Audio
* Meta voice, a new apache2 licensed TTS - (Announcement)
* AI Art & Diffusion & 3D
* Microsoft added DALL-E editing with "designer" (X thread)
* Stability AI releases update to SVD - video 1.1 launches with a webUI, much nicer videos
* Deep Dive with Benjamin Clavie and Connor Shorten show notes:
* Benjamin's announcement of RAGatouille (X)
* Connor chat with Omar Khattab (author of DSPy and ColBERT) - Weaviate Podcast
* Very helpful intro to ColBert + RAGatouille - Notion
Open Source LLMs
Alibaba releases Qwen 1.5 - ranges from .5 to 72B (DEMO)
With 6 sizes, including 2 new novel ones, from as little as .5B parameter models to an interesting 4B, to all the way to a whopping 72B, Alibaba open sources additional QWEN checkpoints. We've had the honor to have friend of the pod Junyang Justin Lin again, and he talked to us about how these sizes were selected, that even thought this model beats Mistral Medium on some benchmarks, it remains to be seen how well this performs on human evaluations, and shared a bunch of details about open sourcing this.
The models were released with all the latest and greatest quantizations, significantly improved context length (32K) and support for both Ollama and Lm Studio (which I helped make happen and am very happy for the way ThursdAI community is growing and connecting!)
We also had a chat about QwenVL Plus and QwebVL Max, their API only examples for the best open source vision enabled models and had the awesome Piotr Skalski from Roborflow on stage to chat with Junyang about those models!
To me a success of ThursdAI, is when the authors of things we talk about are coming to the show, and this is Junyang second appearance, which he joined at midnight at the start of the chinese new year, so greately appreciated and def. give him a listen!
Abacus Smaug climbs to top of the hugging face leaderboard
Junyang also mentioned that Smaug is now at the top of the leaderboards, coming from Abacus, this is a finetune of the previous Qwen-72B, not even this new one. First model to achieve an average score of 80, this is an impressive appearance from Abacus, though they haven't released any new data, they said they are planning to!
They also said that they are planning to finetune Miqu, which we covered last time, the leak from Mistral that was acknowledged by Arthur Mensch the CEO of Mistral.
The techniques that Abacus use
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
π Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr).
I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24