ThursdAI - Feb 1, 2024- Code LLama, Bard is now 2nd best LLM?!, new LLaVa is great at OCR, Hermes DB is public + 2 new Embed models + Apple AI is coming 👀
Description
TL;DR of all topics covered + Show notes
* Open Source LLMs
* Meta releases Code-LLama 70B - 67.8% HumanEval (Announcement, HF instruct version, HuggingChat, Perplexity)
* Together added function calling + JSON mode to Mixtral, Mistral and CodeLLama
* RWKV (non transformer based) Eagle-7B - (Announcement, Demo, Yam's Thread)
* Someone leaks Miqu, Mistral confirms it's an old version of their model
* Olmo from Allen Institute - fully open source 7B model (Data, Weights, Checkpoints, Training code) - Announcement
* Datasets & Embeddings
* Teknium open sources Hermes dataset (Announcement, Dataset, Lilac)
* Lilac announces Garden - LLM powered clustering cloud for datasets (Announcement)
* BAAI releases BGE-M3 - Multi-lingual (100+ languages), 8K context, multi functional embeddings (Announcement, Github, technical report)
* Nomic AI releases Nomic Embed - fully open source embeddings (Announcement, Tech Report)
* Big CO LLMs + APIs
* Bard with Gemini Pro becomes 2nd LLM in the world per LMsys beating 2 out of 3 GPT4 (Thread)
* OpenAI launches GPT mention feature, it's powerful! (Thread)
* Vision & Video
* 🔥 LLaVa 1.6 - 34B achieves SOTA vision model for open source models (X, Announcement, Demo)
* Voice & Audio
* Argmax releases WhisperKit - super optimized (and on device) whisper for IOS/Macs (X, Blogpost, Github)
* Tools
* Infinite Craft - Addicting concept combining game using LLama 2 (neal.fun/infinite-craft/)
Haaaapy first of the second month of 2024 folks, how was your Jan? Not too bad I hope? We definitely got quite a show today, the live recording turned into a proceeding of breaking news, authors who came up, deeper interview and of course... news.
This podcast episode is focusing only on the news, but you should know, that we had deeper chats with Eugene (PicoCreator) from RWKV, and a deeper dive into dataset curation and segmentation tool called Lilac, with founders Nikhil & Daniel, and also, we got a breaking news segment and (from ) joined us to talk about the latest open source from AI2 👏
Besides that, oof what a week, started out with the news that the new Bard API (apparently with Gemini Pro + internet access) is now the 2nd best LLM in the world (According to LMSYS at least), then there was the whole thing with Miqu, which turned out to be, yes, a leak from an earlier version of a Mistral model, that leaked, and they acknowledged it, and finally the main release of LLaVa 1.6 to become the SOTA of vision models in the open source was very interesting!
Open Source LLMs
Meta releases CodeLLama 70B
Benches 67% on MMLU (without fine-tuninig) and already available on HuggingChat, Perplexity, TogetherAI, Quantized for MLX on Apple Silicon and has several finetunes, including SQLCoder which beats GPT-4 on SQL
Has 16K context window, and is one of the top open models for code
Eagle-7B RWKV based model
I was honestly disappointed a bit for the multilingual compared to 1.8B stable LM , but the folks on stage told me to not compare this in a transitional sense to a transformer model ,rather look at the potential here. So we had Eugene, from the RWKV team join on stage and talk through the architecture, the fact that RWKV is the first AI model in the linux foundation and will always be open source, and that they are working on bigger models! That interview will be released soon
Olmo from AI2 - new fully open source 7B model (announcement)
This announcement came as Breaking News, I got a tiny ping just before Nathan dropped a magnet link on X, and then they followed up with the Olmo release and announcement.
A fully open source 7B model, including checkpoints, weights, Weights & Biases logs (coming soon), dataset (Dolma) and just... everything that you can ask, they said they will tell you about this model. Incredible to see how open this effort is, and kudos to the team for such transparency.
They also release a 1B version of Olmo, and you can read the technical report here
Big CO LLMs + APIs
Mistral handles t
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
👋 Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr).
I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24