ThursdAI - Feb 1, 2024- Code LLama, Bard is now 2nd best LLM?!, new LLaVa is great at OCR, Hermes DB is public + 2 new Embed models + Apple AI is coming 👀
Listen now
Description
TL;DR of all topics covered + Show notes * Open Source LLMs * Meta releases Code-LLama 70B - 67.8% HumanEval (Announcement, HF instruct version, HuggingChat, Perplexity) * Together added function calling + JSON mode to Mixtral, Mistral and CodeLLama * RWKV (non transformer based) Eagle-7B - (Announcement, Demo, Yam's Thread) * Someone leaks Miqu, Mistral confirms it's an old version of their model * Olmo from Allen Institute - fully open source 7B model (Data, Weights, Checkpoints, Training code) - Announcement * Datasets & Embeddings * Teknium open sources Hermes dataset (Announcement, Dataset, Lilac) * Lilac announces Garden - LLM powered clustering cloud for datasets (Announcement) * BAAI releases BGE-M3 - Multi-lingual (100+ languages), 8K context, multi functional embeddings (Announcement, Github, technical report) * Nomic AI releases Nomic Embed - fully open source embeddings (Announcement, Tech Report) * Big CO LLMs + APIs * Bard with Gemini Pro becomes 2nd LLM in the world per LMsys beating 2 out of 3 GPT4 (Thread) * OpenAI launches GPT mention feature, it's powerful! (Thread) * Vision & Video * 🔥 LLaVa 1.6 - 34B achieves SOTA vision model for open source models (X, Announcement, Demo) * Voice & Audio * Argmax releases WhisperKit - super optimized (and on device) whisper for IOS/Macs (X, Blogpost, Github) * Tools * Infinite Craft - Addicting concept combining game using LLama 2 (neal.fun/infinite-craft/) Haaaapy first of the second month of 2024 folks, how was your Jan? Not too bad I hope? We definitely got quite a show today, the live recording turned into a proceeding of breaking news, authors who came up, deeper interview and of course... news. This podcast episode is focusing only on the news, but you should know, that we had deeper chats with Eugene (PicoCreator) from RWKV, and a deeper dive into dataset curation and segmentation tool called Lilac, with founders Nikhil & Daniel, and also, we got a breaking news segment and (from ) joined us to talk about the latest open source from AI2 👏 Besides that, oof what a week, started out with the news that the new Bard API (apparently with Gemini Pro + internet access) is now the 2nd best LLM in the world (According to LMSYS at least), then there was the whole thing with Miqu, which turned out to be, yes, a leak from an earlier version of a Mistral model, that leaked, and they acknowledged it, and finally the main release of LLaVa 1.6 to become the SOTA of vision models in the open source was very interesting! Open Source LLMs Meta releases CodeLLama 70B Benches 67% on MMLU (without fine-tuninig) and already available on HuggingChat, Perplexity, TogetherAI, Quantized for MLX on Apple Silicon and has several finetunes, including SQLCoder which beats GPT-4 on SQL Has 16K context window, and is one of the top open models for code Eagle-7B RWKV based model I was honestly disappointed a bit for the multilingual compared to 1.8B stable LM , but the folks on stage told me to not compare this in a transitional sense to a transformer model ,rather look at the potential here. So we had Eugene, from the RWKV team join on stage and talk through the architecture, the fact that RWKV is the first AI model in the linux foundation and will always be open source, and that they are working on bigger models! That interview will be released soon Olmo from AI2 - new fully open source 7B model (announcement) This announcement came as Breaking News, I got a tiny ping just before Nathan dropped a magnet link on X, and then they followed up with the Olmo release and announcement. A fully open source 7B model, including checkpoints, weights, Weights & Biases logs (coming soon), dataset (Dolma) and just... everything that you can ask, they said they will tell you about this model. Incredible to see how open this effort is, and kudos to the team for such transparency. They also release a 1B version of Olmo, and you can read the technical report here Big CO LLMs + APIs Mistral handles t
More Episodes
Wow, holy s**t, insane, overwhelming, incredible, the future is here!, "still not there", there are many more words to describe this past week. (TL;DR at the end of the blogpost) I had a feeling it's going to be a big week, and the companies did NOT disappoint, so this is going to be a very big...
Published 05/17/24
Hey 👋 (show notes and links a bit below) This week has been a great AI week, however, it does feel like a bit "quiet before the storm" with Google I/O on Tuesday next week (which I'll be covering from the ground in Shoreline!) and rumors that OpenAI is not just going to let Google have all the...
Published 05/10/24