πŸ“… ThursdAI Jan 4 - New WizardCoder, Hermes2 on SOLAR, Embedding King? from Microsoft, Alibaba upgrades vision model & more AI news
Listen now
Description
Here’s a TL;DR and show notes links * Open Source LLMs * New WizardCoder 33B V1.1 - 79% on HumanEval (X, HF) * Tekniums Hermes 2 on SOLAR 10.7B (X, HF) * Microsoft - E5 SOTA text embeddings w/ Mistral (X, HF, Paper, Yams Thread) * Big CO LLMs + APIs * Samsung is about to announce some AI stuff * OpenAI GPT store to come next week * Perplexity announces a $73.6 Series B round * Vision * Alibaba - QWEN-VL PLUS was updated to 14B (X, Demo) * OCU SeeAct - GPT4V as a generalist web agent if grounded (X, Paper) * Voice & Audio * Nvidia + Suno release NeMo Parakeet beats Whisper on english ASR (X, HF, DEMO) * Tools & Agents * Stanford - Mobile ALOHA bot - Open source cooking robot (Website, X thread) Open Source LLMs WizardCoder 33B reaches a whopping 79% on HumanEval @pass1 State of the art LLM coding in open source is here. A whopping 79% on HumanEval, with Wizard Finetuning DeepSeek Coder to get to the best Open Source coder, edging closer to GPT4 and passing GeminiPro and GPT3.5 πŸ‘ (at least on some benchmarks) Teknium releases a Hermes on top of SOLAR 10.7B Downloading now with LMStudio and have been running it, it's very capable. Right now SOLAR models are still on top of the hugging face leaderboard, and Hermes 2 now has 7B (Mistral) 10.7B (SOLAR) and 33B (Yi) sizes. On the podcast I've told a story of how this week I actually used the 33B version of Capybara for a task that GPT kept refusing to help me with. It was honestly kind of strange, a simple request to translate kept failing with an ominous β€œnetwork error”. Which only highlighted how important the local AI movement is, and now I actually have had an experience myself of a local model coming through when a hosted capable one didn’t Microsoft releases a new text embeddings SOTA model E5 , finetuned on synthetic data on top of Mistral 7B We present a new, easy way to create high-quality text embeddings. Our method uses synthetic data and requires less than 1,000 training steps, without the need for complex training stages or large, manually collected datasets. By using advanced language models to generate synthetic data in almost 100 languages, we train open-source models with a standard technique. Our experiments show that our method performs well on tough benchmarks using only synthetic data, and it achieves even better results when we mix synthetic and real data. We had the great please of having Bo Wang again (One of the authors of the Previously SOTA Jina embeddings and a previous podcast gust) to do a deepdive into embeddings and specifically E5 with it's decoder only architecture. While the approach Microsoft researchers took here are interesting, and despite E5 claiming a top spot on the MTEB leaderboard (pictured above) this model doesn't seem to be super practical for most purposes folks use embeddings right now (RAG) for the following reasons: * Context length limitation of 32k, with a recommendation not to exceed 4096 tokens. * Requires a one-sentence instruction for queries, adding complexity for certain use cases like RAG. * Model size is large (14GB), leading to higher costs for production use. * Alternative models like bge-large-en-v1.5 are significantly smaller (1.35GB). * Embedding size is 4096 dimensions, increasing the cost for vector storage. Big CO LLMs + APIs OpenAI announces that the GPT store is coming next week! I can't wait to put the visual weather GPT I created and see how the store prompts it and if I get some revenue share like OpenAI promised during dev day. My daughter and I are frequent users of Alice - the kid painter as well, a custom GPT that my Daughter named Alice, that knows it's speaking to kids over voice, and is generating coloring pages. Will see how much this store lives up to the promises. This weeks Buzz (What I learned with WandB this week) This week was a short one for me, so not a LOT of learnings but I did start this course from W&B, called Training and Fine-tuning Large Language Models (LLMs). It feature
More Episodes
This week is a very exciting one in the world of AI news, as we get 3 SOTA models, one in overall LLM rankings, on in OSS coding and one in OSS voice + a bunch of new breaking news during the show (which we reacted to live on the pod, and as we're now doing video, you can see us freak out in real...
Published 11/15/24
πŸ‘‹ Hey all, this is Alex, coming to you from the very Sunny California, as I'm in SF again, while there is a complete snow storm back home in Denver (brrr). I flew here for the Hackathon I kept telling you about, and it was glorious, we had over 400 registered, over 200 approved hackers, 21 teams...
Published 11/08/24