A Deep Dive Into Generative's Newest Models: Gemini vs Mistral

A Deep Dive Into Generative's Newest Models: Gemini vs Mistral (Mixtral-8x7B)–Part I

Listen now

Description

For the last paper read of the year, Arize CPO & Co-Founder, Aparna Dhinakaran, is joined by a Dat Ngo (ML Solutions Architect) and Aman Khan (Product Manager) for an exploration of the new kids on the block: Gemini and Mixtral-8x7B. There's a lot to cover, so this week's paper read is Part I in a series about Mixtral and Gemini. In Part I, we provide some background and context for Mixtral 8x7B from Mistral AI, a high-quality sparse mixture of experts model (SMoE) that outperforms Llama 2 70B on most benchmarks with 6x faster inference Mixtral also matches or outperforms GPT3.5 on most benchmarks. This open-source model was optimized through supervised fine-tuning and direct preference optimization. Stay tuned for Part II in January, where we'll build on this conversation in and discuss Gemini-developed by teams at DeepMind and Google Research. Link to transcript and live recording: https://arize.com/blog/a-deep-dive-into-generatives-newest-models-mistral-mixtral-8x7b/ To learn more about ML observability, join the Arize AI Slack community or get the latest on our LinkedIn and Twitter.

More Episodes

See all »

Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models' Alignment

We break down the paper--Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models' Alignment.Ensuring alignment (aka: making models behave in accordance with human intentions) has become a critical task before deploying LLMs in real-world applications. However, a major...

Published 05/30/24

Deep Papers

Published 05/30/24

Breaking Down EvalGen: Who Validates the Validators?

Due to the cumbersome nature of human evaluation and limitations of code-based evaluation, Large Language Models (LLMs) are increasingly being used to assist humans in evaluating LLM outputs. Yet LLM-generated evaluators often inherit the problems of the LLMs they evaluate, requiring further...

Published 05/13/24