Description
Today's guest is Yujian Tang from Zilliz, one of the big players in the vector database market. This is the first episode in a series of episodes we’re doing on vectors and vector databases. We start with the basics, what is a vector? What are vector embeddings? How does vector search work? And why the heck do I even need a vector database?
RAG models for customizing LLMs is where vector databases are getting a lot of their use. On the surface, it seems pretty simple, but in reality, there's a lot of tinkering that goes into taking RAG to production.
Yujian explains some of the tripwires that you might run into and how to think through those problems. We think you're going to really enjoy this episode.
Timestamps
02:08 Introduction
03:16 What is a Vector?
07:01 How does Vector Search work?
14:08 Why need a Vector database?
15:11 Use Cases
17:37 What is RAG?
20:34 RAG vs fine-tuning
29:51 Measuring Performance
32:32 Is RAG here to stay?
35:43 Milvus
37:17 History of Milvus
47:44 Rapid Fire
X
https://twitter.com/yujian_tang
https://twitter.com/seanfalconer
Today, we have David Cramer on the show. David is one of the co-founders of Sentry, an application monitoring tool that's one of the most widely-adopted tools for developers. Sentry does over 300,000 events per second on average, and there's a lot of fancy work to process these application...
Published 11/12/24
Today we have Philip Kiely from Baseten on the show. Baseten is a Series B startup focused on providing infrastructure for AI workloads.
We go deep on Inference Optimization. We cover choosing a model, discuss the hype around Compound AI, choosing an Inference Engine, Optimization Techniques...
Published 11/05/24