JVector: Cutting-Edge Vector Search in Java
Listen now
Description
An airhacks.fm conversation with Jonathan Ellis (@spyced) about: discussion of JVector, a Java-based vector search engine, Apache Kudu as an alternative to Cassandra for wide-column databases, FoundationDB - is a NoSQL database, explanation of vectors and embeddings in machine learning, different embedding models and their dimensions, the Hamming distance, binary quantization and product quantization for vector compression, DiskANN algorithm for efficient vector search on disk, optimistic concurrency control in JVector, challenges in implementing academic papers, the Neon database, JVector's performance characteristics and typical database sizes, advantages of astra DB over Cassandra, separation of compute and storage in cloud databases, Vector's use of Panama and SIMD instructions, the potential for contributions to the JVector project, Upstash uses of JVector for their vector search service, the cutting-edge nature of JVector in the Java ecosystem, the logarithmic performance of JVector for index construction and search, typical search latencies in the 30-50 millisecond range, the young and rapidly evolving field of vector search, the self-contained nature of the JVector codebase Jonathan Ellis on twitter: @spyced
More Episodes
An airhacks.fm conversation with Christos Kotselidis (@CKotselidis) about: early experiences with computers and programming, transition to studying Java and virtual machines at university, work on Jikes compiler and distributed software transactional memory for PhD, current roles as...
Published 11/17/24
An airhacks.fm conversation with Vadym Kazulkin (@VKazulkin) about: journey as a Java developer from the late 1990s to present, early experiences with Java and J2EE development, transition to cloud and serverless technologies, particularly AWS Lambda, discussion of Java performance on...
Published 11/10/24