Tech Stories by Dmitry Kan
Vector Podcast
Connor Shorten - Research Scientist, Weaviate - ChatGPT, LLMs, Form vs Meaning
0:00
-1:33:11

Connor Shorten - Research Scientist, Weaviate - ChatGPT, LLMs, Form vs Meaning

Topics:

00:00 Intro

01:54 Things Connor learnt in the past year that changed his perception of Vector Search

02:42 Is search becoming conversational?

05:46 Connor asks Dmitry: How Large Language Models will change Search?

08:39 Vector Search Pyramid

09:53 Large models, data, Form vs Meaning and octopus underneath the ocean

13:25 Examples of getting help from ChatGPT and how it compares to web search today

18:32 Classical search engines with URLs for verification vs ChatGPT-style answers

20:15 Hybrid search: keywords + semantic retrieval

23:12 Connor asks Dmitry about his experience with sparse retrieval

28:08 SPLADE vectors

34:10 OOD-DiskANN: handling the out-of-distribution queries, and nuances of sparse vs dense indexing and search

39:54 Ways to debug a query case in dense retrieval (spoiler: it is a challenge!)

44:47 Intricacies of teaching ML models to understand your data and re-vectorization

49:23 Local IDF vs global IDF and how dense search can approach this issue

54:00 Realtime index

59:01 Natural language to SQL

1:04:47 Turning text into a causal DAG

1:10:41 Engineering and Research as two highly intelligent disciplines

1:18:34 Podcast search

1:25:24 Ref2Vec for recommender systems

1:29:48 Announcements

For Show Notes, please check out the YouTube episode below.

This episode on YouTube: https://www.youtube.com/watch?v=2Q-7taLZ374

Podcast design: Saurabh Rai: https://twitter.com/srvbhr

Discussion about this podcast