I’ve just released an episode with Sonam Pankaj. She works on EmbedAnything. We have recorded this episode at Berlin Buzzwords back in June, where I also got the chance to test my new audio recording gear (RØDE Wireless GO II).
You will find the first episode from Berlin Buzzwords with Doug Turnbull here:
EmbedAnything is an infrastructure layer, that allows you to embed anything (different text formats, but also other modalities, like audio), written in Rust for performance reasons. It can embed a pdf text 40x faster than in Python.
EmbedAnything sits on the same level as Encoders layer in the Vector Search Pyramid: https://medium.com/@dmitry-kan/neural-search-frameworks-a-head-to-head-comparison-976aa6662d20
We spoke about this project in detail, but also about metric learning, quality assurance and multimodality.
There are a bunch of show notes with different papers and projects — do check them out.
You can also watch this episode on YouTube:
And listen on major platforms:
RSS: https://rss.com/podcasts/vector-podcast/1663042/
Spotify:
Apple Podcasts:
Vector Podcast from Berlin Buzzwords'24: Sonam Pankaj, EmbedAnything