VoyageAI launches voyage-multimodal-3.5 embedding model with video support; outperforms Cohere by 4.56% and Google by 4.65% on retrieval accuracy using unified transformer architecture

voyage-multimodal-3.5: a new multimodal retrieval frontier with video support

TL;DR – We’re excited to introduce voyage-multimodal-3.5, our next-generation multimodal embedding model built for retrieval over text, images, and videos. Like voyage-multimodal-3, it embeds interleaved text and images (screenshots, PDFs, tables, figures, slides), but now adds explicit support for video frames. It’s also the first production-grade video embedding model to support Matryoshka embeddings for flexible dimensionality. voyage-multimodal-3.5 attains 4.56% higher retrieval accuracy tha...