Google Unveils Gemini Embedding 2, Its First AI Model to Map Text, Images and Video Together

Mar 11, 2026 - 07:00
Google Unveils Gemini Embedding 2, Its First AI Model to Map Text, Images and Video Together
Google released its first fully multimodal embedding model on Tuesday. Dubbed Gemini Embedding 2, the artificial intelligence (AI) model maps text, images, audio, and videos into a single, unified embedding space. This means it uses a architecture to understand concepts whether they are written as words, spoken aloud, or shown in an image or a video.