News5 Nov 20256 min read

Cohere Embed V4: Multilingual Embeddings for Global RAG Systems

Cohere released Embed V4 with support for 100+ languages, improved retrieval accuracy, and reduced dimensionality for faster vector search.

MB
Max Beech
Head of Content

TL;DR

  • Embed V4 supports 100+ languages with unified embedding space.
  • 1024 dimensions (vs 1536 for OpenAI) = 33% faster vector search.
  • Improved retrieval: +12% accuracy on MTEB benchmark vs V3.
  • Pricing: $0.10/million tokens (same as OpenAI text-embedding-3-small).

Cohere Embed V4: Multilingual Embeddings for Global RAG Systems

Cohere launched Embed V4 in November 2024, significantly expanding multilingual support from 100 to 100+ languages while improving retrieval accuracy and reducing computational overhead. For companies building RAG systems serving global users, V4 enables single-model deployment across markets instead of language-specific embedding models.

Key improvements

Multilingual coverage

V3: 100 languages (good but gaps in regional languages) V4: 100+ languages including:

  • Major: English, Chinese, Spanish, Arabic, French, German, Japanese
  • Regional: Swahili, Bengali, Vietnamese, Thai, Turkish
  • Low-resource: Hausa, Zulu, Pashto

Unified embedding space: All languages map to same 1024-dimensional space, enabling cross-lingual search (query in English, retrieve German documents).

Retrieval accuracy

MTEB (Massive Text Embedding Benchmark) scores:

ModelAvg scoreRetrievalClassification
Cohere Embed V469.8%58.2%78.4%
Cohere Embed V362.3%52.1%74.2%
OpenAI text-embedding-3-small62.3%49.2%70.9%
OpenAI text-embedding-3-large64.6%54.9%75.4%

V4 leads on retrieval tasks (RAG use case).

Dimensionality reduction

1024 dimensions vs 1536 (OpenAI), 3072 (OpenAI large)

Benefits:

  • 33% faster vector similarity calculations
  • 33% less storage required
  • Maintains 95%+ of retrieval quality

Trade-off: Slightly less precision for classification tasks (acceptable for most RAG systems).

Implementation

import cohere

co = cohere.Client(api_key="...")

# Embed documents (any language)
docs = [
    "AI is transforming healthcare",  # English
    "Die KI verändert das Gesundheitswesen",  # German
    "الذكاء الاصطناعي يحول الرعاية الصحية"  # Arabic
]

doc_embeds = co.embed(
    texts=docs,
    model="embed-v4",
    input_type="search_document"
).embeddings

# Embed query (different language OK)
query = "How is AI used in medicine?"
query_embed = co.embed(
    texts=[query],
    model="embed-v4",
    input_type="search_query"
).embeddings[0]

# Search across all languages
similarities = cosine_similarity([query_embed], doc_embeds)
# Returns high similarity to all three documents despite language differences

Use cases

1. Cross-lingual customer support

Index support docs in multiple languages, enable search in user's preferred language.

2. Multilingual knowledge bases

Companies with global teams can search unified knowledge base regardless of document language.

3. International e-commerce

Product search works across localized descriptions (search in English, find products described in Chinese/Spanish).

Pricing

ModelPrice ($/M tokens)DimensionsLanguages
Cohere Embed V4$0.101024100+
Cohere Embed V3$0.101024100
OpenAI text-embedding-3-small$0.021536~40
OpenAI text-embedding-3-large$0.133072~40

Value proposition: Better multilingual support than OpenAI at competitive price.

Migration from V3

Breaking changes: None -V4 is drop-in replacement

Recommended approach:

  1. Re-embed knowledge base with V4
  2. Run parallel testing (V3 vs V4 retrieval accuracy)
  3. Cutover once validated

Timeline: 2-3 days for most applications

Call-to-action (Consideration stage) Test Cohere Embed V4 in the playground with multilingual queries.

FAQs

Can I mix V3 and V4 embeddings?

No, incompatible embedding spaces. Must fully migrate or maintain separate indexes.

Does it work with pgvector/Pinecone?

Yes, standard dense vectors compatible with all major vector databases.

How does cross-lingual retrieval work?

Embeddings trained on parallel corpora so semantically similar text in different languages maps to nearby vectors.

Is there a self-hosted option?

No, API-only currently.

Summary

Cohere Embed V4 expands multilingual support to 100+ languages with improved retrieval accuracy and reduced dimensionality. Best for global RAG systems requiring cross-lingual search. OpenAI remains cheaper for English-only use cases.

Internal links:

External references:

Crosslinks: