Jina AI第五代多语言嵌入模型，677M参数，MTEB英语均分71.7，支持119+语言和32K tokens，适用于检索、匹配、聚类、分类，嵌入鲁棒且支持多维度输出。【此简介由AI生成】

文件	最后提交记录	最后更新时间
adapters	Upload adapter_model.safetensors	3 个月前
.gitattributes	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前
README.md	Update README.md	2 个月前
config.json	sentence-transformers (#2) - feat: ST support (587a4a88ac79e65c777e58b5bd9b54a2519694c0) - feat: example (ffb90c2c1899d7bb775c0f90212816d9f9fa730a)	3 个月前
config_sentence_transformers.json	Default to the "document" prompt in Sentence Transformers (#7) - Default to the "document" prompt in Sentence Transformers (f3c66392e56911d3aacebeac9717e74af9913369) Co-authored-by: Tom Aarsen <tomaarsen@users.noreply.huggingface.co>	3 个月前
configuration_jina_embeddings_v5.py	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前
custom_st.py	sentence-transformers (#2) - feat: ST support (587a4a88ac79e65c777e58b5bd9b54a2519694c0) - feat: example (ffb90c2c1899d7bb775c0f90212816d9f9fa730a)	3 个月前
generation_config.json	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前
merges.txt	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前
model.safetensorsLFS	update-adapter-weights (#4) - feat: update weights (f03088bd4d09d6b4a0a48c1a773c1f4c38131af7)	3 个月前
modeling_jina_embeddings_v5.py	fix-config-loading (#5) - fix: config loading (8a67d725dfa6f50ad42e3afe8aebf9659c0f8d3b) - docs: remove comment (a46d97c8a2fc587dd9010701f26ca9d731834202) Co-authored-by: Saba Sturua <jupyterjazz@users.noreply.huggingface.co>	3 个月前
modules.json	sentence-transformers (#2) - feat: ST support (587a4a88ac79e65c777e58b5bd9b54a2519694c0) - feat: example (ffb90c2c1899d7bb775c0f90212816d9f9fa730a)	3 个月前
tokenizer.jsonLFS	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前
tokenizer_config.json	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前
vocab.json	model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c)	3 个月前

自动翻译

license: cc-by-nc-4.0 language:

multilingual base_model:
Qwen/Qwen3-0.6B-Base tags:
feature-extraction
mteb
sentence-transformers library_name: transformers

Jina AI: Your Search Foundation, Supercharged!

jina-embeddings-v5-text-small

jina-embeddings-v5-text-small 是 Jina AI 推出的第五代多语言嵌入模型，于2026年2月18日发布。如需更轻量级的选择，请参见 jina-embeddings-v5-text-nano（2.39亿参数）。

Elastic 推理服务 | ArXiv | 发布说明 | 博客

模型概述

jina-embeddings-v5-text Architecture

jina-embeddings-v5-text-small 拥有6.77亿参数，在 MTEB English v2 上平均得分为71.7，在 MMTEB 上得分为67.7，是10亿参数以下多语言嵌入模型中的佼佼者。该模型基于 Qwen3-0.6B-Base 构建，通过结合 Qwen3-Embedding-4B 的嵌入蒸馏技术与特定任务对比损失进行训练，支持119+种语言，最长序列长度达32K tokens，并能生成在截断和二进制量化下依然稳健的嵌入向量。

它是 jina-embeddings-v5-text 模型家族的一员，该家族还包括 jina-embeddings-v5-text-nano——一款适用于资源受限场景的小型模型。

特性	值
参数数量	677M
支持任务	`retrieval`、`text-matching`、`clustering`、`classification`
最大序列长度	32768
嵌入维度	1024
嵌套维度（Matryoshka Dimensions）	32、64、128、256、512、768、1024
池化策略	末位 token 池化
基础模型	Qwen/Qwen3-0.6B-Base

训练与评估

有关训练详情和评估结果，请参阅我们的技术报告。

使用方法

要求

需要以下Python包：

transformers>=4.57.0
torch>=2.8.0
peft>=0.15.2

可选/推荐

flash-attention：建议安装flash-attention以提高推理速度和效率，但并非必需。
sentence-transformers：如果希望通过sentence-transformers接口使用模型，也请安装此包。

通过 Elastic Inference Service

在生产环境中使用v5-text的最快方式。Elastic Inference Service (EIS) 提供托管的嵌入推理服务，内置扩展功能，因此您可以直接在Elastic部署中生成嵌入。

PUT _inference/text_embedding/jina-v5
{
  "service": "elastic",
  "service_settings": {
    "model_id": "jina-embeddings-v5-text-small"
  }
}

有关设置详情，请参阅 Elastic Inference Service 文档。

通过 Jina AI Embeddings API

curl https://api.jina.ai/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $JINA_AI_API_TOKEN" \
  -d @- <<EOFEOF
  {
    "model": "jina-embeddings-v5-text-small",
    "task": "text-matching",
    "input": [
        {
            "text": "غروب جميل على الشاطئ"
        },
        {
            "text": "海滩上美丽的日落"
        },
        {
            "text": "A beautiful sunset over the beach"
        },
        {
            "text": "Un beau coucher de soleil sur la plage"
        },
        {
            "text": "Ein wunderschöner Sonnenuntergang am Strand"
        },
        {
            "text": "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία"
        },
        {
            "text": "समुद्र तट पर एक खूबसूरत सूर्यास्त"
        },
        {
            "text": "Un bellissimo tramonto sulla spiaggia"
        },
        {
            "text": "浜辺に沈む美しい夕日"
        },
        {
            "text": "해변 위로 아름다운 일몰"
        },
        {
            "image": "https://i.ibb.co/nQNGqL0/beach1.jpg"
        },
        {
            "image": "https://i.ibb.co/r5w8hG8/beach2.jpg"
        }
    ]
  }
EOFEOF

通过 transformers

from transformers import AutoModel
import torch

model = AutoModel.from_pretrained(
    "jinaai/jina-embeddings-v5-text-small",
    trust_remote_code=True,
    _attn_implementation="flash_attention_2",  # Recommended but optional
    dtype=torch.bfloat16,  # Recommended for GPUs
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device=device)

# Optional: set truncate_dim and max_length in encode() to control embedding size and input length

# ========================
# 1. Retrieval Task
# ========================
# Encode query
query_embeddings = model.encode(
    texts=["Overview of climate change impacts on coastal cities"],
    task="retrieval",
    prompt_name="query",
)
# Encode document
document_embeddings = model.encode(
    texts=[
        "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
    ],
    task="retrieval",
    prompt_name="document",
)

# ========================
# 2. Text Matching Task
# ========================
texts = [
    "غروب جميل على الشاطئ",  # Arabic
    "海滩上美丽的日落",  # Chinese
    "Un beau coucher de soleil sur la plage",  # French
    "Ein wunderschöner Sonnenuntergang am Strand",  # German
    "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία",  # Greek
    "समुद्र तट पर एक खूबसूरत सूर्यास्त",  # Hindi
    "Un bellissimo tramonto sulla spiaggia",  # Italian
    "浜辺に沈む美しい夕日",  # Japanese
    "해변 위로 아름다운 일몰",  # Korean
]
text_embeddings = model.encode(texts=texts, task="text-matching")

# ========================
# 3. Classification Task
# ========================
texts = [
    "My order hasn't arrived yet and it's been two weeks.",
    "How do I reset my password?",
    "I'd like a refund for my recent purchase.",
    "Your product exceeded my expectations. Great job!",
]
classification_embeddings = model.encode(texts=texts, task="classification")

# ========================
# 4. Clustering Task
# ========================
texts = [
    "We propose a novel neural network architecture for image segmentation.",
    "This paper analyzes the effects of monetary policy on inflation.",
    "Our method achieves state-of-the-art results on object detection benchmarks.",
    "We study the relationship between interest rates and housing prices.",
    "A new attention mechanism is introduced for visual recognition tasks.",
]
clustering_embeddings = model.encode(texts=texts, task="clustering")

通过 sentence-transformers

from sentence_transformers import SentenceTransformer
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = SentenceTransformer(
    "jinaai/jina-embeddings-v5-text-small",
    trust_remote_code=True,
    device=device,
    model_kwargs={"dtype": torch.bfloat16},  # Recommended for GPUs
    config_kwargs={"_attn_implementation": "flash_attention_2"},  # Recommended but optional
)

# Optional: set truncate_dim in encode() to control embedding size

# ========================
# 1. Retrieval Task
# ========================
# Encode query
query_embeddings = model.encode(
    sentences=["Overview of climate change impacts on coastal cities"],
    task="retrieval",
    prompt_name="query",
)
# Encode document
document_embeddings = model.encode(
    sentences=[
        "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
    ],
    task="retrieval",
    prompt_name="document",
)

# ========================
# 2. Text Matching Task
# ========================
texts = [
    "غروب جميل على الشاطئ",  # Arabic
    "海滩上美丽的日落",  # Chinese
    "Un beau coucher de soleil sur la plage",  # French
    "Ein wunderschöner Sonnenuntergang am Strand",  # German
    "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία",  # Greek
    "समुद्र तट पर एक खूबसूरत सूर्यास्त",  # Hindi
    "Un bellissimo tramonto sulla spiaggia",  # Italian
    "浜辺に沈む美しい夕日",  # Japanese
    "해변 위로 아름다운 일몰",  # Korean
]
text_embeddings = model.encode(sentences=texts, task="text-matching")

# ========================
# 3. Classification Task
# ========================
texts = [
    "My order hasn't arrived yet and it's been two weeks.",
    "How do I reset my password?",
    "I'd like a refund for my recent purchase.",
    "Your product exceeded my expectations. Great job!",
]
classification_embeddings = model.encode(sentences=texts, task="classification")

# ========================
# 4. Clustering Task
# ========================
texts = [
    "We propose a novel neural network architecture for image segmentation.",
    "This paper analyzes the effects of monetary policy on inflation.",
    "Our method achieves state-of-the-art results on object detection benchmarks.",
    "We study the relationship between interest rates and housing prices.",
    "A new attention mechanism is introduced for visual recognition tasks.",
]
clustering_embeddings = model.encode(sentences=texts, task="clustering")

通过 vLLM

我们为每个任务（retrieval、text-matching、classification、clustering）提供单独的模型版本。对于每个模型，特定任务的适配器已合并到底模型权重中。此修改简化了与 vLLM 的兼容性。

各任务的使用说明和示例可在其相应的模型仓库中找到：

通过 Text Embeddings Inference

我们为每个任务（retrieval、text-matching、classification、clustering）提供单独的模型版本。对于每个模型，特定任务的适配器已合并到底模型权重中。此修改简化了与 Text Embeddings Inference 的兼容性。

各任务的使用说明和示例可在其相应的模型仓库中找到：

通过 ONNX 和 Optimum

我们为每个任务（retrieval、text-matching、classification、clustering）提供单独的模型版本。对于每个模型，特定任务的适配器已合并到底模型权重中。这使得可以使用 ONNX Runtime 和 Hugging Face Optimum 进行推理。

我们在每个模型仓库的 onnx 子文件夹中提供了 ONNX 格式的权重。各任务的使用说明和示例可在其相应的模型仓库中找到：

许可协议

jina-embeddings-v5-text-small 基于 CC BY-NC 4.0 许可协议。如需商业用途，请联系我们。

引用说明

如果您在研究中发现 jina-embeddings-v5-text-small 有帮助，请引用以下论文：

@article{akram2026jina,
  title={jina-embeddings-v5-text: Task-Targeted Embedding Distillation},
  author={Mohammad Kalim Akram and Saba Sturua and Nastia Havriushenko and Quentin Herreros and Michael G{\"u}nther and Maximilian Werk and Han Xiao},
  journal={arXiv preprint arXiv:2602.15547},
  year={2026}
}