Jina AI第五代多语言嵌入模型,677M参数,MTEB英语均分71.7,支持119+语言和32K tokens,适用于检索、匹配、聚类、分类,嵌入鲁棒且支持多维度输出。【此简介由AI生成】
license: cc-by-nc-4.0 language:
- multilingual base_model:
- Qwen/Qwen3-0.6B-Base tags:
- feature-extraction
- mteb
- sentence-transformers library_name: transformers
jina-embeddings-v5-text-small
jina-embeddings-v5-text-small 是 Jina AI 推出的第五代多语言嵌入模型,于2026年2月18日发布。如需更轻量级的选择,请参见 jina-embeddings-v5-text-nano(2.39亿参数)。
Elastic 推理服务 | ArXiv | 发布说明 | 博客
模型概述
jina-embeddings-v5-text-small 拥有6.77亿参数,在 MTEB English v2 上平均得分为71.7,在 MMTEB 上得分为67.7,是10亿参数以下多语言嵌入模型中的佼佼者。该模型基于 Qwen3-0.6B-Base 构建,通过结合 Qwen3-Embedding-4B 的嵌入蒸馏技术与特定任务对比损失进行训练,支持119+种语言,最长序列长度达32K tokens,并能生成在截断和二进制量化下依然稳健的嵌入向量。
它是 jina-embeddings-v5-text 模型家族的一员,该家族还包括 jina-embeddings-v5-text-nano——一款适用于资源受限场景的小型模型。
| 特性 | 值 |
|---|---|
| 参数数量 | 677M |
| 支持任务 | retrieval、text-matching、clustering、classification |
| 最大序列长度 | 32768 |
| 嵌入维度 | 1024 |
| 嵌套维度(Matryoshka Dimensions) | 32、64、128、256、512、768、1024 |
| 池化策略 | 末位 token 池化 |
| 基础模型 | Qwen/Qwen3-0.6B-Base |

训练与评估
有关训练详情和评估结果,请参阅我们的技术报告。
使用方法
要求
需要以下Python包:
transformers>=4.57.0torch>=2.8.0peft>=0.15.2
可选/推荐
- flash-attention:建议安装flash-attention以提高推理速度和效率,但并非必需。
- sentence-transformers:如果希望通过
sentence-transformers接口使用模型,也请安装此包。
通过 Elastic Inference Service
在生产环境中使用v5-text的最快方式。Elastic Inference Service (EIS) 提供托管的嵌入推理服务,内置扩展功能,因此您可以直接在Elastic部署中生成嵌入。
PUT _inference/text_embedding/jina-v5
{
"service": "elastic",
"service_settings": {
"model_id": "jina-embeddings-v5-text-small"
}
}
有关设置详情,请参阅 Elastic Inference Service 文档。
curl https://api.jina.ai/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $JINA_AI_API_TOKEN" \
-d @- <<EOFEOF
{
"model": "jina-embeddings-v5-text-small",
"task": "text-matching",
"input": [
{
"text": "غروب جميل على الشاطئ"
},
{
"text": "海滩上美丽的日落"
},
{
"text": "A beautiful sunset over the beach"
},
{
"text": "Un beau coucher de soleil sur la plage"
},
{
"text": "Ein wunderschöner Sonnenuntergang am Strand"
},
{
"text": "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία"
},
{
"text": "समुद्र तट पर एक खूबसूरत सूर्यास्त"
},
{
"text": "Un bellissimo tramonto sulla spiaggia"
},
{
"text": "浜辺に沈む美しい夕日"
},
{
"text": "해변 위로 아름다운 일몰"
},
{
"image": "https://i.ibb.co/nQNGqL0/beach1.jpg"
},
{
"image": "https://i.ibb.co/r5w8hG8/beach2.jpg"
}
]
}
EOFEOF
通过 transformers
from transformers import AutoModel
import torch
model = AutoModel.from_pretrained(
"jinaai/jina-embeddings-v5-text-small",
trust_remote_code=True,
_attn_implementation="flash_attention_2", # Recommended but optional
dtype=torch.bfloat16, # Recommended for GPUs
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device=device)
# Optional: set truncate_dim and max_length in encode() to control embedding size and input length
# ========================
# 1. Retrieval Task
# ========================
# Encode query
query_embeddings = model.encode(
texts=["Overview of climate change impacts on coastal cities"],
task="retrieval",
prompt_name="query",
)
# Encode document
document_embeddings = model.encode(
texts=[
"Climate change has led to rising sea levels, increased frequency of extreme weather events..."
],
task="retrieval",
prompt_name="document",
)
# ========================
# 2. Text Matching Task
# ========================
texts = [
"غروب جميل على الشاطئ", # Arabic
"海滩上美丽的日落", # Chinese
"Un beau coucher de soleil sur la plage", # French
"Ein wunderschöner Sonnenuntergang am Strand", # German
"Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία", # Greek
"समुद्र तट पर एक खूबसूरत सूर्यास्त", # Hindi
"Un bellissimo tramonto sulla spiaggia", # Italian
"浜辺に沈む美しい夕日", # Japanese
"해변 위로 아름다운 일몰", # Korean
]
text_embeddings = model.encode(texts=texts, task="text-matching")
# ========================
# 3. Classification Task
# ========================
texts = [
"My order hasn't arrived yet and it's been two weeks.",
"How do I reset my password?",
"I'd like a refund for my recent purchase.",
"Your product exceeded my expectations. Great job!",
]
classification_embeddings = model.encode(texts=texts, task="classification")
# ========================
# 4. Clustering Task
# ========================
texts = [
"We propose a novel neural network architecture for image segmentation.",
"This paper analyzes the effects of monetary policy on inflation.",
"Our method achieves state-of-the-art results on object detection benchmarks.",
"We study the relationship between interest rates and housing prices.",
"A new attention mechanism is introduced for visual recognition tasks.",
]
clustering_embeddings = model.encode(texts=texts, task="clustering")
from sentence_transformers import SentenceTransformer
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SentenceTransformer(
"jinaai/jina-embeddings-v5-text-small",
trust_remote_code=True,
device=device,
model_kwargs={"dtype": torch.bfloat16}, # Recommended for GPUs
config_kwargs={"_attn_implementation": "flash_attention_2"}, # Recommended but optional
)
# Optional: set truncate_dim in encode() to control embedding size
# ========================
# 1. Retrieval Task
# ========================
# Encode query
query_embeddings = model.encode(
sentences=["Overview of climate change impacts on coastal cities"],
task="retrieval",
prompt_name="query",
)
# Encode document
document_embeddings = model.encode(
sentences=[
"Climate change has led to rising sea levels, increased frequency of extreme weather events..."
],
task="retrieval",
prompt_name="document",
)
# ========================
# 2. Text Matching Task
# ========================
texts = [
"غروب جميل على الشاطئ", # Arabic
"海滩上美丽的日落", # Chinese
"Un beau coucher de soleil sur la plage", # French
"Ein wunderschöner Sonnenuntergang am Strand", # German
"Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία", # Greek
"समुद्र तट पर एक खूबसूरत सूर्यास्त", # Hindi
"Un bellissimo tramonto sulla spiaggia", # Italian
"浜辺に沈む美しい夕日", # Japanese
"해변 위로 아름다운 일몰", # Korean
]
text_embeddings = model.encode(sentences=texts, task="text-matching")
# ========================
# 3. Classification Task
# ========================
texts = [
"My order hasn't arrived yet and it's been two weeks.",
"How do I reset my password?",
"I'd like a refund for my recent purchase.",
"Your product exceeded my expectations. Great job!",
]
classification_embeddings = model.encode(sentences=texts, task="classification")
# ========================
# 4. Clustering Task
# ========================
texts = [
"We propose a novel neural network architecture for image segmentation.",
"This paper analyzes the effects of monetary policy on inflation.",
"Our method achieves state-of-the-art results on object detection benchmarks.",
"We study the relationship between interest rates and housing prices.",
"A new attention mechanism is introduced for visual recognition tasks.",
]
clustering_embeddings = model.encode(sentences=texts, task="clustering")
通过 vLLM
我们为每个任务(retrieval、text-matching、classification、clustering)提供单独的模型版本。
对于每个模型,特定任务的适配器已合并到底模型权重中。
此修改简化了与 vLLM 的兼容性。
各任务的使用说明和示例可在其相应的模型仓库中找到:
通过 Text Embeddings Inference
我们为每个任务(retrieval、text-matching、classification、clustering)提供单独的模型版本。
对于每个模型,特定任务的适配器已合并到底模型权重中。
此修改简化了与 Text Embeddings Inference 的兼容性。
各任务的使用说明和示例可在其相应的模型仓库中找到:
通过 ONNX 和 Optimum
我们为每个任务(retrieval、text-matching、classification、clustering)提供单独的模型版本。
对于每个模型,特定任务的适配器已合并到底模型权重中。
这使得可以使用 ONNX Runtime 和 Hugging Face Optimum 进行推理。
我们在每个模型仓库的 onnx 子文件夹中提供了 ONNX 格式的权重。
各任务的使用说明和示例可在其相应的模型仓库中找到:
许可协议
jina-embeddings-v5-text-small 基于 CC BY-NC 4.0 许可协议。如需商业用途,请联系我们。
引用说明
如果您在研究中发现 jina-embeddings-v5-text-small 有帮助,请引用以下论文:
@article{akram2026jina,
title={jina-embeddings-v5-text: Task-Targeted Embedding Distillation},
author={Mohammad Kalim Akram and Saba Sturua and Nastia Havriushenko and Quentin Herreros and Michael G{\"u}nther and Maximilian Werk and Han Xiao},
journal={arXiv preprint arXiv:2602.15547},
year={2026}
}