jina-embeddings-v5-text-small:第五代多语言嵌入模型,支持119+语言与32K长文本,高效执行检索匹配等任务

Jina AI第五代多语言嵌入模型,677M参数,MTEB英语均分71.7,支持119+语言和32K tokens,适用于检索、匹配、聚类、分类,嵌入鲁棒且支持多维度输出。【此简介由AI生成】

分支1Tags0
d153fb80创建于 2月26日27次提交
文件最后提交记录最后更新时间
Upload adapter_model.safetensors3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前
Update README.md2 个月前
sentence-transformers (#2) - feat: ST support (587a4a88ac79e65c777e58b5bd9b54a2519694c0) - feat: example (ffb90c2c1899d7bb775c0f90212816d9f9fa730a) 3 个月前
Default to the "document" prompt in Sentence Transformers (#7) - Default to the "document" prompt in Sentence Transformers (f3c66392e56911d3aacebeac9717e74af9913369) Co-authored-by: Tom Aarsen <tomaarsen@users.noreply.huggingface.co> 3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前
sentence-transformers (#2) - feat: ST support (587a4a88ac79e65c777e58b5bd9b54a2519694c0) - feat: example (ffb90c2c1899d7bb775c0f90212816d9f9fa730a) 3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前
update-adapter-weights (#4) - feat: update weights (f03088bd4d09d6b4a0a48c1a773c1f4c38131af7) 3 个月前
fix-config-loading (#5) - fix: config loading (8a67d725dfa6f50ad42e3afe8aebf9659c0f8d3b) - docs: remove comment (a46d97c8a2fc587dd9010701f26ca9d731834202) Co-authored-by: Saba Sturua <jupyterjazz@users.noreply.huggingface.co> 3 个月前
sentence-transformers (#2) - feat: ST support (587a4a88ac79e65c777e58b5bd9b54a2519694c0) - feat: example (ffb90c2c1899d7bb775c0f90212816d9f9fa730a) 3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前
model-implementation (#1) - feat: model implementation (ad6f7a0294e4506d25e16ca5d97b23cf8af7778c) - chore: remove unused files (ce5dda258046d903d3b5d87c48af8f6650cca22c) 3 个月前

license: cc-by-nc-4.0 language:

  • multilingual base_model:
  • Qwen/Qwen3-0.6B-Base tags:
  • feature-extraction
  • mteb
  • sentence-transformers library_name: transformers



Jina AI: Your Search Foundation, Supercharged!

jina-embeddings-v5-text-small

jina-embeddings-v5-text-small 是 Jina AI 推出的第五代多语言嵌入模型,于2026年2月18日发布。如需更轻量级的选择,请参见 jina-embeddings-v5-text-nano(2.39亿参数)。

Elastic 推理服务 | ArXiv | 发布说明 | 博客

模型概述

jina-embeddings-v5-text Architecture

jina-embeddings-v5-text-small 拥有6.77亿参数,在 MTEB English v2 上平均得分为71.7,在 MMTEB 上得分为67.7,是10亿参数以下多语言嵌入模型中的佼佼者。该模型基于 Qwen3-0.6B-Base 构建,通过结合 Qwen3-Embedding-4B 的嵌入蒸馏技术与特定任务对比损失进行训练,支持119+种语言,最长序列长度达32K tokens,并能生成在截断和二进制量化下依然稳健的嵌入向量。

它是 jina-embeddings-v5-text 模型家族的一员,该家族还包括 jina-embeddings-v5-text-nano——一款适用于资源受限场景的小型模型。

特性
参数数量 677M
支持任务 retrievaltext-matchingclusteringclassification
最大序列长度 32768
嵌入维度 1024
嵌套维度(Matryoshka Dimensions) 32、64、128、256、512、768、1024
池化策略 末位 token 池化
基础模型 Qwen/Qwen3-0.6B-Base

image

训练与评估

有关训练详情和评估结果,请参阅我们的技术报告

使用方法

要求

需要以下Python包:

  • transformers>=4.57.0
  • torch>=2.8.0
  • peft>=0.15.2

可选/推荐

  • flash-attention:建议安装flash-attention以提高推理速度和效率,但并非必需。
  • sentence-transformers:如果希望通过sentence-transformers接口使用模型,也请安装此包。
通过 Elastic Inference Service

在生产环境中使用v5-text的最快方式。Elastic Inference Service (EIS) 提供托管的嵌入推理服务,内置扩展功能,因此您可以直接在Elastic部署中生成嵌入。

PUT _inference/text_embedding/jina-v5
{
  "service": "elastic",
  "service_settings": {
    "model_id": "jina-embeddings-v5-text-small"
  }
}

有关设置详情,请参阅 Elastic Inference Service 文档

通过 Jina AI Embeddings API
curl https://api.jina.ai/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $JINA_AI_API_TOKEN" \
  -d @- <<EOFEOF
  {
    "model": "jina-embeddings-v5-text-small",
    "task": "text-matching",
    "input": [
        {
            "text": "غروب جميل على الشاطئ"
        },
        {
            "text": "海滩上美丽的日落"
        },
        {
            "text": "A beautiful sunset over the beach"
        },
        {
            "text": "Un beau coucher de soleil sur la plage"
        },
        {
            "text": "Ein wunderschöner Sonnenuntergang am Strand"
        },
        {
            "text": "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία"
        },
        {
            "text": "समुद्र तट पर एक खूबसूरत सूर्यास्त"
        },
        {
            "text": "Un bellissimo tramonto sulla spiaggia"
        },
        {
            "text": "浜辺に沈む美しい夕日"
        },
        {
            "text": "해변 위로 아름다운 일몰"
        },
        {
            "image": "https://i.ibb.co/nQNGqL0/beach1.jpg"
        },
        {
            "image": "https://i.ibb.co/r5w8hG8/beach2.jpg"
        }
    ]
  }
EOFEOF
通过 transformers
from transformers import AutoModel
import torch

model = AutoModel.from_pretrained(
    "jinaai/jina-embeddings-v5-text-small",
    trust_remote_code=True,
    _attn_implementation="flash_attention_2",  # Recommended but optional
    dtype=torch.bfloat16,  # Recommended for GPUs
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device=device)

# Optional: set truncate_dim and max_length in encode() to control embedding size and input length

# ========================
# 1. Retrieval Task
# ========================
# Encode query
query_embeddings = model.encode(
    texts=["Overview of climate change impacts on coastal cities"],
    task="retrieval",
    prompt_name="query",
)
# Encode document
document_embeddings = model.encode(
    texts=[
        "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
    ],
    task="retrieval",
    prompt_name="document",
)

# ========================
# 2. Text Matching Task
# ========================
texts = [
    "غروب جميل على الشاطئ",  # Arabic
    "海滩上美丽的日落",  # Chinese
    "Un beau coucher de soleil sur la plage",  # French
    "Ein wunderschöner Sonnenuntergang am Strand",  # German
    "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία",  # Greek
    "समुद्र तट पर एक खूबसूरत सूर्यास्त",  # Hindi
    "Un bellissimo tramonto sulla spiaggia",  # Italian
    "浜辺に沈む美しい夕日",  # Japanese
    "해변 위로 아름다운 일몰",  # Korean
]
text_embeddings = model.encode(texts=texts, task="text-matching")

# ========================
# 3. Classification Task
# ========================
texts = [
    "My order hasn't arrived yet and it's been two weeks.",
    "How do I reset my password?",
    "I'd like a refund for my recent purchase.",
    "Your product exceeded my expectations. Great job!",
]
classification_embeddings = model.encode(texts=texts, task="classification")

# ========================
# 4. Clustering Task
# ========================
texts = [
    "We propose a novel neural network architecture for image segmentation.",
    "This paper analyzes the effects of monetary policy on inflation.",
    "Our method achieves state-of-the-art results on object detection benchmarks.",
    "We study the relationship between interest rates and housing prices.",
    "A new attention mechanism is introduced for visual recognition tasks.",
]
clustering_embeddings = model.encode(texts=texts, task="clustering")
通过 sentence-transformers
from sentence_transformers import SentenceTransformer
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = SentenceTransformer(
    "jinaai/jina-embeddings-v5-text-small",
    trust_remote_code=True,
    device=device,
    model_kwargs={"dtype": torch.bfloat16},  # Recommended for GPUs
    config_kwargs={"_attn_implementation": "flash_attention_2"},  # Recommended but optional
)

# Optional: set truncate_dim in encode() to control embedding size

# ========================
# 1. Retrieval Task
# ========================
# Encode query
query_embeddings = model.encode(
    sentences=["Overview of climate change impacts on coastal cities"],
    task="retrieval",
    prompt_name="query",
)
# Encode document
document_embeddings = model.encode(
    sentences=[
        "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
    ],
    task="retrieval",
    prompt_name="document",
)

# ========================
# 2. Text Matching Task
# ========================
texts = [
    "غروب جميل على الشاطئ",  # Arabic
    "海滩上美丽的日落",  # Chinese
    "Un beau coucher de soleil sur la plage",  # French
    "Ein wunderschöner Sonnenuntergang am Strand",  # German
    "Ένα όμορφο ηλιοβασίλεμα πάνω από την παραλία",  # Greek
    "समुद्र तट पर एक खूबसूरत सूर्यास्त",  # Hindi
    "Un bellissimo tramonto sulla spiaggia",  # Italian
    "浜辺に沈む美しい夕日",  # Japanese
    "해변 위로 아름다운 일몰",  # Korean
]
text_embeddings = model.encode(sentences=texts, task="text-matching")

# ========================
# 3. Classification Task
# ========================
texts = [
    "My order hasn't arrived yet and it's been two weeks.",
    "How do I reset my password?",
    "I'd like a refund for my recent purchase.",
    "Your product exceeded my expectations. Great job!",
]
classification_embeddings = model.encode(sentences=texts, task="classification")

# ========================
# 4. Clustering Task
# ========================
texts = [
    "We propose a novel neural network architecture for image segmentation.",
    "This paper analyzes the effects of monetary policy on inflation.",
    "Our method achieves state-of-the-art results on object detection benchmarks.",
    "We study the relationship between interest rates and housing prices.",
    "A new attention mechanism is introduced for visual recognition tasks.",
]
clustering_embeddings = model.encode(sentences=texts, task="clustering")
通过 vLLM

我们为每个任务(retrievaltext-matchingclassificationclustering)提供单独的模型版本。 对于每个模型,特定任务的适配器已合并到底模型权重中。 此修改简化了与 vLLM 的兼容性。

各任务的使用说明和示例可在其相应的模型仓库中找到:

通过 Text Embeddings Inference

我们为每个任务(retrievaltext-matchingclassificationclustering)提供单独的模型版本。 对于每个模型,特定任务的适配器已合并到底模型权重中。 此修改简化了与 Text Embeddings Inference 的兼容性。

各任务的使用说明和示例可在其相应的模型仓库中找到:

通过 ONNXOptimum

我们为每个任务(retrievaltext-matchingclassificationclustering)提供单独的模型版本。 对于每个模型,特定任务的适配器已合并到底模型权重中。 这使得可以使用 ONNX Runtime 和 Hugging Face Optimum 进行推理。

我们在每个模型仓库的 onnx 子文件夹中提供了 ONNX 格式的权重。 各任务的使用说明和示例可在其相应的模型仓库中找到:

许可协议

jina-embeddings-v5-text-small 基于 CC BY-NC 4.0 许可协议。如需商业用途,请联系我们

引用说明

如果您在研究中发现 jina-embeddings-v5-text-small 有帮助,请引用以下论文:

@article{akram2026jina,
  title={jina-embeddings-v5-text: Task-Targeted Embedding Distillation},
  author={Mohammad Kalim Akram and Saba Sturua and Nastia Havriushenko and Quentin Herreros and Michael G{\"u}nther and Maximilian Werk and Han Xiao},
  journal={arXiv preprint arXiv:2602.15547},
  year={2026}
}

项目介绍

Jina AI第五代多语言嵌入模型,677M参数,MTEB英语均分71.7,支持119+语言和32K tokens,适用于检索、匹配、聚类、分类,嵌入鲁棒且支持多维度输出。【此简介由AI生成】

定制我的领域

下载使用量

0

项目总下载次数(含Clone、Pull、 zip 包及 release 下载),每日凌晨更新

语言类型

Python100%