基于Granite-4.0-Micro-Base微调的3B参数模型，具备增强的指令遵循和工具调用能力，支持12种语言，适用于摘要、分类、问答、RAG、代码任务等，可作为企业级AI助手基础。【此简介由AI生成】

SsystemUpload folder using huggingface_hub

60b2fe9b创建于 2025年10月7日6次提交

文件	最后提交记录	最后更新时间
.gitattributes	initial commit	7 个月前
README.md	Update README.md	7 个月前
chat_template.jinja	Upload folder using huggingface_hub	7 个月前
config.json	Upload folder using huggingface_hub	7 个月前
generation_config.json	Upload folder using huggingface_hub	7 个月前
merges.txt	Upload folder using huggingface_hub	7 个月前
model.safetensorsLFS	Upload folder using huggingface_hub	7 个月前
special_tokens_map.json	Upload folder using huggingface_hub	7 个月前
tokenizer.json	Upload folder using huggingface_hub	7 个月前
tokenizer_config.json	Upload folder using huggingface_hub	7 个月前
vocab.json	Upload folder using huggingface_hub	7 个月前

自动翻译

base_model:

ibm-granite/granite-4.0-micro license: apache-2.0 library_name: transformers tags:
language
unsloth
granite-4.0

查看我们的资源集合，获取 Granite-4.0 的所有版本，包括 GGUF、4-bit 和 16-bit 格式。

了解如何正确运行 Granite 4.0 - 阅读我们的指南。

查看 Unsloth Dynamic 2.0 GGUFs 了解我们的量化基准测试。

✨ 阅读我们的 Granite-4.0 指南点击这里！

使用我们的 Google Colab 笔记本免费微调 Granite-4.0
阅读我们关于 Granite-4.0 支持的博客：https://docs.unsloth.ai/new/ibm-granite-4.0
在我们的文档此处查看更多笔记本。

Granite-4.0-Micro

模型摘要： Granite-4.0-Micro 是一个拥有 30 亿参数的长上下文指令模型，基于 Granite-4.0-Micro-Base 微调而成。它采用了具有宽松许可证的开源指令数据集与内部收集的合成数据集相结合的方式进行训练。该模型通过多种技术和结构化对话格式开发，包括监督微调、基于强化学习的模型对齐以及模型合并。Granite 4.0 指令模型具有改进的指令遵循（IF） 和工具调用能力，使其在企业应用中更加高效。

开发者： Granite 团队、IBM
HF 资源集合： Granite 4.0 Language Models HF Collection
GitHub 仓库： ibm-granite/granite-4.0-language-models
网站：Granite Docs
发布日期：2025 年 10 月 2 日
许可证： Apache 2.0

支持语言： 英语、德语、西班牙语、法语、日语、葡萄牙语、阿拉伯语、捷克语、意大利语、韩语、荷兰语和中文。用户可以针对这些语言之外的其他语言微调 Granite 4.0 模型。

预期用途： 该模型旨在遵循通用指令，并可作为跨不同领域（包括业务应用）的 AI 助手基础，以及配备工具使用能力的 LLM 智能体基础。

功能

文本摘要
文本分类
文本提取
问答
检索增强生成（RAG）
代码相关任务
函数调用任务
多语言对话用例
中间填充（FIM）代码补全

生成： 以下是使用 Granite-4.0-Micro 模型的简单示例。

安装以下库：

pip install torch torchvision torchaudio
pip install accelerate
pip install transformers

然后，从与您的使用场景相关的部分复制代码片段。

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"
model_path = "ibm-granite/granite-4.0-micro"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
    { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
# generate output tokens
output = model.generate(**input_tokens, 
                        max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# print output
print(output[0])

预期输出：

<|start_of_role|>user<|end_of_role|>Please list one IBM Research laboratory located in the United States. You should only output its name and location.<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>Almaden Research Center, San Jose, California<|end_of_text|>

工具调用： Granite-4.0-Micro 具备增强的工具调用能力，能够无缝集成外部函数和 API。如需定义工具列表，请遵循 OpenAI 的函数定义模式。

以下是使用 Granite-4.0-Micro 模型工具调用能力的示例：

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"
model_path = "ibm-granite/granite-4.0-micro"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather for a specified city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "Name of the city"
                    }
                },
                "required": ["city"]
            }
        }
    }
]

# change input text as desired
chat = [
    { "role": "user", "content": "What's the weather like in Boston right now?" },
]
chat = tokenizer.apply_chat_template(chat, \
                                     tokenize=False, \
                                     tools=tools, \
                                     add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
# generate output tokens
output = model.generate(**input_tokens, 
                        max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# print output
print(output[0])

预期输出：

<|start_of_role|>system<|end_of_role|>You are a helpful assistant with access to the following tools. You may call one or more tools to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
- <tools>
- unsloth
{"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather for a specified city.", "parameters": {"type": "object", "properties": {"city": {"type": "string", "description": "Name of the city"}}, "required": ["city"]}}}
</tools>

For each tool call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
- <tool_call>
- unsloth
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.<|end_of_text|>
<|start_of_role|>user<|end_of_role|>What's the weather like in Boston right now?<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|><tool_call>
{"name": "get_current_weather", "arguments": {"city": "Boston"}}
</tool_call><|end_of_text|>

评估结果：

基准测试

指标

微型密集型（Micro Dense）

H 微型密集型（H Micro Dense）

H 小型混合专家（H Tiny MoE）

H 小型混合专家（H Small MoE）

通用任务

MMLU

5次示例（5-shot）

65.98

67.43

68.65

78.44

MMLU-Pro

5次示例，思维链（5-shot, CoT）

44.5

43.48

44.94

55.47

BBH

3次示例，思维链（3-shot, CoT）

72.48

69.36

66.34

81.62

AGI EVAL

0次示例，思维链（0-shot, CoT）

64.29

62.15

70.63

GPQA

0次示例，思维链（0-shot, CoT）

30.14

32.15

32.59

40.63

对齐任务

AlpacaEval 2.0

29.49

31.49

30.61

42.48

IFEval

指令遵循，严格模式（Instruct, Strict）

85.5

86.94

84.78

89.87

IFEval

提示词，严格模式（Prompt, Strict）

79.12

81.71

78.1

85.22

IFEval

平均值（Average）

82.31

84.32

81.44

87.55

ArenaHard

25.84

36.15

35.75

46.48

数学任务

GSM8K

8次示例（8-shot）

85.45

81.35

84.69

87.27

GSM8K Symbolic

8次示例（8-shot）

79.82

77.5

81.1

87.38

Minerva Math

0次示例，思维链（0-shot, CoT）

62.06

66.44

69.64

DeepMind Math

0次示例，思维链（0-shot, CoT）

44.56

43.83

49.92

59.33

代码任务

HumanEval

通过率@1（pass@1）

HumanEval+

通过率@1（pass@1）

MBPP

通过率@1（pass@1）

模型	微型密集型	H 微型密集型	H 超小型 MoE	H 小型 MoE
嵌入维度	2560	2048	1536	4096
层数	40 个注意力层	4 个注意力层 / 36 个 Mamba2 层	4 个注意力层 / 36 个 Mamba2 层	4 个注意力层 / 36 个 Mamba2 层
注意力头维度	64	64	128	128
注意力头数量	40	32	12	32
KV 头数量	8	8	4	8
Mamba2 状态维度	-	128	128	128
Mamba2 头数量	-	64	48	128
MLP / 共享专家隐藏维度	8192	8192	1024	1536
专家数量	-	-	64	72
激活专家数量	-	-	6	10
专家隐藏维度	-	-	512	768
MLP 激活函数	SwiGLU	SwiGLU	SwiGLU	SwiGLU
序列长度	128K	128K	128K	128K
位置嵌入	RoPE	NoPE	NoPE	NoPE
参数数量	3B	3B	7B	32B
激活参数数量	3B	3B	1B	9B

训练数据： 总体而言，我们的 SFT 数据主要由三个关键来源构成：(1) 具有宽松许可的公开可用数据集，(2) 针对特定能力的内部合成数据，以及 (3) 精选的人工整理数据。

基础设施： 我们使用托管在 CoreWeave 的 NVIDIA GB200 NVL72 集群训练 Granite 4.0 语言模型。机架内通信通过 72-GPU NVLink 域进行，而非阻塞、全胖树 NDR 400 Gb/s InfiniBand 网络提供机架间通信。该集群为我们在数千块 GPU 上训练模型提供了可扩展且高效的基础设施。

伦理考量与局限性： Granite 4.0 指令模型主要使用以英语为主的指令-响应对进行微调，但也包含覆盖多种语言的多语言数据。尽管此模型能够处理多语言对话用例，但其性能可能无法与英语任务相媲美。在这种情况下，引入少量示例（少样本）可帮助模型生成更准确的输出。虽然该模型在开发时已充分考虑安全性并进行了对齐，但在某些情况下，模型仍可能对用户提示产生不准确、有偏见或不安全的响应。因此，我们强烈建议社区在使用此模型时，针对其特定任务进行适当的安全测试和调优。

资源

⭐️ 了解 Granite 的最新更新：https://www.ibm.com/granite
📄 获取教程、最佳实践和提示工程建议，快速入门：https://www.ibm.com/granite/docs/
💡 了解最新的 Granite 学习资源：https://ibm.biz/granite-learning-resources

项目介绍

下载使用量

项目总下载次数（含Clone、Pull、 zip 包及 release 下载），每日凌晨更新

语言类型

Jinja100%

granite-4.0-micro-unsloth-bnb-4bit:3B参数长上下文指令模型，支持多语言与工具调用，适用于企业级AI助手

✨ 阅读我们的 Granite-4.0 指南点击这里！

Granite-4.0-Micro

项目介绍

下载使用量

语言类型

目录

granite-4.0-micro-unsloth-bnb-4bit:3B参数长上下文指令模型，支持多语言与工具调用，适用于企业级AI助手

✨ 阅读我们的 Granite-4.0 指南 点击这里！

Granite-4.0-Micro

项目介绍

下载使用量

语言类型

目录

✨ 阅读我们的 Granite-4.0 指南点击这里！