ad2a1da9创建于 2025年12月3日历史提交

文件	最后提交记录	最后更新时间
model-32K	!6356 【Huggingface】【客户模型】【ChatGLM3-6B】ChatGLM3-6B、ChatGLM3-6B-32K适配代码 Merge pull request !6356 from 胡牛/master	1 年前
model	!6356 【Huggingface】【客户模型】【ChatGLM3-6B】ChatGLM3-6B、ChatGLM3-6B-32K适配代码 Merge pull request !6356 from 胡牛/master	1 年前
scripts	!6509 【Huggingface】【客户模型】Qwen1.5、ChatGLM3 save logs Merge pull request !6509 from 胡牛/master	1 年前
MODEL_LICENSE	!6356 【Huggingface】【客户模型】【ChatGLM3-6B】ChatGLM3-6B、ChatGLM3-6B-32K适配代码 Merge pull request !6356 from 胡牛/master	1 年前
README.md	文档整改，gitee->gitcode Co-authored-by: Lighters_c<zyh13227@163.com> # message auto-generated for no-merge-commit merge: !7469 merge ffffix into master 文档整改，gitee->gitcode Created-by: addsubmuldiv Commit-by: Lighters_c Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7469	5 个月前
public_address_statement.md	!6356 【Huggingface】【客户模型】【ChatGLM3-6B】ChatGLM3-6B、ChatGLM3-6B-32K适配代码 Merge pull request !6356 from 胡牛/master	1 年前
requirements.txt	!6427 【BUG FIX】【Huggingface】【客户模型】【ChatGLM3-6B】ChatGLM3-6B、ChatGLM3-6B-32K modify requirements.txt Merge pull request !6427 from 胡牛/master	1 年前

当前模型脚本已不随版本演进，如使用此模型可跳转至该地址

ChatGLM3-6B for PyTorch

简介

ChatGLM3 是智谱AI和清华大学 KEG 实验室联合发布的对话预训练模型。ChatGLM3-6B 是 ChatGLM3 系列中的开源模型，在保留了前两代模型对话流畅、部署门槛低等众多优秀特性的基础上，ChatGLM3-6B 引入了如下特性：

更强大的基础模型： ChatGLM3-6B 的基础模型 ChatGLM3-6B-Base 采用了更多样的训练数据、更充分的训练步数和更合理的训练策略。在语义、数学、推理、代码、知识等不同角度的数据集上测评显示，* ChatGLM3-6B-Base 具有在 10B 以下的基础模型中最强的性能*。
更完整的功能支持： ChatGLM3-6B 采用了全新设计的Prompt格式，除正常的多轮对话外。同时原生支持工具调用（Function Call）、代码执行（Code Interpreter）和Agent 任务等复杂场景。
更全面的开源序列： 除了对话模型ChatGLM3-6B外，还开源了基础模型ChatGLM3-6B-Base 、长文本对话模型ChatGLM3-6B-32K和进一步强化了对于长文本理解能力的ChatGLM3-6B-128K。以上所有权重对学术研究完全开放 ，在填写问卷进行登记后亦允许免费商业使用。。

参考实现：

url=https://github.com/THUDM/ChatGLM3
commitID=d0be06cd278eb58541b971f69f0544b75613ebdd

适配昇腾 AI 处理器的实现：

url=https://gitcode.com/ascend/ModelZoo-PyTorch.git
code_path=PyTorch/built-in/foundation

准备训练环境

准备环境

安装昇腾环境

请参考昇腾社区中《Pytorch框架训练环境准备》文档搭建昇腾环境。本仓已支持表1中软件版本。

表 1 昇腾软件版本支持表

软件类型	支持版本
FrameworkPTAdapter	6.0.RC2
CANN	8.0.RC2
昇腾NPU固件	24.1.RC2
昇腾NPU驱动	24.1.RC2

安装模型环境

当前模型支持的 PyTorch 版本和已知三方库依赖如下表所示。

表 2 版本支持表

Torch_Version 三方库依赖版本

PyTorch 2.1 transformers == 4.39.2; deepspeed == 0.14.2
安装依赖。

在模型源码包根目录下执行命令，安装模型对应PyTorch版本需要的依赖。
```
pip install -r requirements.txt
```

Torch_Version	三方库依赖版本
PyTorch 2.1	transformers == 4.39.2; deepspeed == 0.14.2

准备数据集

用户可以自行下载AdvertiseGen数据集，并将其放在scripts路径下，该文件夹内容包括：

├── AdvertiseGen
|      ├──train.json
|      ├──dev.json
├── configs
│     ├── ds_zero_3.json
│     └── sft.yaml
├── finetune_hf.py
├── run_train_8p_glm.sh
├── run_train_8p_glm_32k.sh
├── run_train_8p_glm_32k_no_shuffle.sh
└── run_train_8p_glm_no_shuffle.sh

准备预训练权重

ChatGLM3-6B

用户可以自行下载ChatGLM3-6B预训练权重和配置文件，然后将这些文件放在 "model"文件夹中，不要覆盖 modeling_chatglm.py文件。 model文件夹内容如下：

  ├── model
      ├──config.json
      ├──configuration_chatglm.py
      ├──pytorch_model-00001-of-00007.bin
      ├──pytorch_model-00002-of-00007.bin
      ├──pytorch_model-00003-of-00007.bin
      ├──pytorch_model-00004-of-00007.bin
      ├──pytorch_model-00005-of-00007.bin
      ├──pytorch_model-00006-of-00007.bin
      ├──pytorch_model-00007-of-00007.bin
      ├──pytorch_model.bin.index.json
      ├──quantization.py
      ├──tokenization_chatglm.py
      ├──tokenizer_config.json
      ├──tokenizer.model
      ├──modeling_chatglm.py

ChatGLM3-6B-32K

用户可以自行下载ChatGLM3-6B-32K预训练权重和配置文件，然后将这些文件放在 "model-32K"文件夹中，不要覆盖 modeling_chatglm.py文件。 model-32K文件夹内容如下：

  ├── model
      ├──config.json
      ├──configuration_chatglm.py
      ├──pytorch_model-00001-of-00007.bin
      ├──pytorch_model-00002-of-00007.bin
      ├──pytorch_model-00003-of-00007.bin
      ├──pytorch_model-00004-of-00007.bin
      ├──pytorch_model-00005-of-00007.bin
      ├──pytorch_model-00006-of-00007.bin
      ├──pytorch_model-00007-of-00007.bin
      ├──pytorch_model.bin.index.json
      ├──quantization.py
      ├──tokenization_chatglm.py
      ├──tokenizer_config.json
      ├──tokenizer.model
      ├──modeling_chatglm.py

开始训练

ChatGLM3-6B

进入解压后的源码包根目录。
```
cd /${模型文件夹名称} 
```

启动全参数finetune训练

启动训练环境

# 调整以下`set_env`脚本为ascend-toolkit的安装路径
source set_env.sh

启动8卡微调，打乱数据

bash scripts/run_train_8p_glm.sh

启动8卡微调，不打乱数据

bash scripts/run_train_8p_glm_no_shuffle.sh

ChatGLM3-6B-32K

进入解压后的源码包根目录。
```
cd /${模型文件夹名称} 
```

启动全参数finetune训练

启动训练环境

# 调整以下`set_env`脚本为ascend-toolkit的安装路径
source set_env.sh

启动8卡微调，打乱数据

bash scripts/run_train_8p_glm_32k.sh

启动8卡微调，不打乱数据

bash scripts/run_train_8p_glm_32k_no_shuffle.sh

模型训练脚本参数说明如下

data_dir                                        //数据集文件夹路径
model_dir                                       //模型文件夹路径
config_file                                     //配置文件路径
shuffle                                         //训练时是否打乱数据集
auto_resume_from_checkpoint                     //是否自动从上一次训练中最后一个checkpoint继续训练

如需要保存日志文件，请使用以下命令，并将运行脚本及日志文件名更改为自己对应的文件名

nohup bash /path/to/your/sh/scripts >/path/to/your/log/file 2>&1 &

训练结果展示

表 3 训练结果展示表

芯片	卡数	模型	Iterations	Global Batch Size	Train Samples per Second
Atlas A2	8p	ChatGLM3-6B	2000	16	13.781
GPU	8p	ChatGLM3-6B	2000	16	15.094
Atlas A2	8p	ChatGLM3-6B-32K	2000	16	11.819
GPU	8p	ChatGLM3-6B-32K	2000	16	12.088

公网地址说明

代码涉及公网地址参考 public_address_statement.md

变更说明

2024.05.29：首次发布

FAQ

暂无。