1299d0a5创建于 13 天前历史提交

文件	最后提交记录	最后更新时间
README.md	1、增加CosyVoice3，支持在昇腾设备上运行 Co-authored-by: tao-hc<taohouchao@huawei.com> # message auto-generated for no-merge-commit merge: !7565 merge master into master 1、增加CosyVoice3，支持在昇腾设备上运行 Created-by: tao-hc Commit-by: tao-hc Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7565	13 天前
diff.patch	1、增加CosyVoice3，支持在昇腾设备上运行 Co-authored-by: tao-hc<taohouchao@huawei.com> # message auto-generated for no-merge-commit merge: !7565 merge master into master 1、增加CosyVoice3，支持在昇腾设备上运行 Created-by: tao-hc Commit-by: tao-hc Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7565	13 天前

CosyVoice3(TorchAir)-推理指导

CosyVoice3(TorchAir)-推理指导
概述
推理环境准备
快速上手
FAQ

概述

‌Co‌syVoice3是一款基于大型语言模型（LLM）的文本转语音（TTS）系统，在内容一致性、说话者相似性和韵律自然性方面超越了前代 CosyVoice 2.0。

推理环境准备

该模型需要以下插件与驱动 表 1 版本配套表

配套	版本	环境准备指导
固件与驱动	25.5.0	Pytorch框架推理环境准备
CANN	8.3.0	包含kernels包和toolkit包
Python	3.11	-
PyTorch	2.9.0	-
Ascend Extension PyTorch	2.9.0	-
vllm	0.17.0	-
vllm_ascend	0.17.0rc1	vllm, vllm-ascend安装和使用指导

说明：Atlas 800I A2 推理服务器和Atlas 800T A2 训练服务器请以CANN版本选择实际固件与驱动版本。

快速上手

1. 获取vllm和vllm-ascend（以Docker安装为例）

# Update --device according to your device (Atlas A2: /dev/davinci[0-7] Atlas A3:/dev/davinci[0-15]).
# Update the vllm-ascend image according to your environment.
# Note you should download the weight to /root/.cache in advance.
export IMAGE=quay.io/ascend/vllm-ascend:v0.17.0rc1
docker run --rm \
    --name vllm-ascend-env \
    --shm-size=1g \
    --net=host \
    --device /dev/davinci0 \
    --device /dev/davinci1 \
    --device /dev/davinci2 \
    --device /dev/davinci3 \
    --device /dev/davinci4 \
    --device /dev/davinci5 \
    --device /dev/davinci6 \
    --device /dev/davinci7 \
    --device /dev/davinci_manager \
    --device /dev/devmm_svm \
    --device /dev/hisi_hdc \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
    -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
    -v /etc/ascend_install.info:/etc/ascend_install.info \
    -v /root/.cache:/root/.cache \
    -it $IMAGE bash

2. 获取本仓源码

git clone https://gitcode.com/ascend/ModelZoo-PyTorch.git
cd ModelZoo-PyTorch/ACL_PyTorch/built-in/audio/CosyVoice3

3. 获取原仓源码

获取CosyVoice3源码

# 获取CosyVoice3源码
git clone https://github.com/FunAudioLLM/CosyVoice
cd CosyVoice
git reset --hard ace7c47
# 叠加patch
git apply ../diff.patch
# update grpc file
cd ./runtime/python/grpc
bash update_grpc_file.sh

文件目录结构大致如下：

📁 CosyVoice/
|── 📁 CosyVoice
    ├── 📁 asset
    |── 📁 cosyvoice    # cosyVoice的源码文件，此处不一一列举
    ├── 📁 runtime
    ├── 📁 third_party
    ├── 📁 tools
    │── 📄 requirements.txt    # 依赖库
    |── ...
📁 /home/
|── 📁 Fun-CosyVoice3-0.5B-2512    # 权重文件
    |── ...                        # 具体文件详见《获取权重数据》

安装依赖
```
根据提示，安装依赖
```
获取权重数据

本案例中，从modelscope.cn下载Fun-CosyVoice3-0.5B-2512。其他权重或其他下载方法，请自行适配。
```
modelscope download --model FunAudioLLM/Fun-CosyVoice3-0.5B-2512
```
本用例采用sft预训练音色推理，请额外下载spk权重放到权重目录下
```
wget https://www.modelscope.cn/models/iic/CosyVoice-300M-SFT/resolve/master/spk2info.pt
```

模型推理

1 开始推理验证

设置环境变量，执行推理命令

# 参考./runtime/python/grpc/run_server.sh
#!/bin/bash

# 基础环境变量配置
export VLLM_WORKER_MULTIPROC_METHOD=spawn
# 1. 指定使用NPU ID，例如：0
export ASCEND_RT_VISIBLE_DEVICES=0
export TASK_QUEUE_ENABLE=1
export CPU_AFFINITY_CONF=1
export VLLM_ASCEND_ENABLE_MLP_OPTIMIZE=1
export VLLM_ASCEND_ENABLE_TOPK_TOPP_OPTIMIZATION=1
export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libjemalloc.so.2:$LD_PRELOAD
# 日志调试配置：查看图断点和重新编译原因

# 权重路径
MODEL_DIR="/home/Fun-CosyVoice3-0.5B-2512/"
# 最大并发数
MAX_CONC=8

# 启动脚本示例，包括：服务化端口、权重路径、最大并发数、使能图模式
cd ./runtime/python/grpc
python server.py --port 50099 --model_dir "$MODEL_DIR" --max_conc "$MAX_CONC" --graph_mode

在推理开始后，首先会默认执行warm_up，目的是执行首次编译，首次编译时间较长，后续推理无需重复编译，在warm_up结束后，会执行推理操作

2 性能数据（Zero Shot By Id模式，单device，8并发场景）

模型	硬件	平均首包时延（ms）	平均rtf(实时率)
cosyvoice3	800I A2	480	0.54