40c3fe66创建于 14 天前历史提交

文件	最后提交记录	最后更新时间
README.md	add indextts model Co-authored-by: mc-zhang<zhangbolun6@huawei.com> # message auto-generated for no-merge-commit merge: !7516 merge master into master add indextts model Created-by: mc-zhang Commit-by: mc-zhang Merged-by: ascend-robot Description: ## Motivation add the moded Index-TTS vLLM ## Modification add the patch file and readme, to describe how to build the project and operate ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7516	14 天前
diff.patch	add indextts model Co-authored-by: mc-zhang<zhangbolun6@huawei.com> # message auto-generated for no-merge-commit merge: !7516 merge master into master add indextts model Created-by: mc-zhang Commit-by: mc-zhang Merged-by: ascend-robot Description: ## Motivation add the moded Index-TTS vLLM ## Modification add the patch file and readme, to describe how to build the project and operate ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/ModelZoo-PyTorch!7516	14 天前

IndexTTS_vLLM-v1.5-推理指导

概述
推理环境准备
快速上手
- 获取源码
- 模型推理
模型推理性能&精度

概述

IndexTTS（全称 Index Text-To-Speech）是由B站团队开发并开源的工业级、零样本、高可控度文本转语音（TTS）模型系列，主打中文场景，核心解决传统 TTS 在音色克隆、情感控制、时长精准匹配三大难题。

版本说明：

url=https://github.com/Ksuriuri/index-tts-vllm
branch=IndexTTS-vLLM-1.0
model_name=index-tts-vllm-v1.5

推理环境准备

该模型需要以下插件与驱动
表 1 版本配套表

配套	版本	环境准备指导
固件与驱动	25.3.rc1	Pytorch框架推理环境准备
CANN	8.3.RC2	-
Python	3.11	-
PyTorch	2.7.1	-
Ascend Extension PyTorch	2.7.1	-
vLLM	0.11.0	vLLM-Ascend安装指导
说明：Atlas 800I A2/Atlas 800T A2 卡请以CANN版本选择实际固件与驱动版本。	\	\

快速上手

获取源码

获取本仓源码

git clone https://gitcode.com/ascend/ModelZoo-PyTorch.git
cd ModelZoo-PyTorch/ACL_PyTorch/built-in/audio/IndexTTS-vLLM/

安装依赖

# 复用vllm基础镜像或参考vllm-ascend社区文档搭建vllm环境：
docker pull quay.io/ascend/vllm-ascend:v0.11.0

获取模型仓源码

git clone https://github.com/Ksuriuri/index-tts-vllm
cd index-tts-vllm
git checkout 48a06a5df5a8e19adc50afc1179fb788ef05ad6a
git apply ../diff.patch
pip3 install -r requirements.txt

#pynini安装：由于开源仓依赖WeTextProcessing库，需要安装opensft，在ARM环境中需要手动编译安装
# 获取opensft-1.8.3.tar.gz包
wget https://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.8.3.tar.gz 或登陆网站查看https://www.openfst.org/twiki/bin/view/FST/FstDownload
tar -xzvf openfst-1.8.3.tar.gz && cd openfst-1.8.3

# 执行编译
./configure --enable-far --enable-mpdt --enable-pdt && make -j64 && make install
# 查看安装路径
ls -l /usr/local/lib/libfstmpdtscript.so.26
# 指定路径到环境变量
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH && ldconfig
# pip 安装WeTextProcessing

#当前音频预处理采样已放到npu上，但是npu目前不支持复数求绝对值，需要在torchaudio上手动操作
pip show torchaudio
# 进入torchaudio安装路径
vim functional/functional.py

# 源代码在大约146行
146 if power == 1.0:
146    return spec_f.abs()
#修改为下面代码
146 if power == 1.0:
146    device = spec_f.device
147    spec_f = spec_f.cpu() # 放到cpu计算，再放回npu
148    return spec_f.abs().to(device)

下载模型权重

mkdir -p ./checkpoints/Index-TTS-1.5-vLLM

# IndexTTS-1.5
modelscope download --model kusuriuri/Index-TTS-1.5-vLLM --local_dir ./checkpoints/Index-TTS-1.5-vLLM

完整下载后的文件目录树如下

|-- DISCLAIMER
|-- Dockerfile
|-- INDEX_MODEL_LICENSE
|-- LICENSE
|-- README.md
|-- README_EN.md
|-- api_server.py
|-- assets
|-- checkpoints
|-- convert_hf_format.py
|-- convert_hf_format.sh
|-- docker-compose.yaml
|-- entrypoint.sh
|-- indextts
|-- patch_vllm.py
|-- requirements.txt
|-- simple_test.py
|-- tests
|-- tools
`-- webui.py

模型推理

样本测试

export VLLM_USE_V1=1
export ASCEND_RT_VISIBLE_DEVICES=0
unset ASCEND_LAUNCH_BLOCKING=1
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'

python api_server.py --model_dir ./checkpoints/Index-TTS-1.5-vLLM --port 20007

参数说明：
- model_dir: 模型权重路径
- port: 推理服务端口，默认为6006

推理脚本以计算单用例音频输出结果为例，推理后将打屏推理结果

单样本测试推理
```
python3 api_example.py
```

模型推理性能&精度

模型	硬件	数据集	batch size	推理耗时（s）	竞品(A100)耗时（s）
index-tts	Atlas 800T A2(x86)	10个字	1	0.36	0.35
index-tts	Atlas 800T A2(arm)	10个字	1	0.48	0.35