CosyVoice:基于大语言模型的文本转语音系统项目

使用vLLM-Ascend==0.14.0rc1、vLLM==0.14.1、Torch_NPU实现对FunCosyVoice3-2512模型适配国产化昇腾算力910B系列,支持910C、910B2\B3\B4,源码继承自https://github.com/FunAudioLLM/CosyVoice;修改范围:新增cosyvoice/utils/device.py、修改cosyvoice/cli/model.py、修改cosyvoice/cli/cosyvoice.py、修改cosyvoice/cli/frontend.py、修改cosyvoice/utils/onnx.py;屏蔽NPU 的 aclnnUnfoldGrad torch.istft 在 NPU 上的算子带来的异步报错;屏蔽影响max_tokens 变成负数、flow 的 shape 被改变; NPU不做istft,在 CPU上执行torch.istft;FunCosyVoice3-0.5B模型在昇腾NPU上的100次压测,平均RTF为 0.5873。满足实时合成需求。

分支1Tags0
9c61c2ea创建于 3月20日563次提交
文件最后提交记录最后更新时间
fix lint 5 个月前
update dingding 5 个月前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
fix a typo in Dockerfile 5 个月前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
fix lint 2 个月前
remove academic and change to iic/CosyVoice_ttsfrd 1 年前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
add cosyvoice2 1 年前
remove academic and change to iic/CosyVoice_ttsfrd 1 年前
add cosyvoice code 1 年前
remove academic third party 1 年前
update MIT LICENSE 2 个月前
update readme 2 个月前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
fix lint 2 个月前
update requirements.txt readme 2 个月前
update requirements.txt 2 个月前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
new file: "atomgit@gitcode-FunCosyVoice3-0.5B-vLLM\346\234\215\345\212\241\345\214\226\345\273\272\350\256\256\345\216\213\346\265\213\350\256\260\345\275\225.txt" modified: cosyvoice/cli/cosyvoice.py new file: cosyvoice/cli/cosyvoice.py_bak modified: cosyvoice/cli/frontend.py new file: cosyvoice/cli/frontend.py_bak modified: cosyvoice/cli/model.py new file: cosyvoice/cli/model.py_bak modified: cosyvoice/hifigan/generator.py new file: cosyvoice/utils/device.py modified: cosyvoice/utils/file_utils.py modified: cosyvoice/utils/onnx.py new file: cosyvoice/utils/onnx.py_bak new file: download.log modified: examples/libritts/cosyvoice/local/download_and_untar.sh modified: examples/magicdata-read/cosyvoice/local/download_and_untar.sh modified: requirements.txt new file: run_cosyvoice.py new file: test.py modified: tools/extract_embedding.py modified: tools/extract_speech_token.py modified: tools/make_parquet_list.py 2 个月前
fix bistream bug 5 个月前
remove instruct warning 4 个月前

🎙️ FunCosyVoice3 昇腾 NPU 适配版 (Ascend 910B/C)

本项目基于 FunAudioLLM/CosyVoice 官方源码进行深度分叉与定制,专为 国产化昇腾算力(Ascend NPU) 环境提供全面适配与性能优化。

全面支持 昇腾 910B 系列(910B2/B3/B4)910C,在保障生成质量的同时,大幅提升了在国产硬件上的推理稳定性与实时性能。


✨ 核心特性与修改范围

为实现 FunCosyVoice3-2512 模型在昇腾环境下的高效推理,本项目利用 Torch_NPUvLLM 框架进行了底层调度与算子级优化:

🛠️ 代码结构级修改

  • 新增cosyvoice/utils/device.py —— 用于更细粒度的 NPU/CPU 设备调度与资源管理。
  • 修改
    • cosyvoice/cli/model.py
    • cosyvoice/cli/cosyvoice.py
    • cosyvoice/cli/frontend.py
    • cosyvoice/utils/onnx.py

🔧 算子与逻辑级修复 (Bug Fixes & Workarounds)

  1. 规避异步报错:针对 NPU 上 aclnnUnfoldGrad 算子在处理 torch.istft 时引发的底层异步报错问题进行了屏蔽与调度重构。
  2. 异构计算调度:由于 NPU 对 istft 算子的支持存在局限,主动拦截 NPU 侧的 istft 操作,将其强制调度至 CPU 上执行,确保音频重建的稳定性与正确性。
  3. 状态异常修复:修复了在特定 NPU 推理场景下,max_tokens 异常突变为负数,以及 flow 张量 shape 被意外改变的严重逻辑问题。

🚀 性能表现 (Performance)

在昇腾 NPU 环境下,针对 FunCosyVoice3-0.5B 模型进行了高强度连续推理压测:

测试指标 测试结果 结论
测试硬件 昇腾 910B4 32GB -
压测规模 连续 100 次语音合成请求 运行稳定,无内存泄漏或异常中断
平均 RTF (实时率) 0.5873 满足基本的实时语音交互与合成的业务需求

注:RTF (Real-Time Factor) < 1 表示生成语音的时间短于语音本身的播放时长。


📦 环境依赖与安装

本项目依赖特定版本的 vLLM 与昇腾适配包以发挥最佳性能,请确保您的环境符合以下要求:

核心依赖版本

  • vLLM-Ascend == 0.14.0rc1
  • vLLM == 0.14.1
  • Torch_NPU (需与您环境的 CANN 版本匹配)

源码依赖安装

  1. git clone https://gitcode.com/AdaMatrix/CosyVoice.git
  2. cd CosyVoice
  3. pip install -r requirements-dev.txt -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host=mirrors.aliyun.com

权重下载

  1. mkdir -p /opt/atomgit/Fun-CosyVoice3-0.5B-2512
  2. nohup modelscope download --model FunAudioLLM/Fun-CosyVoice3-0.5B-2512 --local_dir /opt/atomgit/Fun-CosyVoice3-0.5B-2512 > download.log 2>&1 &
  3. tail -n 100 -f download.log

🤝 Acknowledgments

  • 核心算法与原始代码框架继承自 FunAudioLLM/CosyVoice
  • 感谢 vLLM 与昇腾开源社区在底层算子适配上提供的支持。

项目介绍

使用vLLM-Ascend==0.14.0rc1、vLLM==0.14.1、Torch_NPU实现对FunCosyVoice3-2512模型适配国产化昇腾算力910B系列,支持910C、910B2\B3\B4,源码继承自https://github.com/FunAudioLLM/CosyVoice;修改范围:新增cosyvoice/utils/device.py、修改cosyvoice/cli/model.py、修改cosyvoice/cli/cosyvoice.py、修改cosyvoice/cli/frontend.py、修改cosyvoice/utils/onnx.py;屏蔽NPU 的 aclnnUnfoldGrad torch.istft 在 NPU 上的算子带来的异步报错;屏蔽影响max_tokens 变成负数、flow 的 shape 被改变; NPU不做istft,在 CPU上执行torch.istft;FunCosyVoice3-0.5B模型在昇腾NPU上的100次压测,平均RTF为 0.5873。满足实时合成需求。

https://github.com/FunAudioLLM/CosyVoice定制我的领域

下载使用量

0

项目总下载次数(含Clone、Pull、 zip 包及 release 下载),每日凌晨更新

语言类型

Python97.03%
Shell2.65%
Dockerfile0.32%