文件最后提交记录最后更新时间
feat(torch): Qwen3-Omni support ulysses cp / fix(torch): repeat_kv and activation_offload bug Co-authored-by: yaoyaoxu<xuyaoyao.824404@huawei.com> # message auto-generated for no-merge-commit merge: !2188 merge qwen3omni_ulysses_cp into master feat(torch): Qwen3-Omni support ulysses cp / fix(torch): repeat_kv and activation_offload bug Created-by: yaoyaoxu Commit-by: yaoyaoxu Merged-by: ascend-robot Description: ## Motivation 当前序列长度无法支持128K,需要适配CP以支持128K长序列 ## Modification 1.Qwen3-Omni支持ulysses cp:对AuT、ViT、LLM都做了适配; 如果开启CP,但没有传入音频数据或CP size > = seq_len,则不对音频模块做CP处理 2.修复repeat_kv的bug 3.修复开启activation_offload配置时的内存泄漏bug ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!21882 个月前