文件最后提交记录最后更新时间
feat: mxfp8-32x32 quant Co-authored-by: kyle_zhangchi<zhangchi158@huawei.com> # message auto-generated for no-merge-commit merge: !3471 merge feat_mxfp8-32x32 into master feat: mxfp8-32x32 quant Created-by: kyle_zhangchi Commit-by: kyle_zhangchi Merged-by: ascend-robot Description: ## What this PR does / why we need it? 在Megatron框架下新增mxfp8-32x32量化算子,降低权重显存占用 ## Does this PR introduce *any* user-facing change? --fp8-recipe新增mxfp8-32x32选项 https://gitcode.com/Ascend/MindSpeed/commit/e065cbca6873bfc02661d088b07d90224333e87d?ref=feat_mxfp8-32x32&prId=3471 ## How was this patch tested? 验证文档 https://wiki.huawei.com/domains/170864/wiki/367830/WIKI2026051111046509 See merge request: Ascend/MindSpeed!34717 天前
fix(quant): only hif8 add dst_type_max args Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3514 merge fix-hif8-tensorwise into master fix(quant): only hif8 add dst_type_max args Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: fix(quant): only hif8 add dst_type_max args See merge request: Ascend/MindSpeed!35143 天前
fix: 修改run_quantizer()变量名错误 Co-authored-by: Keilo_W<wangkaiyu11@h-partners.com> # message auto-generated for no-merge-commit merge: !3479 merge master into master fix: 修改run_quantizer()变量名错误 Created-by: Keilo_W Commit-by: Keilo_W Merged-by: ascend-robot Description: [代码] 修改mindspeed/te/pytorch/fp8/recipes/delayed_scaling_recipe.py文件中,将run_quantizer()内传参变量scale改为scales,解决RuntimeError: npu::npu_quantize() is missing value for argument 'scales'. Declaration: npu::npu_quantize(Tensor input, Tensor scales, Tensor? zero_points, int dtype, int axis=1, bool div_mode=True) -> Tensor问题 See merge request: Ascend/MindSpeed!347917 天前
feat: fp8 reuse quant w with te_gmm_mode compatible Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3371 merge fp8_reuse_perf_v2 into master feat: fp8 reuse quant w with te_gmm_mode compatible Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat: fp8 reuse quant w with te_gmm_mode compatible; perf/fix: fp8 reuse quant w with te_gmm_mode perf Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33712 个月前
feat: mxfp8-32x32 quant Co-authored-by: kyle_zhangchi<zhangchi158@huawei.com> # message auto-generated for no-merge-commit merge: !3471 merge feat_mxfp8-32x32 into master feat: mxfp8-32x32 quant Created-by: kyle_zhangchi Commit-by: kyle_zhangchi Merged-by: ascend-robot Description: ## What this PR does / why we need it? 在Megatron框架下新增mxfp8-32x32量化算子,降低权重显存占用 ## Does this PR introduce *any* user-facing change? --fp8-recipe新增mxfp8-32x32选项 https://gitcode.com/Ascend/MindSpeed/commit/e065cbca6873bfc02661d088b07d90224333e87d?ref=feat_mxfp8-32x32&prId=3471 ## How was this patch tested? 验证文档 https://wiki.huawei.com/domains/170864/wiki/367830/WIKI2026051111046509 See merge request: Ascend/MindSpeed!34717 天前
feat: fp8 reuse quant w with te_gmm_mode compatible Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3371 merge fp8_reuse_perf_v2 into master feat: fp8 reuse quant w with te_gmm_mode compatible Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat: fp8 reuse quant w with te_gmm_mode compatible; perf/fix: fp8 reuse quant w with te_gmm_mode perf Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33712 个月前
feat: fp8 reuse quant w with te_gmm_mode compatible Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3371 merge fp8_reuse_perf_v2 into master feat: fp8 reuse quant w with te_gmm_mode compatible Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat: fp8 reuse quant w with te_gmm_mode compatible; perf/fix: fp8 reuse quant w with te_gmm_mode perf Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33712 个月前