MindSpeed/mindspeed/te/pytorch/fp8/recipes · Ascend/MindSpeed - AtomGit

ascend-robotfix(quant): only hif8 add dst_type_max args

文件	最后提交记录	最后更新时间
__init__.py	feat: mxfp8-32x32 quant Co-authored-by: kyle_zhangchi<zhangchi158@huawei.com> # message auto-generated for no-merge-commit merge: !3471 merge feat_mxfp8-32x32 into master feat: mxfp8-32x32 quant Created-by: kyle_zhangchi Commit-by: kyle_zhangchi Merged-by: ascend-robot Description: ## What this PR does / why we need it? 在Megatron框架下新增mxfp8-32x32量化算子，降低权重显存占用 ## Does this PR introduce any user-facing change? --fp8-recipe新增mxfp8-32x32选项 https://gitcode.com/Ascend/MindSpeed/commit/e065cbca6873bfc02661d088b07d90224333e87d?ref=feat_mxfp8-32x32&prId=3471 ## How was this patch tested? 验证文档 https://wiki.huawei.com/domains/170864/wiki/367830/WIKI2026051111046509 See merge request: Ascend/MindSpeed!3471	7 天前
current_scaling_recipe.py	fix(quant): only hif8 add dst_type_max args Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3514 merge fix-hif8-tensorwise into master fix(quant): only hif8 add dst_type_max args Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: fix(quant): only hif8 add dst_type_max args See merge request: Ascend/MindSpeed!3514	3 天前
delayed_scaling_recipe.py	fix: 修改run_quantizer()变量名错误 Co-authored-by: Keilo_W<wangkaiyu11@h-partners.com> # message auto-generated for no-merge-commit merge: !3479 merge master into master fix: 修改run_quantizer()变量名错误 Created-by: Keilo_W Commit-by: Keilo_W Merged-by: ascend-robot Description: [代码] 修改mindspeed/te/pytorch/fp8/recipes/delayed_scaling_recipe.py文件中，将run_quantizer()内传参变量`scale`改为`scales`，解决`RuntimeError: npu::npu_quantize() is missing value for argument 'scales'. Declaration: npu::npu_quantize(Tensor input, Tensor scales, Tensor? zero_points, int dtype, int axis=1, bool div_mode=True) -> Tensor问题` See merge request: Ascend/MindSpeed!3479	17 天前
float8_block_scaling_recipe.py	feat: fp8 reuse quant w with te_gmm_mode compatible Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3371 merge fp8_reuse_perf_v2 into master feat: fp8 reuse quant w with te_gmm_mode compatible Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat: fp8 reuse quant w with te_gmm_mode compatible; perf/fix: fp8 reuse quant w with te_gmm_mode perf Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!3371	2 个月前
mxfp8_32x32_scaling_recipe.py	feat: mxfp8-32x32 quant Co-authored-by: kyle_zhangchi<zhangchi158@huawei.com> # message auto-generated for no-merge-commit merge: !3471 merge feat_mxfp8-32x32 into master feat: mxfp8-32x32 quant Created-by: kyle_zhangchi Commit-by: kyle_zhangchi Merged-by: ascend-robot Description: ## What this PR does / why we need it? 在Megatron框架下新增mxfp8-32x32量化算子，降低权重显存占用 ## Does this PR introduce any user-facing change? --fp8-recipe新增mxfp8-32x32选项 https://gitcode.com/Ascend/MindSpeed/commit/e065cbca6873bfc02661d088b07d90224333e87d?ref=feat_mxfp8-32x32&prId=3471 ## How was this patch tested? 验证文档 https://wiki.huawei.com/domains/170864/wiki/367830/WIKI2026051111046509 See merge request: Ascend/MindSpeed!3471	7 天前
mxfp8_scaling_recipe.py	feat: fp8 reuse quant w with te_gmm_mode compatible Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3371 merge fp8_reuse_perf_v2 into master feat: fp8 reuse quant w with te_gmm_mode compatible Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat: fp8 reuse quant w with te_gmm_mode compatible; perf/fix: fp8 reuse quant w with te_gmm_mode perf Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!3371	2 个月前
recipe.py	feat: fp8 reuse quant w with te_gmm_mode compatible Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3371 merge fp8_reuse_perf_v2 into master feat: fp8 reuse quant w with te_gmm_mode compatible Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat: fp8 reuse quant w with te_gmm_mode compatible; perf/fix: fp8 reuse quant w with te_gmm_mode perf Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!3371	2 个月前