| [BugFix] Fix the bug in GBS data balance where state_buffer lacks the key "non_balanced_data".
Co-authored-by: huangdabiao<huangdabiao1@huawei.com>
# message auto-generated for no-merge-commit merge:
!2232 merge gbs_data_balance_bugfix_0302 into master
[BugFix] Fix the bug in GBS data balance where state_buffer lacks the key "non_balanced_data".
Created-by: huangdabiao
Commit-by: huangdabiao
Merged-by: ascend-robot
Description: ## Motivation
bug产生的原因:
self.state_buffer[data_type]["non_balanced_data"]在batch中含有不可迭代对象时初始化为字典并存储不可迭代对象,
并在存储之后通过
for data_name in self.state_buffer[data_type]["non_balanced_data"]:
batch.pop(data_name)
将不可迭代对象从batch中暂时删除,以避免后续均衡过程中产生数据不可拆分的bug。
但是在batch不含有不可迭代对象的情况下,self.state_buffer[data_type]["non_balanced_data"]未被初始化,造成self.state_buffer[data_type]找不到key"non_balanced_data"的bug
修复合理性:
本次pr添加判断条件,仅在存在key"non_balanced_data"的情况下调用self.state_buffer[data_type]["non_balanced_data"],规避找不到key的错误
## Modification
Please briefly describe what modification is made in this PR.
## Self-test (Optional)
If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached.
## BC-breaking (Optional)
If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR.
## Checklist
**Before PR**:
- [ ] The new code needs to comply with the Clean Code specification.
- [ ] The PR content is self-checked, and the expression can be clear and the writing standardized
**After PR**:
- [ ] CLA has been signed and all committers have signed the CLA in this PR.
- [ ] The ci-pipeline is passed, Code Check is passed.
See merge request: Ascend/MindSpeed-MM!2232 | 2 个月前 |