| [Feature] The vlm model supports two loss calculation methods: token level and sample level
Co-authored-by: zhangxubin<1656631289@qq.com>
# message auto-generated for no-merge-commit merge:
!1557 merge 2.2.0 into 2.2.0
[Feature] The vlm model supports two loss calculation methods: token level and sample level
Created-by: MoCuishle-M
Commit-by: MoCuishle-M;zhangxubin
Merged-by: ascend-robot
Description: ## Motivation
The vlm model supports two loss calculation methods: token level and sample level.
Enable token-level loss calculation using '--calculate-per-token-loss'.
Enable token-level loss calculation using '--calculate-per-sample-loss'.
Perform argument validation to prevent both '--calculate-per-token-loss' and '--calculate-per-sample-loss' from being enabled simultaneously.
When neither of these two arguments is enabled, the loss computation behavior remains consistent with the current implementation.
## Modification
The vlm model supports two loss calculation methods: token level and sample level
## Self-test (Optional)
If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached.
## BC-breaking (Optional)
If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR.
## Checklist
**Before PR**:
- [ ] The new code needs to comply with the Clean Code specification.
- [ ] The PR content is self-checked, and the expression can be clear and the writing standardized
**After PR**:
- [ ] CLA has been signed and all committers have signed the CLA in this PR.
- [ ] The ci-pipeline is passed, Code Check is passed.
See merge request: Ascend/MindSpeed-MM!1557 | 7 个月前 |