| docs: Add comprehensive docstrings for core modules
Co-authored-by: wangjiangben<wangjiangben@huawei.com>
# message auto-generated for no-merge-commit merge:
!4397 merge feature/add-docstrings into master
docs: Add comprehensive docstrings for core modules
Created-by: wangjiangben
Commit-by: wangjiangben
Merged-by: ascend-robot
Description: ## Summary
This PR adds detailed English docstrings for key functions and classes across multiple core modules to improve code documentation and maintainability.
## Changes
### Core Modules (mindspeed_llm/core/)
- **context_parallel**: Add docstrings for context parallel attention and wrapper functions
- CPDotProductAttention: Context parallel dot product attention implementation
- attention_init_wrapper: Attention initialization with Ulysses and hybrid CP support
- **datasets**: Add docstrings for dataset building utilities
- need_to_build_dataset: Determine which ranks need to build datasets
- build_generic_dataset: Build distributed datasets
- **distributed**: Add docstrings for gradient sync and buffer management
- start_grad_sync_wrapper: Gradient synchronization with distributed optimizer support
- recover_gradient_scaling_factors: Restore gradient scaling factors
- **models**: Add docstrings for GPT layer specifications
- get_gpt_layer_local_spec_wrapper: GPT layer spec with custom normalization
- build_layers_wrapper: Layer building with MC2 optimization for MoE
- **parallel_state**: Add docstrings for parallel initialization
- initialize_model_parallel_decorator: Model parallel initialization with expert parallel support
- **transformer**: Add docstrings for transformer block functions
- get_num_layers_to_build: Calculate layers for current pipeline stage
- get_layer_offset_wrapper: Layer offset with custom distribution support
- transformer_block_init_wrapper: TransformerBlock initialization
### Operators (mindspeed_llm/ops/)
- **triton**: Add docstrings for NPU optimization functions
- get_npu_properties: Get NPU device properties
- rms_norm_ref: Reference implementation of RMS normalization with gating
### Transformer Engine (mindspeed_llm/te/)
- Add docstrings for transformer engine attention
- do_kvallgather_context_parallel: Context parallel attention with KV AllGather strategy
### Training (mindspeed_llm/training/)
- **arguments**: Add docstrings for argument parsing
- extra_args_provider_decorator: Add MindSpeed-LLM specific arguments
- parse_args_decorator: Parse arguments with MindSpeed-LLM processing
- core_transformer_config_from_args_wrapper: Create TransformerConfig with extensions
- validate_args_v2_decorator: Validate arguments with MindSpeed-LLM extensions
- **checkpointing**: Add docstrings for checkpoint management
- _load_base_checkpoint_wrapper: Load checkpoint with LoRA support
- load_checkpoint_wrapper: Load checkpoint with loose loading support
- **initialize**: Add docstrings for initialization
- _compile_dependencies: Compile dataset index builder dependencies
- **training**: Add docstrings for training utilities
- _enable_npu_datadump_step_end: Enable NPU data dump
- update_save_checkpoint_chmod: Update checkpoint file permissions
- **utils**: Add docstrings for utility functions
- _disable_gc: Context manager to disable garbage collection
- temporal_async_caller_schedule_async_call: Schedule async call with GC disabled
## Documentation Standards
All docstrings follow Python standard format:
- Brief description of function/class purpose
- Args: Parameter descriptions with types
- Returns: Return value description
- Note: Important usage notes and constraints (where applicable)
## Statistics
- **Files changed**: 13
- **Lines added**: 443
- **Lines removed**: 6
## Testing
- All docstrings are written in English
- Docstrings accurately describe function behavior
- No functional code changes, only documentation improvements
## Related Issues
Improves code documentation and developer experience for MindSpeed-LLM core modules.
See merge request: Ascend/MindSpeed-LLM!4397 | 1 个月前 |