文件最后提交记录最后更新时间
SDMA适配A5 Co-authored-by: linzhao<1078978268@qq.com> Co-authored-by: chenyz6<chenxin146@huawei.com> # message auto-generated for no-merge-commit merge: !307 merge develop_A5 into develop SDMA适配A5 Created-by: chenyz6 Commit-by: linzhao;chenyz6 Merged-by: yrewzjsx Description: # Pull Request Template ## Description 1.A2支持vmm接口申请HBM 2.shmem支持2M粒度的内存预留 3.shmem支持A5标卡 ## Related Issues <!-- Replace with actual issue numbers. Example: Closes #123, Related to #456 --> ## Type of Change - [ ] Bug fix - [x] New feature - [ ] Performance optimization - [ ] Documentation update - [ ] Other (please describe):_______________ ## Testing shmem A2/A3 ok ![image.png](https://raw.gitcode.com/user-images/assets/7672916/ce39c38e-2828-4cc0-b9fb-45049f7bc90d/image.png 'image.png') trans A3 ok ![image.png](https://raw.gitcode.com/user-images/assets/7672916/badb9901-d08c-46a2-8968-72f743198ec5/image.png 'image.png') bm A3 ok ![image.png](https://raw.gitcode.com/user-images/assets/7672916/fac45c60-a6ba-4c93-a7d3-8612c2990e15/image.png 'image.png') ## Checklist - [ ] I have performed a self-review of my own code. - [ ] I have updated the documentation. - [ ] I have added tests to prove my changes are effective. See merge request: Ascend/memfabric_hybrid!3072 个月前
move transfer performance examples to benchmark Co-authored-by: wonder1121<wangdan@huawei.com> # message auto-generated for no-merge-commit merge: !564 merge develop_transfer into develop move transfer performance examples to benchmark Created-by: wonder1121 Commit-by: wonder1121 Merged-by: yrewzjsx Description: ==================================================Trans Test Start================================================== Test completed: latency 67.45us, block size 32KB, total threads=2, per-thread times=100, aggregated throughput 6.81 GB/s Test completed: latency 65.35us, block size 64KB, total threads=2, per-thread times=100, aggregated throughput 14.60 GB/s Test completed: latency 65.62us, block size 128KB, total threads=2, per-thread times=100, aggregated throughput 23.32 GB/s Test completed: latency 75.19us, block size 256KB, total threads=2, per-thread times=100, aggregated throughput 24.00 GB/s Test completed: latency 66.34us, block size 512KB, total threads=2, per-thread times=100, aggregated throughput 24.32 GB/s Test completed: latency 84.17us, block size 1024KB, total threads=2, per-thread times=100, aggregated throughput 24.42 GB/s Test completed: latency 95.28us, block size 2048KB, total threads=2, per-thread times=100, aggregated throughput 24.47 GB/s Test completed: latency 175.10us, block size 4096KB, total threads=2, per-thread times=100, aggregated throughput 24.49 GB/s Test completed: latency 346.41us, block size 8192KB, total threads=2, per-thread times=100, aggregated throughput 24.50 GB/s Test completed: latency 691.24us, block size 16384KB, total threads=2, per-thread times=100, aggregated throughput 24.50 GB/s ==================================================Test End================================================== # Pull Request Template ## Description <!-- Provide a clear summary of the change, the problem it solves, and the technical approach. --> ## Related Issues <!-- Replace with actual issue numbers. Example: Closes #123, Related to #456 --> ## Type of Change - [ ] Bug fix - [ ] New feature - [ ] Performance optimization - [ ] Documentation update - [ ] Other (please describe):_______________ ## Testing <!-- Describe how the changes were tested. Include: - Unit/integration test coverage - Manual validation steps - Links to CI reports (e.g., CIDA) if available - Note any pending tests --> ## Checklist - [ ] I have performed a self-review of my own code. - [ ] I have updated the documentation. - [ ] I have added tests to prove my changes are effective. See merge request: Ascend/memfabric_hybrid!56429 天前
!1112 修改license 6 个月前
!1145 python wheel包归一 6 个月前