[AMD] Introduce an OptimizeLDSUsage pass (#3730)
This PR inroduces OptimizeLDSUsage pass which generalizes
LDS optimization,which was part of the
DecomposeUnsupportedLayouts pass.
Overall it tries to reduce LDS usage of convert op by adding
intermediate layout in conversion.
---------
Co-authored-by: Lei Zhang <antiagainst@gmail.com>
[BUILD] Prepare for future CUDA updates using more flexible configurations (#4632)
This PR addresses issues arising from recent updates to NVIDIA's CUDA
packages in the conda channel. The key changes are as follows:
- Switched from a .txt file to a .json file for specifying versions
of individual CUDA packages. This change aligns with NVIDIA's [component
version
table](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-toolkit-major-component-versions),
where major updates might include packages from different versions
(e.g., CUDA 12.6 includes packages from versions 12.6.20 and 12.6.37).
- Some packages have been split further. For instance, ptxas was
previously part of cuda-nvcc but is now provided in a separate package
named cuda-nvcc-tools.
- Certain packages now have platform-dependent directories after
extraction from the tar file.
Example: The include path for cudacrt should be set as
targets/<platform-name>/include.