v4.0 update. (#2371)

This commit is contained in:
Junkai-Wu
2025-06-06 14:39:20 +08:00
committed by GitHub
parent 2e2af190bd
commit 8bdbfca682
254 changed files with 29751 additions and 1980 deletions

View File

@ -42,7 +42,7 @@ Core CuTe DSL Abstractions
- **Atoms** Represent fundamental hardware operations like matrix multiply-accumulate (MMA) or memory copy.
- **Tiled Operations** Define how atoms are applied across thread blocks and warps (e.g., ``TiledMma``, ``TiledCopy``).
For more on CuTe abstractions, refer to the `CuTe C++ library documentation <https://github.com/NVIDIA/cutlass/blob/main/media/docs/cute/00_quickstart.md>`__.
For more on CuTe abstractions, refer to the `CuTe C++ library documentation <https://github.com/NVIDIA/cutlass/blob/main/media/docs/cpp/cute/00_quickstart.md>`__.
**Pythonic Kernel Expression**