b1d6e2c9b3
v4.3 update. ( #2709 )
...
* v4.3 update.
* Update the cute_dsl_api changelog's doc link
* Update version to 4.3.0
* Update the example link
* Update doc to encourage user to install DSL from requirements.txt
---------
Co-authored-by: Larry Wu <larwu@nvidia.com >
2025-10-21 14:26:30 -04:00
8825e8be4f
Add required changes for github pipeline. ( #2648 )
2025-09-17 22:22:45 -04:00
7817e47154
Fxied a typo in pipeline descript docs. ( #2623 )
2025-09-15 22:32:27 -04:00
25ccb875b8
Fix: a calculation error in the example of dividing out in the 02_layout_algebra doc ( #2635 )
2025-09-15 22:31:33 -04:00
29c1ad704a
Fix doc cute 03_tensor.md link typo ( #2627 )
...
* Update 03_tensor.md fix link typo
change path to relative path
* Update 03_tensor.md
---------
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
2025-09-15 22:26:43 -04:00
6a35b4d22f
v4.2 tag release. ( #2638 )
2025-09-15 12:21:53 -04:00
a49a78ffef
v4.2 release. ( #2587 )
...
* Fix default cluster callback values to 1 to avoid profiler failure when these values are not set in command line.
* v4.2 release.
2025-08-22 18:11:24 -04:00
5b76420d6a
[DOC] Add more exposition to composition example ( #2536 )
...
* Add more exposition to composition example
* Apply suggestions from code review
Co-authored-by: Cris Cecka <ccecka@users.noreply.github.com >
---------
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
Co-authored-by: Cris Cecka <ccecka@users.noreply.github.com >
2025-08-11 22:20:36 -04:00
86cf63e2d4
NIT: Grammar ( #2537 )
2025-08-10 22:42:45 -04:00
23139309e9
Fix incorrect K dim in CuTe MMA Atom doc. ( #2544 )
2025-08-10 22:40:56 -04:00
3b054767b3
Fix typo ( #2514 )
2025-07-30 22:14:54 -04:00
a39cf6b511
Fix example in CuTe tutorials ( #2416 )
2025-07-30 22:11:47 -04:00
fd6cfe1ed0
v4.1 release update v2. ( #2481 )
2025-07-21 22:03:55 -04:00
9892624b66
Fix typos in the text ( #2417 )
2025-07-16 21:51:12 -04:00
a1aaf2300a
v4.1 release
2025-07-03 08:07:53 -04:00
8bdbfca682
v4.0 update. ( #2371 )
2025-06-06 02:39:20 -04:00
9354bfd7c1
Keep the documentation consistent with the sgemm_1.cu code. ( #2285 )
...
* Keep the documentation consistent with the sgemm_1.cu code.
* fix typo
---------
Co-authored-by: zky <zky@126.com >
2025-05-19 22:53:15 -04:00
5e9b8e2a25
fix docx ( #2290 )
...
Co-authored-by: xiayongqiang <xiayq1@chinatelecom.cn >
2025-05-19 22:52:37 -04:00
f115c3f854
Release v4.0.0 ( #2294 )
2025-05-13 15:55:29 -04:00
331a1f5b3f
cutlass 3.9 update ( #2255 )
...
* cutlass 3.9 update
* rebase
* fixes out of shared memory for blockwise Blackwell
* doc format
* fix issue 2253
* disable host ref by default
* fix sm120 smem capacity
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-04-24 15:42:40 -04:00
bb4dd682dd
Fix broken links and alt text in cluster launch control docs ( #2234 )
...
* Fix broken links in cluster launch control docs
* Improve titles and alt text
2025-04-21 00:01:12 -04:00
5e497243f7
fix: fig link in cute docs ( #2216 )
2025-04-10 14:51:41 -04:00
dd76dec4ef
[Doc] Make C++ code more plausible ( #2156 )
...
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-04-10 14:35:46 -04:00
09df6ac464
[Doc]fix typo ( #2174 )
...
Co-authored-by: wenju.li <wenju.li@deepctr.cn >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-04-10 12:46:53 -04:00
79fc51f4b8
v3.9 update ( #2213 )
...
Co-authored-by: yuzhai <yuzhai@nvidia.com >
2025-04-03 02:10:16 -04:00
6f4921858b
v3.9 update ( #2203 )
...
* v3.9 update
* voidD
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
2025-04-02 15:11:18 -04:00
62750a2b75
v3.9 ( #2185 )
...
* v3.8 update x
* fix blackwell gg
* doc change
* doc change
* doc change
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
2025-03-21 01:52:23 -04:00
3fe62887d8
adding blackwell ( #2143 )
2025-03-17 22:20:40 -04:00
bd03b22f64
fix typo ( #2136 )
...
Co-authored-by: XiaoDong <xiaod@nvidia.com >
2025-03-17 22:19:43 -04:00
b84e9802d8
update 3.8 v2 ( #2112 )
...
* update 3.8 v2
* update 3.8
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
2025-02-19 22:03:14 -05:00
0642d46dd4
Update 0x_gemm_tutorial.md ( #2090 )
2025-02-10 16:46:43 -05:00
833f6990e0
v3.8.0 update ( #2082 )
...
* 3.8 update
* fix Markus' name
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
2025-02-06 21:33:40 -05:00
cc19d4d22b
fix a readme broken link ( #2069 )
2025-01-28 18:03:34 -05:00
389e493055
CUTLASS 3.8 Release ( #2059 )
...
* CUTLASS 3.8 Release
* update
* Update README.md
* Revert "Update README.md"
This reverts commit b353e36fe8 .
* update
* update
---------
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-01-25 02:44:06 -05:00
b78588d163
CUTLASS 3.7 ( #2045 )
...
* CUTLASS 3.7
* clean up changelog
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-01-18 09:53:07 -05:00
cffd5d32b7
Update 0x_gemm_tutorial.md ( #1982 )
...
Shouldn't this be BLK_M, BLK_**K**, k
2025-01-06 22:04:35 -05:00
3d261a5974
3.6.0 update ( #2005 )
...
* 3.6.0 update
* doc and swap stuff
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2024-12-25 01:34:40 -05:00
33c584364e
Fix CuTe README Typo ( #1951 )
2024-12-10 22:05:40 -05:00
e5f3caf145
Fix README ( #1658 )
...
* Fix README
* Improve README
---------
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
2024-10-23 12:52:43 -04:00
ea69cc2849
fix typo ( #1853 )
2024-10-23 12:45:28 -04:00
cc3c29a81a
CUTLASS 3.6.0 ( #1850 )
...
* v3.6
* update changelog
* update readme
* fix typo
* fixing typos
* hopper gemm with weight prefetch
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2024-10-09 15:33:27 -04:00
b27c49e84a
Fix cute doc ( #1529 )
2024-10-07 12:38:32 -04:00
8b2a0408bd
Profiler docs and argument update for raster order ( #1667 )
2024-07-31 16:40:10 -04:00
be60a0b272
CUTLASS 3.5.1 ( #1623 )
...
* CUTLASS 3.5.1
* updates, optimizations, fixes
2024-07-29 08:46:24 -04:00
843adf0408
Fix SMEM index for C in CuTe examples ( #1477 )
2024-07-10 11:14:15 -04:00
2448bb56e6
Update gemm_api_3x.md ( #1386 )
...
Fixed what it seems to be an obvious typo.
2024-07-10 10:59:02 -04:00
033d9efd2d
[Documentation] Fixes the confusion between concatenated vs. composed layout in CuTe documentation ( #1498 )
...
* Update 02_layout_algebra.md
* Update 02_layout_algebra.md
2024-05-02 15:35:12 -04:00
acc3ee18a1
Fix typos in cute docs ( #1486 )
...
* fix typos in 02_layout_algebra.md
* fix typos in 03_tensor.md
2024-05-02 15:34:36 -04:00
7d49e6c7e2
Updates for CUTLASS 3.5.0 ( #1468 )
2024-04-11 21:33:40 -04:00
a40e08e9d5
Update 02_layout_algebra.md ( #1451 )
...
change line 348 to reflect correct layout.
2024-04-10 10:57:57 -04:00