3fe62887d8
adding blackwell ( #2143 )
2025-03-17 22:20:40 -04:00
bd03b22f64
fix typo ( #2136 )
...
Co-authored-by: XiaoDong <xiaod@nvidia.com >
2025-03-17 22:19:43 -04:00
b84e9802d8
update 3.8 v2 ( #2112 )
...
* update 3.8 v2
* update 3.8
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
2025-02-19 22:03:14 -05:00
0642d46dd4
Update 0x_gemm_tutorial.md ( #2090 )
2025-02-10 16:46:43 -05:00
833f6990e0
v3.8.0 update ( #2082 )
...
* 3.8 update
* fix Markus' name
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
2025-02-06 21:33:40 -05:00
cc19d4d22b
fix a readme broken link ( #2069 )
2025-01-28 18:03:34 -05:00
389e493055
CUTLASS 3.8 Release ( #2059 )
...
* CUTLASS 3.8 Release
* update
* Update README.md
* Revert "Update README.md"
This reverts commit b353e36fe8 .
* update
* update
---------
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-01-25 02:44:06 -05:00
b78588d163
CUTLASS 3.7 ( #2045 )
...
* CUTLASS 3.7
* clean up changelog
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2025-01-18 09:53:07 -05:00
cffd5d32b7
Update 0x_gemm_tutorial.md ( #1982 )
...
Shouldn't this be BLK_M, BLK_**K**, k
2025-01-06 22:04:35 -05:00
3d261a5974
3.6.0 update ( #2005 )
...
* 3.6.0 update
* doc and swap stuff
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2024-12-25 01:34:40 -05:00
33c584364e
Fix CuTe README Typo ( #1951 )
2024-12-10 22:05:40 -05:00
e5f3caf145
Fix README ( #1658 )
...
* Fix README
* Improve README
---------
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
2024-10-23 12:52:43 -04:00
ea69cc2849
fix typo ( #1853 )
2024-10-23 12:45:28 -04:00
cc3c29a81a
CUTLASS 3.6.0 ( #1850 )
...
* v3.6
* update changelog
* update readme
* fix typo
* fixing typos
* hopper gemm with weight prefetch
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com >
Co-authored-by: Haicheng Wu <haichengw@nvidia.com >
2024-10-09 15:33:27 -04:00
b27c49e84a
Fix cute doc ( #1529 )
2024-10-07 12:38:32 -04:00
4e5a8f6853
3.5.1 plots and updated readme ( #1708 )
...
Co-authored-by: dePaul Miller <23461061+depaulmillz@users.noreply.github.com >
2024-08-12 18:55:55 -04:00
8b2a0408bd
Profiler docs and argument update for raster order ( #1667 )
2024-07-31 16:40:10 -04:00
be60a0b272
CUTLASS 3.5.1 ( #1623 )
...
* CUTLASS 3.5.1
* updates, optimizations, fixes
2024-07-29 08:46:24 -04:00
843adf0408
Fix SMEM index for C in CuTe examples ( #1477 )
2024-07-10 11:14:15 -04:00
2448bb56e6
Update gemm_api_3x.md ( #1386 )
...
Fixed what it seems to be an obvious typo.
2024-07-10 10:59:02 -04:00
033d9efd2d
[Documentation] Fixes the confusion between concatenated vs. composed layout in CuTe documentation ( #1498 )
...
* Update 02_layout_algebra.md
* Update 02_layout_algebra.md
2024-05-02 15:35:12 -04:00
acc3ee18a1
Fix typos in cute docs ( #1486 )
...
* fix typos in 02_layout_algebra.md
* fix typos in 03_tensor.md
2024-05-02 15:34:36 -04:00
7d49e6c7e2
Updates for CUTLASS 3.5.0 ( #1468 )
2024-04-11 21:33:40 -04:00
a40e08e9d5
Update 02_layout_algebra.md ( #1451 )
...
change line 348 to reflect correct layout.
2024-04-10 10:57:57 -04:00
8f7d2789b8
[NFC] improve doc: fix typo in mma doc ( #1417 )
2024-03-27 14:07:20 -04:00
629f4653c3
CUTLASS 3.5.0 ( #1411 )
2024-03-19 17:51:04 -04:00
ffa34e7075
(NFC) improve doc: Add missing verb to sentence ( #1377 )
...
Co-authored-by: lorenzo chelini <lchelini@nvidia.com >
2024-03-04 15:30:10 -05:00
751eb9a885
Update license year ( #1306 )
2024-01-16 14:37:22 -05:00
2f589ffa76
Updates for 3.4 release. ( #1305 )
2024-01-16 13:42:51 -05:00
8236f30675
CUTLASS 3.4.0 ( #1286 )
...
* CUTLASS 3.4.0
* Update CHANGELOG.md
---------
Co-authored-by: Pradeep Ramani <prramani@nvidia.com >
2023-12-29 15:21:31 -05:00
f188f9b709
Fix typo in quickstart.md ( #1257 )
2023-12-07 09:49:52 -05:00
1d7f2a207e
Fix several broken links ( #1168 )
...
Co-authored-by: isaacw <isaacw@nvidia.com >
2023-11-03 00:01:25 -04:00
557be3ab0e
Fix several typos ( #1169 )
...
Co-authored-by: isaacw <isaacw@nvidia.com >
2023-11-02 23:54:46 -04:00
c008b4aea8
CUTLASS 3.3.0 ( #1167 )
...
* Release 3.3.0
Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.
* minor doc update
2023-11-02 11:09:05 -04:00
fb10fa5308
Fix broken pipeline link in docs ( #1143 )
2023-10-18 12:55:46 -04:00
90d3b0fb18
CUTLASS 3.2.1 ( #1113 )
...
* Updates for 3.2.1 release.
* Minor fix in gemm op profiler for raster order.
* Add scheduler mapping for raster order in the kernels.
2023-09-26 17:24:26 -04:00
3930f709ce
Fix typo in 0x_gemm_tutorial.md ( #1035 )
2023-08-17 10:52:20 -04:00
4575443d44
CUTLASS 3.2 ( #1024 )
...
* CUTLASS 3.2
2023-08-07 20:50:32 -04:00
9b923dd4c4
fix minor typos ( #984 )
2023-07-05 09:23:01 -04:00
fde824af21
Update Hopper performance plot for CUTLASS 3.1 + CTK 12.1 ( #967 )
2023-06-01 14:52:40 -04:00
f079619f5e
More updates for 3.1 ( #958 )
...
* Updates for 3.1
* Minor change
* doc link fix
* Minor updates
2023-05-24 10:17:16 -04:00
6fbc0d3380
Update layout.md
2023-05-17 20:12:58 -04:00
e2953d47c5
Update gemm_api.md
2023-05-12 15:37:31 -04:00
7c04f95415
Updates for 3.1 ( #932 )
2023-04-29 09:34:27 -04:00
54bebe417d
Fix some typos in CuTe tutorials ( #912 )
2023-04-17 16:00:51 -04:00
d572cc1aab
CUTLASS 3.1 ( #915 )
...
Co-authored-by: Aniket Shivam <ashivam@nvidia.com >
2023-04-14 23:19:34 -04:00
0964bdb64c
update gemm and conv2d cmdline --help output ( #878 )
2023-04-01 11:38:13 -04:00
7e370c9637
Fix typos 2 ( #842 )
...
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com >
2023-03-09 23:22:56 -05:00
c4f6b8c6bc
Updates for 3.0 ( #857 )
...
Co-authored-by: Aniket Shivam <ashivam@nvidia.com >
2023-03-09 15:27:40 -05:00
a101ac283f
Fix some typos ( #791 )
...
* fix typo
* fix a deadlink to code
2023-02-16 15:56:55 -05:00