121 Commits

Author SHA1 Message Date
a40e08e9d5 Update 02_layout_algebra.md (#1451)
change line 348 to reflect correct layout.
2024-04-10 10:57:57 -04:00
8f7d2789b8 [NFC] improve doc: fix typo in mma doc (#1417) 2024-03-27 14:07:20 -04:00
629f4653c3 CUTLASS 3.5.0 (#1411) 2024-03-19 17:51:04 -04:00
ffa34e7075 (NFC) improve doc: Add missing verb to sentence (#1377)
Co-authored-by: lorenzo chelini <lchelini@nvidia.com>
2024-03-04 15:30:10 -05:00
751eb9a885 Update license year (#1306) 2024-01-16 14:37:22 -05:00
2f589ffa76 Updates for 3.4 release. (#1305) 2024-01-16 13:42:51 -05:00
8236f30675 CUTLASS 3.4.0 (#1286)
* CUTLASS 3.4.0

* Update CHANGELOG.md

---------

Co-authored-by: Pradeep Ramani <prramani@nvidia.com>
2023-12-29 15:21:31 -05:00
f188f9b709 Fix typo in quickstart.md (#1257) 2023-12-07 09:49:52 -05:00
1d7f2a207e Fix several broken links (#1168)
Co-authored-by: isaacw <isaacw@nvidia.com>
2023-11-03 00:01:25 -04:00
557be3ab0e Fix several typos (#1169)
Co-authored-by: isaacw <isaacw@nvidia.com>
2023-11-02 23:54:46 -04:00
c008b4aea8 CUTLASS 3.3.0 (#1167)
* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
2023-11-02 11:09:05 -04:00
fb10fa5308 Fix broken pipeline link in docs (#1143) 2023-10-18 12:55:46 -04:00
90d3b0fb18 CUTLASS 3.2.1 (#1113)
* Updates for 3.2.1 release.

* Minor fix in gemm op profiler for raster order.

* Add scheduler mapping for raster order in the kernels.
2023-09-26 17:24:26 -04:00
3930f709ce Fix typo in 0x_gemm_tutorial.md (#1035) 2023-08-17 10:52:20 -04:00
4575443d44 CUTLASS 3.2 (#1024)
* CUTLASS 3.2
2023-08-07 20:50:32 -04:00
9b923dd4c4 fix minor typos (#984) 2023-07-05 09:23:01 -04:00
fde824af21 Update Hopper performance plot for CUTLASS 3.1 + CTK 12.1 (#967) 2023-06-01 14:52:40 -04:00
f079619f5e More updates for 3.1 (#958)
* Updates for 3.1

* Minor change

* doc link fix

* Minor updates
2023-05-24 10:17:16 -04:00
6fbc0d3380 Update layout.md 2023-05-17 20:12:58 -04:00
e2953d47c5 Update gemm_api.md 2023-05-12 15:37:31 -04:00
7c04f95415 Updates for 3.1 (#932) 2023-04-29 09:34:27 -04:00
54bebe417d Fix some typos in CuTe tutorials (#912) 2023-04-17 16:00:51 -04:00
d572cc1aab CUTLASS 3.1 (#915)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-04-14 23:19:34 -04:00
0964bdb64c update gemm and conv2d cmdline --help output (#878) 2023-04-01 11:38:13 -04:00
7e370c9637 Fix typos 2 (#842)
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>
2023-03-09 23:22:56 -05:00
c4f6b8c6bc Updates for 3.0 (#857)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-03-09 15:27:40 -05:00
ZZK
a101ac283f Fix some typos (#791)
* fix typo

* fix a deadlink to code
2023-02-16 15:56:55 -05:00
277bd6e537 CUTLASS 3.0.0 (#786)
* CUTLASS 3.0.0
2023-01-23 20:55:28 -05:00
66d9cddc83 New updates for 2.11 (#775)
* New updates.

* Minor profiler updates

Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2023-01-20 16:32:57 -05:00
8567b87d65 Update quickstart.md (#704)
* Update quickstart.md

* Update doxygen_mainpage.md

* Update doxygen_mainpage.md

* Update terminology.md
2022-11-29 21:43:03 -05:00
c975e2ccbb releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
FZC
cc85b64cf6 fix typo (#677) 2022-11-01 14:07:33 -04:00
b72cbf957d CUTLASS 2.10 (#615)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2022-09-03 18:48:46 -04:00
536b20763e Fixed typo in profiler README (#603) 2022-08-24 21:55:13 -04:00
94f01f19d5 Add implicit gemm perf
plot from @manishucsd, presented in gtc'22 cutlass talk
2022-06-23 22:47:11 -04:00
d6f58b2d14 Update functionality.md 2022-05-11 09:34:24 -04:00
57551902d0 Update functionality.md
add some explanations to the functionality table.
2022-05-11 00:01:19 -04:00
70f3ba57f5 Fix typo in shared memory layout description (#471) 2022-04-24 18:32:13 -04:00
12f4108ac2 CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
cd39c75e25 Fix typo in docs, code comments (#429)
* [docs] fix typo in media/docs/layout.md

* [docs] fix comment error

* fix typo in include/cutlass/arch/simd_61.h

* fix stride comment errors in TensorLayout
2022-03-15 21:54:36 -04:00
e96f00586c Make cutlass::gemm::device::GemmArray usable (#295)
* Fix the build of cutlass/gemm/device/gemm_array.h and add a demo for GemmArray

* Add a reference to GemmArray to the docs

Co-authored-by: Ivan Komarov <dfyz@yandex-team.ru>
2022-02-17 20:01:05 -05:00
5fe09c2d67 Updated GEMM performance plot with CUTLASS 2.8 compiled with CUDA 11.5 Toolkit (#375)
Updated GEMM performance plot with CUTLASS 2.8 compiled using CUDA 11.5 Toolkit.

GPUs under test:

    NVIDIA A100
    NVIDIA A2
    NVIDIA TitanV
    NVIDIA GeForce 2080 Ti
2021-12-06 14:21:33 -05:00
808c25337a CUTLASS 2.8 (#363)
CUTLASS 2.8
2021-11-19 13:26:35 -08:00
6fc5008803 Update quickstart.md
fix a broken link
2021-11-11 09:53:46 -05:00
6c2f8f2fb8 CUTLASS 2.6.1 - functional and performance enhancements to strided DGRAD, fixes, and tuning
* cutlass 2.6 update

* remove debug prints

* cutlass 2.6.1 (minor update)

* Updated CHANGELOG.

* Minor edit to readme to indicate patch version.

* Minor edit to readme.

Co-authored-by:  Haicheng Wu <haichengw@nvidia.com>, Andrew Kerr <akerr@nvidia.com>
2021-09-03 10:26:15 -07:00
d36f331b44 fix typo in doc
fix typo
2021-08-08 16:44:22 +08:00
10709dbb64 clean profiler cmd and doc 2021-07-30 11:02:17 -07:00
64dd1e1915 Doc typo
Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
2021-07-29 08:45:59 +08:00
1ac4559d12 Cutlass 2.6 Update 1 (#301)
* cutlass 2.6 update

* remove debug prints
2021-07-27 17:58:30 -07:00
e5d51840e8 CUTLASS 2.6 (#298)
CUTLASS 2.6
2021-07-23 00:40:53 -04:00