18 Commits

Author SHA1 Message Date
a49a78ffef v4.2 release. (#2587)
* Fix default cluster callback values to 1 to avoid profiler failure when these values are not set in command line.

* v4.2 release.
2025-08-22 18:11:24 -04:00
51d730b8be Support "CuTe DSL" auto-labeling in workflow 2025-07-23 00:28:01 -07:00
6c0c8b7484 1. Update bug/feature report template to add component selection. (#2485)
2. Add workflow to apply component label automatically
2025-07-22 12:38:03 -04:00
b244379d9b Merge pull request #2359 from NVIDIA/oss_ci
Initial Workflow Definition for blossom-ci support on CUTLASS GitHub
2025-06-03 14:04:35 -07:00
c008b4aea8 CUTLASS 3.3.0 (#1167)
* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
2023-11-02 11:09:05 -04:00
112590114d Add config.yml issue template with Discord link. (#1135) 2023-10-10 12:13:04 -04:00
c975e2ccbb releaase 2.11 (#703) 2022-11-19 09:02:15 -05:00
3bf95e90c2 Update labeler.yml 2022-10-13 08:03:28 -04:00
75fed7493e Update labeler.yml 2022-10-13 08:01:21 -04:00
98b73fc95d Update labeler.yml 2022-10-13 07:55:33 -04:00
4990e3686d Update labeler.yml 2022-10-13 07:52:38 -04:00
4b7365388c Update labeler.yml 2022-10-13 07:32:55 -04:00
0d8405588d Update labeler.yml 2022-10-12 15:32:38 -04:00
f3eea3a4d7 Create labeler.yml 2022-09-29 15:08:44 -04:00
b72cbf957d CUTLASS 2.10 (#615)
Co-authored-by: Aniket Shivam <ashivam@nvidia.com>
2022-09-03 18:48:46 -04:00
21c1fa3849 add .github (#479)
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
2022-04-28 12:36:59 -07:00
12f4108ac2 CUTLASS 2.9 (#468) 2022-04-23 15:02:38 -04:00
4e666e1dfd Updated README and added issue templates. (#382) 2021-12-17 09:26:20 -05:00