|
|
a49a78ffef
|
v4.2 release. (#2587)
* Fix default cluster callback values to 1 to avoid profiler failure when these values are not set in command line.
* v4.2 release.
|
2025-08-22 18:11:24 -04:00 |
|
|
|
fd6cfe1ed0
|
v4.1 release update v2. (#2481)
|
2025-07-21 22:03:55 -04:00 |
|
|
|
a1aaf2300a
|
v4.1 release
|
2025-07-03 08:07:53 -04:00 |
|
|
|
8bdbfca682
|
v4.0 update. (#2371)
|
2025-06-06 02:39:20 -04:00 |
|
|
|
f115c3f854
|
Release v4.0.0 (#2294)
|
2025-05-13 15:55:29 -04:00 |
|
|
|
331a1f5b3f
|
cutlass 3.9 update (#2255)
* cutlass 3.9 update
* rebase
* fixes out of shared memory for blockwise Blackwell
* doc format
* fix issue 2253
* disable host ref by default
* fix sm120 smem capacity
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
|
2025-04-24 15:42:40 -04:00 |
|
|
|
62750a2b75
|
v3.9 (#2185)
* v3.8 update x
* fix blackwell gg
* doc change
* doc change
* doc change
---------
Co-authored-by: yuzhai <yuzhai@nvidia.com>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
Co-authored-by: Haicheng Wu <57973641+hwu36@users.noreply.github.com>
|
2025-03-21 01:52:23 -04:00 |
|