CUTLASS 3.3.0 (#1167)

* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
This commit is contained in:
Pradeep Ramani
2023-11-02 08:09:05 -07:00
committed by GitHub
parent 922fb5108b
commit c008b4aea8
263 changed files with 16214 additions and 5008 deletions

View File

@ -269,6 +269,7 @@
"metadata": {},
"outputs": [],
"source": [
"tiles = [td for td in tiles if td.threadblock_shape[0] >= 128]\n",
"idx = random.randint(0, len(tiles)-1)\n",
"td = tiles[idx]\n",
"print('Tile description {} is: {}'.format(idx, td))\n",