CUTLASS 3.3.0 (#1167)

* Release 3.3.0

Adds support for mixed precision GEMMs On Hopper and Ampere
Adds support for < 16B aligned GEMMs on Hopper
Enhancements to EVT
Enhancements to Python interface
Enhancements to Sub-byte type handling in CuTe
Several other bug-fixes and performance improvements.

* minor doc update
This commit is contained in:
Pradeep Ramani
2023-11-02 08:09:05 -07:00
committed by GitHub
parent 922fb5108b
commit c008b4aea8
263 changed files with 16214 additions and 5008 deletions

View File

@ -30,6 +30,7 @@
#
#################################################################################################
import os
import sys
from . import conv2d_operation
@ -47,3 +48,16 @@ from . import rank_2k_operation
from . import rank_k_operation
from . import symm_operation
from . import trmm_operation
# Make enum types from library.py accessible via cutlass_library.*
from .library import *
# Set up `source` to point to the path containing the CUTLASS source.
# Check first if the path cotains a `source` subdirectory -- this will
# be the case when the package has been installed via pip. Otherwise,
# default to the root of CUTLASS.
install_source_path = os.path.join(__path__[0], 'source')
if os.path.isdir(install_source_path):
source_path = install_source_path
else:
source_path = os.path.join(__path__[0], '../..')