|
|
629f4653c3
|
CUTLASS 3.5.0 (#1411)
|
2024-03-19 17:51:04 -04:00 |
|
|
|
751eb9a885
|
Update license year (#1306)
|
2024-01-16 14:37:22 -05:00 |
|
|
|
2f589ffa76
|
Updates for 3.4 release. (#1305)
|
2024-01-16 13:42:51 -05:00 |
|
|
|
146d314057
|
Update fMHA kernels (#992)
* Update fMHA kernels
Upstream recent changes to fMHA that we did in xFormers.
Previous version in CUTLASS: facebookresearch/xformers@b6be33a
Updating to: facebookresearch/xformers@55a4798
* minor changes
* make var work
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
|
2023-07-12 22:30:46 -04:00 |
|
|
|
9b8166e3f0
|
fMHA: Add backward pass (#844)
* fMHA: Add backward pass
* Better checks for strides/alignments
* Remove fb-internal URL
* torch.Tensor.untyped_storage requires pytorch 2.0+
* minor changes
* make test
---------
Co-authored-by: danthe3rd <danthe3rd>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
|
2023-04-06 20:44:58 -04:00 |
|