Release v4.0.0 (#2294)

2025-05-13 15:55:29 -04:00
parent ad7b2f5e84
commit f115c3f854
299 changed files with 51495 additions and 4413 deletions
--- a/media/docs/cpp/layout.md
+++ b/media/docs/cpp/layout.md
@ -217,7 +217,7 @@ and `TensorRef` objects for each of the operands whose extents are implied as a
 redundant storage of extent quantities, CUTLASS minimizes capacity utilization of precious resources such as constant memory.
 This is consistent with BLAS conventions.

-# Summary:
+## Summary:

 The design patterns described in this document form a hierarchy:
 * `T *ptr;` is a pointer to a contiguous sequence of elements of type `T`
@ -225,7 +225,7 @@ The design patterns described in this document form a hierarchy:
 * `TensorRef<T, Layout> ref(ptr, layout);` is an object pointing to an _unbounded_ tensor containing elements of type `T` and a layout of type `Layout`
 * `TensorView<T, Layout> view(ref, extent);` is an object pointing to a _bounded_ tensor containing elements of type `T` and a layout of type `Layout`

-# Appendix: Existing Layouts
+### Appendix: Existing Layouts

 This section enumerates several existing Layout types defined in CUTLASS.

@ -268,7 +268,7 @@ Permuted Shared Memory Layouts:
 - `TensorOpCrosswise<ElementSize>`


-# Copyright
+### Copyright

 Copyright (c) 2017 - 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 SPDX-License-Identifier: BSD-3-Clause