@ -116,7 +116,7 @@ would include the following.
|
||||
access instructions (like `cp.async`), then dispatch to the
|
||||
custom instruction.
|
||||
|
||||
2. The the two `Tensor`s have static layouts and it can be proven
|
||||
2. The two `Tensor`s have static layouts and it can be proven
|
||||
that element vectorization is valid -- for example, four `LDS.32`s
|
||||
can be combined into a single `LDS.128` -- then vectorize the source
|
||||
and destinations tensors.
|
||||
|
||||
@ -37,7 +37,7 @@ and the `Layout`s of threads and values within the operation.
|
||||
The `MMA_Traits` struct takes the Operation as a template parameter.
|
||||
CuTe specializes `MMA_Traits` for each Operation type that it supports.
|
||||
|
||||
Together, these two types comprise an "Atom" that decouples the complexity of thread and data layouts from the call site of of the PTX instruction. The Atom's Traits struct exposes information that is relevant to a single MMA operation, no matter the granularity at which it operates.
|
||||
Together, these two types comprise an "Atom" that decouples the complexity of thread and data layouts from the call site of the PTX instruction. The Atom's Traits struct exposes information that is relevant to a single MMA operation, no matter the granularity at which it operates.
|
||||
|
||||
CuTe MMA atoms expose the semantics of a single MMA operation.
|
||||
This is true regardless of the hardware level at which the MMA operates.
|
||||
|
||||
@ -255,7 +255,7 @@ int bar()
|
||||
}
|
||||
```
|
||||
|
||||
"Static" is an unfortunately overloaded term in C++. Sometimes it means "the opposite of instance," like a "static function" or "static member" of a class. (Some programming languages, like Java, say "class method" to refer to a "static function of a class.") That's not what we mean here. Instead, we mean "part of a compile-time type." For example, `Int<1>` encodes the value 1 at compile time, as part of the type of a templated class `Int<Value>`. `Int<3>` and `Int<4>` have different types. You can get the value of of the type like this: `Int<3>::value`. (The `value` is a `static constexpr` member of the class, where "static" means "opposite of instance.") As soon as you go from `Int<3>` to `Int<3>::value`, you've gone from (3) above (a compile-time value) to (2) above (a `constexpr` value). In some situations, this may mean that the compiler treats it as a run-time value.
|
||||
"Static" is an unfortunately overloaded term in C++. Sometimes it means "the opposite of instance," like a "static function" or "static member" of a class. (Some programming languages, like Java, say "class method" to refer to a "static function of a class.") That's not what we mean here. Instead, we mean "part of a compile-time type." For example, `Int<1>` encodes the value 1 at compile time, as part of the type of a templated class `Int<Value>`. `Int<3>` and `Int<4>` have different types. You can get the value of the type like this: `Int<3>::value`. (The `value` is a `static constexpr` member of the class, where "static" means "opposite of instance.") As soon as you go from `Int<3>` to `Int<3>::value`, you've gone from (3) above (a compile-time value) to (2) above (a `constexpr` value). In some situations, this may mean that the compiler treats it as a run-time value.
|
||||
|
||||
#### Strides
|
||||
|
||||
|
||||
@ -56,7 +56,7 @@ You may explicitly exclude cuBLAS and cuDNN as dependencies with the following C
|
||||
|
||||
## Build and run the CUTLASS Profiler
|
||||
|
||||
From the `build/` directory created above, compile the the CUTLASS Profiler.
|
||||
From the `build/` directory created above, compile the CUTLASS Profiler.
|
||||
```bash
|
||||
$ make cutlass_profiler -j12
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user