v4.1 release update v2. (#2481)
This commit is contained in:
@ -12,6 +12,7 @@ CuTe DSL
|
||||
JIT Argument Generation <cute_dsl_general/dsl_jit_arg_generation.rst>
|
||||
JIT Argument: Layouts <cute_dsl_general/dsl_dynamic_layout.rst>
|
||||
JIT Caching <cute_dsl_general/dsl_jit_caching.rst>
|
||||
JIT Compilation Options <cute_dsl_general/dsl_jit_compilation_options.rst>
|
||||
Integration with Frameworks <cute_dsl_general/framework_integration.rst>
|
||||
Debugging with the DSL <cute_dsl_general/debugging.rst>
|
||||
Autotuning with the DSL <cute_dsl_general/autotuning_gemm.rst>
|
||||
|
||||
@ -178,7 +178,7 @@ Limitations of Dynamic Control Flow
|
||||
n = 10
|
||||
|
||||
# ❌ This loop is dynamic, early-exit isn't allowed.
|
||||
for i in cutlass.range_dynamic(n):
|
||||
for i in range(n):
|
||||
if i == 5:
|
||||
break # Early-exit
|
||||
|
||||
|
||||
@ -0,0 +1,50 @@
|
||||
.. _dsl_jit_compilation_options:
|
||||
.. |DSL| replace:: CuTe DSL
|
||||
|
||||
.. _JIT_Compilation_Options:
|
||||
|
||||
JIT Compilation Options
|
||||
=======================
|
||||
|
||||
JIT Compilation Options Overview
|
||||
--------------------------------
|
||||
|
||||
When compiling a JIT function using |DSL|, you may want to control various aspects of the compilation process, such as optimization level, or debugging flags. |DSL| provides a flexible interface for specifying these compilation options when invoking ``cute.compile``.
|
||||
|
||||
Compilation options allow you to customize how your JIT-compiled functions are built and executed. This can be useful for:
|
||||
|
||||
* Enabling or disabling specific compiler optimizations
|
||||
* Generating debug information for troubleshooting
|
||||
|
||||
These options can be passed as keyword arguments to ``cute.compile`` or set globally for all JIT compilations. The available options and their effects are described in the following sections, along with usage examples to help you get started.
|
||||
|
||||
|
||||
``cute.compile`` Compilation Options
|
||||
------------------------------------
|
||||
|
||||
You can provide additional compilation options as a string when calling ``cute.compile``. The |DSL| uses ``argparse`` to parse these options and will raise an error if any invalid options are specified.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 20 20 15 25
|
||||
|
||||
* - **Option**
|
||||
- **Description**
|
||||
- **Default**
|
||||
- **Type**
|
||||
* - ``opt-level``
|
||||
- Optimization level of compilation. The higher the level, the more optimizations are applied. The valid value range is [0, 3].
|
||||
- 3 (highest level of optimization)
|
||||
- int
|
||||
* - ``enable-device-assertions``
|
||||
- Enable device code assertions.
|
||||
- False
|
||||
- bool
|
||||
|
||||
You can use the following code to specify compilation options:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
jit_executor_with_opt_level_2 = cute.compile(add, 1, 2, options="--opt-level 2")
|
||||
jit_executor_with_opt_level_1 = cute.compile(add, 1, 2, options="--opt-level 1")
|
||||
jit_executor_with_enable_device_assertions = cute.compile(add, 1, 2, options="--enable-device-assertions")
|
||||
@ -54,6 +54,7 @@ Programming Model
|
||||
- Modifiable during execution of JIT-compiled functions
|
||||
- Only a specific subset of Python types are supported as dynamic values
|
||||
- Primitive types are automatically converted when passed as function arguments:
|
||||
|
||||
- ``int`` → ``Int32`` (may be updated to ``Int64`` in future releases)
|
||||
- ``bool`` → ``Bool``
|
||||
- ``float`` → ``Float32`` (may be updated to ``Float64`` in future releases)
|
||||
@ -77,7 +78,7 @@ Programming Model
|
||||
# of the runtime value of `i`
|
||||
xs.append(Float32(3.0))
|
||||
|
||||
for i in range_dynamic(10):
|
||||
for i in range(10):
|
||||
# This only append one element to the list at compile-time
|
||||
# as loop doesn't unroll at compile-time
|
||||
xs.append(Float32(1.0))
|
||||
@ -142,16 +143,29 @@ Programming Model
|
||||
@cute.jit
|
||||
def foo():
|
||||
a = Int32(1)
|
||||
for i in range_dynamic(10):
|
||||
for i in range(10):
|
||||
a = Float32(2) # Changing type inside loop-body is not allowed in the DSL
|
||||
|
||||
|
||||
**Built-in Operators**
|
||||
The DSL transforms built-in operators like ``and``, ``or``, ``max``, ``min``, etc.
|
||||
into MLIR operations. They also follow the same constraints of dependent types.
|
||||
For instance, ``a and b`` requires ``a`` and ``b`` to be of the same type.
|
||||
|
||||
Comparison like ``==`` on Sequence of dynamic values is known to not produce
|
||||
expected result at runtime.
|
||||
|
||||
**Special Variables**
|
||||
The DSL treats ``_`` as a special variable that it's value is meant to be ignored.
|
||||
It is not allowed to read ``_`` in the DSL.
|
||||
|
||||
Example illustrating functionality in Python that is not supported in the DSL:
|
||||
|
||||
.. code:: python
|
||||
|
||||
@cute.jit
|
||||
def foo():
|
||||
_ = 1
|
||||
print(_) # This is not allowed in the DSL
|
||||
|
||||
|
||||
**Object Oriented Programming**
|
||||
The DSL is implemented on top of Python and supports Python's object-oriented programming (OOP) features
|
||||
@ -179,7 +193,7 @@ Programming Model
|
||||
@cute.jit
|
||||
def foo(a: Int32, res: cute.Tensor):
|
||||
foo = Foo(a)
|
||||
for i in cutlass.range_dynamic(10):
|
||||
for i in range(10):
|
||||
foo.set_a(i)
|
||||
|
||||
# This fails to compile because `a` is assigned a local value defined within the for-loop body
|
||||
|
||||
Reference in New Issue
Block a user