v4.1 release update v2. (#2481)

This commit is contained in:
Junkai-Wu
2025-07-22 10:03:55 +08:00
committed by GitHub
parent 9baa06dd57
commit fd6cfe1ed0
179 changed files with 7878 additions and 1286 deletions

View File

@ -12,6 +12,7 @@ CuTe DSL
JIT Argument Generation <cute_dsl_general/dsl_jit_arg_generation.rst>
JIT Argument: Layouts <cute_dsl_general/dsl_dynamic_layout.rst>
JIT Caching <cute_dsl_general/dsl_jit_caching.rst>
JIT Compilation Options <cute_dsl_general/dsl_jit_compilation_options.rst>
Integration with Frameworks <cute_dsl_general/framework_integration.rst>
Debugging with the DSL <cute_dsl_general/debugging.rst>
Autotuning with the DSL <cute_dsl_general/autotuning_gemm.rst>

View File

@ -178,7 +178,7 @@ Limitations of Dynamic Control Flow
n = 10
# ❌ This loop is dynamic, early-exit isn't allowed.
for i in cutlass.range_dynamic(n):
for i in range(n):
if i == 5:
break # Early-exit

View File

@ -0,0 +1,50 @@
.. _dsl_jit_compilation_options:
.. |DSL| replace:: CuTe DSL
.. _JIT_Compilation_Options:
JIT Compilation Options
=======================
JIT Compilation Options Overview
--------------------------------
When compiling a JIT function using |DSL|, you may want to control various aspects of the compilation process, such as optimization level, or debugging flags. |DSL| provides a flexible interface for specifying these compilation options when invoking ``cute.compile``.
Compilation options allow you to customize how your JIT-compiled functions are built and executed. This can be useful for:
* Enabling or disabling specific compiler optimizations
* Generating debug information for troubleshooting
These options can be passed as keyword arguments to ``cute.compile`` or set globally for all JIT compilations. The available options and their effects are described in the following sections, along with usage examples to help you get started.
``cute.compile`` Compilation Options
------------------------------------
You can provide additional compilation options as a string when calling ``cute.compile``. The |DSL| uses ``argparse`` to parse these options and will raise an error if any invalid options are specified.
.. list-table::
:header-rows: 1
:widths: 20 20 15 25
* - **Option**
- **Description**
- **Default**
- **Type**
* - ``opt-level``
- Optimization level of compilation. The higher the level, the more optimizations are applied. The valid value range is [0, 3].
- 3 (highest level of optimization)
- int
* - ``enable-device-assertions``
- Enable device code assertions.
- False
- bool
You can use the following code to specify compilation options:
.. code-block:: python
jit_executor_with_opt_level_2 = cute.compile(add, 1, 2, options="--opt-level 2")
jit_executor_with_opt_level_1 = cute.compile(add, 1, 2, options="--opt-level 1")
jit_executor_with_enable_device_assertions = cute.compile(add, 1, 2, options="--enable-device-assertions")

View File

@ -54,6 +54,7 @@ Programming Model
- Modifiable during execution of JIT-compiled functions
- Only a specific subset of Python types are supported as dynamic values
- Primitive types are automatically converted when passed as function arguments:
- ``int````Int32`` (may be updated to ``Int64`` in future releases)
- ``bool````Bool``
- ``float````Float32`` (may be updated to ``Float64`` in future releases)
@ -77,7 +78,7 @@ Programming Model
# of the runtime value of `i`
xs.append(Float32(3.0))
for i in range_dynamic(10):
for i in range(10):
# This only append one element to the list at compile-time
# as loop doesn't unroll at compile-time
xs.append(Float32(1.0))
@ -142,16 +143,29 @@ Programming Model
@cute.jit
def foo():
a = Int32(1)
for i in range_dynamic(10):
for i in range(10):
a = Float32(2) # Changing type inside loop-body is not allowed in the DSL
**Built-in Operators**
The DSL transforms built-in operators like ``and``, ``or``, ``max``, ``min``, etc.
into MLIR operations. They also follow the same constraints of dependent types.
For instance, ``a and b`` requires ``a`` and ``b`` to be of the same type.
Comparison like ``==`` on Sequence of dynamic values is known to not produce
expected result at runtime.
**Special Variables**
The DSL treats ``_`` as a special variable that it's value is meant to be ignored.
It is not allowed to read ``_`` in the DSL.
Example illustrating functionality in Python that is not supported in the DSL:
.. code:: python
@cute.jit
def foo():
_ = 1
print(_) # This is not allowed in the DSL
**Object Oriented Programming**
The DSL is implemented on top of Python and supports Python's object-oriented programming (OOP) features
@ -179,7 +193,7 @@ Programming Model
@cute.jit
def foo(a: Int32, res: cute.Tensor):
foo = Foo(a)
for i in cutlass.range_dynamic(10):
for i in range(10):
foo.set_a(i)
# This fails to compile because `a` is assigned a local value defined within the for-loop body