Update docs for Minimax-Text support (#22562)

Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-09 09:18:18 +02:00
parent 0edc0cd52b
commit 2be07a0db1
2 changed files with 6 additions and 2 deletions
--- a/docs/usage/v1_guide.md
+++ b/docs/usage/v1_guide.md
@ -111,6 +111,10 @@ Models that combine Mamba-2 and Mamba-1 layers with standard attention layers ar
 `Zamba2ForCausalLM`, `NemotronHForCausalLM`, `FalconH1ForCausalLM` and `GraniteMoeHybridForCausalLM`, `JambaForCausalLM`). Please note that
 these models currently require disabling prefix caching and using the FlashInfer attention backend in V1.

+Hybrid models with mechanisms different to Mamba are also supported (e.g, `MiniMaxText01ForCausalLM`, `MiniMaxM1ForCausalLM`).
+Please note that these models currently require disabling prefix caching, enforcing eager mode, and using the FlashInfer
+attention backend in V1.
+
 #### Encoder-Decoder Models

 Models requiring cross-attention between separate encoder and decoder (e.g., `BartForConditionalGeneration`, `MllamaForConditionalGeneration`)