Commit Graph

2028 Commits

Author SHA1 Message Date
e1ede29d82 Remove unsafe pickle loading code that was used on pytorch older than 2.4 (#12473)
ComfyUI hasn't started on pytorch 2.4 since last month.
2026-02-14 22:53:52 -05:00
dc9822b7df Add working Qwen 2512 ControlNet (Fun ControlNet) support (#12359) 2026-02-13 22:23:52 -05:00
712efb466b Add left padding to LTXAV text encoder. (#12456) 2026-02-13 21:56:54 -05:00
726af73867 Fix some custom nodes. (#12455) 2026-02-13 20:21:10 -05:00
831351a29e Support generating attention masks for left padded text encoders. (#12454) 2026-02-13 20:15:23 -05:00
e1add563f9 Use torch RMSNorm for flux models and refactor hunyuan video code. (#12432) 2026-02-13 15:35:13 -05:00
8902907d7a dynamic_vram: Training fixes (#12442) 2026-02-13 15:29:37 -05:00
ae79e33345 llama: use a more efficient rope implementation (#12434)
Get rid of the cat and unary negation and inplace add-cmul the two
halves of the rope. Precompute -sin once at the start of the model
rather than every transformer block.

This is slightly faster on both GPU and CPU bound setups.
2026-02-12 19:56:42 -05:00
117e214354 ModelPatcherDynamic: force load non leaf weights (#12433)
The current behaviour of the default ModelPatcher is to .to a model
only if its fully loaded, which is how random non-leaf weights get
loaded in non-LowVRAM conditions.

The however means they never get loaded in dynamic_vram. In the
dynamic_vram case, force load them to the GPU.
2026-02-12 19:51:50 -05:00
e5ae670a40 Update ace15.py to allow min_p sampling (#12373) 2026-02-11 20:28:48 -05:00
3fe61cedda model_patcher: guard against none model_dtype (#12410)
Handle the case where the _model_dtype exists but is none with the
intended fallback.
2026-02-11 14:54:02 -05:00
2a4328d639 ace15: Use dynamic_vram friendly trange (#12409)
Factor out the ksampler trange and use it in ACE LLM to prevent the
silent stall at 0 and rate distortion due to first-step model load.
2026-02-11 14:53:42 -05:00
d297a749a2 dynamic_vram: Fix windows Aimdo crash + Fix LLM performance (#12408)
* model_management: lazy-cache aimdo_tensor

These tensors cosntructed from aimdo-allocations are CPU expensive to
make on the pytorch side. Add a cache version that will be valid with
signature match to fast path past whatever torch is doing.

* dynamic_vram: Minimize fast path CPU work

Move as much as possible inside the not resident if block and cache
the formed weight and bias rather than the flat intermediates. In
extreme layer weight rates this adds up.
2026-02-11 14:50:16 -05:00
76a7fa96db Make built in lora training work on anima. (#12402) 2026-02-10 22:04:32 -05:00
cdcf4119b3 [Trainer] training with proper offloading (#12189)
* Fix bypass dtype/device moving

* Force offloading mode for training

* training context var

* offloading implementation in training node

* fix wrong input type

* Support bypass load lora model, correct adapter/offloading handling
2026-02-10 21:45:19 -05:00
123a7874a9 ops: Fix vanilla-fp8 loaded lora quality (#12390)
This was missing the stochastic rounding required for fp8 downcast
to be consistent with model_patcher.patch_weight_to_device.

Missed in testing as I spend too much time with quantized tensors
and overlooked the simpler ones.
2026-02-10 13:38:28 -05:00
f719f9c062 sd: delay VAE dtype archive until after override (#12388)
VAEs have host specific dtype logic that should override the dynamic
_model_dtype. Defer the archiving of model dtypes until after.
2026-02-10 13:37:46 -05:00
fe053ba5eb mp: dont deep-clone objects from model_options (#12382)
If there are non-trivial python objects nested in the model_options, this
causes all sorts of issues. Traverse lists and dicts so clones can safely
overide settings and BYO objects but stop there on the deepclone.
2026-02-10 13:37:17 -05:00
a4be04c5d7 Ace step prompts match now. (#12376) 2026-02-09 19:45:56 -05:00
baf8c87455 Iimprovements to ACE-Steps 1.5 text encoding (part 2) (#12350) 2026-02-09 19:41:49 -05:00
62315fbb15 Dynamic VRAM fixes - Ace 1.5 performance + a VRAM leak (#12368)
* revert threaded model loader change

This change was only needed to get around the pytorch 2.7 mempool bugs,
and should have been reverted along with #12260. This fixes a different
memory leak where pytorch gets confused about cache emptying.

* load non comfy weights

* MPDynamic: Pre-generate the tensors for vbars

Apparently this is an expensive operation that slows down things.

* bump to aimdo 1.8

New features:
watermark limit feature
logging enhancements
-O2 build on linux
2026-02-09 16:16:08 -05:00
f350a84261 Disable prompt weights for ltxv2. (#12354) 2026-02-07 19:16:28 -05:00
17e7df43d1 Pad ace step 1.5 ref audio if not long enough. (#12341) 2026-02-07 00:02:11 -05:00
039955c527 Some fixes to previous pr. (#12339) 2026-02-06 20:14:52 -05:00
6a26328842 Support fp16 for Cosmos-Predict2 and Anima (#12249) 2026-02-06 20:12:15 -05:00
204e65b8dc Fix bug with last pr (#12338) 2026-02-06 19:48:20 -05:00
a831c19b70 Fix return_word_ids=True with Anima tokenizer (#12328) 2026-02-06 19:38:04 -05:00
eba6c940fd Make ace step 1.5 base model work properly with default workflow. (#12337) 2026-02-06 19:14:56 -05:00
c2d7f07dbf Fix issue when using disable_unet_model_creation (#12315) 2026-02-05 19:24:09 -05:00
458292fef0 Fix some lowvram stuff with ace step 1.5 (#12312) 2026-02-05 19:15:04 -05:00
6555dc65b8 Make ace step 1.5 work without the llm. (#12311) 2026-02-05 16:43:45 -05:00
35183543e0 Add VAE tiled decode node for audio. (#12299) 2026-02-05 01:12:04 -05:00
a246cc02b2 Improvements to ACE-Steps 1.5 text encoding (#12283) 2026-02-05 00:17:37 -05:00
a50c32d63f Disable sage attention on ace step 1.5 (#12297) 2026-02-04 22:15:30 -05:00
6125b80979 Add llm sampling options and make reference audio work on ace step 1.5 (#12295) 2026-02-04 21:29:22 -05:00
c8fcbd66ee Try to fix ace text encoder slowness on some configs. (#12290) 2026-02-04 19:37:05 -05:00
26dd7eb421 Fix ace step nan issue on some hardware/pytorch configs. (#12289) 2026-02-04 18:25:06 -05:00
ef73070ea4 mp: Fix checkpoint saving (#12268)
Fix regression in the recent model saving refactor. Pass the non unet
pieces down the layers so that checkpoints are complete.
2026-02-04 02:08:45 -05:00
d30c609f5a utils: safetensors: dont slice data on torch level (#12266)
Torch has alignment enforcement when viewing with data type changes
but only relative to itself. Do all tensor constructions straight
off the memory-view individually so pytorch doesnt see an alignment
problem.

The is needed for handling misaligned safetensors weights, which are
reasonably common in third party models.

This limits usage of this safetensors loader to GPU compute only
as CPUs kernnel are very likely to bus error. But it works for
dynamic_vram, where we really dont want to take a deep copy and we
always use GPU copy_ which disentangles the misalignment.
2026-02-04 01:48:47 -05:00
a31681564d Fix crash with ace step 1.5 (#12264) 2026-02-04 00:03:21 -05:00
855849c658 mm: Remove Aimdo exemption for empty_cache (#12260)
Its more important to get the torch caching allocator GC up and running
than supporting the pyt2.7 bug. Switch it on.

Defeature dynamic_vram + pyt2.7.
2026-02-03 21:39:19 -05:00
fe2511468d Support the 4B ace step 1.5 lm model. (#12257)
Can be used as an alternative to the 1.7B
2026-02-03 19:01:38 -05:00
b8315e66cb Fix tiled vae for ace step 1.5 (#12253) 2026-02-03 14:40:45 -05:00
ab1050bec3 Support ace step 1.5 base model loras. (#12252) 2026-02-03 13:54:23 -05:00
85fc35e8fa Fix mac issue. (#12250) 2026-02-03 12:19:39 -05:00
223364743c llama: cast logits as a comfy-weight (#12248)
This is using a different layers weight with .to(). Change it to use
the ops caster if the original layer is a comfy weight so that it picks
up dynamic_vram and async_offload functionality in full.

Co-authored-by: Rattus <rattus128@gmail.com>
2026-02-03 11:31:36 -05:00
affe881354 Fix some issues with mac. (#12247) 2026-02-03 11:07:04 -05:00
f5030e26fd Add progress bar to ace step. (#12242) 2026-02-03 04:09:30 -05:00
3c1a1a2df8 Basic support for the ace step 1.5 model. (#12237) 2026-02-03 00:06:18 -05:00
c05a08ae66 Add back function. (#12234) 2026-02-02 19:52:07 -05:00