Commit Graph

23 Commits

Author SHA1 Message Date
c881a1d689 Support the siglip 2 naflex model as a clip vision model. (#11831)
Not useful yet.
2026-01-12 17:05:54 -05:00
d622a61874 Refactor: move clip_preprocess to comfy.clip_model (#11586) 2025-12-31 17:38:36 -05:00
3412d53b1d USO style reference. (#9677)
Load the projector.safetensors file with the ModelPatchLoader node and use
the siglip_vision_patch14_384.safetensors "clip vision" model and the
USOStyleReferenceNode.
2025-09-02 15:36:22 -04:00
4977f203fa P2 of qwen edit model. (#9412)
* P2 of qwen edit model.

* Typo.

* Fix normal qwen.

* Fix.

* Make the TextEncodeQwenImageEdit also set the ref latent.

If you don't want it to set the ref latent and want to use the
ReferenceLatent node with your custom latent instead just disconnect the
VAE.
2025-08-18 22:38:34 -04:00
0bef826a98 Support llava clip vision model. 2025-03-06 00:24:43 -05:00
85ef295069 Make applying embeddings more efficient.
Adding new tokens no longer makes a whole copy of the embeddings weight
which can be massive on certain models.
2025-03-05 17:34:38 -05:00
d9f0fcdb0c Cleanup. 2025-02-11 17:17:03 -05:00
b124256817 Fix for running via DirectML (#6542)
* Fix for running via DirectML

Fix DirectML empty image generation issue with Flux1. add CPU fallback for unsupported path. Verified the model works on AMD GPUs

* fix formating

* update casual mask calculation
2025-02-11 17:11:32 -05:00
44e19a28d3 Use maximum negative value instead of -inf for masks in text encoders.
This is probably more correct.
2025-02-02 09:46:00 -05:00
8f0009aad0 Support new flux model variants. 2024-11-21 08:38:23 -05:00
d1a6bd6845 Support loading long clipl model with the CLIP loader node. 2024-08-20 10:46:36 -04:00
83dbac28eb Properly set if clip text pooled projection instead of using hack. 2024-08-20 10:46:36 -04:00
2c038ccef0 Lower CLIP memory usage by a bit. 2024-07-31 01:32:35 -04:00
82cae45d44 Fix potential issue with non clip text embeddings. 2024-07-30 14:41:13 -04:00
c2cb8e889b Always return unprojected pooled output for gligen. 2024-02-25 07:33:13 -05:00
1cb3f6a83b Move text projection into the CLIP model code.
Fix issue with not loading the SSD1B clip correctly.
2024-02-25 01:41:08 -05:00
3b9969c1c5 Properly fix attention masks in CLIP with batches. 2024-02-17 12:13:13 -05:00
6c875d846b Fix clip attention mask issues on some hardware. 2024-02-17 07:53:52 -05:00
c6951548cf Update optimized_attention_for_device function for new functions that
support masked attention.
2024-01-07 13:52:08 -05:00
c782144433 Fix clip vision lowvram mode not working. 2023-12-27 13:50:57 -05:00
174eba8e95 Use own clip vision model implementation. 2023-12-09 11:56:31 -05:00
efb704c758 Support attention masking in CLIP implementation. 2023-12-07 02:51:02 -05:00
fbdb14d4c4 Cleaner CLIP text encoder implementation.
Use a simple CLIP model implementation instead of the one from
transformers.

This will allow some interesting things that would too hackish to implement
using the transformers implementation.
2023-12-06 23:50:03 -05:00