Commit Graph

88 Commits

Author SHA1 Message Date
8aea746212 Implement gemma 3 as a text encoder. (#10241)
Not useful yet.
2025-10-06 22:08:08 -04:00
dc95b6acc0 Basic WIP support for the wan animate model. (#9939) 2025-09-19 03:07:17 -04:00
9288c78fc5 Support the HuMo model. (#9903) 2025-09-17 00:12:48 -04:00
c1297f4eb3 Add support for Chroma Radiance (#9682)
* Initial Chroma Radiance support

* Minor Chroma Radiance cleanups

* Update Radiance nodes to ensure latents/images are on the intermediate device

* Fix Chroma Radiance memory estimation.

* Increase Chroma Radiance memory usage factor

* Increase Chroma Radiance memory usage factor once again

* Ensure images are multiples of 16 for Chroma Radiance
Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node

* Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor

* Update Radiance to support conv nerf final head type.

* Allow setting NeRF embedder dtype for Radiance
Bump Radiance nerf tile size to 32
Support EasyCache/LazyCache on Radiance (maybe)

* Add ChromaRadianceStubVAE node

* Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior

* Convert Chroma Radiance nodes to V3 schema.

* Add ChromaRadianceOptions node and backend support.
Cleanups/refactoring to reduce code duplication with Chroma.

* Fix overriding the NeRF embedder dtype for Chroma Radiance

* Minor Chroma Radiance cleanups

* Move Chroma Radiance to its own directory in ldm
Minor code cleanups and tooltip improvements

* Fix Chroma Radiance embedder dtype overriding

* Remove Radiance dynamic nerf_embedder dtype override feature

* Unbork Radiance NeRF embedder init

* Remove Chroma Radiance image conversion and stub VAE nodes
Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE
Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1
2025-09-13 17:58:43 -04:00
e01e99d075 Support hunyuan image distilled model. (#9807) 2025-09-10 23:17:34 -04:00
85e34643f8 Support hunyuan image 2.1 regular model. (#9792) 2025-09-10 02:05:07 -04:00
261421e218 Add Hunyuan 3D 2.1 Support (#8714) 2025-09-04 20:36:20 -04:00
88aee596a3 WIP Wan 2.2 S2V model. (#9568) 2025-08-27 01:10:34 -04:00
f7bd5e58dd Make it easier to implement future qwen controlnets. (#9485) 2025-08-21 23:18:04 -04:00
1702e6df16 Implement wan2.2 camera model. (#9357)
Use the old WanCameraImageToVideo node.
2025-08-15 17:29:58 -04:00
560d38f34c Wan2.2 fun control support. (#9292) 2025-08-12 23:26:33 -04:00
c012400240 Initial support for qwen image model. (#9179) 2025-08-04 22:53:25 -04:00
a88788dce6 Wan 2.2 support. (#9080) 2025-07-28 08:00:23 -04:00
ec70ed6aea Omnigen2 model implementation. (#8669) 2025-06-25 19:35:57 -04:00
d6a2137fc3 Support Cosmos predict2 image to video models. (#8535)
Use the CosmosPredict2ImageToVideoLatent node.
2025-06-14 21:37:07 -04:00
251f54a2ad Basic initial support for cosmos predict2 text to image 2B and 14B models. (#8517) 2025-06-13 07:05:23 -04:00
a0651359d7 Return proper error if diffusion model not detected properly. (#8272) 2025-05-25 05:28:11 -04:00
1c2d45d2b5 Fix typo in last PR. (#8144)
More robust model detection for future proofing.
2025-05-15 19:02:19 -04:00
56b6ee6754 Detection code to make ltxv models without config work. (#7986) 2025-05-07 21:28:24 -04:00
16417b40d9 Initial ACE-Step model implementation. (#7972) 2025-05-07 08:33:34 -04:00
08ff5fa08a Cleanup chroma PR. 2025-04-30 20:57:30 -04:00
4ca3d84277 Support for Chroma - Flux1 Schnell distilled with CFG (#7355)
* Upload files for Chroma Implementation

* Remove trailing whitespace

* trim more trailing whitespace..oops

* remove unused imports

* Add supported_inference_dtypes

* Set min_length to 0 and remove attention_mask=True

* Set min_length to 1

* get_mdulations added from blepping and minor changes

* Add lora conversion if statement in lora.py

* Update supported_models.py

* update model_base.py

* add uptream commits

* set modelType.FLOW, will cause beta scheduler to work properly

* Adjust memory usage factor and remove unnecessary code

* fix mistake

* reduce code duplication

* remove unused imports

* refactor for upstream sync

* sync chroma-support with upstream via syncbranch patch

* Update sd.py

* Add Chroma as option for the OptimalStepsScheduler node
2025-04-30 20:57:00 -04:00
ce22f687cc Support for WAN VACE preview model. (#7711)
* Support for WAN VACE preview model.

* Remove print.
2025-04-21 14:40:29 -04:00
c14429940f Support loading WAN FLF model. 2025-04-17 12:04:48 -04:00
9ad792f927 Basic support for hidream i1 model. 2025-04-15 17:35:05 -04:00
83e839a89b Native LotusD Implementation (#7125)
* draft pass at a native comfy implementation of Lotus-D depth and normal est

* fix model_sampling kludges

* fix ruff

---------

Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
2025-03-21 14:04:15 -04:00
11f1b41bab Initial Hunyuan3Dv2 implementation.
Supports the multiview, mini, turbo models and VAEs.
2025-03-19 16:52:58 -04:00
e1474150de Support fp8_scaled diffusion models that don't use fp8 matrix mult. 2025-03-07 04:39:21 -05:00
93fedd92fe Support LTXV 0.9.5.
Credits: Lightricks team.
2025-03-05 00:13:49 -05:00
4ced06b879 WIP support for Wan I2V model. 2025-02-26 01:49:43 -05:00
63023011b9 WIP support for Wan t2v model. 2025-02-25 17:20:35 -05:00
5715be2ca9 Fix Hunyuan unet config detection for some models. (#6877)
The change to support 32 channel hunyuan models is missing the `key_prefix` on the key.

This addresses a complain in the comments of acc152b674.
2025-02-19 07:14:45 -05:00
acc152b674 Support loading and using SkyReels-V1-Hunyuan-I2V (#6862)
* Support SkyReels-V1-Hunyuan-I2V

* VAE scaling

* Fix T2V

oops

* Proper latent scaling
2025-02-18 17:06:54 -05:00
e5ea112a90 Support Lumina 2 model. 2025-02-04 04:16:30 -05:00
3aaabb12d4 Implement Cosmos Image/Video to World (Video) diffusion models.
Use CosmosImageToVideoLatent to set the input image/video.
2025-01-14 05:14:10 -05:00
2ff3104f70 WIP support for Nvidia Cosmos 7B and 14B text to world (video) models. 2025-01-10 09:14:16 -05:00
b7572b2f87 Fix and enforce no trailing whitespace. 2024-12-31 03:16:37 -05:00
d170292594 Remove some trailing white space. 2024-12-27 18:02:30 -05:00
bddb02660c Add PixArt model support (#6055)
* PixArt initial version

* PixArt Diffusers convert logic

* pos_emb and interpolation logic

* Reduce  duplicate code

* Formatting

* Use optimized attention

* Edit empty token logic

* Basic PixArt LoRA support

* Fix aspect ratio logic

* PixArtAlpha text encode with conds

* Use same detection key logic for PixArt diffusers
2024-12-20 15:25:00 -05:00
bda1482a27 Basic Hunyuan Video model support. 2024-12-16 19:35:40 -05:00
d9d7f3c619 Lint all unused variables (#5989)
* Enable F841

* Autofix

* Remove all unused variable assignment
2024-12-12 17:59:16 -05:00
5e16f1d24b Support Lightricks LTX-Video model. 2024-11-22 08:46:39 -05:00
8f0009aad0 Support new flux model variants. 2024-11-21 08:38:23 -05:00
5e29e7a488 Remove scaled_fp8 key after reading it to silence warning. 2024-11-06 04:56:42 -05:00
daa1565b93 Fix diffusers flux controlnet regression. 2024-10-30 13:11:34 -04:00
09fdb2b269 Support SD3.5 medium diffusers format weights and loras. 2024-10-30 04:24:00 -04:00
13b0ff8a6f Update SD3 code. 2024-10-28 21:58:52 -04:00
5cbb01bc2f Basic Genmo Mochi video model support.
To use:
"Load CLIP" node with t5xxl + type mochi
"Load Diffusion Model" node with the mochi dit file.
"Load VAE" with the mochi vae file.

EmptyMochiLatentVideo node for the latent.
euler + linear_quadratic in the KSampler node.
2024-10-26 06:54:00 -04:00
0075c6d096 Mixed precision diffusion models with scaled fp8.
This change allows supports for diffusion models where all the linears are
scaled fp8 while the other weights are the original precision.
2024-10-21 18:12:51 -04:00
a68bbafddb Support diffusion models with scaled fp8 weights. 2024-10-19 23:47:42 -04:00