* init
* update
* Update model.py
* Update model.py
* remove print
* Fix text encoding
* Prevent empty negative prompt
Really doesn't work otherwise
* fp16 works
* I2V
* Update model_base.py
* Update nodes_hunyuan.py
* Better latent rgb factors
* Use the correct sigclip output...
* Support HunyuanVideo1.5 SR model
* whitespaces...
* Proper latent channel count
* SR model fixes
This also still needs timesteps scheduling based on the noise scale, can be used with two samplers too already
* vae_refiner: roll the convolution through temporal
Work in progress.
Roll the convolution through time using 2-latent-frame chunks and a
FIFO queue for the convolution seams.
* Support HunyuanVideo15 latent resampler
* fix
* Some cleanup
Co-Authored-By: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
* Proper hyvid15 I2V channels
Co-Authored-By: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
* Fix TokenRefiner for fp16
Otherwise x.sum has infs, just in case only casting if input is fp16, I don't know if necessary.
* Bugfix for the HunyuanVideo15 SR model
* vae_refiner: roll the convolution through temporal II
Roll the convolution through time using 2-latent-frame chunks and a
FIFO queue for the convolution seams.
Added support for encoder, lowered to 1 latent frame to save more
VRAM, made work for Hunyuan Image 3.0 (as code shared).
Fixed names, cleaned up code.
* Allow any number of input frames in VAE.
* Better VAE encode mem estimation.
* Lowvram fix.
* Fix hunyuan image 2.1 refiner.
* Fix mistake.
* Name changes.
* Rename.
* Whitespace.
* Fix.
* Fix.
---------
Co-authored-by: kijai <40791699+kijai@users.noreply.github.com>
Co-authored-by: Rattus <rattus128@gmail.com>
Slices model input with output channels so the caching tracks only the noise channels, resolves channel mismatch with models like WanVideo I2V
Also fix for slicing deprecation in pytorch 2.9
Clean up a bunch of stacked and no-longer-needed tensors on the QWEN
VRAM peak (currently FFN).
With this I go from OOMing at B=37x1328x1328 to being able to
succesfully run B=47 (RTX5090).
The partial unloader path in model re-use flow skips straight to the
actual unload without any check of the patching UUID. This means that
if you do an upscale flow with a model patch on an existing model, it
will not apply your patchings.
Fix by delaying the partial_unload until after the uuid checks. This
is done by making partial_unload a model of partial_load where extra_mem
is -ve.
* ops: dont take an offload stream if you dont need one
* ops: prioritize mem transfer
The async offload streams reason for existence is to transfer from
RAM to GPU. The post processing compute steps are a bonus on the side
stream, but if the compute stream is running a long kernel, it can
stall the side stream, as it wait to type-cast the bias before
transferring the weight. So do a pure xfer of the weight straight up,
then do everything bias, then go back to fix the weight type and do
weight patches.
* execution: Roll the UI cache into the outputs
Currently the UI cache is parallel to the output cache with
expectations of being a content superset of the output cache.
At the same time the UI and output cache are maintained completely
seperately, making it awkward to free the output cache content without
changing the behaviour of the UI cache.
There are two actual users (getters) of the UI cache. The first is
the case of a direct content hit on the output cache when executing a
node. This case is very naturally handled by merging the UI and outputs
cache.
The second case is the history JSON generation at the end of the prompt.
This currently works by asking the cache for all_node_ids and then
pulling the cache contents for those nodes. all_node_ids is the nodes
of the dynamic prompt.
So fold the UI cache into the output cache. The current UI cache setter
now writes to a prompt-scope dict. When the output cache is set, just
get this value from the dict and tuple up with the outputs.
When generating the history, simply iterate prompt-scope dict.
This prepares support for more complex caching strategies (like RAM
pressure caching) where less than 1 workflow will be cached and it
will be desirable to keep the UI cache and output cache in sync.
* sd: Implement RAM getter for VAE
* model_patcher: Implement RAM getter for ModelPatcher
* sd: Implement RAM getter for CLIP
* Implement RAM Pressure cache
Implement a cache sensitive to RAM pressure. When RAM headroom drops
down below a certain threshold, evict RAM-expensive nodes from the
cache.
Models and tensors are measured directly for RAM usage. An OOM score
is then computed based on the RAM usage of the node.
Note the due to indirection through shared objects (like a model
patcher), multiple nodes can account the same RAM as their individual
usage. The intent is this will free chains of nodes particularly
model loaders and associate loras as they all score similar and are
sorted in close to each other.
Has a bias towards unloading model nodes mid flow while being able
to keep results like text encodings and VAE.
* execution: Convert the cache entry to NamedTuple
As commented in review.
Convert this to a named tuple and abstract away the tuple type
completely from graph.py.
* mm: factor out the current stream getter
Make this a reusable function.
* ops: sync the offload stream with the consumption of w&b
This sync is nessacary as pytorch will queue cuda async frees on the
same stream as created to tensor. In the case of async offload, this
will be on the offload stream.
Weights and biases can go out of scope in python which then
triggers the pytorch garbage collector to queue the free operation on
the offload stream possible before the compute stream has used the
weight. This causes a use after free on weight data leading to total
corruption of some workflows.
So sync the offload stream with the compute stream after the weight
has been used so the free has to wait for the weight to be used.
The cast_bias_weight is extended in a backwards compatible way with
the new behaviour opt-in on a defaulted parameter. This handles
custom node packs calling cast_bias_weight and defeatures
async-offload for them (as they do not handle the race).
The pattern is now:
cast_bias_weight(... , offloadable=True) #This might be offloaded
thing(weight, bias, ...)
uncast_bias_weight(...)
* controlnet: adopt new cast_bias_weight synchronization scheme
This is nessacary for safe async weight offloading.
* mm: sync the last stream in the queue, not the next
Currently this peeks ahead to sync the next stream in the queue of
streams with the compute stream. This doesnt allow a lot of
parallelization, as then end result is you can only get one weight load
ahead regardless of how many streams you have.
Rotate the loop logic here to synchronize the end of the queue before
returning the next stream. This allows weights to be loaded ahead of the
compute streams position.
In the case of --cache-none lazy and subgraph execution can cause
anything to be run multiple times per workflow. If that rerun nodes is
in itself a subgraph generator, this will crash for two reasons.
pending_subgraph_results[] does not cleanup entries after their use.
So when a pending_subgraph_result is consumed, remove it from the list
so that if the corresponding node is fully re-executed this misses
lookup and it fall through to execute the node as it should.
Secondly, theres is an explicit enforcement against dups in the
addition of subgraphs nodes as ephemerals to the dymprompt. Remove this
enforcement as the use case is now valid.
* Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint.
* Updated design using Tensor Subclasses
* Fix FP8 MM
* An actually functional POC
* Remove CK reference and ensure correct compute dtype
* Update unit tests
* ruff lint
* Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint.
* Updated design using Tensor Subclasses
* Fix FP8 MM
* An actually functional POC
* Remove CK reference and ensure correct compute dtype
* Update unit tests
* ruff lint
* Fix missing keys
* Rename quant dtype parameter
* Rename quant dtype parameter
* Fix unittests for CPU build
* feat(api-nodes): implement new API client for V3 nodes
* feat(api-nodes): implement new API client for V3 nodes
* feat(api-nodes): implement new API client for V3 nodes
* converted WAN nodes to use new client; polishing
* fix(auth): do not leak authentification for the absolute urls
* convert BFL API nodes to use new API client; remove deprecated BFL nodes
* converted Google Veo nodes
* fix(Veo3.1 model): take into account "generate_audio" parameter
* execution: fold in dependency aware caching
This makes --cache-none compatiable with lazy and expanded
subgraphs.
Currently the --cache-none option is powered by the
DependencyAwareCache. The cache attempts to maintain a parallel
copy of the execution list data structure, however it is only
setup once at the start of execution and does not get meaninigful
updates to the execution list.
This causes multiple problems when --cache-none is used with lazy
and expanded subgraphs as the DAC does not accurately update its
copy of the execution data structure.
DAC has an attempt to handle subgraphs ensure_subcache however
this does not accurately connect to nodes outside the subgraph.
The current semantics of DAC are to free a node ASAP after the
dependent nodes are executed.
This means that if a subgraph refs such a node it will be requed
and re-executed by the execution_list but DAC wont see it in
its to-free lists anymore and leak memory.
Rather than try and cover all the cases where the execution list
changes from inside the cache, move the while problem to the
executor which maintains an always up-to-date copy of the wanted
data-structure.
The executor now has a fast-moving run-local cache of its own.
Each _to node has its own mini cache, and the cache is unconditionally
primed at the time of add_strong_link.
add_strong_link is called for all of static workflows, lazy links
and expanded subgraphs so its the singular source of truth for
output dependendencies.
In the case of a cache-hit, the executor cache will hold the non-none
value (it will respect updates if they happen somehow as well).
In the case of a cache-miss, the executor caches a None and will
wait for a notification to update the value when the node completes.
When a node completes execution, it simply releases its mini-cache
and in turn its strong refs on its direct anscestor outputs, allowing
for ASAP freeing (same as the DependencyAwareCache but a little more
automatic).
This now allows for re-implementation of --cache-none with no cache
at all. The dependency aware cache was also observing the dependency
sematics for the objects and UI cache which is not accurate (this
entire logic was always outputs specific).
This also prepares for more complex caching strategies (such as RAM
pressure based caching), where a cache can implement any freeing
strategy completely independently of the DepedancyAwareness
requirement.
* main: re-implement --cache-none as no cache at all
The execution list now tracks the dependency aware caching more
correctly that the DependancyAwareCache.
Change it to a cache that does nothing.
* test_execution: add --cache-none to the test suite
--cache-none is now expected to work universally. Run it through the
full unit test suite. Propagate the server parameterization for whether
or not the server is capabale of caching, so that the minority of tests
that specifically check for cache hits can if else. Hard assert NOT
caching in the else to give some coverage of --cache-none expected
behaviour to not acutally cache.
* Add get_subgraphs_dir to ComfyExtension and PUBLISHED_SUBGRAPH_DIRS to nodes.py
* Created initial endpoints, although the returned paths are a bit off currently
* Fix path and actually return real data
* Sanitize returned /api/global_subgraphs entries
* Remove leftover function from early prototyping
* Remove added whitespace
* Add None check for sanitize_entry
* execution: fold in dependency aware caching
This makes --cache-none compatiable with lazy and expanded
subgraphs.
Currently the --cache-none option is powered by the
DependencyAwareCache. The cache attempts to maintain a parallel
copy of the execution list data structure, however it is only
setup once at the start of execution and does not get meaninigful
updates to the execution list.
This causes multiple problems when --cache-none is used with lazy
and expanded subgraphs as the DAC does not accurately update its
copy of the execution data structure.
DAC has an attempt to handle subgraphs ensure_subcache however
this does not accurately connect to nodes outside the subgraph.
The current semantics of DAC are to free a node ASAP after the
dependent nodes are executed.
This means that if a subgraph refs such a node it will be requed
and re-executed by the execution_list but DAC wont see it in
its to-free lists anymore and leak memory.
Rather than try and cover all the cases where the execution list
changes from inside the cache, move the while problem to the
executor which maintains an always up-to-date copy of the wanted
data-structure.
The executor now has a fast-moving run-local cache of its own.
Each _to node has its own mini cache, and the cache is unconditionally
primed at the time of add_strong_link.
add_strong_link is called for all of static workflows, lazy links
and expanded subgraphs so its the singular source of truth for
output dependendencies.
In the case of a cache-hit, the executor cache will hold the non-none
value (it will respect updates if they happen somehow as well).
In the case of a cache-miss, the executor caches a None and will
wait for a notification to update the value when the node completes.
When a node completes execution, it simply releases its mini-cache
and in turn its strong refs on its direct anscestor outputs, allowing
for ASAP freeing (same as the DependencyAwareCache but a little more
automatic).
This now allows for re-implementation of --cache-none with no cache
at all. The dependency aware cache was also observing the dependency
sematics for the objects and UI cache which is not accurate (this
entire logic was always outputs specific).
This also prepares for more complex caching strategies (such as RAM
pressure based caching), where a cache can implement any freeing
strategy completely independently of the DepedancyAwareness
requirement.
* main: re-implement --cache-none as no cache at all
The execution list now tracks the dependency aware caching more
correctly that the DependancyAwareCache.
Change it to a cache that does nothing.
* test_execution: add --cache-none to the test suite
--cache-none is now expected to work universally. Run it through the
full unit test suite. Propagate the server parameterization for whether
or not the server is capabale of caching, so that the minority of tests
that specifically check for cache hits can if else. Hard assert NOT
caching in the else to give some coverage of --cache-none expected
behaviour to not acutally cache.
Same change pattern as 7e8dd275c2
applied to WAN2.2
If this suffers an exception (such as a VRAM oom) it will leave the
encode() and decode() methods which skips the cleanup of the WAN
feature cache. The comfy node cache then ultimately keeps a reference
this object which is in turn reffing large tensors from the failed
execution.
The feature cache is currently setup at a class variable on the
encoder/decoder however, the encode and decode functions always clear
it on both entry and exit of normal execution.
Its likely the design intent is this is usable as a streaming encoder
where the input comes in batches, however the functions as they are
today don't support that.
So simplify by bringing the cache back to local variable, so that if
it does VRAM OOM the cache itself is properly garbage when the
encode()/decode() functions dissappear from the stack.
* updated V2V node to allow for control image input
exposing steps in v2v
fixing guidance_scale as input parameter
TODO: allow for motion_intensity as input param.
* refactor: comment out unsupported resolution and adjust default values in video nodes
* set control_after_generate
* adding new defaults
* fixes
* changed control_after_generate back to True
* changed control_after_generate back to False
---------
Co-authored-by: thorsten <thorsten@tripod-digital.co.nz>
## Summary
Fixed incorrect type hint syntax in `MotionEncoder_tc.__init__()` parameter list.
## Changes
- Line 647: Changed `num_heads=int` to `num_heads: int`
- This corrects the parameter annotation from a default value assignment to proper type hint syntax
## Details
The parameter was using assignment syntax (`=`) instead of type annotation syntax (`:`), which would incorrectly set the default value to the `int` class itself rather than annotating the expected type.
If this suffers an exception (such as a VRAM oom) it will leave the
encode() and decode() methods which skips the cleanup of the WAN
feature cache. The comfy node cache then ultimately keeps a reference
this object which is in turn reffing large tensors from the failed
execution.
The feature cache is currently setup at a class variable on the
encoder/decoder however, the encode and decode functions always clear
it on both entry and exit of normal execution.
Its likely the design intent is this is usable as a streaming encoder
where the input comes in batches, however the functions as they are
today don't support that.
So simplify by bringing the cache back to local variable, so that if
it does VRAM OOM the cache itself is properly garbage when the
encode()/decode() functions dissappear from the stack.
When the VAE catches this VRAM OOM, it launches the fallback logic
straight from the exception context.
Python however refs the entire call stack that caused the exception
including any local variables for the sake of exception report and
debugging. In the case of tensors, this can hold on the references
to GBs of VRAM and inhibit the VRAM allocated from freeing them.
So dump the except context completely before going back to the VAE
via the tiler by getting out of the except block with nothing but
a flag.
The greately increases the reliability of the tiler fallback,
especially on low VRAM cards, as with the bug, if the leak randomly
leaked more than the headroom needed for a single tile, the tiler
would fallback would OOM and fail the flow.
* feature: Set the Ascend NPU to use a single one
* Enable the `--cuda-device` parameter to support both CUDA and Ascend NPUs simultaneously.
* Make the code just set the ASCENT_RT_VISIBLE_DEVICES environment variable without any other edits to master branch
---------
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* flux: math: Use _addcmul to avoid expensive VRAM intermediate
The rope process can be the VRAM peak and this intermediate
for the addition result before releasing the original can OOM.
addcmul_ it.
* wan: Delete the self attention before cross attention
This saves VRAM when the cross attention and FFN are in play as the
VRAM peak.
Adds installed and required workflow templates version information to the
/system_stats endpoint, allowing the frontend to detect and notify users
when their templates package is outdated.
- Add get_installed_templates_version() and get_required_templates_version()
methods to FrontendManager
- Include templates version info in system_stats response
- Add comprehensive unit tests for the new functionality
When unloading models in load_models_gpu(), the model finalizer was not
being explicitly detached, leading to a memory leak. This caused
linear memory consumption increase over time as models are repeatedly
loaded and unloaded.
This change prevents orphaned finalizer references from accumulating in
memory during model switching operations.
* flux: Do the xq and xk ropes one at a time
This was doing independendent interleaved tensor math on the q and k
tensors, leading to the holding of more than the minimum intermediates
in VRAM. On a bad day, it would VRAM OOM on xk intermediates.
Do everything q and then everything k, so torch can garbage collect
all of qs intermediates before k allocates its intermediates.
This reduces peak VRAM usage for some WAN2.2 inferences (at least).
* wan: Optimize qkv intermediates on attention
As commented. The former logic computed independent pieces of QKV in
parallel which help more inference intermediates in VRAM spiking
VRAM usage. Fully roping Q and garbage collecting the intermediates
before touching K reduces the peak inference VRAM usage.
* Initial Chroma Radiance support
* Minor Chroma Radiance cleanups
* Update Radiance nodes to ensure latents/images are on the intermediate device
* Fix Chroma Radiance memory estimation.
* Increase Chroma Radiance memory usage factor
* Increase Chroma Radiance memory usage factor once again
* Ensure images are multiples of 16 for Chroma Radiance
Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node
* Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor
* Update Radiance to support conv nerf final head type.
* Allow setting NeRF embedder dtype for Radiance
Bump Radiance nerf tile size to 32
Support EasyCache/LazyCache on Radiance (maybe)
* Add ChromaRadianceStubVAE node
* Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior
* Convert Chroma Radiance nodes to V3 schema.
* Add ChromaRadianceOptions node and backend support.
Cleanups/refactoring to reduce code duplication with Chroma.
* Fix overriding the NeRF embedder dtype for Chroma Radiance
* Minor Chroma Radiance cleanups
* Move Chroma Radiance to its own directory in ldm
Minor code cleanups and tooltip improvements
* Fix Chroma Radiance embedder dtype overriding
* Remove Radiance dynamic nerf_embedder dtype override feature
* Unbork Radiance NeRF embedder init
* Remove Chroma Radiance image conversion and stub VAE nodes
Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE
Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1
* Looking into a @wrap_attn decorator to look for 'optimized_attention_override' entry in transformer_options
* Created logging code for this branch so that it can be used to track down all the code paths where transformer_options would need to be added
* Fix memory usage issue with inspect
* Made WAN attention receive transformer_options, test node added to wan to test out attention override later
* Added **kwargs to all attention functions so transformer_options could potentially be passed through
* Make sure wrap_attn doesn't make itself recurse infinitely, attempt to load SageAttention and FlashAttention if not enabled so that they can be marked as available or not, create registry for available attention
* Turn off attention logging for now, make AttentionOverrideTestNode have a dropdown with available attention (this is a test node only)
* Make flux work with optimized_attention_override
* Add logs to verify optimized_attention_override is passed all the way into attention function
* Make Qwen work with optimized_attention_override
* Made hidream work with optimized_attention_override
* Made wan patches_replace work with optimized_attention_override
* Made SD3 work with optimized_attention_override
* Made HunyuanVideo work with optimized_attention_override
* Made Mochi work with optimized_attention_override
* Made LTX work with optimized_attention_override
* Made StableAudio work with optimized_attention_override
* Made optimized_attention_override work with ACE Step
* Made Hunyuan3D work with optimized_attention_override
* Make CosmosPredict2 work with optimized_attention_override
* Made CosmosVideo work with optimized_attention_override
* Made Omnigen 2 work with optimized_attention_override
* Made StableCascade work with optimized_attention_override
* Made AuraFlow work with optimized_attention_override
* Made Lumina work with optimized_attention_override
* Made Chroma work with optimized_attention_override
* Made SVD work with optimized_attention_override
* Fix WanI2VCrossAttention so that it expects to receive transformer_options
* Fixed Wan2.1 Fun Camera transformer_options passthrough
* Fixed WAN 2.1 VACE transformer_options passthrough
* Add optimized to get_attention_function
* Disable attention logs for now
* Remove attention logging code
* Remove _register_core_attention_functions, as we wouldn't want someone to call that, just in case
* Satisfy ruff
* Remove AttentionOverrideTest node, that's something to cook up for later
* Fix showing progress from other sessions
Because `client_id` was missing from ths `progress_state` message, it
was being sent to all connected sessions. This technically meant that if
someone had a graph with the same nodes, they would see the progress
updates for others.
Also added a test to prevent reoccurance and moved the tests around to
make CI easier to hook up.
* Fix CI issues related to timing-sensitive tests
Load the projector.safetensors file with the ModelPatchLoader node and use
the siglip_vision_patch14_384.safetensors "clip vision" model and the
USOStyleReferenceNode.
* bigcat88's progress on adding Google Gemini Image node
* Made Google Gemini Image node functional
* Bump frontend version to get static pricing badge on Gemini Image node
- Update comfyui-frontend-package from 1.25.9 to 1.25.10
- Revert forced legacy navigation mode from PR #9518
- Frontend v1.25.10 includes proper navigation mode fixes and improved display text
* Attempting a universal implementation of EasyCache, starting with flux as test; I screwed up the math a bit, but when I set it just right it works.
* Fixed math to make threshold work as expected, refactored code to use EasyCacheHolder instead of a dict wrapped by object
* Use sigmas from transformer_options instead of timesteps to be compatible with a greater amount of models, make end_percent work
* Make log statement when not skipping useful, preparing for per-cond caching
* Added DIFFUSION_MODEL wrapper around forward function for wan model
* Add subsampling for heuristic inputs
* Add subsampling to output_prev (output_prev_subsampled now)
* Properly consider conds in EasyCache logic
* Created SuperEasyCache to test what happens if caching and reuse is moved outside the scope of conds, added PREDICT_NOISE wrapper to facilitate this test
* Change max reuse_threshold to 3.0
* Mark EasyCache/SuperEasyCache as experimental (beta)
* Make Lumina2 compatible with EasyCache
* Add EasyCache support for Qwen Image
* Fix missing comma, curse you Cursor
* Add EasyCache support to AceStep
* Add EasyCache support to Chroma
* Added EasyCache support to Cosmos Predict t2i
* Make EasyCache not crash with Cosmos Predict ImagToVideo latents, but does not work well at all
* Add EasyCache support to hidream
* Added EasyCache support to hunyuan video
* Added EasyCache support to hunyuan3d
* Added EasyCache support to LTXV (not very good, but does not crash)
* Implemented EasyCache for aura_flow
* Renamed SuperEasyCache to LazyCache, hardcoded subsample_factor to 8 on nodes
* Eatra logging when verbose is true for EasyCache
* convert Google Veo API node to the V3 schema
* use own full io.Schema for Veo3VideoGenerationNode
* fixed typo
* use auth_kwargs instead of auth_token/comfy_api_key
These are not real controlnets but actually a patch on the model so they
will be treated as such.
Put them in the models/model_patches/ folder.
Use the new ModelPatchLoader and QwenImageDiffsynthControlnet nodes.
* P2 of qwen edit model.
* Typo.
* Fix normal qwen.
* Fix.
* Make the TextEncodeQwenImageEdit also set the ref latent.
If you don't want it to set the ref latent and want to use the
ReferenceLatent node with your custom latent instead just disconnect the
VAE.
This node is only useful if someone trains the kontext model to properly
use multiple reference images via the index method.
The default is the offset method which feeds the multiple images like if
they were stitched together as one. This method works with the current
flux kontext model.
Turns out torch.compile has some gaps in context manager decorator
syntax support. I've sent patches to fix that in PyTorch, but it won't
be available for all the folks running older versions of PyTorch, hence
this trivial patch.
* Update default parameters for Moonvalley video nodes
- Changed default negative prompts to a more extensive list for both BaseMoonvalleyVideoNode and MoonvalleyVideo2VideoNode.
- Updated default guidance scale values for both nodes to enhance prompt adherence.
- Set a fixed default seed value for consistency in video generation.
* no message
* ruff fix
---------
Co-authored-by: thorsten <thorsten@tripod-digital.co.nz>
The checkbox for confirming custom node testing is now optional in both bug report and user support templates. This allows users to submit issues even if they haven't been able to test with custom nodes disabled, making the reporting process more accessible.
* Added initial support for basic context windows - in progress
* Add prepare_sampling wrapper for context window to more accurately estimate latent memory requirements, fixed merging wrappers/callbacks dicts in prepare_model_patcher
* Made context windows compatible with different dimensions; works for WAN, but results are bad
* Fix comfy.patcher_extension.merge_nested_dicts calls in prepare_model_patcher in sampler_helpers.py
* Considering adding some callbacks to context window code to allow extensions of behavior without the need to rewrite code
* Made dim slicing cleaner
* Add Wan Context WIndows node for testing
* Made context schedule and fuse method functions be stored on the handler instead of needing to be registered in core code to be found
* Moved some code around between node_context_windows.py and context_windows.py
* Change manual context window nodes names/ids
* Added callbacks to IndexListContexHandler
* Adjusted default values for context_length and context_overlap, made schema.inputs definition for WAN Context Windows less annoying
* Make get_resized_cond more robust for various dim sizes
* Fix typo
* Another small fix
* Change bf16 check and switch non-blocking to off default with option to force to regain speed on certain classes of iGPUs and refactor xpu check.
* Turn non_blocking off by default for xpu.
* Update README.md for Intel GPUs.
* fix(Kling Image API Node): do not pass "image_type" when no image
* fix(Kling Image API Node): raise client-side error when kling_v1 is used with reference image
- Create new Veo3VideoGenerationNode that extends VeoVideoGenerationNode
- Add support for generateAudio parameter (only for Veo3 models)
- Support new Veo3 models: veo-3.0-generate-001, veo-3.0-fast-generate-001
- Fix Veo3 duration constraint to 8 seconds only
- Update original node to be clearly Veo 2 only
- Update API paths to use model parameter: /proxy/veo/{model}/generate
- Regenerate API types from staging to include generateAudio parameter
- Fix TripoModelVersion enum reference after regeneration
- Mark generated API types file in .gitattributes
When a prompt is submitted, it can optionally include
`partial_execution_targets` as a list of ids. If it does, rather than
adding all outputs to the execution list, we add only those in the list.
* ComfyAPI Core v0.0.2
* Respond to PR feedback
* Fix Python 3.9 errors
* Fix missing backward compatibility proxy
* Reorganize types a bit
The input types, input impls, and utility types are now all available in
the versioned API. See the change in `comfy_extras/nodes_video.py` for
an example of their usage.
* Remove the need for `--generate-api-stubs`
* Fix generated stubs differing by Python version
* Fix ruff formatting issues
* [moonvalley] Update V2V node to match API specification
- Add exact resolution validation for supported resolutions (1920x1080, 1080x1920, 1152x1152, 1536x1152, 1152x1536)
- Change frame count validation from divisible by 32 to 16
- Add MP4 container format validation
- Remove internal parameters (steps, guidance_scale) from V2V inference params
- Update video duration handling to support only 5 seconds (auto-trim if longer)
- Add motion_intensity parameter (0-100) for Motion Transfer control type
- Add get_container_format() method to VideoInput classes
* update negative prompt
* Added the parameter required_frontend_version in the /system_stats api response
* Update server.py
* Created a function get_required_frontend_version and wrote tests for it
* Refactored the function to return currently installed frontend pacakage version
* Moved required_frontend to a new function and imported that in server.py
* Corrected test cases using mocking techniques
* Corrected files to comply with ruff formatting
* Add factorization utils for lokr
* Add lokr train impl
* Add loha train impl
* Add adapter map for algo selection
* Add optional grad ckpt and algo selection
* Update __init__.py
* correct key name for loha
* Use custom fwd/bwd func and better init for loha
* Support gradient accumulation
* Fix bugs of loha
* use more stable init
* Add OFT training
* linting
This makes it easier to write asynchronous clients that submit requests, because they can store the task immediately.
Duplicate prompt IDs are rejected by the job queue.
Remove auth_token_comfy_org and api_key_comfy_org from extra_data before
storing prompt history to prevent sensitive authentication tokens from
being persisted in the history endpoint response.
Extends polling duration from 10 minutes to ~68 minutes (256 attempts × 16 seconds) to accommodate longer Kling API operations that were frequently timing out for users.
* Support for async execution functions
This commit adds support for node execution functions defined as async. When
a node's execution function is defined as async, we can continue
executing other nodes while it is processing.
Standard uses of `await` should "just work", but people will still have
to be careful if they spawn actual threads. Because torch doesn't really
have async/await versions of functions, this won't particularly help
with most locally-executing nodes, but it does work for e.g. web
requests to other machines.
In addition to the execute function, the `VALIDATE_INPUTS` and
`check_lazy_status` functions can also be defined as async, though we'll
only resolve one node at a time right now for those.
* Add the execution model tests to CI
* Add a missing file
It looks like this got caught by .gitignore? There's probably a better
place to put it, but I'm not sure what that is.
* Add the websocket library for automated tests
* Add additional tests for async error cases
Also fixes one bug that was found when an async function throws an error
after being scheduled on a task.
* Add a feature flags message to reduce bandwidth
We now only send 1 preview message of the latest type the client can
support.
We'll add a console warning when the client fails to send a feature
flags message at some point in the future.
* Add async tests to CI
* Don't actually add new tests in this PR
Will do it in a separate PR
* Resolve unit test in GPU-less runner
* Just remove the tests that GHA can't handle
* Change line endings to UNIX-style
* Avoid loading model_management.py so early
Because model_management.py has a top-level `logging.info`, we have to
be careful not to import that file before we call `setup_logging`. If we
do, we end up having the default logging handler registered in addition
to our custom one.
* feat: “--whitelist-custom-nodes” args for comfy core to go with “--disable-all-custom-nodes” for development purposes
* feat: Simplify custom nodes whitelist logic to use consistent code paths
* Add '@prerelease' to use latest test frontend
Allows download of pre-release versions.
Will always get the latest pre-release version - even if it's older than the latest stable release.
* nit
* add support to read pyproject.toml from custom node
* sf
* use pydantic instead
* sf
* use pydantic_settings
* remove unnecessary try/catch and handle single-file python node
* sf
* [feat] Add GetImageSize node to return image dimensions
Added a simple GetImageSize node in comfy_extras/nodes_images.py that returns width and height of input images. The node displays dimensions on the UI via PromptServer and provides width/height as outputs for further processing.
* add display name mapping
* [fix] Add server module mock to unit tests for PromptServer import
Updated test to mock server module preventing import errors from the new PromptServer usage in GetImageSize node. Uses direct import pattern consistent with rest of codebase.
* Update fix for potential XSS on /view
This commit uses mimetypes to add more restricted filetypes to prevent from being served, since mimetypes are what browsers use to determine how to serve files.
* Fix typo
Fixed a typo that prevented the program from running
Adds mandatory checkbox to bug report and user support templates requiring users to confirm they've tested with custom nodes disabled before submitting issues.
* [feat] Add ImageStitch node for concatenating images with borders
Add ImageStitch node that concatenates images in four directions with optional borders and intelligent size handling. Features include optional second image input, configurable borders with color selection, automatic batch size matching, and dimension alignment via padding or resizing.
Upstreamed from https://github.com/kijai/ComfyUI-KJNodes with enhancements for better error handling and comprehensive test coverage.
* [fix] Fix CI issues with CUDA dependencies and linting
- Mock CUDA-dependent modules in tests to avoid CI failures on CPU-only runners
- Fix ruff linting issues for code style compliance
* [fix] Improve CI compatibility by mocking nodes module import
Prevent CUDA initialization chain by mocking the nodes module at import time,
which is cleaner than deep mocking of CUDA-specific functions.
* [refactor] Clean up ImageStitch tests
- Remove unnecessary sys.path manipulation (pythonpath set in pytest.ini)
- Remove metadata tests that test framework internals rather than functionality
- Rename complex scenario test to be more descriptive of what it tests
* [refactor] Rename 'border' to 'spacing' for semantic accuracy
- Change border_width/border_color to spacing_width/spacing_color in API
- Update all tests to use spacing terminology
- Update comments and variable names throughout
- More accurately describes the gap/separator between images
* Added initial Flux.1 Kontext Pro Image node - recreated branch to save myself sanity from rebase crap after master got rebased
* Add safety filter to Kontext.
* Make safety = 2 and input image is optional.
* Add BFL kontext API nodes.
---------
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* Make torch compile node use wrapper instead of object_patch for the entire diffusion_models object, allowing key assotiations on diffusion_models to not break (loras, getting attributes, etc.)
* Moved torch compile code into comfy_api so it can be used by custom nodes with a degree of confidence
* Refactor set_torch_compile_wrapper to support a list of keys instead of just diffusion_model, as well as additional torch.compile args
* remove unused import
* Moved torch compile kwargs to be stored in model_options instead of attachments; attachments are more intended for things to be 'persisted', AKA not deepcopied
* Add some comments
* Remove random line of code, not sure how it got there
* support wan camera models
* fix by ruff check
* change camera_condition type; make camera_condition optional
* support camera trajectory nodes
* fix camera direction
---------
Co-authored-by: Qirui Sun <sunqr0667@126.com>
* [Luma] Print download URL of successful task result directly on nodes (#177)
[Veo] Print download URL of successful task result directly on nodes (#184)
[Recraft] Print download URL of successful task result directly on nodes (#183)
[Pixverse] Print download URL of successful task result directly on nodes (#182)
[Kling] Print download URL of successful task result directly on nodes (#181)
[MiniMax] Print progress text and download URL of successful task result directly on nodes (#179)
[Docs] Link to docs in `API_NODE` class property type annotation comment (#178)
[Ideogram] Print download URL of successful task result directly on nodes (#176)
[Kling] Print download URL of successful task result directly on nodes (#181)
[Veo] Print download URL of successful task result directly on nodes (#184)
[Recraft] Print download URL of successful task result directly on nodes (#183)
[Pixverse] Print download URL of successful task result directly on nodes (#182)
[MiniMax] Print progress text and download URL of successful task result directly on nodes (#179)
[Docs] Link to docs in `API_NODE` class property type annotation comment (#178)
[Luma] Print download URL of successful task result directly on nodes (#177)
[Ideogram] Print download URL of successful task result directly on nodes (#176)
Show output URL and progress text on Pika nodes (#168)
[BFL] Print download URL of successful task result directly on nodes (#175)
[OpenAI ] Print download URL of successful task result directly on nodes (#174)
* fix ruff errors
* fix 3.10 syntax error
* first pass at opus and mp3 as well as migrating flac to pyav
* minor mp3 encoding fix
* fix ruff
* delete dead code
* split out save audio to separate nodes per filetype
* fix ruff
* Handle Comfy API key based authorizaton (#167)
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* Bump frontend version to include API key features (#170)
* bump templates version
---------
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
* Add Ideogram generate node.
* Add staging api.
* Add API_NODE and common error for missing auth token (#5)
* Add Minimax Video Generation + Async Task queue polling example (#6)
* [Minimax] Show video preview and embed workflow in ouput (#7)
* Remove uv.lock
* Remove polling operations.
* Revert "Remove polling operations."
* Update stubs.
* Added Ideogram and Minimax back in.
* Added initial BFL Flux 1.1 [pro] Ultra node (#11)
* Add --comfy-api-base launch arg (#13)
* Add instructions for staging development. (#14)
* remove validation to make it easier to run against LAN copies of the API
* Manually add BFL polling status response schema (#15)
* Add function for uploading files. (#18)
* Add Luma nodes (#16)
* Refactor util functions (#20)
* Add VIDEO type (#21)
* Add rest of Luma node functionality (#19)
* Fix image_luma_ref not working (#28)
* [Bug] Remove duplicated option T2V-01 in MinimaxTextToVideoNode (#31)
* Add utils to map from pydantic model fields to comfy node inputs (#30)
* add veo2, bump av req (#32)
* Add Recraft nodes (#29)
* Add Kling Nodes (#12)
* Add Camera Concepts (luma_concepts) to Luma Video nodes (#33)
* Add Runway nodes (#17)
* Convert Minimax node to use VIDEO output type (#34)
* Standard `CATEGORY` system for api nodes (#35)
* Set `Content-Type` header when uploading files (#36)
* add better error propagation to veo2 (#37)
* Add Realistic Image and Logo Raster styles for Recraft v3 (#38)
* Fix runway image upload and progress polling (#39)
* Fix image upload for Luma: only include `Content-Type` header field if it's set explicitly (#40)
* Moved Luma nodes to nodes_luma.py (#47)
* Moved Recraft nodes to nodes_recraft.py (#48)
* Add Pixverse nodes (#46)
* Move and fix BFL nodes to node_bfl.py (#49)
* Move and edit Minimax node to nodes_minimax.py (#50)
* Add Minimax Image to Video node + Cleanup (#51)
* Add Recraft Text to Vector node, add Save SVG node to handle its output (#53)
* Added pixverse_template support to Pixverse Text to Video node (#54)
* Added Recraft Controls + Recraft Color RGB nodes (#57)
* split remaining nodes out of nodes_api, make utility lib, refactor ideogram (#61)
* Add types and doctstrings to utils file (#64)
* Fix: `PollingOperation` progress bar update progress by absolute value (#65)
* Use common download function in kling nodes module (#67)
* Fix: Luma video nodes in `api nodes/image` category (#68)
* Set request type explicitly (#66)
* Add `control_after_generate` to all seed inputs (#69)
* Fix bug: deleting `Content-Type` when property does not exist (#73)
* Add preview to Save SVG node (#74)
* change default poll interval (#76), rework veo2
* Add Pixverse and updated Kling types (#75)
* Added Pixverse Image to VIdeo node (#77)
* Add Pixverse Transition Video node (#79)
* Proper ray-1-6 support as fix has been applied in backend (#80)
* Added Recraft Style - Infinite Style Library node (#82)
* add ideogram v3 (#83)
* [Kling] Split Camera Control config to its own node (#81)
* Add Pika i2v and t2v nodes (#52)
* Temporary Fix for Runway (#87)
* Added Stability Stable Image Ultra node (#86)
* Remove Runway nodes (#88)
* Fix: Prompt text can't be validated in Kling nodes when using primitive nodes (#90)
* Fix: typo in node name "Stabiliy" => "Stability" (#91)
* Add String (Multiline) node (#93)
* Update Pika Duration and Resolution options (#94)
* Change base branch to master. Not main. (#95)
* Fix UploadRequest file_name param (#98)
* Removed Infinite Style Library until later (#99)
* fix ideogram style types (#100)
* fix multi image return (#101)
* add metadata saving to SVG (#102)
* Bump templates version to include API node template workflows (#104)
* Fix: `download_url_to_video_output` return type (#103)
* fix 4o generation bug (#106)
* Serve SVG files directly (#107)
* Add a bunch of nodes, 3 ready to use, the rest waiting for endpoint support (#108)
* Revert "Serve SVG files directly" (#111)
* Expose 4 remaining Recraft nodes (#112)
* [Kling] Add `Duration` and `Video ID` outputs (#105)
* Fix: datamodel-codegen sets string#binary type to non-existent `bytes_aliased` variable (#114)
* Fix: Dall-e 2 not setting request content-type dynamically (#113)
* Default request timeout: one hour. (#116)
* Add Kling nodes: camera control, start-end frame, lip-sync, video extend (#115)
* Add 8 nodes - 4 BFL, 4 Stability (#117)
* Fix error for Recraft ImageToImage error for nonexistent random_seed param (#118)
* Add remaining Pika nodes (#119)
* Make controls input work for Recraft Image to Image node (#120)
* Use upstream PR: Support saving Comfy VIDEO type to buffer (#123)
* Use Upstream PR: "Fix: Error creating video when sliced audio tensor chunks are non-c-contiguous" (#127)
* Improve audio upload utils (#128)
* Fix: Nested `AnyUrl` in request model cannot be serialized (Kling, Runway) (#129)
* Show errors and API output URLs to the user (change log levels) (#131)
* Fix: Luma I2I fails when weight is <=0.01 (#132)
* Change category of `LumaConcepts` node from image to video (#133)
* Fix: `image.shape` accessed before `image` is null-checked (#134)
* Apply small fixes and most prompt validation (if needed to avoid API error) (#135)
* Node name/category modifications (#140)
* Add back Recraft Style - Infinite Style Library node (#141)
* Fixed Kling: Check attributes of pydantic types. (#144)
* Bump `comfyui-workflow-templates` version (#142)
* [Kling] Print response data when error validating response (#146)
* Fix: error validating Kling image response, trying to use `"key" in` on Pydantic class instance (#147)
* [Kling] Fix: Correct/verify supported subset of input combos in Kling nodes (#149)
* [Kling] Fix typo in node description (#150)
* [Kling] Fix: CFG min/max not being enforced (#151)
* Rebase launch-rebase (private) on prep-branch (public copy of master) (#153)
* Bump templates version (#154)
* Fix: Kling image gen nodes don't return entire batch when `n` > 1 (#152)
* Remove pixverse_template from PixVerse Transition Video node (#155)
* Invert image_weight value on Luma Image to Image node (#156)
* Invert and resize mask for Ideogram V3 node to match masking conventions (#158)
* [Kling] Fix: image generation nodes not returning Tuple (#159)
* [Bug] [Kling] Fix Kling camera control (#161)
* Kling Image Gen v2 + improve node descriptions for Flux/OpenAI (#160)
* [Kling] Don't return video_id from dual effect video (#162)
* Bump frontend to 1.18.8 (#163)
* Use 3.9 compat syntax (#164)
* Use Python 3.10
* add example env var
* Update templates to 0.1.11
* Bump frontend to 1.18.9
---------
Co-authored-by: Robin Huang <robin.j.huang@gmail.com>
Co-authored-by: Christian Byrne <cbyrne@comfy.org>
Co-authored-by: thot experiment <94414189+thot-experiment@users.noreply.github.com>
* Upload files for Chroma Implementation
* Remove trailing whitespace
* trim more trailing whitespace..oops
* remove unused imports
* Add supported_inference_dtypes
* Set min_length to 0 and remove attention_mask=True
* Set min_length to 1
* get_mdulations added from blepping and minor changes
* Add lora conversion if statement in lora.py
* Update supported_models.py
* update model_base.py
* add uptream commits
* set modelType.FLOW, will cause beta scheduler to work properly
* Adjust memory usage factor and remove unnecessary code
* fix mistake
* reduce code duplication
* remove unused imports
* refactor for upstream sync
* sync chroma-support with upstream via syncbranch patch
* Update sd.py
* Add Chroma as option for the OptimalStepsScheduler node
* Add basic support for videos as types
This PR adds support for VIDEO as first-class types. In order to avoid
unnecessary costs, VIDEO outputs must implement the `VideoInput` ABC,
but their implementation details can vary. Included are two
implementations of this type which can be returned by other nodes:
* `VideoFromFile` - Created with either a path on disk (as a string) or
a `io.BytesIO` containing the contents of a file in a supported format
(like .mp4). This implementation won't actually load the video unless
necessary. It will also avoid re-encoding when saving if possible.
* `VideoFromComponents` - Created from an image tensor and an optional
audio tensor.
Currently, only h264 encoded videos in .mp4 containers are supported for
saving, but the plan is to add additional encodings/containers in the
near future (particularly .webm).
* Add optimization to avoid parsing entire video
* Improve type declarations to reduce warnings
* Make sure bytesIO objects can be read many times
* Fix a potential issue when saving long videos
* Fix incorrect type annotation
* Add a `LoadVideo` node to make testing easier
* Refactor new types out of the base comfy folder
I've created a new `comfy_api` top-level module. The intention is that
anything within this folder would be covered by semver-style versioning
that would allow custom nodes to rely on them not introducing breaking
changes.
* Fix linting issue
This should speed up the lowvram mode a bit. It currently is only enabled when --async-offload is used but it will be enabled by default in the future if there are no problems.
* Add Ideogram generate node.
* Add staging api.
* COMFY_API_NODE_NAME node property
* switch to boolean flag and use original node name for id
* add optional to type
* Add API_NODE and common error for missing auth token (#5)
* Add Minimax Video Generation + Async Task queue polling example (#6)
* [Minimax] Show video preview and embed workflow in ouput (#7)
* [API Nodes] Send empty request body instead of empty dictionary. (#8)
* Fixed: removed function from rebase.
* Add pydantic.
* Remove uv.lock
* Remove polling operations.
* Update stubs workflow.
* Remove polling comments.
* Update stubs.
* Use pydantic v2.
* Use pydantic v2.
* Add basic OpenAITextToImage node
* Add.
* convert image to tensor.
* Improve types.
* Ruff.
* Push tests.
* Handle multi-form data.
- Don't set content-type for multi-part/form
- Use data field instead of JSON
* Change to api.comfy.org
* Handle error code 409.
* Remove nodes.
---------
Co-authored-by: bymyself <cbyrne@comfy.org>
Co-authored-by: Yoland Y <4950057+yoland68@users.noreply.github.com>
* install templates as pip package
* Update requirements.txt
* bump templates version to include hidream
---------
Co-authored-by: Chenlei Hu <hcl@comfy.org>
* support 3d model filtering
* fix lint error: blank line contains whitespace
* add model extensions to test runner mimetype cache manually
* use unittest.mock.patch
* remove mtl file from testcase (actually plaintext support file)
* add dependency aware cache that removed a cached node as soon as all of its decendents have executed. This allows users with lower RAM to run workflows they would otherwise not be able to run. The downside is that every workflow will fully run each time even if no nodes have changed.
* remove test code
* tidy code
* Ensuring a 401 error is returned when user data is not found in multi-user context.
* Returning a 401 error when provided comfy-user does not exists on server side.
* draft pass at a native comfy implementation of Lotus-D depth and normal est
* fix model_sampling kludges
* fix ruff
---------
Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
This commit relaxes divisibility constraint for single-frame
conditionings. For single frames, the index can be arbitrary, while
multi-frame conditionings (>= 9 frames) must still be aligned to 8
frames.
Co-authored-by: Andrew Kvochko <a.kvochko@lightricks.com>
* Better argument handling of front-end-root
Improves handling of front-end-root launch argument. Several instances where users have set it and ComfyUI launches as normal and completely disregards the launch arg which doesn't make sense. Better to indicate to user that something is incorrect.
* Removed unused import
There was no real reason to use "Optional" typing in ther front-end-root argument.
This patch fixes a bug in LTXVCropGuides when the latent has no
keyframes. Additionally, the first frame is always added as a keyframe.
Co-authored-by: Andrew Kvochko <a.kvochko@lightricks.com>
* improved: better installation guide
- change `pip` to `{sys.executable} -m pip`
modified: To prevent the guide message from being obscured by a complex error message, apply `exit` instead of `raise`.
* ruff fix
The idea is that you can indicate how much quality vs speed you want.
At the moment:
--fast 2 enables fp16 accumulation if your pytorch supports it.
--fast 5 enables fp8 matrix mult on fp8 models and the optimization above.
--fast without a number enables all optimizations.
* Fix link pointing to non-exisiting docs
The current link is pointing to a path that does not exist any longer.
I changed it to point to the currect correct path for custom nodes datatypes.
* Update node_typing.py
The frontend part isn't done yet so there is no video preview on the node
or dragging the webm on the interface to load the workflow yet.
This uses a new dependency: PyAV.
* add LoadImageOutput node
* add route for input/output/temp files
* update node_typing.py
* use literal type for image_folder field
* mark node as beta
I'm not not sure which arches are supported yet. If you see improvements in
memory usage while using --use-pytorch-cross-attention on your AMD GPU let
me know and I will add it to the list.
* Fix for running via DirectML
Fix DirectML empty image generation issue with Flux1. add CPU fallback for unsupported path. Verified the model works on AMD GPUs
* fix formating
* update casual mask calculation
* Use `torch.special.expm1`
This function provides greater precision than `exp(x) - 1` for small values of `x`.
Found with TorchFix https://github.com/pytorch-labs/torchfix/
* Use non-alias
* Add 'sigmas' to transformer_options so that downstream code can know about the full scope of current sampling run, fix Hook Keyframes' guarantee_steps=1 inconsistent behavior with sampling split across different Sampling nodes/sampling runs by referencing 'sigmas'
* Cleaned up hooks.py, refactored Hook.should_register and add_hook_patches to use target_dict instead of target so that more information can be provided about the current execution environment if needed
* Refactor WrapperHook into TransformerOptionsHook, as there is no need to separate out Wrappers/Callbacks/Patches into different hook types (all affect transformer_options)
* Refactored HookGroup to also store a dictionary of hooks separated by hook_type, modified necessary code to no longer need to manually separate out hooks by hook_type
* In inner_sample, change "sigmas" to "sampler_sigmas" in transformer_options to not conflict with the "sigmas" that will overwrite "sigmas" in _calc_cond_batch
* Refactored 'registered' to be HookGroup instead of a list of Hooks, made AddModelsHook operational and compliant with should_register result, moved TransformerOptionsHook handling out of ModelPatcher.register_all_hook_patches, support patches in TransformerOptionsHook properly by casting any patches/wrappers/hooks to proper device at sample time
* Made hook clone code sane, made clear ObjectPatchHook and SetInjectionsHook are not yet operational
* Fix performance of hooks when hooks are appended via Cond Pair Set Props nodes by properly caching between positive and negative conds, make hook_patches_backup behave as intended (in the case that something pre-registers WeightHooks on the ModelPatcher instead of registering it at sample time)
* Filter only registered hooks on self.conds in CFGGuider.sample
* Make hook_scope functional for TransformerOptionsHook
* removed 4 whitespace lines to satisfy Ruff,
* Add a get_injections function to ModelPatcher
* Made TransformerOptionsHook contribute to registered hooks properly, added some doc strings and removed a so-far unused variable
* Rename AddModelsHooks to AdditionalModelsHook, rename SetInjectionsHook to InjectionsHook (not yet implemented, but at least getting the naming figured out)
* Clean up a typehint
I think the issue this was working around has been solved.
If you notice that this change slows things down or causes stutters on
your AMD GPU with ROCm on Linux please report it.
This commit fixes the temporal tile size calculation, and removes
a redundant tile at the end of the range when its elements are
completely covered by the previous tile.
Co-authored-by: Andrew Kvochko <a.kvochko@lightricks.com>
* nit
* Add option to log non-error output to stdout
- No change to default behaviour
- Adds CLI argument: --log-stdout
- With this arg present, any logging of a level below logging.ERROR will be sent to stdout instead of stderr
* Add oneAPI device selector and some other minor changes.
* Fix device selector variable name.
* Flip minor version check sign.
* Undo changes to README.md.
This should make it possible to do higher res images/longer videos by
further offloading weights to CPU memory.
Please report an issue if this slows down things on your system.
The 10 step minimum for the AYS scheduler is pointless, it works well at lower steps, like 8 steps, or even 4 steps.
For example with LCM or DMD2.
Example here: https://i.ibb.co/56CSPMj/image.png
* fix attention OOM in xformers
* allow passing attention mask in flux attention
* allow an attn_mask in flux
* attn masks can be done using replace patches instead of a separate dict
* fix return types
* fix return order
* enumerate
* patch the right keys
* arg names
* fix a silly bug
* fix xformers masks
* replace match with if, elif, else
* mask with image_ref_size
* remove unused import
* remove unused import 2
* fix pytorch/xformers attention
This corrects a weird inconsistency with skip_reshape.
It also allows masks of various shapes to be passed, which will be
automtically expanded (in a memory-efficient way) to a size that is
compatible with xformers or pytorch sdpa respectively.
* fix mask shapes
The last ROCM 6.2 build was November 22nd, after that date new builds use ROCM 6.2.4.
The builds from the new URL have been tested and work without problems.
* fix: The custom nodes installed in the paths specified in `extra_model_paths.yaml` encounter a bug where the prestartup script is not imported.
* Ensure custom paths are used during startup
https://github.com/comfyanonymous/ComfyUI/pull/5794
* Add MaHiRo (improved CFG)
long explanation of what it is is [here](https://huggingface.co/spaces/yoinked/blue-arxiv) (2024-1208.1)
note: if the node name has encoding issues (utf 8/whatever), id suggest to replace the face at the end with `(>w<)`
* add it to nodes.py, add description, and make it a post_cfg function
* fix
* revert the sampler_cfg_function thing
* switch cfg to args["denoised"]
- Commented out Windows OS from the CI matrix in test-ci.yml.
- Removed the test-win-nightly job to streamline testing on macOS and Linux only.
- Adjusted the matrix strategy to focus on Python versions and CUDA compatibility without Windows support.
* Reapply "Add union link connection type support (#5806)" (#5889)
This reverts commit bf9a90a145.
* Fix union type breaks existing type workarounds
* Add non-string test
* Add tests for hacks and non-string types
* Support python versions lower than 3.11
Now the only symptom of code messing up and keeping references to a model
object when it should not will be endless prints in the log instead of the
next workflow crashing ComfyUI.
* Added hook_patches to ModelPatcher for weights (model)
* Initial changes to calc_cond_batch to eventually support hook_patches
* Added current_patcher property to BaseModel
* Consolidated add_hook_patches_as_diffs into add_hook_patches func, fixed fp8 support for model-as-lora feature
* Added call to initialize_timesteps on hooks in process_conds func, and added call prepare current keyframe on hooks in calc_cond_batch
* Added default_conds support in calc_cond_batch func
* Added initial set of hook-related nodes, added code to register hooks for loras/model-as-loras, small renaming/refactoring
* Made CLIP work with hook patches
* Added initial hook scheduling nodes, small renaming/refactoring
* Fixed MaxSpeed and default conds implementations
* Added support for adding weight hooks that aren't registered on the ModelPatcher at sampling time
* Made Set Clip Hooks node work with hooks from Create Hook nodes, began work on better Create Hook Model As LoRA node
* Initial work on adding 'model_as_lora' lora type to calculate_weight
* Continued work on simpler Create Hook Model As LoRA node, started to implement ModelPatcher callbacks, attachments, and additional_models
* Fix incorrect ref to create_hook_patches_clone after moving function
* Added injections support to ModelPatcher + necessary bookkeeping, added additional_models support in ModelPatcher, conds, and hooks
* Added wrappers to ModelPatcher to facilitate standardized function wrapping
* Started scaffolding for other hook types, refactored get_hooks_from_cond to organize hooks by type
* Fix skip_until_exit logic bug breaking injection after first run of model
* Updated clone_has_same_weights function to account for new ModelPatcher properties, improved AutoPatcherEjector usage in partially_load
* Added WrapperExecutor for non-classbound functions, added calc_cond_batch wrappers
* Refactored callbacks+wrappers to allow storing lists by id
* Added forward_timestep_embed_patch type, added helper functions on ModelPatcher for emb_patch and forward_timestep_embed_patch, added helper functions for removing callbacks/wrappers/additional_models by key, added custom_should_register prop to hooks
* Added get_attachment func on ModelPatcher
* Implement basic MemoryCounter system for determing with cached weights due to hooks should be offloaded in hooks_backup
* Modified ControlNet/T2IAdapter get_control function to receive transformer_options as additional parameter, made the model_options stored in extra_args in inner_sample be a clone of the original model_options instead of same ref
* Added create_model_options_clone func, modified type annotations to use __future__ so that I can use the better type annotations
* Refactored WrapperExecutor code to remove need for WrapperClassExecutor (now gone), added sampler.sample wrapper (pending review, will likely keep but will see what hacks this could currently let me get rid of in ACN/ADE)
* Added Combine versions of Cond/Cond Pair Set Props nodes, renamed Pair Cond to Cond Pair, fixed default conds never applying hooks (due to hooks key typo)
* Renamed Create Hook Model As LoRA nodes to make the test node the main one (more changes pending)
* Added uuid to conds in CFGGuider and uuids to transformer_options to allow uniquely identifying conds in batches during sampling
* Fixed models not being unloaded properly due to current_patcher reference; the current ComfyUI model cleanup code requires that nothing else has a reference to the ModelPatcher instances
* Fixed default conds not respecting hook keyframes, made keyframes not reset cache when strength is unchanged, fixed Cond Set Default Combine throwing error, fixed model-as-lora throwing error during calculate_weight after a recent ComfyUI update, small refactoring/scaffolding changes for hooks
* Changed CreateHookModelAsLoraTest to be the new CreateHookModelAsLora, rename old ones as 'direct' and will be removed prior to merge
* Added initial support within CLIP Text Encode (Prompt) node for scheduling weight hook CLIP strength via clip_start_percent/clip_end_percent on conds, added schedule_clip toggle to Set CLIP Hooks node, small cleanup/fixes
* Fix range check in get_hooks_for_clip_schedule so that proper keyframes get assigned to corresponding ranges
* Optimized CLIP hook scheduling to treat same strength as same keyframe
* Less fragile memory management.
* Make encode_from_tokens_scheduled call cleaner, rollback change in model_patcher.py for hook_patches_backup dict
* Fix issue.
* Remove useless function.
* Prevent and detect some types of memory leaks.
* Run garbage collector when switching workflow if needed.
* Moved WrappersMP/CallbacksMP/WrapperExecutor to patcher_extension.py
* Refactored code to store wrappers and callbacks in transformer_options, added apply_model and diffusion_model.forward wrappers
* Fix issue.
* Refactored hooks in calc_cond_batch to be part of get_area_and_mult tuple, added extra_hooks to ControlBase to allow custom controlnets w/ hooks, small cleanup and renaming
* Fixed inconsistency of results when schedule_clip is set to False, small renaming/typo fixing, added initial support for ControlNet extra_hooks to work in tandem with normal cond hooks, initial work on calc_cond_batch merging all subdicts in returned transformer_options
* Modified callbacks and wrappers so that unregistered types can be used, allowing custom_nodes to have their own unique callbacks/wrappers if desired
* Updated different hook types to reflect actual progress of implementation, initial scaffolding for working WrapperHook functionality
* Fixed existing weight hook_patches (pre-registered) not working properly for CLIP
* Removed Register/Direct hook nodes since they were present only for testing, removed diff-related weight hook calculation as improved_memory removes unload_model_clones and using sample time registered hooks is less hacky
* Added clip scheduling support to all other native ComfyUI text encoding nodes (sdxl, flux, hunyuan, sd3)
* Made WrapperHook functional, added another wrapper/callback getter, added ON_DETACH callback to ModelPatcher
* Made opt_hooks append by default instead of replace, renamed comfy.hooks set functions to be more accurate
* Added apply_to_conds to Set CLIP Hooks, modified relevant code to allow text encoding to automatically apply hooks to output conds when apply_to_conds is set to True
* Fix cached_hook_patches not respecting target_device/memory_counter results
* Fixed issue with setting weights from hooks instead of copying them, added additional memory_counter check when caching hook patches
* Remove unnecessary torch.no_grad calls for hook patches
* Increased MemoryCounter minimum memory to leave free by *2 until a better way to get inference memory estimate of currently loaded models exists
* For encode_from_tokens_scheduled, allow start_percent and end_percent in add_dict to limit which scheduled conds get encoded for optimization purposes
* Removed a .to call on results of calculate_weight in patch_hook_weight_to_device that was screwing up the intermediate results for fp8 prior to being passed into stochastic_rounding call
* Made encode_from_tokens_scheduled work when no hooks are set on patcher
* Small cleanup of comments
* Turn off hook patch caching when only 1 hook present in sampling, replace some current_hook = None with calls to self.patch_hooks(None) instead to avoid a potential edge case
* On Cond/Cond Pair nodes, removed opt_ prefix from optional inputs
* Allow both FLOATS and FLOAT for floats_strength input
* Revert change, does not work
* Made patch_hook_weight_to_device respect set_func and convert_func
* Make discard_model_sampling True by default
* Add changes manually from 'master' so merge conflict resolution goes more smoothly
* Cleaned up text encode nodes with just a single clip.encode_from_tokens_scheduled call
* Make sure encode_from_tokens_scheduled will respect use_clip_schedule on clip
* Made nodes in nodes_hooks be marked as experimental (beta)
* Add get_nested_additional_models for cases where additional_models could have their own additional_models, and add robustness for circular additional_models references
* Made finalize_default_conds area math consistent with other sampling code
* Changed 'opt_hooks' input of Cond/Cond Pair Set Default Combine nodes to 'hooks'
* Remove a couple old TODO's and a no longer necessary workaround
* Less fragile memory management.
* Fix issue.
* Remove useless function.
* Prevent and detect some types of memory leaks.
* Run garbage collector when switching workflow if needed.
* Fix issue.
Add a way to reshape lora weights.
Allow weight patches to all weight not just .weight and .bias
Add a way for a lora to set a weight to a specific value.
This one should work for skipping the single layers of models like Flux
and Auraflow.
If you want to see how these models work and how many double/single layers
they have see the "ModelMerge*" nodes for the specific model.
* fix --cuda-device arg for AMD/HIP devices
CUDA_VISIBLE_DEVICES is ignored for HIP devices/backend. Instead it uses HIP_VISIBLE_DEVICES. Setting this environment variable has no side effect for CUDA/NVIDIA so it can safely be set in any case and vice versa.
* deleted accidental if
* Add /logs/raw and /logs/subscribe for getting logs on frontend
Hijacks stderr/stdout to send all output data to the client on flush
* Use existing send sync method
* Fix get_logs should return string
* Fix bug
* pass no server
* fix tests
* Fix output flush on linux
* Update nodes_images.py
Nodes menu has inconsistency in names, some with spaces between words, other not.
* Update nodes.py
Include the node mapping name line for Image Crop Node
* Update nodes_images.py
* Rename image nodes
add space between words for consistency > Display name mappings
* Update nodes_images.py
Nodes menu has inconsistency in names, some with spaces between words, other not.
* Update nodes.py
Include the node mapping name line for Image Crop Node
* Update nodes_images.py
To use:
"Load CLIP" node with t5xxl + type mochi
"Load Diffusion Model" node with the mochi dit file.
"Load VAE" with the mochi vae file.
EmptyMochiLatentVideo node for the latent.
euler + linear_quadratic in the KSampler node.
* Frontend Manager: avoid redundant gh calls for static versions
* actually, removing old tmpdir isn't needed
I tested - downloader code handles this case well already
(also rmdir was wrong func anyway, needed shutil.rmtree if it had content)
* add code comment
This is a port of the ModelSamplerTonemapNoiseTest from the experiments
repo.
To replicate that node use LatentOperationTonemapReinhard and
LatentApplyOperationCFG together.
Somehow managed to drop a file called "nul" into a windows checkpoints subdirectory. This caused all sorts of havoc with many nodes that needed the list of checkpoints.
* add internal /folder_paths route
returns a json maps of folder paths
* (minor) format download_models.py
* initial folder path input on download api
* actually, require folder_path and clean up some code
* partial tests update
* fix & logging
* also download to a tmp file not the live file
to avoid compounding errors from network failure
* update tests again
* test tweaks
* workaround the first tests blocker
* fix file handling in tests
* rewrite test for create_model_path
* minor doc fix
* avoid 'mock_directory'
use temp dir to avoid accidental fs pollution from tests
* Run unit tests on Windows as well.
* Test on mac.
* Continue running on error.
* Compared normalized paths to work cross platform.
* Only test common set of mimetypes across operating systems.
* add 'is_default' to model paths config
including impl and doc in example file
* update weirdly overspecific test expectations
* oh there's two
* sigh
* Override user directory.
* Use overridden user directory.
* Remove prints.
* Remove references to global user_files.
* Remove unused replace_folder function.
* Remove newline.
* Remove global during get_user_directory.
* Add validation.
It probably only works on Linux.
For maximum speed on Flux with Nvidia 40 series/ada and newer try using
this node with fp8_e4m3fn and the --fast argument.
* Update sampling.py
* Update samplers.py
* my bad
* "fix" the sampler
* Update samplers.py
* i named it wrong
* minor sampling improvements
mainly using a dynamic rho value (hey this sounds a lot like smea!!!)
* revert rho change
rho? r? its just 1/2
All past 30 min of comtts are done on the top of Mt Fuji
By Comfy, Robin, and Yoland
All other comfy org members died on the way
Introduced unit tests to verify the correctness of various folder path
utility functions such as `get_directory_by_type`, `annotated_filepath`,
and `recursive_search` among others. These tests cover scenarios
including directory retrieval, filepath annotation, recursive file
searches, and filtering files by extensions, enhancing the robustness
and reliability of the codebase.
* Expand user path.
* Add test.
* Add unit test for expanding base path.
* Simplify unit test.
* Remove comment.
* Remove comment.
* Checkpoints.
* Refactor.
Browsers are dumb and let any website do requests to localhost this should
prevent this without breaking things. CORS prevents the javascript from
reading the response but they can still write it.
At the moment this is only enabled when the --enable-cors-header argument
is not used.
text_encoder_diff should be connected to a CLIPMergeSubtract node.
model_diff and text_encoder_diff are optional inputs so you can create
model only loras, text encoder only loras or a lora that contains both.
This is a format with keys like:
text_encoders.clip_l.transformer.text_model.encoder.layers.9.self_attn.v_proj.lora_up.weight
Instead of waiting for me to add support for specific lora formats you can
convert your text encoder loras to this format instead.
If you want to see an example save a text encoder lora with the SaveLora
node with the commit right after this one.
* Add route for getting output logs
* Include ComfyUI version
* Move to own function
* Changed to memory logger
* Unify logger setup logic
* Fix get version git fallback
---------
Co-authored-by: pythongosssss <125205205+pythongosssss@users.noreply.github.com>
Currently, if a graph partially fails validation (i.e. some outputs are
valid while others have links from missing nodes), the execution loop
could get an exception resulting in server lockup.
This isn't actually possible to reproduce via the default UI, but is a
potential issue for people using the API to construct invalid graphs.
This code automatically forces upcasting attention for MacOS versions 14.5 and 14.6. My computer returns the string "14.6.1" for `platform.mac_ver()[0]`, so this generalizes the comparison to catch more versions.
I am running MacOS Sonoma 14.6.1 (latest version) and was seeing black image generation on previously functional workflows after recent software updates. This PR solved the issue for me.
See comfyanonymous/ComfyUI#3521
This change fixes a bug where non-constant values could be passed to the
IS_CHANGED function. This would result in workflows taking an extra
execution before they acted as if they were cached.
The actual change is like 4 characters -- the rest is adding unit tests.
When generating images with fp8_e4_m3 Flux and batch size >1, using --fast, ComfyUI throws a "view size is not compatible with input tensor's size and stride" error pointing at the first of these two calls to view.
As reshape is semantically equivalent to view except for working on a broader set of inputs, there should be no downside to changing this. The only difference is that it clones the underlying data in cases where .view would error out. I have confirmed that the output still looks as expected, but cannot confirm that no mutable use is made of the tensors anywhere.
Note that --fast is only marginally faster than the default.
* Create internal route table.
* List files.
* Add GET /internal/files.
Retrieves list of files in models, output, and user directories.
* Refactor file names.
* Use typing_extensions for Python 3.8
* Fix tests.
* Remove print statements.
* Update README.
* Add output and user to valid directory test.
* Add missing type hints.
Optimizations that might break things/lower quality will be put behind
this flag first and might be enabled by default in the future.
Currently the only optimization is float8_e4m3fn matrix multiplication on
4000/ADA series Nvidia cards or later. If you have one of these cards you
will see a speed boost when using fp8_e4m3fn flux for example.
--reserve-vram 1.0 for example will make ComfyUI try to keep 1GB vram free.
This can also be useful if workflows are failing because of OOM errors but
in that case please report it if --reserve-vram improves your situation.
* Add Flux model support for InstantX style controlnet residuals
* Refactor Flux controlnet residual step to a separate method
* Rollback minor change
* New format for applying controlnet residuals: input->double_blocks, output->single_blocks
* Adjust XLabs Flux controlnet to fit new syntax of applying Flux controlnet residuals
* Remove unnecessary import and minor style change
* Execution Model Inversion
This PR inverts the execution model -- from recursively calling nodes to
using a topological sort of the nodes. This change allows for
modification of the node graph during execution. This allows for two
major advantages:
1. The implementation of lazy evaluation in nodes. For example, if a
"Mix Images" node has a mix factor of exactly 0.0, the second image
input doesn't even need to be evaluated (and visa-versa if the mix
factor is 1.0).
2. Dynamic expansion of nodes. This allows for the creation of dynamic
"node groups". Specifically, custom nodes can return subgraphs that
replace the original node in the graph. This is an incredibly
powerful concept. Using this functionality, it was easy to
implement:
a. Components (a.k.a. node groups)
b. Flow control (i.e. while loops) via tail recursion
c. All-in-one nodes that replicate the WebUI functionality
d. and more
All of those were able to be implemented entirely via custom nodes,
so those features are *not* a part of this PR. (There are some
front-end changes that should occur before that functionality is
made widely available, particularly around variant sockets.)
The custom nodes associated with this PR can be found at:
https://github.com/BadCafeCode/execution-inversion-demo-comfyui
Note that some of them require that variant socket types ("*") be
enabled.
* Allow `input_info` to be of type `None`
* Handle errors (like OOM) more gracefully
* Add a command-line argument to enable variants
This allows the use of nodes that have sockets of type '*' without
applying a patch to the code.
* Fix an overly aggressive assertion.
This could happen when attempting to evaluate `IS_CHANGED` for a node
during the creation of the cache (in order to create the cache key).
* Fix Pyright warnings
* Add execution model unit tests
* Fix issue with unused literals
Behavior should now match the master branch with regard to undeclared
inputs. Undeclared inputs that are socket connections will be used while
undeclared inputs that are literals will be ignored.
* Make custom VALIDATE_INPUTS skip normal validation
Additionally, if `VALIDATE_INPUTS` takes an argument named `input_types`,
that variable will be a dictionary of the socket type of all incoming
connections. If that argument exists, normal socket type validation will
not occur. This removes the last hurdle for enabling variant types
entirely from custom nodes, so I've removed that command-line option.
I've added appropriate unit tests for these changes.
* Fix example in unit test
This wouldn't have caused any issues in the unit test, but it would have
bugged the UI if someone copy+pasted it into their own node pack.
* Use fstrings instead of '%' formatting syntax
* Use custom exception types.
* Display an error for dependency cycles
Previously, dependency cycles that were created during node expansion
would cause the application to quit (due to an uncaught exception). Now,
we'll throw a proper error to the UI. We also make an attempt to 'blame'
the most relevant node in the UI.
* Add docs on when ExecutionBlocker should be used
* Remove unused functionality
* Rename ExecutionResult.SLEEPING to PENDING
* Remove superfluous function parameter
* Pass None for uneval inputs instead of default
This applies to `VALIDATE_INPUTS`, `check_lazy_status`, and lazy values
in evaluation functions.
* Add a test for mixed node expansion
This test ensures that a node that returns a combination of expanded
subgraphs and literal values functions correctly.
* Raise exception for bad get_node calls.
* Minor refactor of IsChangedCache.get
* Refactor `map_node_over_list` function
* Fix ui output for duplicated nodes
* Add documentation on `check_lazy_status`
* Add file for execution model unit tests
* Clean up Javascript code as per review
* Improve documentation
Converted some comments to docstrings as per review
* Add a new unit test for mixed lazy results
This test validates that when an output list is fed to a lazy node, the
node will properly evaluate previous nodes that are needed by any inputs
to the lazy node.
No code in the execution model has been changed. The test already
passes.
* Allow kwargs in VALIDATE_INPUTS functions
When kwargs are used, validation is skipped for all inputs as if they
had been mentioned explicitly.
* List cached nodes in `execution_cached` message
This was previously just bugged in this PR.
* Add support for simple tooltips
* Fix overflow
* Add tooltips for nodes in the default workflow
* new line
* Prevent potential crash
* PR feedback
* Hide tooltip when clicking (e.g. combo widget)
* Refactor tooltips, add node level support
* Fix
* move
* Fix test (and undo last change)
* Fixed indent
* Fix dom widgets, dont show tooltip if not over canvas
* Add model downloading endpoint.
* Move client session init to async function.
* Break up large function.
* Send "download_progress" as websocket event.
* Fixed
* Fixed.
* Use async mock.
* Move server set up to right before run call.
* Validate that model subdirectory cannot contain relative paths.
* Add download_model test checking for invalid paths.
* Remove DS_Store.
* Consolidate DownloadStatus and DownloadModelResult
* Add progress_interval as an optional parameter.
* Use tuple type from annotations.
* Use pydantic.
* Update comment.
* Revert "Use pydantic."
This reverts commit 7461e8eb00.
* Add new line.
* Add newline EOF.
* Validate model filename as well.
* Add comment to not reply on internal.
* Restrict downloading to safetensor files only.
* add support for HunYuanDit ControlNet
* fix hunyuandit controlnet
* fix typo in hunyuandit controlnet
* fix typo in hunyuandit controlnet
* fix code format style
* add control_weight support for HunyuanDit Controlnet
* use control_weights in HunyuanDit Controlnet
* fix typo
'_target' allows secrets to pass through, and we're just using the secret that allows uploading to the dashboard and are manually vetting PRs before running this workflow anyway
The keys are just: model.full.model.key.name.lora_up.weight
It is supported by all comfyui supported models.
Now people can just convert loras to this format instead of having to ask
for me to implement them.
* Lower SAG step for finer control
Since the introduction of cfg++ which uses very low cfg value, a step of 0.1 in SAG might be too high for finer control. Even SAG of 0.1 can be too high when cfg is only 0.6, so I change the step to 0.01.
* Lower PAG step as well.
* Update nodes_sag.py
This breaks seeds for resolutions that are not a multiple of 16 in pixel
resolution by using circular padding instead of reflection padding but
should lower the amount of artifacts when doing img2img at those
resolutions.
Fixes an issue where under certain conditions, the ComfyUI custom undo / redo functions would not run when intended to.
When trying to undo an action like deleting several nodes, instead the native browser undo runs - e.g. a textarea gets focus and the last typed text is undone. Clicking outside the text area and typing again just keeps doing the same thing.
* Let tokenizers return weights to be stored in the saved checkpoint.
* Basic hunyuan dit implementation.
* Fix some resolutions not working.
* Support hydit checkpoint save.
* Init with right dtype.
* Switch to optimized attention in pooler.
* Fix black images on hunyuan dit.
* cli_args: Add --duplicate-check-hash-function.
* server.py: compare_image_hash configurable hash function
Uses an argument added in cli_args to specify the type of hashing to default to for duplicate hash checking. Uses an `eval()` to identify the specific hashlib class to utilize, but ultimately safely operates because we have specific options and only those options/choices in the arg parser. So we don't have any unsafe input there.
* Add hasher() to node_helpers
* hashlib selection moved to node_helpers
* default-hashing-function instead of dupe checking hasher
This makes a default-hashing-function option instead of previous selected option.
* Use args.default_hashing_function
* Use safer handling for node_helpers.hasher()
Uses a safer handling method than `eval` to evaluate default hashing function.
* Stray parentheses are evil.
* Indentation fix.
Somehow when I hit save I didn't notice I missed a space to make indentation work proper. Oops!
* Add frontend manager
* Add tests
* nit
* Add unit test to github CI
* Fix path
* nit
* ignore
* Add logging
* Install test deps
* Remove 'stable' keyword support
* Update test
* Add web-root arg
* Rename web-root to front-end-root
* Add test on non-exist version number
* Use repo owner/name to replace hard coded provider list
* Inline cmd args
* nit
* Fix unit test
* Fix send to workflow
Fix center align of close workflow dialog
Better support for elements around canvas
* More resilent to extra elements added to body
If you have memory issues you can try disabling the smart memory management by running comfyui with:
run_amd_gpu_disable_smart_memory.bat
IF YOU GET A RED ERROR IN THE UI MAKE SURE YOU HAVE A MODEL/CHECKPOINT IN: ComfyUI\models\checkpoints
You can download the stable diffusion XL one from: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors
RECOMMENDED WAY TO UPDATE:
To update the ComfyUI code: update\update_comfyui.bat
TO SHARE MODELS BETWEEN COMFYUI AND ANOTHER UI:
In the ComfyUI directory you will find a file: extra_model_paths.yaml.example
Rename this file to: extra_model_paths.yaml and edit it with your favorite text editor.
if you want to enable the fast fp16 accumulation (faster for fp16 models with slightly less quality):
run_nvidia_gpu_fast_fp16_accumulation.bat
To run it in slow CPU mode:
@ -14,7 +17,7 @@ run_cpu.bat
IF YOU GET A RED ERROR IN THE UI MAKE SURE YOU HAVE A MODEL/CHECKPOINT IN: ComfyUI\models\checkpoints
You can download the stable diffusion 1.5 one from: https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.ckpt
You can download the stable diffusion 1.5 one from: https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors
Before submitting a **Bug Report**, please ensure the following:
- **1:** You are running the latest version of ComfyUI.
- **2:** You have looked at the existing bug reports and made sure this isn't already reported.
- **2:** You have your ComfyUI logs and relevant workflow on hand and will post them in this bug report.
- **3:** You confirmed that the bug is not caused by a custom node. You can disable all custom nodes by passing
`--disable-all-custom-nodes` command line argument.
`--disable-all-custom-nodes` command line argument. If you have custom node try updating them to the latest version.
- **4:** This is an actual bug in ComfyUI, not just a support question. A bug is when you can specify exact
steps to replicate what went wrong and others will be able to repeat your steps and see the same issue happen.
If unsure, ask on the [ComfyUI Matrix Space](https://app.element.io/#/room/%23comfyui_space%3Amatrix.org) or the [Comfy Org Discord](https://discord.gg/comfyorg) first.
## Very Important
Please make sure that you post ALL your ComfyUI logs in the bug report. A bug report without logs will likely be ignored.
- type:checkboxes
id:custom-nodes-test
attributes:
label:Custom Node Testing
description:Please confirm you have tried to reproduce the issue with all custom nodes disabled.
options:
- label:I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)
**2:** You have made an effort to find public answers to your question before asking here. In other words, you googled it first, and scrolled through recent help topics.
If unsure, ask on the [ComfyUI Matrix Space](https://app.element.io/#/room/%23comfyui_space%3Amatrix.org) or the [Comfy Org Discord](https://discord.gg/comfyorg) first.
- type:checkboxes
id:custom-nodes-test
attributes:
label:Custom Node Testing
description:Please confirm you have tried to reproduce the issue with all custom nodes disabled.
options:
- label:I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)
body: '(Automated Bot Message) CI Tests are running, you can view the results at https://ci.comfy.org/?branch=${{ github.event.pull_request.number }}%2Fmerge'
stale-issue-message:"This issue is being marked stale because it has not had any activity for 30 days. Reply below within 7 days if your issue still isn't solved, and it will be left open. Otherwise, the issue will be closed automatically."
Quantization aims to map a high-precision value x_f to a lower precision format with minimal loss in accuracy. These smaller formats then serve to reduce the models memory footprint and increase throughput by using specialized hardware.
When simply converting a value from FP16 to FP8 using the round-nearest method we might hit two issues:
- The dynamic range of FP16 (-65,504, 65,504) far exceeds FP8 formats like E4M3 (-448, 448) or E5M2 (-57,344, 57,344), potentially resulting in clipped values
- The original values are concentrated in a small range (e.g. -1,1) leaving many FP8-bits "unused"
By using a scaling factor, we aim to map these values into the quantized-dtype range, making use of the full spectrum. One of the easiest approaches, and common, is using per-tensor absolute-maximum scaling.
Given that additional information (scaling factor) is needed to "interpret" the quantized values, we describe those as derived datatypes.
## Quantization in Comfy
```
QuantizedTensor (torch.Tensor subclass)
↓ __torch_dispatch__
Two-Level Registry (generic + layout handlers)
↓
MixedPrecisionOps + Metadata Detection
```
### Representation
To represent these derived datatypes, ComfyUI uses a subclass of torch.Tensor to implements these using the `QuantizedTensor` class found in `comfy/quant_ops.py`
A `Layout` class defines how a specific quantization format behaves:
- Required parameters
- Quantize method
- De-Quantize method
```python
fromcomfy.quant_opsimportQuantizedLayout
classMyLayout(QuantizedLayout):
@classmethod
defquantize(cls,tensor,**kwargs):
# Convert to quantized format
qdata=...
params={'scale':...,'orig_dtype':tensor.dtype}
returnqdata,params
@staticmethod
defdequantize(qdata,scale,orig_dtype,**kwargs):
returnqdata.to(orig_dtype)*scale
```
To then run operations using these QuantizedTensors we use two registry systems to define supported operations.
The first is a **generic registry** that handles operations common to all quantized formats (e.g., `.to()`, `.clone()`, `.reshape()`).
The second registry is layout-specific and allows to implement fast-paths like nn.Linear.
When `torch.nn.functional.linear()` is called with QuantizedTensor arguments, `__torch_dispatch__` automatically routes to the registered implementation.
For any unsupported operation, QuantizedTensor will fallback to call `dequantize` and dispatch using the high-precision implementation.
### Mixed Precision
The `MixedPrecisionOps` class (lines 542-648 in `comfy/ops.py`) enables per-layer quantization decisions, allowing different layers in a model to use different precisions. This is activated when a model config contains a `layer_quant_config` dictionary that specifies which layers should be quantized and how.
**Architecture:**
```python
classMixedPrecisionOps(disable_weight_init):
_layer_quant_config={}# Maps layer names to quantization configs
Not all layers tolerate quantization equally. Sensitive operations like final projections can be kept in higher precision, while compute-heavy matmuls are quantized. This provides most of the performance benefits while maintaining quality.
The system is selected in `pick_operations()` when `model_config.layer_quant_config` is present, making it the highest-priority operation mode.
## Checkpoint Format
Quantized checkpoints are stored as standard safetensors files with quantized weight tensors and associated scaling parameters, plus a `_quantization_metadata` JSON entry describing the quantization scheme.
The quantized checkpoint will contain the same layers as the original checkpoint but:
- The weights are stored as quantized values, sometimes using a different storage datatype. E.g. uint8 container for fp8.
- For each quantized weight a number of additional scaling parameters are stored alongside depending on the recipe.
- We store a metadata.json in the metadata of the final safetensor containing the `_quantization_metadata` describing which layers are quantized and what layout has been used.
### Scaling Parameters details
We define 4 possible scaling parameters that should cover most recipes in the near-future:
- **weight_scale**: quantization scalers for the weights
- **weight_scale_2**: global scalers in the context of double scaling
- **pre_quant_scale**: scalers used for smoothing salient weights
- **input_scale**: quantization scalers for the activations
You can find the defined formats in `comfy/quant_ops.py` (QUANT_ALGOS).
### Quantization Metadata
The metadata stored alongside the checkpoint contains:
- **format_version**: String to define a version of the standard
- **layers**: A dictionary mapping layer names to their quantization format. The format string maps to the definitions found in `QUANT_ALGOS`.
Example:
```json
{
"_quantization_metadata":{
"format_version":"1.0",
"layers":{
"model.layers.0.mlp.up_proj":"float8_e4m3fn",
"model.layers.0.mlp.down_proj":"float8_e4m3fn",
"model.layers.1.mlp.up_proj":"float8_e4m3fn"
}
}
}
```
## Creating Quantized Checkpoints
To create compatible checkpoints, use any quantization tool provided the output follows the checkpoint format described above and uses a layout defined in `QUANT_ALGOS`.
### Weight Quantization
Weight quantization is straightforward - compute the scaling factor directly from the weight tensor using the absolute maximum method described earlier. Each layer's weights are quantized independently and stored with their corresponding `weight_scale` parameter.
### Calibration (for Activation Quantization)
Activation quantization (e.g., for FP8 Tensor Core operations) requires `input_scale` parameters that cannot be determined from static weights alone. Since activation values depend on actual inputs, we use **post-training calibration (PTQ)**:
1.**Collect statistics**: Run inference on N representative samples
2.**Track activations**: Record the absolute maximum (`amax`) of inputs to each quantized layer
3.**Compute scales**: Derive `input_scale` from collected statistics
4.**Store in checkpoint**: Save `input_scale` parameters alongside weights
The calibration dataset should be representative of your target use case. For diffusion models, this typically means a diverse set of prompts and generation parameters.
The most powerful and modular stable diffusion GUI and backend.
-----------

<divalign="center">
This ui will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart based interface. For some workflow examples and see what ComfyUI can do you can check out:
ComfyUI lets you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart based interface. Available on Windows, Linux, and macOS.
- Latent previews with [TAESD](#how-to-show-high-quality-previews)
-Starts up very fast.
-Works fully offline: will never download anything.
-Works fully offline: core will never download anything unless you want to.
-Optional API nodes to use paid models from external providers through the online [Comfy API](https://docs.comfy.org/tutorials/api-nodes/overview).
- [Config file](extra_model_paths.yaml.example) to set the search paths for models.
Workflow examples can be found on the [Examples page](https://comfyanonymous.github.io/ComfyUI_examples/)
## Release Process
ComfyUI follows a weekly release cycle targeting Monday but this regularly changes because of model releases or large changes to the codebase. There are three interconnected repositories:
| `.` | Fit view to selection (Whole graph when nothing is selected) |
| Double-Click LMB | Open node quick search palette |
| `Shift` + Drag | Move multiple wires at once |
| `Ctrl` + `Alt` + LMB | Disconnect all wires from clicked slot |
Ctrl can also be replaced with Cmd instead for macOS users
`Ctrl` can also be replaced with `Cmd` instead for macOS users
# Installing
## Windows
## Windows Portable
There is a portable standalone build for Windows that should work for running on Nvidia GPUs or for running on your CPU only on the [releases page](https://github.com/comfyanonymous/ComfyUI/releases).
### [Direct link to download](https://github.com/comfyanonymous/ComfyUI/releases/download/latest/ComfyUI_windows_portable_nvidia_cu121_or_cpu.7z)
### [Direct link to download](https://github.com/comfyanonymous/ComfyUI/releases/latest/download/ComfyUI_windows_portable_nvidia.7z)
Simply download, extract with [7-Zip](https://7-zip.org) and run. Make sure you put your Stable Diffusion checkpoints/models (the huge ckpt/safetensors files) in: ComfyUI\models\checkpoints
Simply download, extract with [7-Zip](https://7-zip.org) or with the windows explorer on recent windows versions and run. For smaller models you normally only need to put the checkpoints (the huge ckpt/safetensors files) in: ComfyUI\models\checkpoints but many of the larger models have multiple files. Make sure to follow the instructions to know which subfolder to put them in ComfyUI\models\
If you have trouble extracting it, right click the file -> properties -> unblock
Update your Nvidia drivers if it doesn't start.
#### Alternative Downloads:
[Experimental portable for AMD GPUs](https://github.com/comfyanonymous/ComfyUI/releases/latest/download/ComfyUI_windows_portable_amd.7z)
[Portable with pytorch cuda 12.8 and python 3.12](https://github.com/comfyanonymous/ComfyUI/releases/latest/download/ComfyUI_windows_portable_nvidia_cu128.7z).
[Portable with pytorch cuda 12.6 and python 3.12](https://github.com/comfyanonymous/ComfyUI/releases/latest/download/ComfyUI_windows_portable_nvidia_cu126.7z) (Supports Nvidia 10 series and older GPUs).
#### How do I share models between another UI and ComfyUI?
See the [Config file](extra_model_paths.yaml.example) to set the search paths for models. In the standalone windows build you can find this file in the ComfyUI directory. Rename this file to extra_model_paths.yaml and edit it with your favorite text editor.
## Jupyter Notebook
To run it on services like paperspace, kaggle or colab you can use my [Jupyter Notebook](notebooks/comfyui_colab.ipynb)
### AMD GPUs (Experimental: Windows and Linux), RDNA 3, 3.5 and 4 only.
These have less hardware support than the builds above but they work on windows. You also need to install the pytorch version specific to your hardware.
Intel Arc GPU users can install native PyTorch with torch.xpu support using pip. More information can be found [here](https://pytorch.org/docs/main/notes/get_start_xpu.html)
1. To install PyTorch xpu, use the following command:
@ -136,17 +282,6 @@ After this you should have everything installed and can proceed to running Comfy
### Others:
#### Intel GPUs
Intel GPU support is available for all Intel GPUs supported by Intel's Extension for Pytorch (IPEX) with the support requirements listed in the [Installation](https://intel.github.io/intel-extension-for-pytorch/index.html#installation?platform=gpu) page. Choose your platform and method of install and follow the instructions. The steps are as follows:
1. Start by installing the drivers or kernel listed or newer in the Installation page of IPEX linked above for Windows and Linux if needed.
1. Follow the instructions to install [Intel's oneAPI Basekit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) for your platform.
1. Install the packages for IPEX using the instructions provided in the Installation page for your platform.
1. Follow the [ComfyUI manual installation](#manual-install-windows-linux) instructions for Windows and Linux and run ComfyUI normally as described above after everything is installed.
Additional discussion and help can be found [here](https://github.com/comfyanonymous/ComfyUI/discussions/476).
#### Apple Mac silicon
You can install ComfyUI in Apple Mac silicon (M1 or M2) with any recent macOS version.
@ -158,23 +293,29 @@ You can install ComfyUI in Apple Mac silicon (M1 or M2) with any recent macOS ve
> **Note**: Remember to add your models, VAE, LoRAs etc. to the corresponding Comfy folders, as discussed in [ComfyUI manual installation](#manual-install-windows-linux).
#### DirectML (AMD Cards on Windows)
#### Ascend NPUs
```pip install torch-directml``` Then you can launch ComfyUI with: ```python main.py --directml```
For models compatible with Ascend Extension for PyTorch (torch_npu). To get started, ensure your environment meets the prerequisites outlined on the [installation](https://ascend.github.io/docs/sources/ascend/quick_install.html) page. Here's a step-by-step guide tailored to your platform and installation method:
### I already have another UI for Stable Diffusion installed do I really have to install all of these dependencies?
1. Begin by installing the recommended or newer kernel version for Linux as specified in the Installation page of torch-npu, if necessary.
2. Proceed with the installation of Ascend Basekit, which includes the driver, firmware, and CANN, following the instructions provided for your specific platform.
3. Next, install the necessary packages for torch-npu by adhering to the platform-specific instructions on the [Installation](https://ascend.github.io/docs/sources/pytorch/install.html#pytorch) page.
4. Finally, adhere to the [ComfyUI manual installation](#manual-install-windows-linux) guide for Linux. Once all components are installed, you can run ComfyUI as described earlier.
You don't. If you have another UI installed and working with its own python venv you can use that venv to run ComfyUI. You can open up your favorite terminal and activate it:
For models compatible with Cambricon Extension for PyTorch (torch_mlu). Here's a step-by-step guide tailored to your platform and installation method:
or on Windows:
1. Install the Cambricon CNToolkit by adhering to the platform-specific instructions on the [Installation](https://www.cambricon.com/docs/sdk_1.15.0/cntoolkit_3.7.2/cntoolkit_install_3.7.2/index.html)
2. Next, install the PyTorch(torch_mlu) following the instructions on the [Installation](https://www.cambricon.com/docs/sdk_1.15.0/cambricon_pytorch_1.17.0/user_guide_1.9/index.html)
3. Launch ComfyUI by running `python main.py`
With Powershell: ```"path_to_other_sd_gui\venv\Scripts\Activate.ps1"```
#### Iluvatar Corex
With cmd.exe: ```"path_to_other_sd_gui\venv\Scripts\activate.bat"```
For models compatible with Iluvatar Extension for PyTorch. Here's a step-by-step guide tailored to your platform and installation method:
And then you can use that terminal to run ComfyUI without installing any dependencies. Note that the venv folder might be called something else depending on the SD UI.
1. Install the Iluvatar Corex Toolkit by adhering to the platform-specific instructions on the [Installation](https://support.iluvatar.com/#/DocumentCentre?id=1&nameCenter=2&productId=520117912052801536)
2. Launch ComfyUI by running `python main.py`
# Running
@ -188,6 +329,14 @@ For 6700, 6600 and maybe other RDNA2 or older: ```HSA_OVERRIDE_GFX_VERSION=10.3.
For AMD 7600 and maybe other RDNA3 cards: ```HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py```
### AMD ROCm Tips
You can enable experimental memory efficient attention on recent pytorch in ComfyUI on some AMD GPUs using this command, it should already be enabled by default on RDNA3. If this improves speed for you on latest pytorch on your GPU please report it so that I can enable it by default.
You can also try setting this env variable `PYTORCH_TUNABLEOP_ENABLED=1` which might speed things up at the cost of a very slow initial run.
# Notes
Only parts of the graph that have an output with all the correct inputs will be executed.
@ -211,25 +360,67 @@ To use a textual inversion concepts/embeddings in a text prompt put them in the
Use ```--preview-method auto``` to enable previews.
The default installation includes a fast latent preview method that's low-resolution. To enable higher-quality previews with [TAESD](https://github.com/madebyollin/taesd), download the [taesd_decoder.pth](https://github.com/madebyollin/taesd/raw/main/taesd_decoder.pth) (for SD1.x and SD2.x) and [taesdxl_decoder.pth](https://github.com/madebyollin/taesd/raw/main/taesdxl_decoder.pth) (for SDXL) models and place them in the `models/vae_approx` folder. Once they're installed, restart ComfyUI to enable high-quality previews.
The default installation includes a fast latent preview method that's low-resolution. To enable higher-quality previews with [TAESD](https://github.com/madebyollin/taesd), download the [taesd_decoder.pth, taesdxl_decoder.pth, taesd3_decoder.pth and taef1_decoder.pth](https://github.com/madebyollin/taesd/) and place them in the `models/vae_approx` folder. Once they're installed, restart ComfyUI and launch it with `--preview-method taesd` to enable high-quality previews.
## How to use TLS/SSL?
Generate a self-signed certificate (not appropriate for shared/production use) and key by running the command: `openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=XX/ST=StateName/L=CityName/O=CompanyName/OU=CompanySectionName/CN=CommonNameOrHostname"`
Use `--tls-keyfile key.pem --tls-certfile cert.pem` to enable TLS/SSL, the app will now be accessible with `https://...` instead of `http://...`.
> Note: Windows users can use [alexisrolland/docker-openssl](https://github.com/alexisrolland/docker-openssl) or one of the [3rd party binary distributions](https://wiki.openssl.org/index.php/Binaries) to run the command example above.
> Note: Windows users can use [alexisrolland/docker-openssl](https://github.com/alexisrolland/docker-openssl) or one of the [3rd party binary distributions](https://wiki.openssl.org/index.php/Binaries) to run the command example above.
<br/><br/>If you use a container, note that the volume mount `-v` can be a relative path so `... -v ".\:/openssl-certs" ...` would create the key & cert files in the current directory of your command prompt or powershell terminal.
## Support and dev channel
[Discord](https://comfy.org/discord): Try the #help or #feedback channels.
[Matrix space: #comfyui_space:matrix.org](https://app.element.io/#/room/%23comfyui_space%3Amatrix.org) (it's like discord but open source).
See also: [https://www.comfy.org/](https://www.comfy.org/)
## Frontend Development
As of August 15, 2024, we have transitioned to a new frontend, which is now hosted in a separate repository: [ComfyUI Frontend](https://github.com/Comfy-Org/ComfyUI_frontend). This repository now hosts the compiled JS (from TS/Vue) under the `web/` directory.
### Reporting Issues and Requesting Features
For any bugs, issues, or feature requests related to the frontend, please use the [ComfyUI Frontend repository](https://github.com/Comfy-Org/ComfyUI_frontend). This will help us manage and address frontend-specific concerns more efficiently.
### Using the Latest Frontend
The new frontend is now the default for ComfyUI. However, please note:
1. The frontend in the main ComfyUI repository is updated fortnightly.
2. Daily releases are available in the separate frontend repository.
To use the most up-to-date frontend version:
1. For the latest daily release, launch ComfyUI with this command line argument:
This approach allows you to easily switch between the stable fortnightly release and the cutting-edge daily updates, or even specific versions for testing purposes.
### Accessing the Legacy Frontend
If you need to use the legacy frontend for any reason, you can access it using the following command line argument:
This will use a snapshot of the legacy frontend preserved in the [ComfyUI Legacy Frontend repository](https://github.com/Comfy-Org/ComfyUI_legacy_frontend).
# QA
### Which GPU should I buy for this?
[See this page for some recommendations](https://github.com/comfyanonymous/ComfyUI/wiki/Which-GPU-should-I-buy-for-ComfyUI)
All routes under the `/internal` path are designated for **internal use by ComfyUI only**. These routes are not intended for use by external applications may change at any time without notice.
"""Returns a web response that contains the map of custom_nodes names and their associated workflow templates. The ones without templates are omitted."""
@ -33,20 +35,22 @@ class EnumAction(argparse.Action):
parser=argparse.ArgumentParser()
parser.add_argument("--listen",type=str,default="127.0.0.1",metavar="IP",nargs="?",const="0.0.0.0",help="Specify the IP address to listen on (default: 127.0.0.1). If --listen is provided without an argument, it defaults to 0.0.0.0. (listens on all)")
parser.add_argument("--listen",type=str,default="127.0.0.1",metavar="IP",nargs="?",const="0.0.0.0,::",help="Specify the IP address to listen on (default: 127.0.0.1). You can give a list of ip addresses by separating them with a comma like: 127.2.2.2,127.3.3.3 If --listen is provided without an argument, it defaults to 0.0.0.0,:: (listens on all ipv4 and ipv6)")
parser.add_argument("--port",type=int,default=8188,help="Set the listen port.")
parser.add_argument("--tls-keyfile",type=str,help="Path to TLS (SSL) key file. Enables TLS, makes app accessible at https://... requires --tls-certfile to function")
parser.add_argument("--tls-certfile",type=str,help="Path to TLS (SSL) certificate file. Enables TLS, makes app accessible at https://... requires --tls-keyfile to function")
parser.add_argument("--enable-cors-header",type=str,default=None,metavar="ORIGIN",nargs="?",const="*",help="Enable CORS (Cross-Origin Resource Sharing) with optional origin or allow all with default '*'.")
parser.add_argument("--max-upload-size",type=float,default=100,help="Set the maximum upload size in MB.")
parser.add_argument("--base-directory",type=str,default=None,help="Set the ComfyUI base directory for models, custom_nodes, input, output, temp, and user directories.")
parser.add_argument("--extra-model-paths-config",type=str,default=None,metavar="PATH",nargs='+',action='append',help="Load one or more extra_model_paths.yaml files.")
parser.add_argument("--output-directory",type=str,default=None,help="Set the ComfyUI output directory.")
parser.add_argument("--temp-directory",type=str,default=None,help="Set the ComfyUI temp directory (default is in the ComfyUI directory).")
parser.add_argument("--input-directory",type=str,default=None,help="Set the ComfyUI input directory.")
parser.add_argument("--output-directory",type=str,default=None,help="Set the ComfyUI output directory. Overrides --base-directory.")
parser.add_argument("--temp-directory",type=str,default=None,help="Set the ComfyUI temp directory (default is in the ComfyUI directory). Overrides --base-directory.")
parser.add_argument("--input-directory",type=str,default=None,help="Set the ComfyUI input directory. Overrides --base-directory.")
parser.add_argument("--auto-launch",action="store_true",help="Automatically launch ComfyUI in the default browser.")
parser.add_argument("--disable-auto-launch",action="store_true",help="Disable auto launching the browser.")
parser.add_argument("--cuda-device",type=int,default=None,metavar="DEVICE_ID",help="Set the id of the cuda device this instance will use.")
parser.add_argument("--cuda-device",type=int,default=None,metavar="DEVICE_ID",help="Set the id of the cuda device this instance will use. All other devices will not be visible.")
parser.add_argument("--default-device",type=int,default=None,metavar="DEFAULT_DEVICE_ID",help="Set the id of the default device, all other devices will stay visible.")
cm_group=parser.add_mutually_exclusive_group()
cm_group.add_argument("--cuda-malloc",action="store_true",help="Enable cudaMallocAsync (enabled by default for torch 2.0 and up).")
parser.add_argument("--disable-ipex-optimize",action="store_true",help="Disables ipex.optimize when loading models with Intel GPUs.")
parser.add_argument("--oneapi-device-selector",type=str,default=None,metavar="SELECTOR_STRING",help="Sets the oneAPI device(s) this instance will use.")
parser.add_argument("--disable-ipex-optimize",action="store_true",help="Disables ipex.optimize default when loading models with Intel's Extension for Pytorch.")
parser.add_argument("--supports-fp8-compute",action="store_true",help="ComfyUI will act like if the device supports fp8 compute.")
classLatentPreviewMethod(enum.Enum):
NoPreviews="none"
@ -89,10 +99,20 @@ class LatentPreviewMethod(enum.Enum):
parser.add_argument("--preview-method",type=LatentPreviewMethod,default=LatentPreviewMethod.NoPreviews,help="Default preview method for sampler nodes.",action=EnumAction)
parser.add_argument("--preview-size",type=int,default=512,help="Sets the maximum preview size for sampler nodes.")
cache_group=parser.add_mutually_exclusive_group()
cache_group.add_argument("--cache-classic",action="store_true",help="Use the old style (aggressive) caching.")
cache_group.add_argument("--cache-lru",type=int,default=0,help="Use LRU caching with a maximum of N node results cached. May use more RAM/VRAM.")
cache_group.add_argument("--cache-none",action="store_true",help="Reduced RAM/VRAM usage at the expense of executing every node for each run.")
cache_group.add_argument("--cache-ram",nargs='?',const=4.0,type=float,default=0,help="Use RAM pressure caching with the specified headroom threshold. If available RAM drops below the threhold the cache remove large items to free RAM. Default 4GB")
attn_group=parser.add_mutually_exclusive_group()
attn_group.add_argument("--use-split-cross-attention",action="store_true",help="Use the split cross attention optimization. Ignored when xformers is used.")
attn_group.add_argument("--use-quad-cross-attention",action="store_true",help="Use the sub-quadratic cross attention optimization . Ignored when xformers is used.")
attn_group.add_argument("--use-pytorch-cross-attention",action="store_true",help="Use the new pytorch 2.0 cross attention function.")
vram_group.add_argument("--cpu",action="store_true",help="To use the CPU for everything (slow).")
parser.add_argument("--reserve-vram",type=float,default=None,help="Set the amount of vram in GB you want to reserve for use by your OS/other software. By default some amount is reserved depending on your OS.")
parser.add_argument("--force-non-blocking",action="store_true",help="Force ComfyUI to use non-blocking operations for all applicable tensors. This may improve performance on some non-Nvidia systems but can cause issues with some workflows.")
parser.add_argument("--default-hashing-function",type=str,choices=['md5','sha1','sha256','sha512'],default='sha256',help="Allows you to choose the hash function to use for duplicate filename / contents comparison. Default is sha256.")
parser.add_argument("--disable-smart-memory",action="store_true",help="Force ComfyUI to agressively offload to regular ram instead of keeping models in vram when it can.")
parser.add_argument("--deterministic",action="store_true",help="Make pytorch use slower deterministic algorithms when it can. Note that this might not make images deterministic in all cases.")
classPerformanceFeature(enum.Enum):
Fp16Accumulation="fp16_accumulation"
Fp8MatrixMultiplication="fp8_matrix_mult"
CublasOps="cublas_ops"
AutoTune="autotune"
parser.add_argument("--fast",nargs="*",type=PerformanceFeature,help="Enable some untested and potentially quality deteriorating optimizations. This is used to test new features so using it might crash your comfyui. --fast with no arguments enables everything. You can pass a list specific optimizations if you only want to enable specific ones. Current valid optimizations: {}".format("".join(map(lambdac:c.value,PerformanceFeature))))
parser.add_argument("--mmap-torch-files",action="store_true",help="Use mmap when loading ckpt/pt files.")
parser.add_argument("--disable-mmap",action="store_true",help="Don't use mmap when loading safetensors.")
parser.add_argument("--dont-print-server",action="store_true",help="Don't print server output.")
parser.add_argument("--quick-test-for-ci",action="store_true",help="Quick test for CI.")
parser.add_argument("--windows-standalone-build",action="store_true",help="Windows standalone build: Enable convenient things that most people using the standalone windows build will probably enjoy (like auto opening the page on startup).")
parser.add_argument("--disable-metadata",action="store_true",help="Disable saving prompt metadata in files.")
parser.add_argument("--disable-all-custom-nodes",action="store_true",help="Disable loading all custom nodes.")
parser.add_argument("--whitelist-custom-nodes",type=str,nargs='+',default=[],help="Specify custom node folders to load even when --disable-all-custom-nodes is enabled.")
parser.add_argument("--disable-api-nodes",action="store_true",help="Disable loading all api nodes. Also prevents the frontend from communicating with the internet.")
parser.add_argument("--verbose",action="store_true",help="Enables more debug prints.")
parser.add_argument("--verbose",default='INFO',const='DEBUG',nargs="?",choices=['DEBUG','INFO','WARNING','ERROR','CRITICAL'],help='Set the logging level')
parser.add_argument("--log-stdout",action="store_true",help="Send normal process output to stdout instead of stderr (default).")
Specifies the version of the frontend to be used. This command needs internet connectivity to query and
download available frontend implementations from GitHub releases.
The version string should be in the format of:
[repoOwner]/[repoName]@[version]
where version is one of: "latest" or a valid version number (e.g. "1.0.0")
""",
)
defis_valid_directory(path:str)->str:
"""Validate if the given path is a directory, and check permissions."""
ifnotos.path.exists(path):
raiseargparse.ArgumentTypeError(f"The path '{path}' does not exist.")
ifnotos.path.isdir(path):
raiseargparse.ArgumentTypeError(f"'{path}' is not a directory.")
ifnotos.access(path,os.R_OK):
raiseargparse.ArgumentTypeError(f"You do not have read permissions for '{path}'.")
returnpath
parser.add_argument(
"--front-end-root",
type=is_valid_directory,
default=None,
help="The local filesystem path to the directory where the frontend is located. Overrides --front-end-version.",
)
parser.add_argument("--user-directory",type=is_valid_directory,default=None,help="Set the ComfyUI user directory with an absolute path. Overrides --base-directory.")
parser.add_argument("--database-url",type=str,default=f"sqlite:///{database_default_path}",help="Specify the database URL, e.g. for an in-memory database you can use 'sqlite:///:memory:'.")
ifcomfy.options.args_parsing:
args=parser.parse_args()
@ -136,9 +228,16 @@ if args.windows_standalone_build:
"""Base class for string enums. Python's StrEnum is not available until 3.11."""
def__str__(self)->str:
returnself.value
classIO(StrEnum):
"""Node input/output data types.
Includes functionality for ``"*"`` (`ANY`) and ``"MULTI,TYPES"``.
"""
STRING="STRING"
IMAGE="IMAGE"
MASK="MASK"
LATENT="LATENT"
BOOLEAN="BOOLEAN"
INT="INT"
FLOAT="FLOAT"
COMBO="COMBO"
CONDITIONING="CONDITIONING"
SAMPLER="SAMPLER"
SIGMAS="SIGMAS"
GUIDER="GUIDER"
NOISE="NOISE"
CLIP="CLIP"
CONTROL_NET="CONTROL_NET"
VAE="VAE"
MODEL="MODEL"
LORA_MODEL="LORA_MODEL"
LOSS_MAP="LOSS_MAP"
CLIP_VISION="CLIP_VISION"
CLIP_VISION_OUTPUT="CLIP_VISION_OUTPUT"
STYLE_MODEL="STYLE_MODEL"
GLIGEN="GLIGEN"
UPSCALE_MODEL="UPSCALE_MODEL"
AUDIO="AUDIO"
WEBCAM="WEBCAM"
POINT="POINT"
FACE_ANALYSIS="FACE_ANALYSIS"
BBOX="BBOX"
SEGS="SEGS"
VIDEO="VIDEO"
ANY="*"
"""Always matches any type, but at a price.
Causes some functionality issues (e.g. reroutes, link types), and should be avoided whenever possible.
"""
NUMBER="FLOAT,INT"
"""A float or an int - could be either"""
PRIMITIVE="STRING,FLOAT,INT,BOOLEAN"
"""Could be any of: string, float, int, or bool"""
def__ne__(self,value:object)->bool:
ifself=="*"orvalue=="*":
returnFalse
ifnotisinstance(value,str):
returnTrue
a=frozenset(self.split(","))
b=frozenset(value.split(","))
returnnot(b.issubset(a)ora.issubset(b))
classRemoteInputOptions(TypedDict):
route:str
"""The route to the remote source."""
refresh_button:bool
"""Specifies whether to show a refresh button in the UI below the widget."""
control_after_refresh:Literal["first","last"]
"""Specifies the control after the refresh button is clicked. If "first", the first item will be automatically selected, and so on."""
timeout:int
"""The maximum amount of time to wait for a response from the remote source in milliseconds."""
max_retries:int
"""The maximum number of retries before aborting the request."""
refresh:int
"""The TTL of the remote input's value in milliseconds. Specifies the interval at which the remote input's value is refreshed."""
classMultiSelectOptions(TypedDict):
placeholder:NotRequired[str]
"""The placeholder text to display in the multi-select widget when no items are selected."""
chip:NotRequired[bool]
"""Specifies whether to use chips instead of comma separated values for the multi-select widget."""
classInputTypeOptions(TypedDict):
"""Provides type hinting for the return type of the INPUT_TYPES node function.
Due to IDE limitations with unions, for now all options are available for all types (e.g. `label_on` is hinted even when the type is not `IO.BOOLEAN`).
"""Forces the input to be an input slot rather than a widget even a widget is available for the input type."""
lazy:NotRequired[bool]
"""Declares that this input uses lazy evaluation"""
rawLink:NotRequired[bool]
"""When a link exists, rather than receiving the evaluated value, you will receive the link (i.e. `["nodeId", <outputIndex>]`). Designed for node expansion."""
tooltip:NotRequired[str]
"""Tooltip for the input (or widget), shown on pointer hover"""
socketless:NotRequired[bool]
"""All inputs (including widgets) have an input socket to connect links. When ``true``, if there is a widget for this input, no socket will be created.
"""Specifies whether a control widget should be added to the input, adding options to automatically change the value after each prompt is queued. Currently only used for INT and COMBO types."""
options:NotRequired[list[str|int|float]]
"""COMBO type only. Specifies the selectable options for the combo widget.
"""Provides type hinting for the hidden entry of node INPUT_TYPES."""
node_id:NotRequired[Literal["UNIQUE_ID"]]
"""UNIQUE_ID is the unique identifier of the node, and matches the id property of the node on the client side. It is commonly used in client-server communications (see messages)."""
unique_id:NotRequired[Literal["UNIQUE_ID"]]
"""UNIQUE_ID is the unique identifier of the node, and matches the id property of the node on the client side. It is commonly used in client-server communications (see messages)."""
prompt:NotRequired[Literal["PROMPT"]]
"""PROMPT is the complete prompt sent by the client to the server. See the prompt object for a full description."""
"""EXTRA_PNGINFO is a dictionary that will be copied into the metadata of any .png files saved. Custom nodes can store additional information in this dictionary for saving (or as a way to communicate with a downstream node)."""
dynprompt:NotRequired[Literal["DYNPROMPT"]]
"""DYNPROMPT is an instance of comfy_execution.graph.DynamicPrompt. It differs from PROMPT in that it may mutate during the course of execution in response to Node Expansion."""
"""A flag indicating if this node implements the additional code necessary to deal with OUTPUT_IS_LIST nodes.
All inputs of ``type`` will become ``list[type]``, regardless of how many items are passed in. This also affects ``check_lazy_status``.
From the docs:
A node can also override the default input behaviour and receive the whole list in a single call. This is done by setting a class attribute `INPUT_IS_LIST` to ``True``.
# based on code in Kijai's WanVideoWrapper: https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/dbb2523b37e4ccdf45127e5ae33e31362f755c8e/nodes.py#L1302
# only expected overlap is given different weights
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.