ComfyUI

mirror of https://github.com/comfyanonymous/ComfyUI.git synced 2026-07-16 08:28:18 +08:00

Author	SHA1	Message	Date
Dr.Lt.Data	32bd55779a	Merge branch 'master' into dr-support-pip-cm	2025-11-05 07:42:29 +09:00
comfyanonymous	7f3e4d486c	Limit amount of pinned memory on windows to prevent issues. (#10638 )	2025-11-04 17:37:50 -05:00
rattus	a389ee01bb	caching: Handle None outputs tuple case (#10637 )	2025-11-04 14:14:10 -08:00
ComfyUI Wiki	9c71a66790	chore: update workflow templates to v0.2.11 (#10634 )	2025-11-04 10:51:53 -08:00
Dr.Lt.Data	671a769dc6	Merge branch 'master' into dr-support-pip-cm	2025-11-04 23:25:51 +09:00
comfyanonymous	af4b7b5edb	More fp8 torch.compile regressions fixed. (#10625 )	2025-11-03 22:14:20 -05:00
comfyanonymous	0f4ef3afa0	This seems to slow things down slightly on Linux. (#10624 )	2025-11-03 21:47:14 -05:00
comfyanonymous	6b88478f9f	Bring back fp8 torch compile performance to what it should be. (#10622 )	2025-11-03 19:22:10 -05:00
comfyanonymous	e199c8cc67	Fixes (#10621 )	2025-11-03 17:58:24 -05:00
comfyanonymous	0652cb8e2d	Speed up torch.compile (#10620 )	2025-11-03 17:37:12 -05:00
comfyanonymous	958a17199a	People should update their pytorch versions. (#10618 )	2025-11-03 17:08:30 -05:00
ComfyUI Wiki	e974e554ca	chore: update embedded docs to v0.3.1 (#10614 )	2025-11-03 10:59:44 -08:00
Alexander Piskun	4e2110c794	feat(Pika-API-nodes): use new API client (#10608 )	2025-11-03 00:29:08 -08:00
Alexander Piskun	e617cddf24	convert nodes_openai.py to V3 schema (#10604 )	2025-11-03 00:28:13 -08:00
Alexander Piskun	1f3f7a2823	convert nodes_hypernetwork.py to V3 schema (#10583 )	2025-11-03 00:21:47 -08:00
EverNebula	88df172790	fix(caching): treat bytes as hashable (#10567 )	2025-11-03 00:16:40 -08:00
Alexander Piskun	6d6a18b0b7	fix(api-nodes-cloud): stop using sub-folder and absolute path for output of Rodin3D nodes (#10556 )	2025-11-03 00:04:56 -08:00
Dr.Lt.Data	d8b821e47b	Merge branch 'master' into dr-support-pip-cm	2025-11-03 07:12:55 +09:00
comfyanonymous	97ff9fae7e	Clarify help text for --fast argument (#10609 ) Updated help text for the --fast argument to clarify potential risks.	2025-11-02 13:14:04 -05:00
rattus	135fa49ec2	Small speed improvements to --async-offload (#10593 ) * ops: dont take an offload stream if you dont need one * ops: prioritize mem transfer The async offload streams reason for existence is to transfer from RAM to GPU. The post processing compute steps are a bonus on the side stream, but if the compute stream is running a long kernel, it can stall the side stream, as it wait to type-cast the bias before transferring the weight. So do a pure xfer of the weight straight up, then do everything bias, then go back to fix the weight type and do weight patches.	2025-11-01 18:48:53 -04:00
comfyanonymous	44869ff786	Fix issue with pinned memory. (#10597 )	2025-11-01 17:25:59 -04:00
Alexander Piskun	20182a393f	convert StabilityAI to use new API client (#10582 )	2025-11-01 12:14:06 -07:00
Alexander Piskun	5f109fe6a0	added 12s-20s as available output durations for the LTXV API nodes (#10570 )	2025-11-01 12:13:39 -07:00
comfyanonymous	c58c13b2ba	Fix torch compile regression on fp8 ops. (#10580 )	2025-11-01 00:25:17 -04:00
Dr.Lt.Data	16359abbbc	Merge branch 'master' into dr-support-pip-cm	2025-11-01 06:27:21 +09:00
comfyanonymous	7f374e42c8	ScaleROPE now works on Lumina models. (#10578 )	2025-10-31 15:41:40 -04:00
Dr.Lt.Data	8f492d8f34	Merge branch 'master' into dr-support-pip-cm	2025-10-31 12:55:36 +09:00
comfyanonymous	27d1bd8829	Fix rope scaling. (#10560 )	2025-10-30 22:51:58 -04:00
comfyanonymous	614cf9805e	Add a ScaleROPE node. Currently only works on WAN models. (#10559 )	2025-10-30 22:11:38 -04:00
Dr.Lt.Data	ad4b959d7e	Merge branch 'master' into dr-support-pip-cm	2025-10-31 07:31:50 +09:00
rattus	513b0c46fb	Add RAM Pressure cache mode (#10454 ) * execution: Roll the UI cache into the outputs Currently the UI cache is parallel to the output cache with expectations of being a content superset of the output cache. At the same time the UI and output cache are maintained completely seperately, making it awkward to free the output cache content without changing the behaviour of the UI cache. There are two actual users (getters) of the UI cache. The first is the case of a direct content hit on the output cache when executing a node. This case is very naturally handled by merging the UI and outputs cache. The second case is the history JSON generation at the end of the prompt. This currently works by asking the cache for all_node_ids and then pulling the cache contents for those nodes. all_node_ids is the nodes of the dynamic prompt. So fold the UI cache into the output cache. The current UI cache setter now writes to a prompt-scope dict. When the output cache is set, just get this value from the dict and tuple up with the outputs. When generating the history, simply iterate prompt-scope dict. This prepares support for more complex caching strategies (like RAM pressure caching) where less than 1 workflow will be cached and it will be desirable to keep the UI cache and output cache in sync. * sd: Implement RAM getter for VAE * model_patcher: Implement RAM getter for ModelPatcher * sd: Implement RAM getter for CLIP * Implement RAM Pressure cache Implement a cache sensitive to RAM pressure. When RAM headroom drops down below a certain threshold, evict RAM-expensive nodes from the cache. Models and tensors are measured directly for RAM usage. An OOM score is then computed based on the RAM usage of the node. Note the due to indirection through shared objects (like a model patcher), multiple nodes can account the same RAM as their individual usage. The intent is this will free chains of nodes particularly model loaders and associate loras as they all score similar and are sorted in close to each other. Has a bias towards unloading model nodes mid flow while being able to keep results like text encodings and VAE. * execution: Convert the cache entry to NamedTuple As commented in review. Convert this to a named tuple and abstract away the tuple type completely from graph.py.	2025-10-30 17:39:02 -04:00
Alexander Piskun	dfac94695b	fix img2img operation in Dall2 node (#10552 )	2025-10-30 10:22:35 -07:00
Alexander Piskun	163b629c70	use new API client in Pixverse and Ideogram nodes (#10543 )	2025-10-29 23:49:03 -07:00
Jedrzej Kosinski	998bf60beb	Add units/info for the numbers displayed on 'load completely' and 'load partially' log messages (#10538 )	2025-10-29 19:37:06 -04:00
comfyanonymous	906c089957	Fix small performance regression with fp8 fast and scaled fp8. (#10537 )	2025-10-29 19:29:01 -04:00
Dr.Lt.Data	b88c66bfa1	Merge branch 'master' into dr-support-pip-cm	2025-10-30 07:30:50 +09:00
comfyanonymous	25de7b1bfa	Try to fix slow load issue on low ram hardware with pinned mem. (#10536 )	2025-10-29 17:20:27 -04:00
rattus	ab7ab5be23	Fix Race condition in --async-offload that can cause corruption (#10501 ) * mm: factor out the current stream getter Make this a reusable function. * ops: sync the offload stream with the consumption of w&b This sync is nessacary as pytorch will queue cuda async frees on the same stream as created to tensor. In the case of async offload, this will be on the offload stream. Weights and biases can go out of scope in python which then triggers the pytorch garbage collector to queue the free operation on the offload stream possible before the compute stream has used the weight. This causes a use after free on weight data leading to total corruption of some workflows. So sync the offload stream with the compute stream after the weight has been used so the free has to wait for the weight to be used. The cast_bias_weight is extended in a backwards compatible way with the new behaviour opt-in on a defaulted parameter. This handles custom node packs calling cast_bias_weight and defeatures async-offload for them (as they do not handle the race). The pattern is now: cast_bias_weight(... , offloadable=True) #This might be offloaded thing(weight, bias, ...) uncast_bias_weight(...) * controlnet: adopt new cast_bias_weight synchronization scheme This is nessacary for safe async weight offloading. * mm: sync the last stream in the queue, not the next Currently this peeks ahead to sync the next stream in the queue of streams with the compute stream. This doesnt allow a lot of parallelization, as then end result is you can only get one weight load ahead regardless of how many streams you have. Rotate the loop logic here to synchronize the end of the queue before returning the next stream. This allows weights to be loaded ahead of the compute streams position.	2025-10-29 17:17:46 -04:00
comfyanonymous	ec4fc2a09a	Fix case of weights not being unpinned. (#10533 )	2025-10-29 15:48:06 -04:00
comfyanonymous	1a58087ac2	Reduce memory usage for fp8 scaled op. (#10531 )	2025-10-29 15:43:51 -04:00
Alexander Piskun	6c14f3afac	use new API client in Luma and Minimax nodes (#10528 )	2025-10-29 11:14:56 -07:00
comfyanonymous	e525673f72	Fix issue. (#10527 )	2025-10-29 00:37:00 -04:00
comfyanonymous	3fa7a5c04a	Speed up offloading using pinned memory. (#10526 ) To enable this feature use: --fast pinned_memory	2025-10-29 00:21:01 -04:00
Alexander Piskun	210f7a1ba5	convert nodes_recraft.py to V3 schema (#10507 )	2025-10-28 14:38:05 -07:00
rattus	d202c2ba74	execution: Allow a subgraph nodes to execute multiple times (#10499 ) In the case of --cache-none lazy and subgraph execution can cause anything to be run multiple times per workflow. If that rerun nodes is in itself a subgraph generator, this will crash for two reasons. pending_subgraph_results[] does not cleanup entries after their use. So when a pending_subgraph_result is consumed, remove it from the list so that if the corresponding node is fully re-executed this misses lookup and it fall through to execute the node as it should. Secondly, theres is an explicit enforcement against dups in the addition of subgraphs nodes as ephemerals to the dymprompt. Remove this enforcement as the use case is now valid.	2025-10-28 16:22:08 -04:00
contentis	8817f8fc14	Mixed Precision Quantization System (#10498 ) * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Fix missing keys * Rename quant dtype parameter * Rename quant dtype parameter * Fix unittests for CPU build	2025-10-28 16:20:53 -04:00
comfyanonymous	22e40d2ace	Tell users to update their nvidia drivers if portable doesn't start. (#10518 )	2025-10-28 15:08:08 -04:00
Dr.Lt.Data	de357a01f8	Merge branch 'master' into dr-support-pip-cm	2025-10-28 19:01:11 +09:00
comfyanonymous	3bea4efc6b	Tell users to update nvidia drivers if problem with portable. (#10510 )	2025-10-28 04:45:45 -04:00
comfyanonymous	8cf2ba4ba6	Remove comfy api key from queue api. (#10502 )	2025-10-28 03:23:52 -04:00

1 2 3 4 5 ...

4358 Commits