Compare commits

..

14 Commits

Author SHA1 Message Date
7c0c70b608 Correct PIL contract in ImageBlend channel-cap rationale
PIL.Image.fromarray accepts 2-channel (LA mode) arrays as well, not
just 1/3/4-channel. Reword the inline comments and test docstrings to
say 'rejects > 4-channel arrays', which is the actual constraint
driving the cap. Also drop a too-narrow 'mode in (L, RGB, RGBA)'
assertion in test_save_compatible_output_passes_through_pil so a
future 2-channel result would not be flagged as a failure.
2026-04-27 07:58:20 +00:00
ae88cd1966 Cap ImageBlend channel-mismatch output at 4 channels (RGBA)
Address review feedback: the previous fix allowed ImageBlend to return
tensors with > 4 channels (e.g. blending a 3-channel and a 5-channel
image produced a 5-channel tensor). This shifted the original failure
from blend-time to save/preview-time, because SaveImage and PreviewImage
both call PIL.Image.fromarray, which only supports 1/3/4-channel arrays.

Fix:
- In Blend.execute, the alignment target is now min(max(c1, c2), 4):
  any image with more than 4 channels is truncated, any image with
  fewer is padded with 1.0s up to the (capped) target. This makes the
  RGB/RGBA case work and also makes the >4-channel case work end-to-end
  rather than just deferring its failure.
- Update the regression test that previously codified the wrong
  5-channel-output behavior to assert the correct 4-channel cap.
- Add test_output_capped_at_four_channels (both inputs > 4 channels).
- Add test_save_compatible_output_passes_through_pil that mirrors
  SaveImage's exact PIL.Image.fromarray conversion to catch regressions
  in the save/preview path.
- Add a small workflow-validation test (image_blend_workflow_test.py)
  that loads tests/inference/graphs/image_blend_channel_mismatch.json
  and verifies its node types and wiring, so the demo workflow can't
  silently bitrot.

Verified end-to-end against a local ComfyUI server: the workflow runs,
output is RGBA, downstream SaveImage succeeds.
2026-04-27 07:18:16 +00:00
2b731a99fd Add API workflow JSON demonstrating ImageBlend channel-count fix
Adds a self-contained API workflow at
tests/inference/graphs/image_blend_channel_mismatch.json that exercises
the previously-broken RGB + RGBA blend case end-to-end:

  EmptyImage (RGB, red)        -> image1
  EmptyImage (RGB, blue)
    + SolidMask (0.5)          -> JoinImageWithAlpha (RGBA) -> image2
                                 -> ImageBlend (3ch + 4ch)
                                 -> SaveImage

The workflow uses no model checkpoints and runs in seconds on CPU. With
the fix it produces a 4-channel RGBA output (alpha preserved); the
previous behavior would have silently dropped the alpha channel via
node_helpers.image_alpha_fix.

Verified end-to-end against a local ComfyUI instance: workflow executes
successfully, output PNG is RGBA, center pixel is (127, 0, 127, 191) -
red and blue blended at 0.5 with alpha 0.5*1.5=0.75 (191/255).
2026-04-27 07:08:14 +00:00
c390b88d8c Strengthen ImageBlend regression tests with deterministic assertions
- test_output_clamped now uses inputs that would produce -0.5 without
  clamping, so it actually exercises the torch.clamp(..., 0, 1) call.
- test_padding_value_is_one verifies that the channel-alignment logic
  pads with 1.0 specifically (not 0.0 or some other value), which is the
  semantic guarantee of treating the extra channel as an opaque alpha.
2026-04-27 06:36:44 +00:00
618f1026fc Address review feedback for ImageBlend channel-count fix
- Add regression tests covering RGB+RGBA, RGBA+RGB, channel gap > 1
  (the exact CORE-103 error case), all blend modes with mismatch, and
  output value clamping.
- Soften the inline comment to reflect that channel padding is well-
  defined for alpha-like extra channels rather than claiming support
  for arbitrary channel layouts.
2026-04-27 06:31:18 +00:00
6a1284e20b Fix ImageBlend node to handle mismatched channel counts
Replace the limited node_helpers.image_alpha_fix call (which only handles
RGB<->RGBA differences by padding by exactly one channel) with the same
generalized channel-padding logic used by the ImageStitch node.

This allows ImageBlend to work with any combination of channel counts
(e.g. 3 vs 4, 3 vs 5, 4 vs 3, etc.) by padding the image with fewer
channels using 1.0s up to the larger channel count. Behavior between
ImageBlend and ImageStitch is now consistent.

Fixes CORE-103.
2026-04-27 06:21:33 +00:00
115f418b64 Make EmptySD3LatentImage node use intermediate dtype. (#13577) 2026-04-26 23:23:57 -04:00
7385eb2800 Add new ComfyUI blueprints and fix subgraph naming (#13371)
* Remove local tag from subgraph name

* New Subgraph blueprints

* Remove duplicate blueprint

* Update Subgraph size

* Update subgraph

* Update Blueprint

* Remove local tag from subgraph name

* New Subgraph blueprints

* Remove duplicate blueprint

* Update Subgraph size

* Update subgraph

* Update Blueprint

* Update LTX 2.0 Pose to Video

* Fix crop blueprint split coverage

Made-with: Cursor

* Clean up image edit blueprint metadata

Made-with: Cursor

* Update subgraph blueprints

---------

Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2026-04-26 22:59:16 +08:00
df22bcd5e1 Support loading the alpha channel of videos. (#13564)
Not exposed in nodes yet.
2026-04-25 21:02:58 -04:00
5e3f15a830 Bump comfyui-frontend-package to 1.42.15 (#13556) 2026-04-24 17:21:39 -07:00
4304c15e9b Properly load higher bit depth videos. (#13542) 2026-04-24 16:46:10 -04:00
7636599389 chore(api-nodes): add upcoming-deprecation notice to Sora nodes (#13549) 2026-04-24 06:54:10 -07:00
443074eee9 Add OpenAPI 3.1 specification for ComfyUI API (#13397)
* Add OpenAPI 3.1 specification for ComfyUI API

Adds a comprehensive OpenAPI 3.1 spec documenting all HTTP endpoints
exposed by ComfyUI's server, including prompt execution, queue management,
file uploads, userdata, settings, system stats, object info, assets,
and internal routes.

The spec was validated against the source code with adversarial review
from multiple models, and passes Spectral linting with zero errors.

Also removes openapi.yaml from .gitignore so the spec is tracked.

* Mark /api/history endpoints as deprecated

Address Jacob's review feedback on PR #13397 by explicitly marking the
three /api/history operations as deprecated in the OpenAPI spec:

  * GET  /api/history              -> superseded by GET /api/jobs
  * POST /api/history              -> superseded by /api/jobs management
  * GET  /api/history/{prompt_id}  -> superseded by GET /api/jobs/{job_id}

Each operation gains deprecated: true plus a description that names the
replacement. A formal sunset timeline (RFC 8594 Deprecation and RFC 8553
Sunset headers, minimum-runway policy) is being defined separately and
will be applied as a follow-up.

* Address Spectral lint findings in openapi.yaml

- Add operation descriptions to 52 endpoints (prompt, queue, upload,
  view, models, userdata, settings, assets, internal, etc.)
- Add schema descriptions to 22 component schemas
- Add parameter descriptions to 8 path parameters that were missing them
- Remove 6 unused component schemas: TaskOutput, EmbeddingsResponse,
  ExtensionsResponse, LogRawResponse, UserInfo, UserDataFullInfo

No wire/shape changes. Reduces Spectral findings from 92 to 4. The
remaining 4 are real issues (WebSocket 101 on /ws, loose error schema,
and two snake_case warnings on real wire field names) and are worth
addressing separately.

* fix(openapi): address jtreminio oneOf review on /api/userdata

Restructure the UserData response schemas to address the review feedback
on the `oneOf` without a discriminator, and fix two accuracy bugs found
while doing it.

Changes
- GET /api/userdata response: extract the inline `oneOf` to a named
  schema (`ListUserdataResponse`) and add the missing third variant
  returned when `split=true` and `full_info=false` (array of
  `[relative_path, ...path_components]`). Previously only two of the
  three actual server response shapes were described.
- UserDataResponse (POST endpoints): correct the description — this
  schema is a single item, not a list — and point at the canonical
  `GetUserDataResponseFullFile` schema instead of the duplicate
  `UserDataResponseFull`. Also removes the malformed blank line in
  `UserDataResponseShort`.
- Delete the now-unused `UserDataResponseFull` and
  `UserDataResponseShort` schemas (replaced by reuse of
  `GetUserDataResponseFullFile` and an inline string variant).
- Add an `x-variant-selector` vendor extension to both `oneOf` sites
  documenting which query-parameter combination selects which branch,
  since a true OpenAPI `discriminator` is not applicable (the variants
  are type-disjoint and the selector lives in the request, not the
  response body).

This keeps the shapes the server actually emits (no wire-breaking
change) while making the selection rule explicit for SDK generators
and readers.

---------

Co-authored-by: guill <jacob.e.segal@gmail.com>
2026-04-23 21:00:25 -07:00
2e0503780d range type (#13322)
Co-authored-by: guill <jacob.e.segal@gmail.com>
2026-04-23 20:51:34 -07:00
31 changed files with 33546 additions and 1096 deletions

1
.gitignore vendored
View File

@ -21,6 +21,5 @@ venv*/
*.log
web_custom_versions/
.DS_Store
openapi.yaml
filtered-openapi.yaml
uv.lock

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -160,7 +160,7 @@
},
"revision": 0,
"config": {},
"name": "local-Depth to Image (Z-Image-Turbo)",
"name": "Depth to Image (Z-Image-Turbo)",
"inputNode": {
"id": -10,
"bounding": [
@ -2482,4 +2482,4 @@
"VHS_KeepIntermediate": true
},
"version": 0.4
}
}

View File

@ -261,7 +261,7 @@
},
"revision": 0,
"config": {},
"name": "local-Depth to Video (LTX 2.0)",
"name": "Depth to Video (LTX 2.0)",
"inputNode": {
"id": -10,
"bounding": [
@ -5208,4 +5208,4 @@
"workflowRendererVersion": "LG"
},
"version": 0.4
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -128,7 +128,7 @@
},
"revision": 0,
"config": {},
"name": "local-Image Edit (Flux.2 Klein 4B)",
"name": "Image Edit (Flux.2 Klein 4B)",
"inputNode": {
"id": -10,
"bounding": [
@ -1837,4 +1837,4 @@
}
},
"version": 0.4
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -124,7 +124,7 @@
},
"revision": 0,
"config": {},
"name": "local-Image Inpainting (Qwen-image)",
"name": "Image Inpainting (Qwen-image)",
"inputNode": {
"id": -10,
"bounding": [
@ -1923,4 +1923,4 @@
"workflowRendererVersion": "LG"
},
"version": 0.4
}
}

View File

@ -204,7 +204,7 @@
},
"revision": 0,
"config": {},
"name": "local-Image Outpainting (Qwen-Image)",
"name": "Image Outpainting (Qwen-Image)",
"inputNode": {
"id": -10,
"bounding": [
@ -2749,4 +2749,4 @@
}
},
"version": 0.4
}
}

View File

@ -1,15 +1,14 @@
{
"id": "1a761372-7c82-4016-b9bf-fa285967e1e9",
"revision": 0,
"last_node_id": 83,
"last_node_id": 176,
"last_link_id": 0,
"nodes": [
{
"id": 83,
"type": "f754a936-daaf-4b6e-9658-41fdc54d301d",
"id": 176,
"type": "2d2e3c8e-53b3-4618-be52-6d1d99382f0e",
"pos": [
61.999827823554256,
153.3332507624185
-1150,
200
],
"size": [
400,
@ -56,6 +55,38 @@
"name": "layers"
},
"link": null
},
{
"name": "seed",
"type": "INT",
"widget": {
"name": "seed"
},
"link": null
},
{
"name": "unet_name",
"type": "COMBO",
"widget": {
"name": "unet_name"
},
"link": null
},
{
"name": "clip_name",
"type": "COMBO",
"widget": {
"name": "clip_name"
},
"link": null
},
{
"name": "vae_name",
"type": "COMBO",
"widget": {
"name": "vae_name"
},
"link": null
}
],
"outputs": [
@ -66,28 +97,41 @@
"links": []
}
],
"title": "Image to Layers (Qwen-Image-Layered)",
"properties": {
"proxyWidgets": [
[
"-1",
"6",
"text"
],
[
"-1",
"3",
"steps"
],
[
"-1",
"3",
"cfg"
],
[
"-1",
"83",
"layers"
],
[
"3",
"seed"
],
[
"37",
"unet_name"
],
[
"38",
"clip_name"
],
[
"39",
"vae_name"
],
[
"3",
"control_after_generate"
@ -95,6 +139,11 @@
],
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -103,25 +152,20 @@
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": [
"",
20,
2.5,
2
]
"widgets_values": []
}
],
"links": [],
"groups": [],
"version": 0.4,
"definitions": {
"subgraphs": [
{
"id": "f754a936-daaf-4b6e-9658-41fdc54d301d",
"id": "2d2e3c8e-53b3-4618-be52-6d1d99382f0e",
"version": 1,
"state": {
"lastGroupId": 3,
"lastNodeId": 83,
"lastLinkId": 159,
"lastGroupId": 8,
"lastNodeId": 176,
"lastLinkId": 380,
"lastRerouteId": 0
},
"revision": 0,
@ -130,10 +174,10 @@
"inputNode": {
"id": -10,
"bounding": [
-510,
523,
-720,
720,
120,
140
220
]
},
"outputNode": {
@ -156,8 +200,8 @@
],
"localized_name": "image",
"pos": [
-410,
543
-620,
740
]
},
{
@ -168,8 +212,8 @@
150
],
"pos": [
-410,
563
-620,
760
]
},
{
@ -180,8 +224,8 @@
153
],
"pos": [
-410,
583
-620,
780
]
},
{
@ -192,8 +236,8 @@
154
],
"pos": [
-410,
603
-620,
800
]
},
{
@ -204,8 +248,56 @@
159
],
"pos": [
-410,
623
-620,
820
]
},
{
"id": "9f76338b-f4ca-4bb3-b61a-57b3f233061e",
"name": "seed",
"type": "INT",
"linkIds": [
377
],
"pos": [
-620,
840
]
},
{
"id": "8d0422d5-5eee-4f7e-9817-dc613cc62eca",
"name": "unet_name",
"type": "COMBO",
"linkIds": [
378
],
"pos": [
-620,
860
]
},
{
"id": "552eece2-a735-4d00-ae78-ded454622bc1",
"name": "clip_name",
"type": "COMBO",
"linkIds": [
379
],
"pos": [
-620,
880
]
},
{
"id": "1e6d141c-d0f9-4a2b-895c-b6780e57cfa0",
"name": "vae_name",
"type": "COMBO",
"linkIds": [
380
],
"pos": [
-620,
900
]
}
],
@ -231,14 +323,14 @@
"type": "CLIPLoader",
"pos": [
-320,
310
360
],
"size": [
346.7470703125,
106
350,
150
],
"flags": {},
"order": 0,
"order": 5,
"mode": 0,
"inputs": [
{
@ -248,7 +340,7 @@
"widget": {
"name": "clip_name"
},
"link": null
"link": 379
},
{
"localized_name": "type",
@ -283,9 +375,14 @@
}
],
"properties": {
"Node name for S&R": "CLIPLoader",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "CLIPLoader",
"models": [
{
"name": "qwen_2.5_vl_7b_fp8_scaled.safetensors",
@ -312,14 +409,14 @@
"type": "VAELoader",
"pos": [
-320,
460
580
],
"size": [
346.7470703125,
58
350,
110
],
"flags": {},
"order": 1,
"order": 6,
"mode": 0,
"inputs": [
{
@ -329,7 +426,7 @@
"widget": {
"name": "vae_name"
},
"link": null
"link": 380
}
],
"outputs": [
@ -345,9 +442,14 @@
}
],
"properties": {
"Node name for S&R": "VAELoader",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "VAELoader",
"models": [
{
"name": "qwen_image_layered_vae.safetensors",
@ -375,11 +477,11 @@
420
],
"size": [
425.27801513671875,
180.6060791015625
430,
190
],
"flags": {},
"order": 3,
"order": 2,
"mode": 0,
"inputs": [
{
@ -411,9 +513,14 @@
],
"title": "CLIP Text Encode (Negative Prompt)",
"properties": {
"Node name for S&R": "CLIPTextEncode",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -432,12 +539,12 @@
"id": 70,
"type": "ReferenceLatent",
"pos": [
330,
670
140,
700
],
"size": [
204.1666717529297,
46
210,
50
],
"flags": {
"collapsed": true
@ -470,9 +577,14 @@
}
],
"properties": {
"Node name for S&R": "ReferenceLatent",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "ReferenceLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -480,19 +592,18 @@
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
}
},
{
"id": 69,
"type": "ReferenceLatent",
"pos": [
330,
710
160,
820
],
"size": [
204.1666717529297,
46
210,
50
],
"flags": {
"collapsed": true
@ -525,9 +636,14 @@
}
],
"properties": {
"Node name for S&R": "ReferenceLatent",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "ReferenceLatent",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -535,8 +651,7 @@
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
}
},
{
"id": 66,
@ -547,10 +662,10 @@
],
"size": [
270,
58
110
],
"flags": {},
"order": 4,
"order": 7,
"mode": 0,
"inputs": [
{
@ -580,9 +695,14 @@
}
],
"properties": {
"Node name for S&R": "ModelSamplingAuraFlow",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "ModelSamplingAuraFlow",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -600,11 +720,11 @@
"type": "LatentCutToBatch",
"pos": [
830,
160
140
],
"size": [
270,
82
140
],
"flags": {},
"order": 11,
@ -646,9 +766,14 @@
}
],
"properties": {
"Node name for S&R": "LatentCutToBatch",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "LatentCutToBatch",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -666,12 +791,12 @@
"id": 71,
"type": "VAEEncode",
"pos": [
100,
690
-280,
780
],
"size": [
140,
46
230,
100
],
"flags": {
"collapsed": false
@ -704,9 +829,14 @@
}
],
"properties": {
"Node name for S&R": "VAEEncode",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "VAEEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -714,24 +844,23 @@
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
}
},
{
"id": 8,
"type": "VAEDecode",
"pos": [
850,
310
370
],
"size": [
210,
46
50
],
"flags": {
"collapsed": true
},
"order": 7,
"order": 3,
"mode": 0,
"inputs": [
{
@ -759,9 +888,14 @@
}
],
"properties": {
"Node name for S&R": "VAEDecode",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "VAEDecode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -769,8 +903,7 @@
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
}
},
{
"id": 6,
@ -780,11 +913,11 @@
180
],
"size": [
422.84503173828125,
164.31304931640625
430,
170
],
"flags": {},
"order": 6,
"order": 1,
"mode": 0,
"inputs": [
{
@ -816,9 +949,14 @@
],
"title": "CLIP Text Encode (Positive Prompt)",
"properties": {
"Node name for S&R": "CLIPTextEncode",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "CLIPTextEncode",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -838,14 +976,14 @@
"type": "KSampler",
"pos": [
530,
280
340
],
"size": [
270,
400
],
"flags": {},
"order": 5,
"order": 0,
"mode": 0,
"inputs": [
{
@ -879,7 +1017,7 @@
"widget": {
"name": "seed"
},
"link": null
"link": 377
},
{
"localized_name": "steps",
@ -939,9 +1077,14 @@
}
],
"properties": {
"Node name for S&R": "KSampler",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "KSampler",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -964,12 +1107,12 @@
"id": 78,
"type": "GetImageSize",
"pos": [
80,
790
-280,
930
],
"size": [
210,
136
230,
140
],
"flags": {},
"order": 12,
@ -1007,9 +1150,14 @@
}
],
"properties": {
"Node name for S&R": "GetImageSize",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "GetImageSize",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -1017,23 +1165,23 @@
"secondTabText": "Send Back",
"secondTabOffset": 80,
"secondTabWidth": 65
},
"widgets_values": []
}
},
{
"id": 83,
"type": "EmptyQwenImageLayeredLatentImage",
"pos": [
320,
790
-280,
1120
],
"size": [
330.9341796875,
130
340,
200
],
"flags": {},
"order": 13,
"mode": 0,
"showAdvanced": true,
"inputs": [
{
"localized_name": "width",
@ -1083,9 +1231,14 @@
}
],
"properties": {
"Node name for S&R": "EmptyQwenImageLayeredLatentImage",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "EmptyQwenImageLayeredLatentImage",
"enableTabs": false,
"tabWidth": 65,
"tabXOffset": 10,
@ -1109,11 +1262,11 @@
180
],
"size": [
346.7470703125,
82
350,
110
],
"flags": {},
"order": 2,
"order": 4,
"mode": 0,
"inputs": [
{
@ -1123,7 +1276,7 @@
"widget": {
"name": "unet_name"
},
"link": null
"link": 378
},
{
"localized_name": "weight_dtype",
@ -1147,9 +1300,14 @@
}
],
"properties": {
"Node name for S&R": "UNETLoader",
"cnr_id": "comfy-core",
"ver": "0.5.1",
"ue_properties": {
"widget_ue_connectable": {},
"input_ue_unconnectable": {},
"version": "7.7"
},
"Node name for S&R": "UNETLoader",
"models": [
{
"name": "qwen_image_layered_bf16.safetensors",
@ -1191,8 +1349,8 @@
"bounding": [
-330,
110,
366.7470703125,
421.6
370,
610
],
"color": "#3f789e",
"font_size": 24,
@ -1391,6 +1549,38 @@
"target_id": 83,
"target_slot": 2,
"type": "INT"
},
{
"id": 377,
"origin_id": -10,
"origin_slot": 5,
"target_id": 3,
"target_slot": 4,
"type": "INT"
},
{
"id": 378,
"origin_id": -10,
"origin_slot": 6,
"target_id": 37,
"target_slot": 0,
"type": "COMBO"
},
{
"id": 379,
"origin_id": -10,
"origin_slot": 7,
"target_id": 38,
"target_slot": 0,
"type": "COMBO"
},
{
"id": 380,
"origin_id": -10,
"origin_slot": 8,
"target_id": 39,
"target_slot": 0,
"type": "COMBO"
}
],
"extra": {
@ -1400,7 +1590,6 @@
}
]
},
"config": {},
"extra": {
"ds": {
"scale": 1.14,
@ -1409,7 +1598,6 @@
6.855893974423647
]
},
"workflowRendererVersion": "LG"
},
"version": 0.4
}
"ue_links": []
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -240,19 +240,34 @@ class VideoFromFile(VideoInput):
start_time = self.__start_time
# Get video frames
frames = []
alphas = None
start_pts = int(start_time / video_stream.time_base)
end_pts = int((start_time + self.__duration) / video_stream.time_base)
container.seek(start_pts, stream=video_stream)
image_format = 'gbrpf32le'
for frame in container.decode(video_stream):
if alphas is None:
for comp in frame.format.components:
if comp.is_alpha:
alphas = []
image_format = 'gbrapf32le'
break
if frame.pts < start_pts:
continue
if self.__duration and frame.pts >= end_pts:
break
img = frame.to_ndarray(format='rgb24') # shape: (H, W, 3)
img = torch.from_numpy(img) / 255.0 # shape: (H, W, 3)
frames.append(img)
images = torch.stack(frames) if len(frames) > 0 else torch.zeros(0, 3, 0, 0)
img = frame.to_ndarray(format=image_format) # shape: (H, W, 4)
if alphas is None:
frames.append(torch.from_numpy(img))
else:
frames.append(torch.from_numpy(img[..., :-1]))
alphas.append(torch.from_numpy(img[..., -1:]))
images = torch.stack(frames) if len(frames) > 0 else torch.zeros(0, 0, 0, 3)
if alphas is not None:
alphas = torch.stack(alphas) if len(alphas) > 0 else torch.zeros(0, 0, 0, 1)
# Get frame rate
frame_rate = Fraction(video_stream.average_rate) if video_stream.average_rate else Fraction(1)
@ -295,7 +310,7 @@ class VideoFromFile(VideoInput):
})
metadata = container.metadata
return VideoComponents(images=images, audio=audio, frame_rate=frame_rate, metadata=metadata)
return VideoComponents(images=images, alpha=alphas, audio=audio, frame_rate=frame_rate, metadata=metadata)
def get_components(self) -> VideoComponents:
if isinstance(self.__file, io.BytesIO):

View File

@ -3,7 +3,7 @@ from dataclasses import dataclass
from enum import Enum
from fractions import Fraction
from typing import Optional
from .._input import ImageInput, AudioInput
from .._input import ImageInput, AudioInput, MaskInput
class VideoCodec(str, Enum):
AUTO = "auto"
@ -48,5 +48,4 @@ class VideoComponents:
frame_rate: Fraction
audio: Optional[AudioInput] = None
metadata: Optional[dict] = None
alpha: Optional[MaskInput] = None

View File

@ -33,9 +33,13 @@ class OpenAIVideoSora2(IO.ComfyNode):
def define_schema(cls):
return IO.Schema(
node_id="OpenAIVideoSora2",
display_name="OpenAI Sora - Video",
display_name="OpenAI Sora - Video (Deprecated)",
category="api node/video/Sora",
description="OpenAI video and audio generation.",
description=(
"OpenAI video and audio generation.\n\n"
"DEPRECATION NOTICE: OpenAI will stop serving the Sora v2 API in September 2026. "
"This node will be removed from ComfyUI at that time."
),
inputs=[
IO.Combo.Input(
"model",

View File

@ -11,7 +11,6 @@ import kornia
import comfy.utils
import comfy.model_management
from comfy_extras.nodes_latent import reshape_latent_to
import node_helpers
from comfy_api.latest import ComfyExtension, io
from nodes import MAX_RESOLUTION
@ -36,8 +35,23 @@ class Blend(io.ComfyNode):
@classmethod
def execute(cls, image1: torch.Tensor, image2: torch.Tensor, blend_factor: float, blend_mode: str) -> io.NodeOutput:
image1, image2 = node_helpers.image_alpha_fix(image1, image2)
image2 = image2.to(image1.device)
# Reconcile mismatched channel counts. Downstream nodes (SaveImage,
# PreviewImage) ultimately call PIL.Image.fromarray, which rejects
# arrays with more than 4 channels, so we cap the output at 4
# (RGBA): any image with > 4 channels is truncated, and any image
# with fewer channels than the (capped) target is padded with 1.0s
# so the extra slot behaves like an opaque alpha channel.
if image1.shape[-1] != image2.shape[-1] or image1.shape[-1] > 4 or image2.shape[-1] > 4:
target_channels = min(max(image1.shape[-1], image2.shape[-1]), 4)
if image1.shape[-1] > target_channels:
image1 = image1[..., :target_channels]
elif image1.shape[-1] < target_channels:
image1 = torch.cat([image1, torch.ones(*image1.shape[:-1], target_channels - image1.shape[-1], device=image1.device, dtype=image1.dtype)], dim=-1)
if image2.shape[-1] > target_channels:
image2 = image2[..., :target_channels]
elif image2.shape[-1] < target_channels:
image2 = torch.cat([image2, torch.ones(*image2.shape[:-1], target_channels - image2.shape[-1], device=image2.device, dtype=image2.dtype)], dim=-1)
if image1.shape != image2.shape:
image2 = image2.permute(0, 3, 1, 2)
image2 = comfy.utils.common_upscale(image2, image1.shape[2], image1.shape[1], upscale_method='bicubic', crop='center')

View File

@ -54,7 +54,7 @@ class EmptySD3LatentImage(io.ComfyNode):
@classmethod
def execute(cls, width, height, batch_size=1) -> io.NodeOutput:
latent = torch.zeros([batch_size, 16, height // 8, width // 8], device=comfy.model_management.intermediate_device())
latent = torch.zeros([batch_size, 16, height // 8, width // 8], device=comfy.model_management.intermediate_device(), dtype=comfy.model_management.intermediate_dtype())
return io.NodeOutput({"samples": latent, "downscale_ratio_spacial": 8})
generate = execute # TODO: remove

3231
openapi.yaml Normal file

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
comfyui-frontend-package==1.42.14
comfyui-frontend-package==1.42.15
comfyui-workflow-templates==0.9.62
comfyui-embedded-docs==0.4.4
torch

View File

@ -0,0 +1,151 @@
import sys
from unittest.mock import patch, MagicMock
# `comfy.model_management` initializes the GPU at module import time, which
# fails in CPU-only environments. Stub it out before any `comfy.*` imports
# load it transitively. We don't use it in these tests.
sys.modules.setdefault("comfy.model_management", MagicMock())
import torch # noqa: E402
# Mock nodes module to prevent CUDA initialization during import
mock_nodes = MagicMock()
mock_nodes.MAX_RESOLUTION = 16384
# Mock server module for PromptServer
mock_server = MagicMock()
with patch.dict("sys.modules", {"nodes": mock_nodes, "server": mock_server}):
from comfy_extras.nodes_post_processing import Blend # noqa: E402
class TestImageBlend:
"""Regression tests for the ImageBlend node, especially channel-count handling."""
def create_test_image(self, batch_size=1, height=64, width=64, channels=3):
return torch.rand(batch_size, height, width, channels)
def test_same_shape_rgb(self):
"""Baseline: identical RGB inputs produce an RGB output."""
image1 = self.create_test_image(channels=3)
image2 = self.create_test_image(channels=3)
result = Blend.execute(image1, image2, 0.5, "normal")
assert result[0].shape == (1, 64, 64, 3)
def test_rgb_plus_rgba(self):
"""RGB image1 + RGBA image2 should pad image1 to 4 channels."""
image1 = self.create_test_image(channels=3)
image2 = self.create_test_image(channels=4)
result = Blend.execute(image1, image2, 0.5, "normal")
assert result[0].shape == (1, 64, 64, 4)
def test_rgba_plus_rgb(self):
"""RGBA image1 + RGB image2 should pad image2 to 4 channels."""
image1 = self.create_test_image(channels=4)
image2 = self.create_test_image(channels=3)
result = Blend.execute(image1, image2, 0.5, "normal")
assert result[0].shape == (1, 64, 64, 4)
def test_channel_gap_larger_than_one(self):
"""Channel-count gap > 1 (e.g. 3 vs 5) should not raise.
This is the exact runtime error reported in CORE-103:
'The size of tensor a (5) must match the size of tensor b (3) at
non-singleton dimension 3'.
The output is capped at 4 channels (RGBA) because downstream
SaveImage/PreviewImage rely on PIL.Image.fromarray, which rejects
arrays with more than 4 channels. Without this cap, the failure
would just shift from blend-time to save-time.
"""
image1 = self.create_test_image(channels=3)
image2 = self.create_test_image(channels=5)
result = Blend.execute(image1, image2, 0.5, "multiply")
assert result[0].shape == (1, 64, 64, 4)
def test_output_capped_at_four_channels(self):
"""Both inputs having > 4 channels should still produce a 4-channel
output. PIL.Image.fromarray (used by SaveImage/PreviewImage)
rejects arrays with more than 4 channels."""
image1 = self.create_test_image(channels=6)
image2 = self.create_test_image(channels=5)
result = Blend.execute(image1, image2, 0.5, "normal")
assert result[0].shape == (1, 64, 64, 4)
def test_save_compatible_output_passes_through_pil(self):
"""The blended result must be convertible by PIL.Image.fromarray,
which is what SaveImage/PreviewImage do downstream. Catches the
case where a >4-channel output would silently break save/preview."""
from PIL import Image
import numpy as np
image1 = self.create_test_image(channels=3)
image2 = self.create_test_image(channels=5)
result = Blend.execute(image1, image2, 0.5, "normal")
# Mirror SaveImage's exact conversion (nodes.py:1662). PIL accepts
# 1/2/3/4-channel arrays (L/LA/RGB/RGBA); a >4-channel output would
# raise "TypeError: Cannot handle this data type" here.
arr = np.clip(255.0 * result[0][0].cpu().numpy(), 0, 255).astype(np.uint8)
Image.fromarray(arr)
def test_different_size_and_channels(self):
"""Different spatial size AND different channel counts should both be reconciled."""
image1 = self.create_test_image(height=64, width=64, channels=3)
image2 = self.create_test_image(height=32, width=32, channels=4)
result = Blend.execute(image1, image2, 0.5, "screen")
assert result[0].shape == (1, 64, 64, 4)
def test_all_blend_modes_with_channel_mismatch(self):
"""Every blend mode should work with mismatched channel counts."""
image1 = self.create_test_image(channels=3)
image2 = self.create_test_image(channels=4)
for mode in [
"normal",
"multiply",
"screen",
"overlay",
"soft_light",
"difference",
]:
result = Blend.execute(image1, image2, 0.5, mode)
assert result[0].shape == (1, 64, 64, 4), (
f"blend mode {mode} produced wrong shape"
)
def test_output_clamped(self):
"""Output values should be clamped to [0, 1] even when intermediate
results would go negative.
With `difference` mode, image1=0 and image2=1, the unclamped blend
produces image1*(1-bf) + (image1-image2)*bf = -bf, which is negative.
The output therefore exercises the clamp branch.
"""
image1 = torch.zeros(1, 8, 8, 3)
image2 = torch.ones(1, 8, 8, 3)
result = Blend.execute(image1, image2, 0.5, "difference")
assert result[0].min() >= 0.0
assert result[0].max() <= 1.0
# All pixels would be -0.5 without the clamp; verify they were clipped to 0.
assert torch.all(result[0] == 0.0)
def test_padding_value_is_one(self):
"""Verify the padded channel(s) are filled with 1.0, not 0.0 or some
other value. This is the semantic guarantee of the channel-alignment
logic (it acts like an opaque alpha channel).
Setup: image1 has 3 channels of zeros, image2 has 4 channels of ones.
After padding, image1 becomes [0, 0, 0, X] where X is the pad value.
With `multiply` blend_mode and blend_factor=1.0:
output = image1 * (1 - 1) + (image1 * image2) * 1
= image1 * image2
= [0, 0, 0, X * 1] = [0, 0, 0, X]
So output channel 4 reveals the pad value used for image1.
"""
image1 = torch.zeros(1, 4, 4, 3)
image2 = torch.ones(1, 4, 4, 4)
result = Blend.execute(image1, image2, 1.0, "multiply")
assert result[0].shape == (1, 4, 4, 4)
# First three channels: 0 * 1 = 0
assert torch.all(result[0][..., :3] == 0.0)
# Fourth channel: pad_value * 1 = pad_value -> must be 1.0
assert torch.all(result[0][..., 3] == 1.0)

View File

@ -0,0 +1,56 @@
import json
import pathlib
WORKFLOW_PATH = (
pathlib.Path(__file__).resolve().parents[2]
/ "tests"
/ "inference"
/ "graphs"
/ "image_blend_channel_mismatch.json"
)
def test_workflow_loads():
with open(WORKFLOW_PATH) as f:
graph = json.load(f)
assert isinstance(graph, dict) and graph, "workflow JSON is empty"
def test_workflow_uses_expected_node_types():
"""The workflow uses a fixed, minimal set of nodes. If any are renamed
or removed upstream, this test fails fast instead of letting the demo
bitrot silently."""
expected = {
"EmptyImage",
"SolidMask",
"JoinImageWithAlpha",
"ImageBlend",
"SaveImage",
}
with open(WORKFLOW_PATH) as f:
graph = json.load(f)
actual = {node["class_type"] for node in graph.values()}
assert expected.issubset(actual), (
f"workflow is missing required node types: {expected - actual}"
)
def test_workflow_exercises_imageblend_with_mismatched_channels():
"""Sanity-check that the workflow actually wires an RGB output and an
RGBA output into ImageBlend (the CORE-103 case). If someone edits the
JSON and accidentally breaks this guarantee, the demo loses its point."""
with open(WORKFLOW_PATH) as f:
graph = json.load(f)
blend_nodes = [n for n in graph.values() if n["class_type"] == "ImageBlend"]
assert len(blend_nodes) == 1, "expected exactly one ImageBlend node"
blend = blend_nodes[0]
src1_id, _ = blend["inputs"]["image1"]
src2_id, _ = blend["inputs"]["image2"]
types = {graph[src1_id]["class_type"], graph[src2_id]["class_type"]}
assert "JoinImageWithAlpha" in types, (
"workflow no longer feeds an RGBA image into ImageBlend"
)
assert "EmptyImage" in types, (
"workflow no longer feeds a plain RGB image into ImageBlend"
)

View File

@ -0,0 +1,69 @@
{
"1": {
"inputs": {
"width": 256,
"height": 256,
"batch_size": 1,
"color": 16711680
},
"class_type": "EmptyImage",
"_meta": {
"title": "RGB image (3 channels, red)"
}
},
"2": {
"inputs": {
"width": 256,
"height": 256,
"batch_size": 1,
"color": 255
},
"class_type": "EmptyImage",
"_meta": {
"title": "Base image for RGBA (blue)"
}
},
"3": {
"inputs": {
"value": 0.5,
"width": 256,
"height": 256
},
"class_type": "SolidMask",
"_meta": {
"title": "Alpha mask (0.5)"
}
},
"4": {
"inputs": {
"image": ["2", 0],
"alpha": ["3", 0]
},
"class_type": "JoinImageWithAlpha",
"_meta": {
"title": "RGBA image (4 channels)"
}
},
"5": {
"inputs": {
"image1": ["1", 0],
"image2": ["4", 0],
"blend_factor": 0.5,
"blend_mode": "normal"
},
"class_type": "ImageBlend",
"_meta": {
"title": "Blend RGB (3ch) + RGBA (4ch) - exercises CORE-103 fix"
}
},
"6": {
"inputs": {
"filename_prefix": "image_blend_channel_test",
"images": ["5", 0]
},
"class_type": "SaveImage",
"_meta": {
"title": "Save blended output (will be RGBA)"
}
}
}