Compare commits

...

39 Commits

Author SHA1 Message Date
c88cf4414d ComfyUI v0.19.5 2026-04-23 17:27:01 -04:00
7f2e08de2b add 4K resolution to Kling nodes (#13536)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-23 17:20:00 -04:00
b53623423e [Partner Nodes] GPTImage: fix price badges, add new resolutions (#13519)
* fix(api-nodes): fixed price badges, add new resolutions

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* proper calculate the total run cost when "n > 1"

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-23 17:19:31 -04:00
5e2a5b0cce [Partner Nodes] add SD2 real human support (#13509)
* feat(api-nodes): add SD2 real human support

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* fix: add validation before uploading Assets

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* Add asset_id and group_id displaying on the node

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* extend poll_op to use instead of custom async cycle

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* added the polling for the "Active" status after asset creation

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* updated tooltip for group_id

* allow usage of real human in the ByteDance2FirstLastFrame node

* add reference count limits

* corrected price in status when input assets contain video

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-23 17:19:21 -04:00
5b6726d6b7 fix: use Parameter assignment for Stable_Zero123 cc_projection weights (fixes #13492) (#13518)
On Windows with aimdo enabled, disable_weight_init.Linear uses lazy
initialization that sets weight and bias to None to avoid unnecessary
memory allocation. This caused a crash when copy_() was called on the
None weight attribute in Stable_Zero123.__init__.

Replace copy_() with direct torch.nn.Parameter assignment, which works
correctly on both Windows (aimdo enabled) and other platforms.
2026-04-23 17:18:59 -04:00
337b8c3f2b fix(veo): reject 4K resolution for veo-3.0 models in Veo3VideoGenerationNode (#13504)
The tooltip on the resolution input states that 4K is not available for
veo-3.1-lite or veo-3.0 models, but the execute guard only rejected the
lite combination. Selecting 4K with veo-3.0-generate-001 or
veo-3.0-fast-generate-001 would fall through and hit the upstream API
with an invalid request.

Broaden the guard to match the documented behavior and update the error
message accordingly.

Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2026-04-23 17:18:33 -04:00
cb15dd6d0a chore: update workflow templates to v0.9.61 (#13533) 2026-04-23 17:18:01 -04:00
4a09ad8dca ComfyUI v0.19.4 2026-04-21 21:08:31 -04:00
c135d9f74a Add gpt-image-2 as version option (#13501) 2026-04-21 21:07:09 -04:00
ec62a307a2 Bump comfyui-frontend-package to 1.42.14 (#13493) 2026-04-21 21:06:57 -04:00
685f3db99d Bump comfyui-frontend-package to 1.42.12 (#13489) 2026-04-21 21:06:45 -04:00
58744ac533 [Partner Nodes] added 4K resolution for Veo models; added Veo 3 Lite model (#13330)
* feat(api nodes): added 4K resolution for Veo models; added Veo 3 Lite model

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* increase poll_interval from 5 to 9

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2026-04-21 21:06:19 -04:00
e6f1b1e6be feat(api-nodes): add automatic downscaling of videos for ByteDance 2 nodes (#13465) 2026-04-21 21:05:59 -04:00
3086026401 ComfyUI v0.19.3 2026-04-17 13:35:01 -04:00
9635c2ec9b fix(api-nodes): make "obj" output optional in Hunyuan3D Text and Image to 3D (#13449)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
2026-04-18 01:31:37 +08:00
f8d92cf313 chore: update workflow templates to v0.9.57 (#13455) 2026-04-17 12:16:39 -05:00
4f48be4138 feat(api-nodes): add new "arrow-1.1" and "arrow-1.1-max" SVG models (#13447)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-17 12:02:06 -05:00
541fd10bbe fix(api-nodes): corrected StabilityAI price badges (#13454)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-17 11:44:08 -05:00
05f7531148 nodes_textgen: Implement use_default_template for LTX (#13451) 2026-04-17 12:20:09 -04:00
c033bbf516 ComfyUI v0.19.2 2026-04-17 00:26:35 -04:00
1391579c33 Add JsonExtractString node. (#13435) 2026-04-17 00:20:16 -04:00
d0c53c50c2 feat(api-nodes): add 1080p resolution for SeeDance 2.0 model (#13437)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-16 20:32:04 -05:00
b41ab53b6f Use ErnieTEModel_ not ErnieTEModel. (#13431) 2026-04-16 10:11:58 -04:00
e9a2d1e4cc Add a way to disable default template in text gen node. (#13424) 2026-04-15 22:59:08 -04:00
1de83f91c3 Fix OOM regression in _apply() for quantized models during inference (#13372)
Skip unnecessary clone of inference-mode tensors when already inside
torch.inference_mode(), matching the existing guard in set_attr_param.
The unconditional clone introduced in 20561aa9 caused transient VRAM
doubling during model movement for FP8/quantized models.
2026-04-15 02:10:36 -07:00
8f374716ee ComfyUI v0.19.1 2026-04-14 22:56:13 -04:00
cb0bbde402 Fix ernie on devices that don't support fp64. (#13414) 2026-04-14 22:54:47 -04:00
7ce3f64c78 Update workflow templates to v0.9.54 (#13412) 2026-04-14 17:35:27 -07:00
c5569e8627 Add string output to preview text node. (#13406) 2026-04-14 14:42:23 -04:00
c16db7fd69 Bump comfyui-frontend-package to 1.42.11 (#13398) 2026-04-14 14:13:35 -04:00
fed4ac031a chore: update workflow templates to v0.9.50 (#13399) 2026-04-14 14:24:37 +08:00
35dfcbbb28 [Partner Nodes] add Sonilo Audio nodes (#13391)
* feat(api-nodes): add Sonilo nodes

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* fix: do not spam frontend with each chunk arrival

Signed-off-by: bigcat88 <bigcat88@icloud.com>

* updated pricing badge

Signed-off-by: bigcat88 <bigcat88@icloud.com>

---------

Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-13 22:21:01 -07:00
722bc73319 Make text generation work with ministral model. (#13395)
Needs template before it works properly.
2026-04-13 20:43:57 -04:00
402ff1cdb7 Fix issue with ernie image. (#13393) 2026-04-13 16:38:42 -04:00
acd718598e ComfyUI v0.19.0 2026-04-13 03:02:36 -04:00
559501e4b8 chore: update workflow templates to v0.9.47 (#13385) 2026-04-12 23:19:09 -07:00
ee2db7488d feat(api-nodes): add SeeDance 2.0 nodes (#13364)
Signed-off-by: bigcat88 <bigcat88@icloud.com>
2026-04-12 19:26:19 -10:00
c2657d5fb9 Fix typo. (#13382) 2026-04-12 23:37:13 -04:00
971932346a Update quant doc so it's not completely wrong. (#13381)
There is still more that needs to be fixed.
2026-04-12 23:27:38 -04:00
24 changed files with 1936 additions and 223 deletions

View File

@ -139,9 +139,9 @@ Example:
"_quantization_metadata": {
"format_version": "1.0",
"layers": {
"model.layers.0.mlp.up_proj": "float8_e4m3fn",
"model.layers.0.mlp.down_proj": "float8_e4m3fn",
"model.layers.1.mlp.up_proj": "float8_e4m3fn"
"model.layers.0.mlp.up_proj": {"format": "float8_e4m3fn"},
"model.layers.0.mlp.down_proj": {"format": "float8_e4m3fn"},
"model.layers.1.mlp.up_proj": {"format": "float8_e4m3fn"}
}
}
}
@ -165,4 +165,4 @@ Activation quantization (e.g., for FP8 Tensor Core operations) requires `input_s
3. **Compute scales**: Derive `input_scale` from collected statistics
4. **Store in checkpoint**: Save `input_scale` parameters alongside weights
The calibration dataset should be representative of your target use case. For diffusion models, this typically means a diverse set of prompts and generation parameters.
The calibration dataset should be representative of your target use case. For diffusion models, this typically means a diverse set of prompts and generation parameters.

View File

@ -15,7 +15,7 @@ def rope(pos: torch.Tensor, dim: int, theta: int) -> torch.Tensor:
scale = torch.arange(0, dim, 2, dtype=torch.float64, device=device) / dim
omega = 1.0 / (theta**scale)
out = torch.einsum("...n,d->...nd", pos, omega)
out = torch.einsum("...n,d->...nd", pos.to(device), omega)
out = torch.stack([torch.cos(out), torch.sin(out)], dim=0)
return out.to(dtype=torch.float32, device=pos.device)
@ -279,7 +279,7 @@ class ErnieImageModel(nn.Module):
rotary_pos_emb = self.pos_embed(torch.cat([image_ids, text_ids], dim=1)).to(x.dtype)
del image_ids, text_ids
sample = self.time_proj(timesteps.to(dtype)).to(self.time_embedding.linear_1.weight.dtype)
sample = self.time_proj(timesteps).to(dtype)
c = self.time_embedding(sample)
shift_msa, scale_msa, gate_msa, shift_mlp, scale_mlp, gate_mlp = [

View File

@ -578,8 +578,8 @@ class Stable_Zero123(BaseModel):
def __init__(self, model_config, model_type=ModelType.EPS, device=None, cc_projection_weight=None, cc_projection_bias=None):
super().__init__(model_config, model_type, device=device)
self.cc_projection = comfy.ops.manual_cast.Linear(cc_projection_weight.shape[1], cc_projection_weight.shape[0], dtype=self.get_dtype(), device=device)
self.cc_projection.weight.copy_(cc_projection_weight)
self.cc_projection.bias.copy_(cc_projection_bias)
self.cc_projection.weight = torch.nn.Parameter(cc_projection_weight.clone())
self.cc_projection.bias = torch.nn.Parameter(cc_projection_bias.clone())
def extra_conds(self, **kwargs):
out = {}

View File

@ -1151,7 +1151,7 @@ def mixed_precision_ops(quant_config={}, compute_dtype=torch.bfloat16, full_prec
if param is None:
continue
p = fn(param)
if p.is_inference():
if (not torch.is_inference_mode_enabled()) and p.is_inference():
p = p.clone()
self.register_parameter(key, torch.nn.Parameter(p, requires_grad=False))
for key, buf in self._buffers.items():

View File

@ -3,7 +3,7 @@ from comfy import sd1_clip
import comfy.text_encoders.llama
class Ministral3_3BTokenizer(Mistral3Tokenizer):
def __init__(self, embedding_directory=None, embedding_size=5120, embedding_key='mistral3_24b', tokenizer_data={}):
def __init__(self, embedding_directory=None, embedding_size=5120, embedding_key='ministral3_3b', tokenizer_data={}):
return super().__init__(embedding_directory=embedding_directory, embedding_size=embedding_size, embedding_key=embedding_key, tokenizer_data=tokenizer_data)
class ErnieTokenizer(sd1_clip.SD1Tokenizer):
@ -35,4 +35,4 @@ def te(dtype_llama=None, llama_quantization_metadata=None):
model_options = model_options.copy()
model_options["quantization_metadata"] = llama_quantization_metadata
super().__init__(device=device, dtype=dtype, model_options=model_options)
return ErnieTEModel
return ErnieTEModel_

View File

@ -82,6 +82,7 @@ class Ministral3_3BConfig:
rope_scale = None
final_norm: bool = True
lm_head: bool = False
stop_tokens = [2]
@dataclass
class Qwen25_3BConfig:
@ -969,7 +970,7 @@ class Mistral3Small24B(BaseLlama, torch.nn.Module):
self.model = Llama2_(config, device=device, dtype=dtype, ops=operations)
self.dtype = dtype
class Ministral3_3B(BaseLlama, torch.nn.Module):
class Ministral3_3B(BaseLlama, BaseQwen3, BaseGenerate, torch.nn.Module):
def __init__(self, config_dict, dtype, device, operations):
super().__init__()
config = Ministral3_3BConfig(**config_dict)

View File

@ -52,6 +52,26 @@ class TaskImageContent(BaseModel):
role: Literal["first_frame", "last_frame", "reference_image"] | None = Field(None)
class TaskVideoContentUrl(BaseModel):
url: str = Field(...)
class TaskVideoContent(BaseModel):
type: str = Field("video_url")
video_url: TaskVideoContentUrl = Field(...)
role: str = Field("reference_video")
class TaskAudioContentUrl(BaseModel):
url: str = Field(...)
class TaskAudioContent(BaseModel):
type: str = Field("audio_url")
audio_url: TaskAudioContentUrl = Field(...)
role: str = Field("reference_audio")
class Text2VideoTaskCreationRequest(BaseModel):
model: str = Field(...)
content: list[TaskTextContent] = Field(..., min_length=1)
@ -64,6 +84,17 @@ class Image2VideoTaskCreationRequest(BaseModel):
generate_audio: bool | None = Field(...)
class Seedance2TaskCreationRequest(BaseModel):
model: str = Field(...)
content: list[TaskTextContent | TaskImageContent | TaskVideoContent | TaskAudioContent] = Field(..., min_length=1)
generate_audio: bool | None = Field(None)
resolution: str | None = Field(None)
ratio: str | None = Field(None)
duration: int | None = Field(None, ge=4, le=15)
seed: int | None = Field(None, ge=0, le=2147483647)
watermark: bool | None = Field(None)
class TaskCreationResponse(BaseModel):
id: str = Field(...)
@ -77,12 +108,62 @@ class TaskStatusResult(BaseModel):
video_url: str = Field(...)
class TaskStatusUsage(BaseModel):
completion_tokens: int = Field(0)
total_tokens: int = Field(0)
class TaskStatusResponse(BaseModel):
id: str = Field(...)
model: str = Field(...)
status: Literal["queued", "running", "cancelled", "succeeded", "failed"] = Field(...)
error: TaskStatusError | None = Field(None)
content: TaskStatusResult | None = Field(None)
usage: TaskStatusUsage | None = Field(None)
class GetAssetResponse(BaseModel):
id: str = Field(...)
name: str | None = Field(None)
url: str | None = Field(None)
asset_type: str = Field(...)
group_id: str = Field(...)
status: str = Field(...)
error: TaskStatusError | None = Field(None)
class SeedanceCreateVisualValidateSessionResponse(BaseModel):
session_id: str = Field(...)
h5_link: str = Field(...)
class SeedanceGetVisualValidateSessionResponse(BaseModel):
session_id: str = Field(...)
status: str = Field(...)
group_id: str | None = Field(None)
error_code: str | None = Field(None)
error_message: str | None = Field(None)
class SeedanceCreateAssetRequest(BaseModel):
group_id: str = Field(...)
url: str = Field(...)
asset_type: str = Field(...)
name: str | None = Field(None, max_length=64)
project_name: str | None = Field(None)
class SeedanceCreateAssetResponse(BaseModel):
asset_id: str = Field(...)
# Dollars per 1K tokens, keyed by (model_id, has_video_input).
SEEDANCE2_PRICE_PER_1K_TOKENS = {
("dreamina-seedance-2-0-260128", False): 0.007,
("dreamina-seedance-2-0-260128", True): 0.0043,
("dreamina-seedance-2-0-fast-260128", False): 0.0056,
("dreamina-seedance-2-0-fast-260128", True): 0.0033,
}
RECOMMENDED_PRESETS = [
@ -112,6 +193,19 @@ RECOMMENDED_PRESETS_SEEDREAM_4 = [
("Custom", None, None),
]
# Seedance 2.0 reference video pixel count limits per model and output resolution.
SEEDANCE2_REF_VIDEO_PIXEL_LIMITS = {
"dreamina-seedance-2-0-260128": {
"480p": {"min": 409_600, "max": 927_408},
"720p": {"min": 409_600, "max": 927_408},
"1080p": {"min": 409_600, "max": 2_073_600},
},
"dreamina-seedance-2-0-fast-260128": {
"480p": {"min": 409_600, "max": 927_408},
"720p": {"min": 409_600, "max": 927_408},
},
}
# The time in this dictionary are given for 10 seconds duration.
VIDEO_TASKS_EXECUTION_TIME = {
"seedance-1-0-lite-t2v-250428": {

File diff suppressed because it is too large Load Diff

View File

@ -221,14 +221,17 @@ class TencentTextToModelNode(IO.ComfyNode):
response_model=To3DProTaskResultResponse,
status_extractor=lambda r: r.Status,
)
obj_result = await download_and_extract_obj_zip(get_file_from_response(result.ResultFile3Ds, "obj").Url)
obj_file_response = get_file_from_response(result.ResultFile3Ds, "obj", raise_if_not_found=False)
obj_result = None
if obj_file_response:
obj_result = await download_and_extract_obj_zip(obj_file_response.Url)
return IO.NodeOutput(
f"{task_id}.glb",
await download_url_to_file_3d(
get_file_from_response(result.ResultFile3Ds, "glb").Url, "glb", task_id=task_id
),
obj_result.obj,
obj_result.texture,
obj_result.obj if obj_result else None,
obj_result.texture if obj_result else None,
)
@ -378,17 +381,30 @@ class TencentImageToModelNode(IO.ComfyNode):
response_model=To3DProTaskResultResponse,
status_extractor=lambda r: r.Status,
)
obj_result = await download_and_extract_obj_zip(get_file_from_response(result.ResultFile3Ds, "obj").Url)
obj_file_response = get_file_from_response(result.ResultFile3Ds, "obj", raise_if_not_found=False)
if obj_file_response:
obj_result = await download_and_extract_obj_zip(obj_file_response.Url)
return IO.NodeOutput(
f"{task_id}.glb",
await download_url_to_file_3d(
get_file_from_response(result.ResultFile3Ds, "glb").Url, "glb", task_id=task_id
),
obj_result.obj,
obj_result.texture,
obj_result.metallic if obj_result.metallic is not None else torch.zeros(1, 1, 1, 3),
obj_result.normal if obj_result.normal is not None else torch.zeros(1, 1, 1, 3),
obj_result.roughness if obj_result.roughness is not None else torch.zeros(1, 1, 1, 3),
)
return IO.NodeOutput(
f"{task_id}.glb",
await download_url_to_file_3d(
get_file_from_response(result.ResultFile3Ds, "glb").Url, "glb", task_id=task_id
),
obj_result.obj,
obj_result.texture,
obj_result.metallic if obj_result.metallic is not None else torch.zeros(1, 1, 1, 3),
obj_result.normal if obj_result.normal is not None else torch.zeros(1, 1, 1, 3),
obj_result.roughness if obj_result.roughness is not None else torch.zeros(1, 1, 1, 3),
None,
None,
None,
None,
None,
)

View File

@ -276,6 +276,7 @@ async def finish_omni_video_task(cls: type[IO.ComfyNode], response: TaskStatusRe
cls,
ApiEndpoint(path=f"/proxy/kling/v1/videos/omni-video/{response.data.task_id}"),
response_model=TaskStatusResponse,
max_poll_attempts=280,
status_extractor=lambda r: (r.data.task_status if r.data else None),
)
return IO.NodeOutput(await download_url_to_video_output(final_response.data.task_result.videos[0].url))
@ -862,7 +863,7 @@ class OmniProTextToVideoNode(IO.ComfyNode):
),
IO.Combo.Input("aspect_ratio", options=["16:9", "9:16", "1:1"]),
IO.Int.Input("duration", default=5, min=3, max=15, display_mode=IO.NumberDisplay.slider),
IO.Combo.Input("resolution", options=["1080p", "720p"], optional=True),
IO.Combo.Input("resolution", options=["4k", "1080p", "720p"], default="1080p", optional=True),
IO.DynamicCombo.Input(
"storyboards",
options=[
@ -904,12 +905,13 @@ class OmniProTextToVideoNode(IO.ComfyNode):
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution", "model_name", "generate_audio"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$res := widgets.resolution;
$mode := $res = "4k" ? "4k" : ($res = "720p" ? "std" : "pro");
$isV3 := $contains(widgets.model_name, "v3");
$audio := $isV3 and widgets.generate_audio;
$rates := $audio
? {"std": 0.112, "pro": 0.14}
: {"std": 0.084, "pro": 0.112};
? {"std": 0.112, "pro": 0.14, "4k": 0.42}
: {"std": 0.084, "pro": 0.112, "4k": 0.42};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
@ -934,6 +936,8 @@ class OmniProTextToVideoNode(IO.ComfyNode):
raise ValueError("kling-video-o1 only supports durations of 5 or 10 seconds.")
if generate_audio:
raise ValueError("kling-video-o1 does not support audio generation.")
if resolution == "4k":
raise ValueError("kling-video-o1 does not support 4k resolution.")
stories_enabled = storyboards is not None and storyboards["storyboards"] != "disabled"
if stories_enabled and model_name == "kling-video-o1":
raise ValueError("kling-video-o1 does not support storyboards.")
@ -963,6 +967,12 @@ class OmniProTextToVideoNode(IO.ComfyNode):
f"must equal the global duration ({duration}s)."
)
if resolution == "4k":
mode = "4k"
elif resolution == "1080p":
mode = "pro"
else:
mode = "std"
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/kling/v1/videos/omni-video", method="POST"),
@ -972,7 +982,7 @@ class OmniProTextToVideoNode(IO.ComfyNode):
prompt=prompt,
aspect_ratio=aspect_ratio,
duration=str(duration),
mode="pro" if resolution == "1080p" else "std",
mode=mode,
multi_shot=multi_shot,
multi_prompt=multi_prompt_list,
shot_type="customize" if multi_shot else None,
@ -1014,7 +1024,7 @@ class OmniProFirstLastFrameNode(IO.ComfyNode):
optional=True,
tooltip="Up to 6 additional reference images.",
),
IO.Combo.Input("resolution", options=["1080p", "720p"], optional=True),
IO.Combo.Input("resolution", options=["4k", "1080p", "720p"], default="1080p", optional=True),
IO.DynamicCombo.Input(
"storyboards",
options=[
@ -1061,12 +1071,13 @@ class OmniProFirstLastFrameNode(IO.ComfyNode):
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution", "model_name", "generate_audio"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$res := widgets.resolution;
$mode := $res = "4k" ? "4k" : ($res = "720p" ? "std" : "pro");
$isV3 := $contains(widgets.model_name, "v3");
$audio := $isV3 and widgets.generate_audio;
$rates := $audio
? {"std": 0.112, "pro": 0.14}
: {"std": 0.084, "pro": 0.112};
? {"std": 0.112, "pro": 0.14, "4k": 0.42}
: {"std": 0.084, "pro": 0.112, "4k": 0.42};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
@ -1093,6 +1104,8 @@ class OmniProFirstLastFrameNode(IO.ComfyNode):
raise ValueError("kling-video-o1 does not support durations greater than 10 seconds.")
if generate_audio:
raise ValueError("kling-video-o1 does not support audio generation.")
if resolution == "4k":
raise ValueError("kling-video-o1 does not support 4k resolution.")
stories_enabled = storyboards is not None and storyboards["storyboards"] != "disabled"
if stories_enabled and model_name == "kling-video-o1":
raise ValueError("kling-video-o1 does not support storyboards.")
@ -1161,6 +1174,12 @@ class OmniProFirstLastFrameNode(IO.ComfyNode):
validate_image_aspect_ratio(i, (1, 2.5), (2.5, 1))
for i in await upload_images_to_comfyapi(cls, reference_images, wait_label="Uploading reference frame(s)"):
image_list.append(OmniParamImage(image_url=i))
if resolution == "4k":
mode = "4k"
elif resolution == "1080p":
mode = "pro"
else:
mode = "std"
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/kling/v1/videos/omni-video", method="POST"),
@ -1170,7 +1189,7 @@ class OmniProFirstLastFrameNode(IO.ComfyNode):
prompt=prompt,
duration=str(duration),
image_list=image_list,
mode="pro" if resolution == "1080p" else "std",
mode=mode,
sound="on" if generate_audio else "off",
multi_shot=multi_shot,
multi_prompt=multi_prompt_list,
@ -1204,7 +1223,7 @@ class OmniProImageToVideoNode(IO.ComfyNode):
"reference_images",
tooltip="Up to 7 reference images.",
),
IO.Combo.Input("resolution", options=["1080p", "720p"], optional=True),
IO.Combo.Input("resolution", options=["4k", "1080p", "720p"], default="1080p", optional=True),
IO.DynamicCombo.Input(
"storyboards",
options=[
@ -1251,12 +1270,13 @@ class OmniProImageToVideoNode(IO.ComfyNode):
depends_on=IO.PriceBadgeDepends(widgets=["duration", "resolution", "model_name", "generate_audio"]),
expr="""
(
$mode := (widgets.resolution = "720p") ? "std" : "pro";
$res := widgets.resolution;
$mode := $res = "4k" ? "4k" : ($res = "720p" ? "std" : "pro");
$isV3 := $contains(widgets.model_name, "v3");
$audio := $isV3 and widgets.generate_audio;
$rates := $audio
? {"std": 0.112, "pro": 0.14}
: {"std": 0.084, "pro": 0.112};
? {"std": 0.112, "pro": 0.14, "4k": 0.42}
: {"std": 0.084, "pro": 0.112, "4k": 0.42};
{"type":"usd","usd": $lookup($rates, $mode) * widgets.duration}
)
""",
@ -1282,6 +1302,8 @@ class OmniProImageToVideoNode(IO.ComfyNode):
raise ValueError("kling-video-o1 does not support durations greater than 10 seconds.")
if generate_audio:
raise ValueError("kling-video-o1 does not support audio generation.")
if resolution == "4k":
raise ValueError("kling-video-o1 does not support 4k resolution.")
stories_enabled = storyboards is not None and storyboards["storyboards"] != "disabled"
if stories_enabled and model_name == "kling-video-o1":
raise ValueError("kling-video-o1 does not support storyboards.")
@ -1320,6 +1342,12 @@ class OmniProImageToVideoNode(IO.ComfyNode):
image_list: list[OmniParamImage] = []
for i in await upload_images_to_comfyapi(cls, reference_images, wait_label="Uploading reference image"):
image_list.append(OmniParamImage(image_url=i))
if resolution == "4k":
mode = "4k"
elif resolution == "1080p":
mode = "pro"
else:
mode = "std"
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/kling/v1/videos/omni-video", method="POST"),
@ -1330,7 +1358,7 @@ class OmniProImageToVideoNode(IO.ComfyNode):
aspect_ratio=aspect_ratio,
duration=str(duration),
image_list=image_list,
mode="pro" if resolution == "1080p" else "std",
mode=mode,
sound="on" if generate_audio else "off",
multi_shot=multi_shot,
multi_prompt=multi_prompt_list,
@ -2860,7 +2888,7 @@ class KlingVideoNode(IO.ComfyNode):
IO.DynamicCombo.Option(
"kling-v3",
[
IO.Combo.Input("resolution", options=["1080p", "720p"]),
IO.Combo.Input("resolution", options=["4k", "1080p", "720p"], default="1080p"),
IO.Combo.Input(
"aspect_ratio",
options=["16:9", "9:16", "1:1"],
@ -2913,7 +2941,11 @@ class KlingVideoNode(IO.ComfyNode):
),
expr="""
(
$rates := {"1080p": {"off": 0.112, "on": 0.168}, "720p": {"off": 0.084, "on": 0.126}};
$rates := {
"4k": {"off": 0.42, "on": 0.42},
"1080p": {"off": 0.112, "on": 0.168},
"720p": {"off": 0.084, "on": 0.126}
};
$res := $lookup(widgets, "model.resolution");
$audio := widgets.generate_audio ? "on" : "off";
$rate := $lookup($lookup($rates, $res), $audio);
@ -2943,7 +2975,12 @@ class KlingVideoNode(IO.ComfyNode):
start_frame: Input.Image | None = None,
) -> IO.NodeOutput:
_ = seed
mode = "pro" if model["resolution"] == "1080p" else "std"
if model["resolution"] == "4k":
mode = "4k"
elif model["resolution"] == "1080p":
mode = "pro"
else:
mode = "std"
custom_multi_shot = False
if multi_shot["multi_shot"] == "disabled":
shot_type = None
@ -3025,6 +3062,7 @@ class KlingVideoNode(IO.ComfyNode):
cls,
ApiEndpoint(path=poll_path),
response_model=TaskStatusResponse,
max_poll_attempts=280,
status_extractor=lambda r: (r.data.task_status if r.data else None),
)
return IO.NodeOutput(await download_url_to_video_output(final_response.data.task_result.videos[0].url))
@ -3057,7 +3095,7 @@ class KlingFirstLastFrameNode(IO.ComfyNode):
IO.DynamicCombo.Option(
"kling-v3",
[
IO.Combo.Input("resolution", options=["1080p", "720p"]),
IO.Combo.Input("resolution", options=["4k", "1080p", "720p"], default="1080p"),
],
),
],
@ -3089,7 +3127,11 @@ class KlingFirstLastFrameNode(IO.ComfyNode):
),
expr="""
(
$rates := {"1080p": {"off": 0.112, "on": 0.168}, "720p": {"off": 0.084, "on": 0.126}};
$rates := {
"4k": {"off": 0.42, "on": 0.42},
"1080p": {"off": 0.112, "on": 0.168},
"720p": {"off": 0.084, "on": 0.126}
};
$res := $lookup(widgets, "model.resolution");
$audio := widgets.generate_audio ? "on" : "off";
$rate := $lookup($lookup($rates, $res), $audio);
@ -3118,6 +3160,12 @@ class KlingFirstLastFrameNode(IO.ComfyNode):
validate_image_aspect_ratio(end_frame, (1, 2.5), (2.5, 1))
image_url = await upload_image_to_comfyapi(cls, first_frame, wait_label="Uploading first frame")
image_tail_url = await upload_image_to_comfyapi(cls, end_frame, wait_label="Uploading end frame")
if model["resolution"] == "4k":
mode = "4k"
elif model["resolution"] == "1080p":
mode = "pro"
else:
mode = "std"
response = await sync_op(
cls,
ApiEndpoint(path="/proxy/kling/v1/videos/image2video", method="POST"),
@ -3127,7 +3175,7 @@ class KlingFirstLastFrameNode(IO.ComfyNode):
image=image_url,
image_tail=image_tail_url,
prompt=prompt,
mode="pro" if model["resolution"] == "1080p" else "std",
mode=mode,
duration=str(duration),
sound="on" if generate_audio else "off",
),
@ -3140,6 +3188,7 @@ class KlingFirstLastFrameNode(IO.ComfyNode):
cls,
ApiEndpoint(path=f"/proxy/kling/v1/videos/image2video/{response.data.task_id}"),
response_model=TaskStatusResponse,
max_poll_attempts=280,
status_extractor=lambda r: (r.data.task_status if r.data else None),
)
return IO.NodeOutput(await download_url_to_video_output(final_response.data.task_result.videos[0].url))

View File

@ -357,13 +357,17 @@ def calculate_tokens_price_image_1_5(response: OpenAIImageGenerationResponse) ->
return ((response.usage.input_tokens * 8.0) + (response.usage.output_tokens * 32.0)) / 1_000_000.0
def calculate_tokens_price_image_2_0(response: OpenAIImageGenerationResponse) -> float | None:
return ((response.usage.input_tokens * 8.0) + (response.usage.output_tokens * 30.0)) / 1_000_000.0
class OpenAIGPTImage1(IO.ComfyNode):
@classmethod
def define_schema(cls):
return IO.Schema(
node_id="OpenAIGPTImage1",
display_name="OpenAI GPT Image 1.5",
display_name="OpenAI GPT Image 2",
category="api node/image/OpenAI",
description="Generates images synchronously via OpenAI's GPT Image endpoint.",
inputs=[
@ -401,7 +405,17 @@ class OpenAIGPTImage1(IO.ComfyNode):
IO.Combo.Input(
"size",
default="auto",
options=["auto", "1024x1024", "1024x1536", "1536x1024"],
options=[
"auto",
"1024x1024",
"1024x1536",
"1536x1024",
"2048x2048",
"2048x1152",
"1152x2048",
"3840x2160",
"2160x3840",
],
tooltip="Image size",
optional=True,
),
@ -427,8 +441,8 @@ class OpenAIGPTImage1(IO.ComfyNode):
),
IO.Combo.Input(
"model",
options=["gpt-image-1", "gpt-image-1.5"],
default="gpt-image-1.5",
options=["gpt-image-1", "gpt-image-1.5", "gpt-image-2"],
default="gpt-image-2",
optional=True,
),
],
@ -442,23 +456,36 @@ class OpenAIGPTImage1(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["quality", "n"]),
depends_on=IO.PriceBadgeDepends(widgets=["quality", "n", "model"]),
expr="""
(
$ranges := {
"low": [0.011, 0.02],
"medium": [0.046, 0.07],
"high": [0.167, 0.3]
"gpt-image-1": {
"low": [0.011, 0.02],
"medium": [0.042, 0.07],
"high": [0.167, 0.25]
},
"gpt-image-1.5": {
"low": [0.009, 0.02],
"medium": [0.034, 0.062],
"high": [0.133, 0.22]
},
"gpt-image-2": {
"low": [0.0048, 0.012],
"medium": [0.041, 0.112],
"high": [0.165, 0.43]
}
};
$range := $lookup($ranges, widgets.quality);
$n := widgets.n;
$range := $lookup($lookup($ranges, widgets.model), widgets.quality);
$nRaw := widgets.n;
$n := ($nRaw != null and $nRaw != 0) ? $nRaw : 1;
($n = 1)
? {"type":"range_usd","min_usd": $range[0], "max_usd": $range[1]}
? {"type":"range_usd","min_usd": $range[0], "max_usd": $range[1], "format": {"approximate": true}}
: {
"type":"range_usd",
"min_usd": $range[0],
"max_usd": $range[1],
"format": { "suffix": " x " & $string($n) & "/Run" }
"min_usd": $range[0] * $n,
"max_usd": $range[1] * $n,
"format": { "suffix": "/Run", "approximate": true }
}
)
""",
@ -483,10 +510,18 @@ class OpenAIGPTImage1(IO.ComfyNode):
if mask is not None and image is None:
raise ValueError("Cannot use a mask without an input image")
if model in ("gpt-image-1", "gpt-image-1.5"):
if size not in ("auto", "1024x1024", "1024x1536", "1536x1024"):
raise ValueError(f"Resolution {size} is only supported by GPT Image 2 model")
if model == "gpt-image-1":
price_extractor = calculate_tokens_price_image_1
elif model == "gpt-image-1.5":
price_extractor = calculate_tokens_price_image_1_5
elif model == "gpt-image-2":
price_extractor = calculate_tokens_price_image_2_0
if background == "transparent":
raise ValueError("Transparent background is not supported for GPT Image 2 model")
else:
raise ValueError(f"Unknown model: {model}")

View File

@ -17,6 +17,44 @@ from comfy_api_nodes.util import (
)
from comfy_extras.nodes_images import SVG
_ARROW_MODELS = ["arrow-1.1", "arrow-1.1-max", "arrow-preview"]
def _arrow_sampling_inputs():
"""Shared sampling inputs for all Arrow model variants."""
return [
IO.Float.Input(
"temperature",
default=1.0,
min=0.0,
max=2.0,
step=0.1,
display_mode=IO.NumberDisplay.slider,
tooltip="Randomness control. Higher values increase randomness.",
advanced=True,
),
IO.Float.Input(
"top_p",
default=1.0,
min=0.05,
max=1.0,
step=0.05,
display_mode=IO.NumberDisplay.slider,
tooltip="Nucleus sampling parameter.",
advanced=True,
),
IO.Float.Input(
"presence_penalty",
default=0.0,
min=-2.0,
max=2.0,
step=0.1,
display_mode=IO.NumberDisplay.slider,
tooltip="Token presence penalty.",
advanced=True,
),
]
class QuiverTextToSVGNode(IO.ComfyNode):
@classmethod
@ -39,6 +77,7 @@ class QuiverTextToSVGNode(IO.ComfyNode):
default="",
tooltip="Additional style or formatting guidance.",
optional=True,
advanced=True,
),
IO.Autogrow.Input(
"reference_images",
@ -53,43 +92,7 @@ class QuiverTextToSVGNode(IO.ComfyNode):
),
IO.DynamicCombo.Input(
"model",
options=[
IO.DynamicCombo.Option(
"arrow-preview",
[
IO.Float.Input(
"temperature",
default=1.0,
min=0.0,
max=2.0,
step=0.1,
display_mode=IO.NumberDisplay.slider,
tooltip="Randomness control. Higher values increase randomness.",
advanced=True,
),
IO.Float.Input(
"top_p",
default=1.0,
min=0.05,
max=1.0,
step=0.05,
display_mode=IO.NumberDisplay.slider,
tooltip="Nucleus sampling parameter.",
advanced=True,
),
IO.Float.Input(
"presence_penalty",
default=0.0,
min=-2.0,
max=2.0,
step=0.1,
display_mode=IO.NumberDisplay.slider,
tooltip="Token presence penalty.",
advanced=True,
),
],
),
],
options=[IO.DynamicCombo.Option(m, _arrow_sampling_inputs()) for m in _ARROW_MODELS],
tooltip="Model to use for SVG generation.",
),
IO.Int.Input(
@ -112,7 +115,16 @@ class QuiverTextToSVGNode(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.429}""",
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$contains(widgets.model, "max")
? {"type":"usd","usd":0.3575}
: $contains(widgets.model, "preview")
? {"type":"usd","usd":0.429}
: {"type":"usd","usd":0.286}
)
""",
),
)
@ -176,12 +188,13 @@ class QuiverImageToSVGNode(IO.ComfyNode):
"auto_crop",
default=False,
tooltip="Automatically crop to the dominant subject.",
advanced=True,
),
IO.DynamicCombo.Input(
"model",
options=[
IO.DynamicCombo.Option(
"arrow-preview",
m,
[
IO.Int.Input(
"target_size",
@ -189,39 +202,12 @@ class QuiverImageToSVGNode(IO.ComfyNode):
min=128,
max=4096,
tooltip="Square resize target in pixels.",
),
IO.Float.Input(
"temperature",
default=1.0,
min=0.0,
max=2.0,
step=0.1,
display_mode=IO.NumberDisplay.slider,
tooltip="Randomness control. Higher values increase randomness.",
advanced=True,
),
IO.Float.Input(
"top_p",
default=1.0,
min=0.05,
max=1.0,
step=0.05,
display_mode=IO.NumberDisplay.slider,
tooltip="Nucleus sampling parameter.",
advanced=True,
),
IO.Float.Input(
"presence_penalty",
default=0.0,
min=-2.0,
max=2.0,
step=0.1,
display_mode=IO.NumberDisplay.slider,
tooltip="Token presence penalty.",
advanced=True,
),
*_arrow_sampling_inputs(),
],
),
)
for m in _ARROW_MODELS
],
tooltip="Model to use for SVG vectorization.",
),
@ -245,7 +231,16 @@ class QuiverImageToSVGNode(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.429}""",
depends_on=IO.PriceBadgeDepends(widgets=["model"]),
expr="""
(
$contains(widgets.model, "max")
? {"type":"usd","usd":0.3575}
: $contains(widgets.model, "preview")
? {"type":"usd","usd":0.429}
: {"type":"usd","usd":0.286}
)
""",
),
)

View File

@ -0,0 +1,287 @@
import base64
import json
import logging
import time
from urllib.parse import urljoin
import aiohttp
from typing_extensions import override
from comfy_api.latest import IO, ComfyExtension, Input
from comfy_api_nodes.util import (
ApiEndpoint,
audio_bytes_to_audio_input,
upload_video_to_comfyapi,
validate_string,
)
from comfy_api_nodes.util._helpers import (
default_base_url,
get_auth_header,
get_node_id,
is_processing_interrupted,
)
from comfy_api_nodes.util.common_exceptions import ProcessingInterrupted
from server import PromptServer
logger = logging.getLogger(__name__)
class SoniloVideoToMusic(IO.ComfyNode):
"""Generate music from video using Sonilo's AI model."""
@classmethod
def define_schema(cls) -> IO.Schema:
return IO.Schema(
node_id="SoniloVideoToMusic",
display_name="Sonilo Video to Music",
category="api node/audio/Sonilo",
description="Generate music from video content using Sonilo's AI model. "
"Analyzes the video and creates matching music.",
inputs=[
IO.Video.Input(
"video",
tooltip="Input video to generate music from. Maximum duration: 6 minutes.",
),
IO.String.Input(
"prompt",
default="",
multiline=True,
tooltip="Optional text prompt to guide music generation. "
"Leave empty for best quality - the model will fully analyze the video content.",
),
IO.Int.Input(
"seed",
default=0,
min=0,
max=0xFFFFFFFFFFFFFFFF,
control_after_generate=True,
tooltip="Seed for reproducibility. Currently ignored by the Sonilo "
"service but kept for graph consistency.",
),
],
outputs=[IO.Audio.Output()],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr='{"type":"usd","usd":0.009,"format":{"suffix":"/second"}}',
),
)
@classmethod
async def execute(
cls,
video: Input.Video,
prompt: str = "",
seed: int = 0,
) -> IO.NodeOutput:
video_url = await upload_video_to_comfyapi(cls, video, max_duration=360)
form = aiohttp.FormData()
form.add_field("video_url", video_url)
if prompt.strip():
form.add_field("prompt", prompt.strip())
audio_bytes = await _stream_sonilo_music(
cls,
ApiEndpoint(path="/proxy/sonilo/v2m/generate", method="POST"),
form,
)
return IO.NodeOutput(audio_bytes_to_audio_input(audio_bytes))
class SoniloTextToMusic(IO.ComfyNode):
"""Generate music from a text prompt using Sonilo's AI model."""
@classmethod
def define_schema(cls) -> IO.Schema:
return IO.Schema(
node_id="SoniloTextToMusic",
display_name="Sonilo Text to Music",
category="api node/audio/Sonilo",
description="Generate music from a text prompt using Sonilo's AI model. "
"Leave duration at 0 to let the model infer it from the prompt.",
inputs=[
IO.String.Input(
"prompt",
default="",
multiline=True,
tooltip="Text prompt describing the music to generate.",
),
IO.Int.Input(
"duration",
default=0,
min=0,
max=360,
tooltip="Target duration in seconds. Set to 0 to let the model "
"infer the duration from the prompt. Maximum: 6 minutes.",
),
IO.Int.Input(
"seed",
default=0,
min=0,
max=0xFFFFFFFFFFFFFFFF,
control_after_generate=True,
tooltip="Seed for reproducibility. Currently ignored by the Sonilo "
"service but kept for graph consistency.",
),
],
outputs=[IO.Audio.Output()],
hidden=[
IO.Hidden.auth_token_comfy_org,
IO.Hidden.api_key_comfy_org,
IO.Hidden.unique_id,
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["duration"]),
expr="""
(
widgets.duration > 0
? {"type":"usd","usd": 0.005 * widgets.duration}
: {"type":"usd","usd": 0.005, "format":{"suffix":"/second"}}
)
""",
),
)
@classmethod
async def execute(
cls,
prompt: str,
duration: int = 0,
seed: int = 0,
) -> IO.NodeOutput:
validate_string(prompt, strip_whitespace=True, min_length=1)
form = aiohttp.FormData()
form.add_field("prompt", prompt)
if duration > 0:
form.add_field("duration", str(duration))
audio_bytes = await _stream_sonilo_music(
cls,
ApiEndpoint(path="/proxy/sonilo/t2m/generate", method="POST"),
form,
)
return IO.NodeOutput(audio_bytes_to_audio_input(audio_bytes))
async def _stream_sonilo_music(
cls: type[IO.ComfyNode],
endpoint: ApiEndpoint,
form: aiohttp.FormData,
) -> bytes:
"""POST ``form`` to Sonilo, read the NDJSON stream, and return the first stream's audio bytes."""
url = urljoin(default_base_url().rstrip("/") + "/", endpoint.path.lstrip("/"))
headers: dict[str, str] = {}
headers.update(get_auth_header(cls))
headers.update(endpoint.headers)
node_id = get_node_id(cls)
start_ts = time.monotonic()
last_chunk_status_ts = 0.0
audio_streams: dict[int, list[bytes]] = {}
title: str | None = None
timeout = aiohttp.ClientTimeout(total=1200.0, sock_read=300.0)
async with aiohttp.ClientSession(timeout=timeout) as session:
PromptServer.instance.send_progress_text("Status: Queued", node_id)
async with session.post(url, data=form, headers=headers) as resp:
if resp.status >= 400:
msg = await _extract_error_message(resp)
raise Exception(f"Sonilo API error ({resp.status}): {msg}")
while True:
if is_processing_interrupted():
raise ProcessingInterrupted("Task cancelled")
raw_line = await resp.content.readline()
if not raw_line:
break
line = raw_line.decode("utf-8").strip()
if not line:
continue
try:
evt = json.loads(line)
except json.JSONDecodeError:
logger.warning("Sonilo: skipping malformed NDJSON line")
continue
evt_type = evt.get("type")
if evt_type == "error":
code = evt.get("code", "UNKNOWN")
message = evt.get("message", "Unknown error")
raise Exception(f"Sonilo generation error ({code}): {message}")
if evt_type == "duration":
duration_sec = evt.get("duration_sec")
if duration_sec is not None:
PromptServer.instance.send_progress_text(
f"Status: Generating\nVideo duration: {duration_sec:.1f}s",
node_id,
)
elif evt_type in ("titles", "title"):
# v2m sends a "titles" list, t2m sends a scalar "title"
if evt_type == "titles":
titles = evt.get("titles", [])
if titles:
title = titles[0]
else:
title = evt.get("title") or title
if title:
PromptServer.instance.send_progress_text(
f"Status: Generating\nTitle: {title}",
node_id,
)
elif evt_type == "audio_chunk":
stream_idx = evt.get("stream_index", 0)
chunk_data = base64.b64decode(evt["data"])
if stream_idx not in audio_streams:
audio_streams[stream_idx] = []
audio_streams[stream_idx].append(chunk_data)
now = time.monotonic()
if now - last_chunk_status_ts >= 1.0:
total_chunks = sum(len(chunks) for chunks in audio_streams.values())
elapsed = int(now - start_ts)
status_lines = ["Status: Receiving audio"]
if title:
status_lines.append(f"Title: {title}")
status_lines.append(f"Chunks received: {total_chunks}")
status_lines.append(f"Time elapsed: {elapsed}s")
PromptServer.instance.send_progress_text("\n".join(status_lines), node_id)
last_chunk_status_ts = now
elif evt_type == "complete":
break
if not audio_streams:
raise Exception("Sonilo API returned no audio data.")
PromptServer.instance.send_progress_text("Status: Completed", node_id)
selected_stream = 0 if 0 in audio_streams else min(audio_streams)
return b"".join(audio_streams[selected_stream])
async def _extract_error_message(resp: aiohttp.ClientResponse) -> str:
"""Extract a human-readable error message from an HTTP error response."""
try:
error_body = await resp.json()
detail = error_body.get("detail", {})
if isinstance(detail, dict):
return detail.get("message", str(detail))
return str(detail)
except Exception:
return await resp.text()
class SoniloExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[IO.ComfyNode]]:
return [SoniloVideoToMusic, SoniloTextToMusic]
async def comfy_entrypoint() -> SoniloExtension:
return SoniloExtension()

View File

@ -401,7 +401,7 @@ class StabilityUpscaleConservativeNode(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.25}""",
expr="""{"type":"usd","usd":0.4}""",
),
)
@ -510,7 +510,7 @@ class StabilityUpscaleCreativeNode(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.25}""",
expr="""{"type":"usd","usd":0.6}""",
),
)
@ -593,7 +593,7 @@ class StabilityUpscaleFastNode(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
expr="""{"type":"usd","usd":0.01}""",
expr="""{"type":"usd","usd":0.02}""",
),
)

View File

@ -24,8 +24,9 @@ from comfy_api_nodes.util import (
AVERAGE_DURATION_VIDEO_GEN = 32
MODELS_MAP = {
"veo-2.0-generate-001": "veo-2.0-generate-001",
"veo-3.1-generate": "veo-3.1-generate-preview",
"veo-3.1-fast-generate": "veo-3.1-fast-generate-preview",
"veo-3.1-generate": "veo-3.1-generate-001",
"veo-3.1-fast-generate": "veo-3.1-fast-generate-001",
"veo-3.1-lite": "veo-3.1-lite-generate-001",
"veo-3.0-generate-001": "veo-3.0-generate-001",
"veo-3.0-fast-generate-001": "veo-3.0-fast-generate-001",
}
@ -247,17 +248,8 @@ class VeoVideoGenerationNode(IO.ComfyNode):
raise Exception("Video generation completed but no video was returned")
class Veo3VideoGenerationNode(VeoVideoGenerationNode):
"""
Generates videos from text prompts using Google's Veo 3 API.
Supported models:
- veo-3.0-generate-001
- veo-3.0-fast-generate-001
This node extends the base Veo node with Veo 3 specific features including
audio generation and fixed 8-second duration.
"""
class Veo3VideoGenerationNode(IO.ComfyNode):
"""Generates videos from text prompts using Google's Veo 3 API."""
@classmethod
def define_schema(cls):
@ -279,6 +271,13 @@ class Veo3VideoGenerationNode(VeoVideoGenerationNode):
default="16:9",
tooltip="Aspect ratio of the output video",
),
IO.Combo.Input(
"resolution",
options=["720p", "1080p", "4k"],
default="720p",
tooltip="Output video resolution. 4K is not available for veo-3.1-lite and veo-3.0 models.",
optional=True,
),
IO.String.Input(
"negative_prompt",
multiline=True,
@ -289,11 +288,11 @@ class Veo3VideoGenerationNode(VeoVideoGenerationNode):
IO.Int.Input(
"duration_seconds",
default=8,
min=8,
min=4,
max=8,
step=1,
step=2,
display_mode=IO.NumberDisplay.number,
tooltip="Duration of the output video in seconds (Veo 3 only supports 8 seconds)",
tooltip="Duration of the output video in seconds",
optional=True,
),
IO.Boolean.Input(
@ -332,10 +331,10 @@ class Veo3VideoGenerationNode(VeoVideoGenerationNode):
options=[
"veo-3.1-generate",
"veo-3.1-fast-generate",
"veo-3.1-lite",
"veo-3.0-generate-001",
"veo-3.0-fast-generate-001",
],
default="veo-3.0-generate-001",
tooltip="Veo 3 model to use for video generation",
optional=True,
),
@ -356,21 +355,111 @@ class Veo3VideoGenerationNode(VeoVideoGenerationNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "generate_audio"]),
depends_on=IO.PriceBadgeDepends(widgets=["model", "generate_audio", "resolution", "duration_seconds"]),
expr="""
(
$m := widgets.model;
$r := widgets.resolution;
$a := widgets.generate_audio;
($contains($m,"veo-3.0-fast-generate-001") or $contains($m,"veo-3.1-fast-generate"))
? {"type":"usd","usd": ($a ? 1.2 : 0.8)}
: ($contains($m,"veo-3.0-generate-001") or $contains($m,"veo-3.1-generate"))
? {"type":"usd","usd": ($a ? 3.2 : 1.6)}
: {"type":"range_usd","min_usd":0.8,"max_usd":3.2}
$seconds := widgets.duration_seconds;
$pps :=
$contains($m, "lite")
? ($r = "1080p" ? ($a ? 0.08 : 0.05) : ($a ? 0.05 : 0.03))
: $contains($m, "3.1-fast")
? ($r = "4k" ? ($a ? 0.30 : 0.25) : $r = "1080p" ? ($a ? 0.12 : 0.10) : ($a ? 0.10 : 0.08))
: $contains($m, "3.1-generate")
? ($r = "4k" ? ($a ? 0.60 : 0.40) : ($a ? 0.40 : 0.20))
: $contains($m, "3.0-fast")
? ($a ? 0.15 : 0.10)
: ($a ? 0.40 : 0.20);
{"type":"usd","usd": $pps * $seconds}
)
""",
),
)
@classmethod
async def execute(
cls,
prompt,
aspect_ratio="16:9",
resolution="720p",
negative_prompt="",
duration_seconds=8,
enhance_prompt=True,
person_generation="ALLOW",
seed=0,
image=None,
model="veo-3.0-generate-001",
generate_audio=False,
):
if resolution == "4k" and ("lite" in model or "3.0" in model):
raise Exception("4K resolution is not supported by the veo-3.1-lite or veo-3.0 models.")
model = MODELS_MAP[model]
instances = [{"prompt": prompt}]
if image is not None:
image_base64 = tensor_to_base64_string(image)
if image_base64:
instances[0]["image"] = {"bytesBase64Encoded": image_base64, "mimeType": "image/png"}
parameters = {
"aspectRatio": aspect_ratio,
"personGeneration": person_generation,
"durationSeconds": duration_seconds,
"enhancePrompt": True,
"generateAudio": generate_audio,
}
if negative_prompt:
parameters["negativePrompt"] = negative_prompt
if seed > 0:
parameters["seed"] = seed
if "veo-3.1" in model:
parameters["resolution"] = resolution
initial_response = await sync_op(
cls,
ApiEndpoint(path=f"/proxy/veo/{model}/generate", method="POST"),
response_model=VeoGenVidResponse,
data=VeoGenVidRequest(
instances=instances,
parameters=parameters,
),
)
poll_response = await poll_op(
cls,
ApiEndpoint(path=f"/proxy/veo/{model}/poll", method="POST"),
response_model=VeoGenVidPollResponse,
status_extractor=lambda r: "completed" if r.done else "pending",
data=VeoGenVidPollRequest(operationName=initial_response.name),
poll_interval=9.0,
estimated_duration=AVERAGE_DURATION_VIDEO_GEN,
)
if poll_response.error:
raise Exception(f"Veo API error: {poll_response.error.message} (code: {poll_response.error.code})")
response = poll_response.response
filtered_count = response.raiMediaFilteredCount
if filtered_count:
reasons = response.raiMediaFilteredReasons or []
reason_part = f": {reasons[0]}" if reasons else ""
raise Exception(
f"Content blocked by Google's Responsible AI filters{reason_part} "
f"({filtered_count} video{'s' if filtered_count != 1 else ''} filtered)."
)
if response.videos:
video = response.videos[0]
if video.bytesBase64Encoded:
return IO.NodeOutput(InputImpl.VideoFromFile(BytesIO(base64.b64decode(video.bytesBase64Encoded))))
if video.gcsUri:
return IO.NodeOutput(await download_url_to_video_output(video.gcsUri))
raise Exception("Video returned but no data or URL was provided")
raise Exception("Video generation completed but no video was returned")
class Veo3FirstLastFrameNode(IO.ComfyNode):
@ -394,7 +483,7 @@ class Veo3FirstLastFrameNode(IO.ComfyNode):
default="",
tooltip="Negative text prompt to guide what to avoid in the video",
),
IO.Combo.Input("resolution", options=["720p", "1080p"]),
IO.Combo.Input("resolution", options=["720p", "1080p", "4k"]),
IO.Combo.Input(
"aspect_ratio",
options=["16:9", "9:16"],
@ -424,8 +513,7 @@ class Veo3FirstLastFrameNode(IO.ComfyNode):
IO.Image.Input("last_frame", tooltip="End frame"),
IO.Combo.Input(
"model",
options=["veo-3.1-generate", "veo-3.1-fast-generate"],
default="veo-3.1-fast-generate",
options=["veo-3.1-generate", "veo-3.1-fast-generate", "veo-3.1-lite"],
),
IO.Boolean.Input(
"generate_audio",
@ -443,26 +531,20 @@ class Veo3FirstLastFrameNode(IO.ComfyNode):
],
is_api_node=True,
price_badge=IO.PriceBadge(
depends_on=IO.PriceBadgeDepends(widgets=["model", "generate_audio", "duration"]),
depends_on=IO.PriceBadgeDepends(widgets=["model", "generate_audio", "duration", "resolution"]),
expr="""
(
$prices := {
"veo-3.1-fast-generate": { "audio": 0.15, "no_audio": 0.10 },
"veo-3.1-generate": { "audio": 0.40, "no_audio": 0.20 }
};
$m := widgets.model;
$ga := (widgets.generate_audio = "true");
$r := widgets.resolution;
$ga := widgets.generate_audio;
$seconds := widgets.duration;
$modelKey :=
$contains($m, "veo-3.1-fast-generate") ? "veo-3.1-fast-generate" :
$contains($m, "veo-3.1-generate") ? "veo-3.1-generate" :
"";
$audioKey := $ga ? "audio" : "no_audio";
$modelPrices := $lookup($prices, $modelKey);
$pps := $lookup($modelPrices, $audioKey);
($pps != null)
? {"type":"usd","usd": $pps * $seconds}
: {"type":"range_usd","min_usd": 0.4, "max_usd": 3.2}
$pps :=
$contains($m, "lite")
? ($r = "1080p" ? ($ga ? 0.08 : 0.05) : ($ga ? 0.05 : 0.03))
: $contains($m, "fast")
? ($r = "4k" ? ($ga ? 0.30 : 0.25) : $r = "1080p" ? ($ga ? 0.12 : 0.10) : ($ga ? 0.10 : 0.08))
: ($r = "4k" ? ($ga ? 0.60 : 0.40) : ($ga ? 0.40 : 0.20));
{"type":"usd","usd": $pps * $seconds}
)
""",
),
@ -482,6 +564,9 @@ class Veo3FirstLastFrameNode(IO.ComfyNode):
model: str,
generate_audio: bool,
):
if "lite" in model and resolution == "4k":
raise Exception("4K resolution is not supported by the veo-3.1-lite model.")
model = MODELS_MAP[model]
initial_response = await sync_op(
cls,
@ -519,7 +604,7 @@ class Veo3FirstLastFrameNode(IO.ComfyNode):
data=VeoGenVidPollRequest(
operationName=initial_response.name,
),
poll_interval=5.0,
poll_interval=9.0,
estimated_duration=AVERAGE_DURATION_VIDEO_GEN,
)

View File

@ -19,6 +19,7 @@ from .conversions import (
image_tensor_pair_to_batch,
pil_to_bytesio,
resize_mask_to_image,
resize_video_to_pixel_budget,
tensor_to_base64_string,
tensor_to_bytesio,
tensor_to_pil,
@ -90,6 +91,7 @@ __all__ = [
"image_tensor_pair_to_batch",
"pil_to_bytesio",
"resize_mask_to_image",
"resize_video_to_pixel_budget",
"tensor_to_base64_string",
"tensor_to_bytesio",
"tensor_to_pil",

View File

@ -156,6 +156,7 @@ async def poll_op(
estimated_duration: int | None = None,
cancel_endpoint: ApiEndpoint | None = None,
cancel_timeout: float = 10.0,
extra_text: str | None = None,
) -> M:
raw = await poll_op_raw(
cls,
@ -176,6 +177,7 @@ async def poll_op(
estimated_duration=estimated_duration,
cancel_endpoint=cancel_endpoint,
cancel_timeout=cancel_timeout,
extra_text=extra_text,
)
if not isinstance(raw, dict):
raise Exception("Expected JSON response to validate into a Pydantic model, got non-JSON (binary or text).")
@ -260,6 +262,7 @@ async def poll_op_raw(
estimated_duration: int | None = None,
cancel_endpoint: ApiEndpoint | None = None,
cancel_timeout: float = 10.0,
extra_text: str | None = None,
) -> dict[str, Any]:
"""
Polls an endpoint until the task reaches a terminal state. Displays time while queued/processing,
@ -299,6 +302,7 @@ async def poll_op_raw(
price=state.price,
is_queued=state.is_queued,
processing_elapsed_seconds=int(proc_elapsed),
extra_text=extra_text,
)
await asyncio.sleep(1.0)
except Exception as exc:
@ -389,6 +393,7 @@ async def poll_op_raw(
price=state.price,
is_queued=False,
processing_elapsed_seconds=int(state.base_processing_elapsed),
extra_text=extra_text,
)
return resp_json
@ -462,6 +467,7 @@ def _display_time_progress(
price: float | None = None,
is_queued: bool | None = None,
processing_elapsed_seconds: int | None = None,
extra_text: str | None = None,
) -> None:
if estimated_total is not None and estimated_total > 0 and is_queued is False:
pe = processing_elapsed_seconds if processing_elapsed_seconds is not None else elapsed_seconds
@ -469,7 +475,8 @@ def _display_time_progress(
time_line = f"Time elapsed: {int(elapsed_seconds)}s (~{remaining}s remaining)"
else:
time_line = f"Time elapsed: {int(elapsed_seconds)}s"
_display_text(node_cls, time_line, status=status, price=price)
text = f"{time_line}\n\n{extra_text}" if extra_text else time_line
_display_text(node_cls, text, status=status, price=price)
async def _diagnose_connectivity() -> dict[str, bool]:

View File

@ -129,22 +129,38 @@ def pil_to_bytesio(img: Image.Image, mime_type: str = "image/png") -> BytesIO:
return img_byte_arr
def _compute_downscale_dims(src_w: int, src_h: int, total_pixels: int) -> tuple[int, int] | None:
"""Return downscaled (w, h) with even dims fitting ``total_pixels``, or None if already fits.
Source aspect ratio is preserved; output may drift by a fraction of a percent because both dimensions
are rounded down to even values (many codecs require divisible-by-2).
"""
pixels = src_w * src_h
if pixels <= total_pixels:
return None
scale = math.sqrt(total_pixels / pixels)
new_w = max(2, int(src_w * scale))
new_h = max(2, int(src_h * scale))
new_w -= new_w % 2
new_h -= new_h % 2
return new_w, new_h
def downscale_image_tensor(image: torch.Tensor, total_pixels: int = 1536 * 1024) -> torch.Tensor:
"""Downscale input image tensor to roughly the specified total pixels."""
"""Downscale input image tensor to roughly the specified total pixels.
Output dimensions are rounded down to even values so that the result is guaranteed to fit within ``total_pixels``
and is compatible with codecs that require even dimensions (e.g. yuv420p).
"""
samples = image.movedim(-1, 1)
total = int(total_pixels)
scale_by = math.sqrt(total / (samples.shape[3] * samples.shape[2]))
if scale_by >= 1:
dims = _compute_downscale_dims(samples.shape[3], samples.shape[2], int(total_pixels))
if dims is None:
return image
width = round(samples.shape[3] * scale_by)
height = round(samples.shape[2] * scale_by)
s = common_upscale(samples, width, height, "lanczos", "disabled")
s = s.movedim(1, -1)
return s
new_w, new_h = dims
return common_upscale(samples, new_w, new_h, "lanczos", "disabled").movedim(1, -1)
def downscale_image_tensor_by_max_side(image: torch.Tensor, *, max_side: int) -> torch.Tensor:
def downscale_image_tensor_by_max_side(image: torch.Tensor, *, max_side: int) -> torch.Tensor:
"""Downscale input image tensor so the largest dimension is at most max_side pixels."""
samples = image.movedim(-1, 1)
height, width = samples.shape[2], samples.shape[3]
@ -399,6 +415,72 @@ def trim_video(video: Input.Video, duration_sec: float) -> Input.Video:
raise RuntimeError(f"Failed to trim video: {str(e)}") from e
def resize_video_to_pixel_budget(video: Input.Video, total_pixels: int) -> Input.Video:
"""Downscale a video to fit within ``total_pixels`` (w * h), preserving aspect ratio.
Returns the original video object untouched when it already fits. Preserves frame rate, duration, and audio.
Aspect ratio is preserved up to a fraction of a percent (even-dim rounding).
"""
src_w, src_h = video.get_dimensions()
scale_dims = _compute_downscale_dims(src_w, src_h, total_pixels)
if scale_dims is None:
return video
return _apply_video_scale(video, scale_dims)
def _apply_video_scale(video: Input.Video, scale_dims: tuple[int, int]) -> Input.Video:
"""Re-encode ``video`` scaled to ``scale_dims`` with a single decode/encode pass."""
out_w, out_h = scale_dims
output_buffer = BytesIO()
input_container = None
output_container = None
try:
input_source = video.get_stream_source()
input_container = av.open(input_source, mode="r")
output_container = av.open(output_buffer, mode="w", format="mp4")
video_stream = output_container.add_stream("h264", rate=video.get_frame_rate())
video_stream.width = out_w
video_stream.height = out_h
video_stream.pix_fmt = "yuv420p"
audio_stream = None
for stream in input_container.streams:
if isinstance(stream, av.AudioStream):
audio_stream = output_container.add_stream("aac", rate=stream.sample_rate)
audio_stream.sample_rate = stream.sample_rate
audio_stream.layout = stream.layout
break
for frame in input_container.decode(video=0):
frame = frame.reformat(width=out_w, height=out_h, format="yuv420p")
for packet in video_stream.encode(frame):
output_container.mux(packet)
for packet in video_stream.encode():
output_container.mux(packet)
if audio_stream is not None:
input_container.seek(0)
for audio_frame in input_container.decode(audio=0):
for packet in audio_stream.encode(audio_frame):
output_container.mux(packet)
for packet in audio_stream.encode():
output_container.mux(packet)
output_container.close()
input_container.close()
output_buffer.seek(0)
return InputImpl.VideoFromFile(output_buffer)
except Exception as e:
if input_container is not None:
input_container.close()
if output_container is not None:
output_container.close()
raise RuntimeError(f"Failed to resize video: {str(e)}") from e
def _f32_pcm(wav: torch.Tensor) -> torch.Tensor:
"""Convert audio to float 32 bits PCM format. Copy-paste from nodes_audio.py file."""
if wav.dtype.is_floating_point:

View File

@ -11,7 +11,7 @@ class PreviewAny():
"required": {"source": (IO.ANY, {})},
}
RETURN_TYPES = ()
RETURN_TYPES = (IO.STRING,)
FUNCTION = "main"
OUTPUT_NODE = True
@ -33,7 +33,7 @@ class PreviewAny():
except Exception:
value = 'source exists, but could not be serialized.'
return {"ui": {"text": (value,)}}
return {"ui": {"text": (value,)}, "result": (value,)}
NODE_CLASS_MAPPINGS = {
"PreviewAny": PreviewAny,

View File

@ -1,4 +1,5 @@
import re
import json
from typing_extensions import override
from comfy_api.latest import ComfyExtension, io
@ -375,6 +376,39 @@ class RegexReplace(io.ComfyNode):
return io.NodeOutput(result)
class JsonExtractString(io.ComfyNode):
@classmethod
def define_schema(cls):
return io.Schema(
node_id="JsonExtractString",
display_name="Extract String from JSON",
category="utils/string",
search_aliases=["json", "extract json", "parse json", "json value", "read json"],
inputs=[
io.String.Input("json_string", multiline=True),
io.String.Input("key", multiline=False),
],
outputs=[
io.String.Output(),
]
)
@classmethod
def execute(cls, json_string, key):
try:
data = json.loads(json_string)
if isinstance(data, dict) and key in data:
value = data[key]
if value is None:
return io.NodeOutput("")
return io.NodeOutput(str(value))
return io.NodeOutput("")
except (json.JSONDecodeError, TypeError):
return io.NodeOutput("")
class StringExtension(ComfyExtension):
@override
async def get_node_list(self) -> list[type[io.ComfyNode]]:
@ -390,6 +424,7 @@ class StringExtension(ComfyExtension):
RegexMatch,
RegexExtract,
RegexReplace,
JsonExtractString,
]
async def comfy_entrypoint() -> StringExtension:

View File

@ -35,6 +35,7 @@ class TextGenerate(io.ComfyNode):
io.Int.Input("max_length", default=256, min=1, max=2048),
io.DynamicCombo.Input("sampling_mode", options=sampling_options, display_name="Sampling Mode"),
io.Boolean.Input("thinking", optional=True, default=False, tooltip="Operate in thinking mode if the model supports it."),
io.Boolean.Input("use_default_template", optional=True, default=True, tooltip="Use the built in system prompt/template if the model has one.", advanced=True),
],
outputs=[
io.String.Output(display_name="generated_text"),
@ -42,9 +43,9 @@ class TextGenerate(io.ComfyNode):
)
@classmethod
def execute(cls, clip, prompt, max_length, sampling_mode, image=None, thinking=False) -> io.NodeOutput:
def execute(cls, clip, prompt, max_length, sampling_mode, image=None, thinking=False, use_default_template=True) -> io.NodeOutput:
tokens = clip.tokenize(prompt, image=image, skip_template=False, min_length=1, thinking=thinking)
tokens = clip.tokenize(prompt, image=image, skip_template=not use_default_template, min_length=1, thinking=thinking)
# Get sampling parameters from dynamic combo
do_sample = sampling_mode.get("sampling_mode") == "on"
@ -160,12 +161,12 @@ class TextGenerateLTX2Prompt(TextGenerate):
)
@classmethod
def execute(cls, clip, prompt, max_length, sampling_mode, image=None, thinking=False) -> io.NodeOutput:
def execute(cls, clip, prompt, max_length, sampling_mode, image=None, thinking=False, use_default_template=True) -> io.NodeOutput:
if image is None:
formatted_prompt = f"<start_of_turn>system\n{LTX2_T2V_SYSTEM_PROMPT.strip()}<end_of_turn>\n<start_of_turn>user\nUser Raw Input Prompt: {prompt}.<end_of_turn>\n<start_of_turn>model\n"
else:
formatted_prompt = f"<start_of_turn>system\n{LTX2_I2V_SYSTEM_PROMPT.strip()}<end_of_turn>\n<start_of_turn>user\n\n<image_soft_token>\n\nUser Raw Input Prompt: {prompt}.<end_of_turn>\n<start_of_turn>model\n"
return super().execute(clip, formatted_prompt, max_length, sampling_mode, image, thinking)
return super().execute(clip, formatted_prompt, max_length, sampling_mode, image, thinking, use_default_template)
class TextgenExtension(ComfyExtension):

View File

@ -1,3 +1,3 @@
# This file is automatically generated by the build process when version is
# updated in pyproject.toml.
__version__ = "0.18.1"
__version__ = "0.19.5"

View File

@ -1,6 +1,6 @@
[project]
name = "ComfyUI"
version = "0.18.1"
version = "0.19.5"
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.10"

View File

@ -1,5 +1,5 @@
comfyui-frontend-package==1.42.10
comfyui-workflow-templates==0.9.45
comfyui-frontend-package==1.42.14
comfyui-workflow-templates==0.9.61
comfyui-embedded-docs==0.4.3
torch
torchsde