ComfyUI v0.25.1

[Partner Nodes] feat(Kling): add support for Kling V3-Turbo model (#14528 )
ComfyUI v0.25.0
2026-06-19 04:57:26 +08:00 · 2026-06-18 00:07:36 +00:00 · 2026-06-18 07:57:37 +08:00 · 2026-06-15 23:45:14 -04:00 · 2026-06-16 11:42:00 +08:00 · 2026-06-15 20:23:09 -07:00
11 changed files with 213 additions and 45 deletions
--- a/comfy/cli_args.py
+++ b/comfy/cli_args.py
@ -145,6 +145,7 @@ vram_group.add_argument("--novram", action="store_true", help="When lowvram isn'
 vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")

 parser.add_argument("--reserve-vram", type=float, default=None, help="Set the amount of vram in GB you want to reserve for use by your OS/other software. By default some amount is reserved depending on your OS.")
+parser.add_argument("--vram-headroom", type=float, default=0, help="Set the amount of vram in GB for DynamicVRAM to maintain as extra headroom above default. ComfyUI will try and keep this much VRAM completely free and unused, even counting VRAM from other apps.")

 parser.add_argument("--async-offload", nargs='?', const=2, type=int, default=None, metavar="NUM_STREAMS", help="Use async weight offloading. An optional argument controls the amount of offload streams. Default is 2. Enabled by default on Nvidia.")
 parser.add_argument("--disable-async-offload", action="store_true", help="Disable async weight offloading.")
--- a/comfy_api/latest/_input_impl/video_types.py
+++ b/comfy_api/latest/_input_impl/video_types.py
@ -325,21 +325,25 @@ class VideoFromFile(VideoInput):
                            checked_alpha = True

                        # Fix non-deterministic video decode when the video width is not a multiple of 32
-                        # For non-yuvj pixel formats (all H.264/H.265 video)
+                        # For non-yuvj pixel formats: most H.264/H.265 video and static images (e.g. lossy WebP via LoadImage)
+                        # Pad both axes to a multiple of 32 and smear the border so the alignment padding never bleeds into the cropped edges
                        if image_format in ('gbrpf32le', 'gbrapf32le') and frame.width % 32 != 0:
                            if align_graph is None:
                                pad_w = ((frame.width + 31) // 32) * 32
+                                pad_h = ((frame.height + 31) // 32) * 32
                                g = av.filter.Graph()
                                g_src = g.add_buffer(width=frame.width, height=frame.height,
                                                     format=frame.format.name, time_base=video_stream.time_base)
-                                g_pad = g.add('pad', f'{pad_w}:{frame.height}:0:0')
+                                g_pad = g.add('pad', f'{pad_w}:{pad_h}:0:0')
+                                g_fill = g.add('fillborders', f'left=0:right={pad_w - frame.width}:top=0:bottom={pad_h - frame.height}:mode=smear')
                                g_sink = g.add('buffersink')
                                g_src.link_to(g_pad)
-                                g_pad.link_to(g_sink)
+                                g_pad.link_to(g_fill)
+                                g_fill.link_to(g_sink)
                                g.configure()
                                align_graph = (g, g_src, g_sink)
                            align_graph[1].push(frame)
-                            img = np.ascontiguousarray(align_graph[2].pull().to_ndarray(format=image_format)[:, :frame.width])
+                            img = np.ascontiguousarray(align_graph[2].pull().to_ndarray(format=image_format)[:frame.height, :frame.width])
                        else:
                            img = frame.to_ndarray(format=image_format)
                        if frame.rotation != 0:
--- a/comfy_api_nodes/apis/kling.py
+++ b/comfy_api_nodes/apis/kling.py
@ -149,3 +149,59 @@ class MotionControlRequest(BaseModel):
    character_orientation: str = Field(...)
    mode: str = Field(..., description="'pro' or 'std'")
    model_name: str = Field(...)
+
+
+class Kling3TurboSettings(BaseModel):
+    resolution: str = Field("720p", description="'720p' or '1080p'")
+    aspect_ratio: str | None = Field(None, description="'16:9'/'9:16'/'1:1'; text-to-video only")
+    duration: int = Field(5, description="3-15 second")
+
+
+class Kling3TurboText2VideoRequest(BaseModel):
+    prompt: str = Field(..., description="<=3072 chars; may use multi-shot 'shot n, m, words; ...'")
+    settings: Kling3TurboSettings | None = Field(None)
+
+
+class Kling3TurboContent(BaseModel):
+    type: str = Field(..., description="'prompt' or 'first_frame'")
+    text: str | None = Field(None, description="for type=prompt; <=2500 chars")
+    url: str | None = Field(None, description="for type=first_frame")
+
+
+class Kling3TurboImage2VideoRequest(BaseModel):
+    contents: list[Kling3TurboContent] = Field(..., description="prompt + first_frame materials")
+    settings: Kling3TurboSettings | None = Field(None)
+
+
+class Kling3TurboCreateData(BaseModel):
+    id: str | None = Field(None, description="Task ID")
+    status: str | None = Field(None)
+    message: str | None = Field(None)
+
+
+class Kling3TurboCreateResponse(BaseModel):
+    code: int | None = Field(None)
+    message: str | None = Field(None)
+    request_id: str | None = Field(None)
+    data: Kling3TurboCreateData | None = Field(None)
+
+
+class Kling3TurboOutput(BaseModel):
+    type: str | None = Field(None, description="'video', 'image', 'audio', ...")
+    id: str | None = Field(None)
+    url: str | None = Field(None)
+    duration: str | None = Field(None)
+
+
+class Kling3TurboTaskData(BaseModel):
+    id: str | None = Field(None)
+    status: str | None = Field(None, description="submitted | processing | succeeded | failed")
+    message: str | None = Field(None)
+    outputs: list[Kling3TurboOutput] | None = Field(None)
+
+
+class Kling3TurboQueryResponse(BaseModel):
+    code: int | None = Field(None)
+    message: str | None = Field(None)
+    request_id: str | None = Field(None)
+    data: list[Kling3TurboTaskData] | None = Field(None)
--- a/comfy_api_nodes/nodes_kling.py
+++ b/comfy_api_nodes/nodes_kling.py
@ -60,6 +60,12 @@ from comfy_api_nodes.apis.kling import (
    OmniProImageRequest,
    OmniProReferences2VideoRequest,
    OmniProText2VideoRequest,
+    Kling3TurboSettings,
+    Kling3TurboText2VideoRequest,
+    Kling3TurboContent,
+    Kling3TurboImage2VideoRequest,
+    Kling3TurboCreateResponse,
+    Kling3TurboQueryResponse,
    TaskStatusResponse,
    TextToVideoWithAudioRequest,
 )
@ -2847,6 +2853,67 @@ class MotionControl(IO.ComfyNode):
        return IO.NodeOutput(await download_url_to_video_output(final_response.data.task_result.videos[0].url))


+def build_turbo_shot_prompt(multi_prompt: list[MultiPromptEntry]) -> str:
+    """Render storyboard entries into the Turbo multi-shot prompt 'shot n, m, words; ...'."""
+    return "; ".join(f"shot {i}, {int(e.duration)}, {e.prompt}" for i, e in enumerate(multi_prompt, 1)) + ";"
+
+
+def _turbo_video_url(response: Kling3TurboQueryResponse) -> str:
+    """Extract the result video URL from a /tasks response (data[].outputs[] where type == 'video')."""
+    task = response.data[0] if response.data else None
+    if task and task.outputs:
+        for output in task.outputs:
+            if output.type == "video" and output.url:
+                return output.url
+    raise RuntimeError(f"Kling 3.0 Turbo task finished without a video output: {response.model_dump()}")
+
+
+async def execute_kling_turbo(
+    cls: type[IO.ComfyNode],
+    *,
+    prompt: str,
+    resolution: str,
+    aspect_ratio: str,
+    duration: int,
+    start_frame: torch.Tensor | None,
+) -> IO.NodeOutput:
+    """Create + poll a Kling 3.0 Turbo task. Image-to-video when start_frame is given, else text-to-video."""
+    if start_frame is not None:
+        validate_image_dimensions(start_frame, min_width=300, min_height=300)
+        validate_image_aspect_ratio(start_frame, (1, 2.5), (2.5, 1))
+        contents = [Kling3TurboContent(type="first_frame", url=tensor_to_base64_string(start_frame))]
+        if prompt:
+            contents.insert(0, Kling3TurboContent(type="prompt", text=prompt))
+        create = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/kling/image-to-video/kling-3.0-turbo", method="POST"),
+            response_model=Kling3TurboCreateResponse,
+            data=Kling3TurboImage2VideoRequest(
+                contents=contents,
+                settings=Kling3TurboSettings(resolution=resolution, duration=duration),  # i2v: no aspect_ratio
+            ),
+        )
+    else:
+        create = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/kling/text-to-video/kling-3.0-turbo", method="POST"),
+            response_model=Kling3TurboCreateResponse,
+            data=Kling3TurboText2VideoRequest(
+                prompt=prompt,
+                settings=Kling3TurboSettings(resolution=resolution, aspect_ratio=aspect_ratio, duration=duration),
+            ),
+        )
+    if not (create.data and create.data.id):
+        raise RuntimeError(f"Kling 3.0 Turbo create failed. Code: {create.code}, Message: {create.message}")
+    final_response = await poll_op(
+        cls,
+        ApiEndpoint(path="/proxy/kling/tasks", query_params={"task_ids": create.data.id}),
+        response_model=Kling3TurboQueryResponse,
+        status_extractor=lambda r: (r.data[0].status if r.data else None),
+    )
+    return IO.NodeOutput(await download_url_to_video_output(_turbo_video_url(final_response)))
+
+
 class KlingVideoNode(IO.ComfyNode):

    @classmethod
@ -2884,7 +2951,11 @@ class KlingVideoNode(IO.ComfyNode):
                    ],
                    tooltip="Generate a series of video segments with individual prompts and durations.",
                ),
-                IO.Boolean.Input("generate_audio", default=True),
+                IO.Boolean.Input(
+                    "generate_audio",
+                    default=True,
+                    tooltip="'kling-3.0-turbo' always generates native audio, so the audio toggle is ignored.",
+                ),
                IO.DynamicCombo.Input(
                    "model",
                    options=[
@ -2899,6 +2970,17 @@ class KlingVideoNode(IO.ComfyNode):
                                ),
                            ],
                        ),
+                        IO.DynamicCombo.Option(
+                            "kling-3.0-turbo",
+                            [
+                                IO.Combo.Input("resolution", options=["1080p", "720p"], default="720p"),
+                                IO.Combo.Input(
+                                    "aspect_ratio",
+                                    options=["16:9", "9:16", "1:1"],
+                                    tooltip="Ignored in image-to-video mode.",
+                                ),
+                            ],
+                        ),
                    ],
                    tooltip="Model and generation settings.",
                ),
@ -2930,6 +3012,7 @@ class KlingVideoNode(IO.ComfyNode):
            price_badge=IO.PriceBadge(
                depends_on=IO.PriceBadgeDepends(
                    widgets=[
+                        "model",
                        "model.resolution",
                        "generate_audio",
                        "multi_shot",
@ -2944,14 +3027,7 @@ class KlingVideoNode(IO.ComfyNode):
                ),
                expr="""
                (
-                  $rates := {
-                    "4k": {"off": 0.42, "on": 0.42},
-                    "1080p": {"off": 0.112, "on": 0.168},
-                    "720p": {"off": 0.084, "on": 0.126}
-                  };
                  $res := $lookup(widgets, "model.resolution");
-                  $audio := widgets.generate_audio ? "on" : "off";
-                  $rate := $lookup($lookup($rates, $res), $audio);
                  $ms := widgets.multi_shot;
                  $isSb := $ms != "disabled";
                  $n := $isSb ? $number($substring($ms, 0, 1)) : 0;
@ -2962,7 +3038,18 @@ class KlingVideoNode(IO.ComfyNode):
                  $d5 := $n >= 5 ? $lookup(widgets, "multi_shot.storyboard_5_duration") : 0;
                  $d6 := $n >= 6 ? $lookup(widgets, "multi_shot.storyboard_6_duration") : 0;
                  $dur := $isSb ? $d1 + $d2 + $d3 + $d4 + $d5 + $d6 : $lookup(widgets, "multi_shot.duration");
-                  {"type":"usd","usd": $rate * $dur}
+                  widgets.model = "kling-3.0-turbo"
+                    ? {"type":"usd","usd": ($res = "1080p" ? 0.14 : 0.112) * $dur}
+                    : (
+                        $rates := {
+                          "4k": {"off": 0.42, "on": 0.42},
+                          "1080p": {"off": 0.112, "on": 0.168},
+                          "720p": {"off": 0.084, "on": 0.126}
+                        };
+                        $audio := widgets.generate_audio ? "on" : "off";
+                        $rate := $lookup($lookup($rates, $res), $audio);
+                        {"type":"usd","usd": $rate * $dur}
+                      )
                )
                """,
            ),
@ -3015,6 +3102,17 @@ class KlingVideoNode(IO.ComfyNode):
            duration = multi_shot["duration"]
            validate_string(multi_shot["prompt"], min_length=1, max_length=2500)

+        if model["model"] == "kling-3.0-turbo":
+            turbo_prompt = build_turbo_shot_prompt(multi_prompt_list) if custom_multi_shot else multi_shot["prompt"]
+            return await execute_kling_turbo(
+                cls,
+                prompt=turbo_prompt,
+                resolution=model["resolution"],
+                aspect_ratio=model["aspect_ratio"],
+                duration=duration,
+                start_frame=start_frame,
+            )
+
        if start_frame is not None:
            validate_image_dimensions(start_frame, min_width=300, min_height=300)
            validate_image_aspect_ratio(start_frame, (1, 2.5), (2.5, 1))
--- a/comfy_api_nodes/nodes_sonilo.py
+++ b/comfy_api_nodes/nodes_sonilo.py
@ -111,11 +111,10 @@ class SoniloTextToMusic(IO.ComfyNode):
                ),
                IO.Int.Input(
                    "duration",
-                    default=0,
-                    min=0,
+                    default=30,
+                    min=1,
                    max=360,
-                    tooltip="Target duration in seconds. Set to 0 to let the model "
-                    "infer the duration from the prompt. Maximum: 6 minutes.",
+                    tooltip="Target duration in seconds. Maximum: 6 minutes.",
                ),
                IO.Int.Input(
                    "seed",
@ -150,14 +149,13 @@ class SoniloTextToMusic(IO.ComfyNode):
    async def execute(
        cls,
        prompt: str,
-        duration: int = 0,
+        duration: int = 1,
        seed: int = 0,
    ) -> IO.NodeOutput:
-        validate_string(prompt, strip_whitespace=True, min_length=1)
+        validate_string(prompt, strip_whitespace=True, min_length=1, max_length=1000)
        form = aiohttp.FormData()
        form.add_field("prompt", prompt)
-        if duration > 0:
-            form.add_field("duration", str(duration))
+        form.add_field("duration", str(duration))
        audio_bytes = await _stream_sonilo_music(
            cls,
            ApiEndpoint(path="/proxy/sonilo/t2m/generate", method="POST"),
--- a/comfy_extras/nodes_rtdetr.py
+++ b/comfy_extras/nodes_rtdetr.py
@ -14,7 +14,7 @@ class RTDETR_detect(io.ComfyNode):
    def define_schema(cls):
        return io.Schema(
            node_id="RTDETR_detect",
-            display_name="RT-DETR Detect",
+            display_name="Run Real-Time Detection (RT-DETR)",
            category="image/detection",
            search_aliases=["bbox", "bounding box", "object detection", "coco"],
            inputs=[
--- a/comfy_extras/nodes_sam3.py
+++ b/comfy_extras/nodes_sam3.py
@ -264,7 +264,7 @@ class SAM3_VideoTrack(io.ComfyNode):
    def define_schema(cls):
        return io.Schema(
            node_id="SAM3_VideoTrack",
-            display_name="SAM3 Video Track",
+            display_name="Run SAM3 Video Track",
            category="image/detection",
            search_aliases=["sam3", "video", "track", "propagate"],
            inputs=[
--- a/comfyui_version.py
+++ b/comfyui_version.py
@ -1,3 +1,3 @@
 # This file is automatically generated by the build process when version is
 # updated in pyproject.toml.
-__version__ = "0.24.0"
+__version__ = "0.25.1"
--- a/main.py
+++ b/main.py
@ -55,7 +55,11 @@ if __name__ == "__main__" and args.debug_hang:
 import comfy_aimdo.control

 if enables_dynamic_vram():
-    comfy_aimdo.control.init()
+    try:
+        comfy_aimdo.control.init(simple_vram_headroom=None if args.reserve_vram is None else int(args.reserve_vram * 1024 ** 3))
+    except TypeError:
+        # comfy-aimdo 0.4.9 protocol.
+        comfy_aimdo.control.init()

 if os.name == "nt":
    os.environ['MIMALLOC_PURGE_DELAY'] = '0'
@ -231,23 +235,30 @@ import comfy.model_patcher
 if args.enable_dynamic_vram or (enables_dynamic_vram() and comfy.model_management.is_nvidia() and not comfy.model_management.is_wsl()):
    if (not args.enable_dynamic_vram) and (comfy.model_management.torch_version_numeric < (2, 8)):
        logging.warning("Unsupported Pytorch detected. DynamicVRAM support requires Pytorch version 2.8 or later. Falling back to legacy ModelPatcher. VRAM estimates may be unreliable especially on Windows")
-    elif comfy_aimdo.control.init_devices(d.index for d in comfy.model_management.get_all_torch_devices()):
-        if args.verbose == 'DEBUG':
-            comfy_aimdo.control.set_log_debug()
-        elif args.verbose == 'CRITICAL':
-            comfy_aimdo.control.set_log_critical()
-        elif args.verbose == 'ERROR':
-            comfy_aimdo.control.set_log_error()
-        elif args.verbose == 'WARNING':
-            comfy_aimdo.control.set_log_warning()
-        else: #INFO
-            comfy_aimdo.control.set_log_info()
-
-        comfy.model_patcher.CoreModelPatcher = comfy.model_patcher.ModelPatcherDynamic
-        comfy.memory_management.aimdo_enabled = True
-        logging.info("DynamicVRAM support detected and enabled")
    else:
-        logging.warning("No working comfy-aimdo install detected. DynamicVRAM support disabled. Falling back to legacy ModelPatcher. VRAM estimates may be unreliable especially on Windows")
+        try:
+            aimdo_initialized = comfy_aimdo.control.init_devices((d.index, int(args.vram_headroom * 1024 ** 3)) for d in comfy.model_management.get_all_torch_devices())
+        except TypeError:
+            # comfy-aimdo 0.4.9 protocol.
+            aimdo_initialized = comfy_aimdo.control.init_devices(d.index for d in comfy.model_management.get_all_torch_devices())
+
+        if aimdo_initialized:
+            if args.verbose == 'DEBUG':
+                comfy_aimdo.control.set_log_debug()
+            elif args.verbose == 'CRITICAL':
+                comfy_aimdo.control.set_log_critical()
+            elif args.verbose == 'ERROR':
+                comfy_aimdo.control.set_log_error()
+            elif args.verbose == 'WARNING':
+                comfy_aimdo.control.set_log_warning()
+            else: #INFO
+                comfy_aimdo.control.set_log_info()
+
+            comfy.model_patcher.CoreModelPatcher = comfy.model_patcher.ModelPatcherDynamic
+            comfy.memory_management.aimdo_enabled = True
+            logging.info("DynamicVRAM support detected and enabled")
+        else:
+            logging.warning("No working comfy-aimdo install detected. DynamicVRAM support disabled. Falling back to legacy ModelPatcher. VRAM estimates may be unreliable especially on Windows")


 def cuda_malloc_warning():
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "ComfyUI"
-version = "0.24.0"
+version = "0.25.1"
 readme = "README.md"
 license = { file = "LICENSE" }
 requires-python = ">=3.10"
--- a/requirements.txt
+++ b/requirements.txt
@ -1,6 +1,6 @@
 comfyui-frontend-package==1.45.15
-comfyui-workflow-templates==0.9.98
-comfyui-embedded-docs==0.5.3
+comfyui-workflow-templates==0.10.0
+comfyui-embedded-docs==0.5.4
 torch
 torchsde
 torchvision
@ -23,7 +23,7 @@ SQLAlchemy>=2.0.0
 filelock
 av>=16.0.0
 comfy-kitchen==0.2.10
-comfy-aimdo==0.4.9
+comfy-aimdo==0.4.10
 requests
 simpleeval>=1.0.0
 blake3
Author	SHA1	Message	Date
fen-release[bot]	eca4757d65	ComfyUI v0.25.1	2026-06-18 00:07:36 +00:00
Alexander Piskun	a043f1c8bd	[Partner Nodes] feat(Kling): add support for Kling V3-Turbo model (#14528 )	2026-06-18 07:57:37 +08:00
comfyanonymous	135abed8da	ComfyUI v0.25.0	2026-06-15 23:45:14 -04:00
Alexis Rolland	a439dcae07	Update nodes titles (#14417 )	2026-06-16 11:42:00 +08:00
John Pollock	5db51b76b4	Fix odd-height crash and edge bleed in unaligned-width image/video decode (#14491 ) `a1d95f3f` padded the decode width to the next multiple of 32 with the pad filter to fix libswscale's float YUV->GBR edge corruption, but kept the pad target height equal to the source height. The pad filter requires the target height to be a multiple of the input's vertical chroma subsampling factor, so a chroma-subsampled input such as yuv420p (the format the gbrpf32le float branch decodes) with an odd height makes the filter round the target below the input height and fail to configure: 'Padded dimensions cannot be smaller than input dimensions' (Errno 22). This is reachable from LoadImage, which routes static images through VideoFromFile, on a lossy WebP whose width is not a multiple of 32 and whose height is odd. The pad filter also fills the added border with black, and chroma upsampling bleeds that black into the cropped edge of every unaligned-width subsampled decode. Pad both axes to the next multiple of 32 (32 is a multiple of every vertical subsampling factor, including yuv410p's 4 that a plain even rounding misses) and run fillborders mode=smear to replicate the real edge into the padding so it never bleeds into the cropped output, then crop both axes back to the source size. Aligned-width and uint8 paths run the identical to_ndarray call as before and are byte-identical to master; only unaligned-width subsampled inputs change, from a crash or edge artifact to a clean, deterministic decode.	2026-06-15 20:23:09 -07:00
rattus	b13ca1ce7b	main: support fallback to aimdo 0.4.9 (#14489 ) The aimdo 0.4.10 protocol causing startup failure to be too early and before the aimdo version warning can happen. This causes user confusion. Limp on with 0.4.9 as it will work and users will see the version warning.	2026-06-15 20:22:24 -07:00
Alexander Piskun	2f4c4e983c	[Partner Nodes] fix(SoniloTextToMusic): always require "duration" to be specified (#14484 )	2026-06-16 00:20:01 +08:00
Daxiong (Lin)	83a3f03218	chore: update workflow templates to v0.10.0 (#14482 )	2026-06-15 08:06:15 -07:00
rattus	ec4dec93d2	Comfy Aimdo 0.4.10 + Dynamic --reserve-vram + --vram-headroom (#14480 ) * main: implement --vram-headroom Implement --vram-headroom for dynamic vram as a hybrid debug/diagnostic option that can be used for people who still report shared VRAM spills. They can trial and error the setting to maintain a bit more headroom to avoid shared VRAM spills. * main: implement --reserve-vram Implement --reserve-vram as extra headroom on the simple method which is semantically as close as possible to the stated functionality and formet behaviour of non-dynamic VRAM.	2026-06-15 07:54:36 -07:00
Daxiong (Lin)	7d4194d984	chore: update embedded docs to v0.5.4 (#14478 )	2026-06-15 16:35:36 +08:00