[Partner Nodes] chore(StabilityAI): remove StabilityAI nodes (#14737 )

Signed-off-by: bigcat88 <bigcat88@icloud.com>
chore: update workflow templates to v0.11.2 (#14741 )
2026-07-03 19:27:08 +08:00 · 2026-07-03 14:24:21 +03:00 · 2026-07-03 19:08:11 +08:00 · 2026-07-03 14:00:10 +03:00 · 2026-07-03 02:08:47 -04:00 · 2026-07-02 20:44:54 -07:00
8 changed files with 383 additions and 1083 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -177,10 +177,21 @@
 - Do not use tensors as general-purpose Python data structures. Keep metadata,
  bookkeeping, counters, flags, shape math, padding math, index planning, memory
  estimates, and control-flow decisions in plain Python values unless the data
-  must participate directly in tensor computation. Avoid creating temporary
-  tensors just to use tensor methods for scalar or structural calculations.
+  must participate directly in tensor computation. Do not create tensors for
+  structural metadata that is only used for Python-side control flow. Sequence
+  lengths, cumulative offsets, split indices, window counts, slice boundaries,
+  and repeat counts should be kept as Python ints/lists from the point they are
+  computed. Do not build them as CPU/GPU tensors and then cast, move, validate,
+  or convert them back to Python for `split`, `tensor_split`, indexing plans,
+  loops, or cache keys. Avoid creating temporary tensors just to use tensor
+  methods for scalar or structural calculations.
 - Avoid unnecessary casts and transfers. Preserve the intended compute dtype,
  storage dtype, bias dtype, and original tensor shape metadata.
+- Keep model-native latent layout handling inside the model or latent-format
+  owner, not in helper nodes. Do not collapse, expand, pack, or unpack latent
+  dimensions in nodes or other caller-side adapters just to satisfy a model
+  forward; the model path should consume and return the native latent shape for
+  that model family.
 - Assume inputs to the main model forward are already in the compute dtype by
  default, except integer inputs such as some model timestep tensors. Do not add
  defensive or convenience casts in model code; it is better for invalid dtype
@ -244,6 +255,14 @@
 - Model implementations should add the minimal number of ComfyUI nodes required
  to run the model. Reuse existing nodes as much as possible; adapting the model
  to work with existing nodes is strongly preferred over creating new nodes.
+- Nodes should output only values they own. Do not add pass-through outputs for
+  workflow convenience unless the node is explicitly an output node. Existing
+  models, latents, conditioning, or other inputs should flow directly to the
+  next consumer instead of being re-emitted unchanged.
+- Nodes should expose only inputs they actually read to produce current
+  behavior. Do not add placeholder, pass-through, compatibility, or
+  workflow-shaping inputs that are ignored or could flow directly to another
+  node.
 - Node-level code must not patch model code directly. Any node behavior that
  modifies, wraps, hooks, or changes model behavior must go through the model
  patcher class instead of reaching into model internals.
--- a/comfy_api_nodes/apis/bytedance.py
+++ b/comfy_api_nodes/apis/bytedance.py
@ -1,4 +1,4 @@
-from typing import Literal
+from typing import Any, Literal

 from pydantic import BaseModel, Field

@ -316,3 +316,36 @@ VIDEO_TASKS_EXECUTION_TIME = {
        "1080p": 150,
    },
 }
+
+
+class SeedAudioConfig(BaseModel):
+    format: str = Field(default="mp3")
+    sample_rate: int = Field(default=24000)
+    speech_rate: int = Field(default=0)
+    loudness_rate: int = Field(default=0)
+    pitch_rate: int = Field(default=0)
+
+
+class SeedAudioReference(BaseModel):
+    speaker: str | None = Field(default=None)
+    audio_data: str | None = Field(default=None)
+    audio_url: str | None = Field(default=None)
+    image_data: str | None = Field(default=None)
+    image_url: str | None = Field(default=None)
+
+
+class SeedAudioRequest(BaseModel):
+    model: str = Field(default="seed-audio-1.0")
+    text_prompt: str = Field(...)
+    references: list[SeedAudioReference] | None = Field(default=None)
+    audio_config: SeedAudioConfig = Field(default_factory=SeedAudioConfig)
+    watermark: dict[str, Any] = Field(default_factory=dict)
+
+
+class SeedAudioResponse(BaseModel):
+    audio: str | None = Field(default=None)
+    url: str | None = Field(default=None)
+    duration: float | None = Field(default=None)
+    original_duration: float | None = Field(default=None)
+    code: int | None = Field(default=None)
+    message: str | None = Field(default=None)
--- a/comfy_api_nodes/apis/stability.py
+++ b/comfy_api_nodes/apis/stability.py
@ -1,147 +0,0 @@
-from enum import Enum
-from typing import Optional
-
-from pydantic import BaseModel, Field, confloat
-
-
-class StabilityFormat(str, Enum):
-    png = 'png'
-    jpeg = 'jpeg'
-    webp = 'webp'
-
-
-class StabilityAspectRatio(str, Enum):
-    ratio_1_1 = "1:1"
-    ratio_16_9 = "16:9"
-    ratio_9_16 = "9:16"
-    ratio_3_2 = "3:2"
-    ratio_2_3 = "2:3"
-    ratio_5_4 = "5:4"
-    ratio_4_5 = "4:5"
-    ratio_21_9 = "21:9"
-    ratio_9_21 = "9:21"
-
-
-def get_stability_style_presets(include_none=True):
-    presets = []
-    if include_none:
-        presets.append("None")
-    return presets + [x.value for x in StabilityStylePreset]
-
-
-class StabilityStylePreset(str, Enum):
-    _3d_model = "3d-model"
-    analog_film = "analog-film"
-    anime = "anime"
-    cinematic = "cinematic"
-    comic_book = "comic-book"
-    digital_art = "digital-art"
-    enhance = "enhance"
-    fantasy_art = "fantasy-art"
-    isometric = "isometric"
-    line_art = "line-art"
-    low_poly = "low-poly"
-    modeling_compound = "modeling-compound"
-    neon_punk = "neon-punk"
-    origami = "origami"
-    photographic = "photographic"
-    pixel_art = "pixel-art"
-    tile_texture = "tile-texture"
-
-
-class Stability_SD3_5_Model(str, Enum):
-    sd3_5_large = "sd3.5-large"
-    # sd3_5_large_turbo = "sd3.5-large-turbo"
-    sd3_5_medium = "sd3.5-medium"
-
-
-class Stability_SD3_5_GenerationMode(str, Enum):
-    text_to_image = "text-to-image"
-    image_to_image = "image-to-image"
-
-
-class StabilityStable3_5Request(BaseModel):
-    model: str = Field(...)
-    mode: str = Field(...)
-    prompt: str = Field(...)
-    negative_prompt: Optional[str] = Field(None)
-    aspect_ratio: Optional[str] = Field(None)
-    seed: Optional[int] = Field(None)
-    output_format: Optional[str] = Field(StabilityFormat.png.value)
-    image: Optional[str] = Field(None)
-    style_preset: Optional[str] = Field(None)
-    cfg_scale: float = Field(...)
-    strength: Optional[confloat(ge=0.0, le=1.0)] = Field(None)
-
-
-class StabilityUpscaleConservativeRequest(BaseModel):
-    prompt: str = Field(...)
-    negative_prompt: Optional[str] = Field(None)
-    seed: Optional[int] = Field(None)
-    output_format: Optional[str] = Field(StabilityFormat.png.value)
-    image: Optional[str] = Field(None)
-    creativity: Optional[confloat(ge=0.2, le=0.5)] = Field(None)
-
-
-class StabilityUpscaleCreativeRequest(BaseModel):
-    prompt: str = Field(...)
-    negative_prompt: Optional[str] = Field(None)
-    seed: Optional[int] = Field(None)
-    output_format: Optional[str] = Field(StabilityFormat.png.value)
-    image: Optional[str] = Field(None)
-    creativity: Optional[confloat(ge=0.1, le=0.5)] = Field(None)
-    style_preset: Optional[str] = Field(None)
-
-
-class StabilityStableUltraRequest(BaseModel):
-    prompt: str = Field(...)
-    negative_prompt: Optional[str] = Field(None)
-    aspect_ratio: Optional[str] = Field(None)
-    seed: Optional[int] = Field(None)
-    output_format: Optional[str] = Field(StabilityFormat.png.value)
-    image: Optional[str] = Field(None)
-    style_preset: Optional[str] = Field(None)
-    strength: Optional[confloat(ge=0.0, le=1.0)] = Field(None)
-
-
-class StabilityStableUltraResponse(BaseModel):
-    image: Optional[str] = Field(None)
-    finish_reason: Optional[str] = Field(None)
-    seed: Optional[int] = Field(None)
-
-
-class StabilityResultsGetResponse(BaseModel):
-    image: Optional[str] = Field(None)
-    finish_reason: Optional[str] = Field(None)
-    seed: Optional[int] = Field(None)
-    id: Optional[str] = Field(None)
-    name: Optional[str] = Field(None)
-    errors: Optional[list[str]] = Field(None)
-    status: Optional[str] = Field(None)
-    result: Optional[str] = Field(None)
-
-
-class StabilityAsyncResponse(BaseModel):
-    id: Optional[str] = Field(None)
-
-
-class StabilityTextToAudioRequest(BaseModel):
-    model: str = Field(...)
-    prompt: str = Field(...)
-    duration: int = Field(190, ge=1, le=190)
-    seed: int = Field(0, ge=0, le=4294967294)
-    steps: int = Field(8, ge=4, le=8)
-    output_format: str = Field("wav")
-
-
-class StabilityAudioToAudioRequest(StabilityTextToAudioRequest):
-    strength: float = Field(0.01, ge=0.01, le=1.0)
-
-
-class StabilityAudioInpaintRequest(StabilityTextToAudioRequest):
-    mask_start: int = Field(30, ge=0, le=190)
-    mask_end: int = Field(190, ge=0, le=190)
-
-
-class StabilityAudioResponse(BaseModel):
-    audio: Optional[str] = Field(None)
--- a/comfy_api_nodes/nodes_bytedance.py
+++ b/comfy_api_nodes/nodes_bytedance.py
@ -1,3 +1,4 @@
+import base64
 import hashlib
 import logging
 import math
@ -20,6 +21,10 @@ from comfy_api_nodes.apis.bytedance import (
    GetAssetResponse,
    Image2VideoTaskCreationRequest,
    ImageTaskCreationResponse,
+    SeedAudioConfig,
+    SeedAudioReference,
+    SeedAudioRequest,
+    SeedAudioResponse,
    Seedance2TaskCreationRequest,
    SeedanceCreateAssetRequest,
    SeedanceCreateAssetResponse,
@ -43,6 +48,8 @@ from comfy_api_nodes.apis.bytedance import (
 )
 from comfy_api_nodes.util import (
    ApiEndpoint,
+    audio_bytes_to_audio_input,
+    audio_input_to_mp3,
    download_url_to_image_tensor,
    download_url_to_video_output,
    downscale_image_tensor_by_max_side,
@ -51,11 +58,14 @@ from comfy_api_nodes.util import (
    image_tensor_pair_to_batch,
    poll_op,
    sync_op,
+    tensor_to_base64_string,
    upload_audio_to_comfyapi,
    upload_image_to_comfyapi,
    upload_images_to_comfyapi,
    upload_video_to_comfyapi,
+    upscale_image_tensor_to_min_pixels,
    upscale_video_to_min_pixels,
+    validate_audio_duration,
    validate_image_aspect_ratio,
    validate_image_dimensions,
    validate_string,
@ -2474,6 +2484,311 @@ class ByteDanceCreateVideoAsset(IO.ComfyNode):
        return IO.NodeOutput(asset_id, resolved_group)


+MODE_TEXT = "text only"
+MODE_AUDIO = "audio reference"
+MODE_IMAGE = "image reference"
+MODE_SPEAKER = "preset voice"
+
+# (speaker_id, display_label) for built-in TTS 2.0 voices; resolvable ids are account-scoped.
+SEED_AUDIO_PRESET_VOICES: list[tuple[str, str]] = [
+    ("zh_female_vv_uranus_bigtts", "Vivi (Female, multilingual)"),
+    ("zh_female_xiaohe_uranus_bigtts", "Mindy (Female, multilingual)"),
+    ("en_female_stokie_uranus_bigtts", "Stokie (Female, English)"),
+    ("en_female_dacey_uranus_bigtts", "Dacey (Female, English)"),
+    ("en_male_tim_uranus_bigtts", "Tim (Male, English)"),
+    ("zh_male_m191_uranus_bigtts", "Kian (Male, multilingual)"),
+    ("zh_male_taocheng_uranus_bigtts", "Cedric (Male, multilingual)"),
+    ("zh_male_sophie_uranus_bigtts", "Sophie (Female, multilingual)"),
+    ("zh_female_yingyujiaoxue_uranus_bigtts", "Jean (Female, multilingual)"),
+    ("zh_male_dayi_uranus_bigtts", "Magnus (Male, multilingual)"),
+    ("zh_female_mizai_uranus_bigtts", "Mabel (Female, multilingual)"),
+    ("zh_female_jitangnv_uranus_bigtts", "Nadia (Female, multilingual)"),
+    ("zh_female_meilinvyou_uranus_bigtts", "Opal (Female, multilingual)"),
+    ("zh_female_liuchangnv_uranus_bigtts", "Pearl (Female, multilingual)"),
+    ("zh_male_ruyayichen_uranus_bigtts", "Quentin (Male, multilingual)"),
+    ("zh_female_vivo_uranus_bigtts", "Vienna (Female, multilingual)"),
+    ("zh_female_xiaoai_uranus_bigtts", "Alina (Female, multilingual)"),
+    ("zh_female_cancan_uranus_bigtts", "Corinne (Female, multilingual)"),
+    ("zh_female_tianmeixiaoyuan_uranus_bigtts", "Esther (Female, multilingual)"),
+    ("zh_female_tianmeitaozi_uranus_bigtts", "Freya (Female, multilingual)"),
+    ("zh_female_shuangkuaisisi_uranus_bigtts", "Gigi (Female, multilingual)"),
+    ("zh_female_peiqi_uranus_bigtts", "Holly (Female, multilingual)"),
+    ("zh_female_xiaoxue_uranus_bigtts", "Lyla (Female, multilingual)"),
+    ("zh_female_yuanqi_uranus_bigtts", "Daisy (Female, multilingual)"),
+    ("zh_female_kefunvsheng_uranus_bigtts", "Tracy (Female, multilingual)"),
+    ("zh_male_shaonianzixin_uranus_bigtts", "Jess (Male, multilingual)"),
+    ("zh_female_linjianvhai_uranus_bigtts", "Pinky (Female, multilingual)"),
+    ("zh_female_kiwi_uranus_bigtts", "Sweety (Female, multilingual)"),
+    ("zh_female_sajiaoxuemei_uranus_bigtts", "Sandy (Female, multilingual)"),
+    ("de_male_seven_uranus_bigtts", "Sven (Male, German)"),
+    ("jp_female_minimi_uranus_bigtts", "Minimi (Female, Japanese)"),
+    ("fr_male_usseau_uranus_bigtts", "Usseau (Male, French)"),
+    ("es_male_felipe_uranus_bigtts", "Felipe (Male, Spanish)"),
+    ("id_male_han_uranus_bigtts", "Han (Male, Indonesian)"),
+    ("pt_male_martins_uranus_bigtts", "Martins (Male, Portuguese)"),
+    ("it_male_enzo_uranus_bigtts", "Enzo (Male, Italian)"),
+    ("kr_male_shane_uranus_bigtts", "Shane (Male, Korean)"),
+    ("zh_male_liufei_uranus_bigtts", "Felix (Male, Chinese)"),
+    ("zh_female_qingxinnvsheng_uranus_bigtts", "Celeste (Female, Chinese)"),
+    ("zh_male_sunwukong_uranus_bigtts", "Monkey King (Male, Chinese)"),
+]
+SEED_AUDIO_VOICE_OPTIONS = [label for _, label in SEED_AUDIO_PRESET_VOICES]
+SEED_AUDIO_VOICE_MAP = {label: speaker_id for speaker_id, label in SEED_AUDIO_PRESET_VOICES}
+
+_AUDIO_TAG_RE = re.compile(r"@Audio(\d+)", re.IGNORECASE)
+
+
+def max_audio_tag(prompt: str) -> int:
+    """Highest N referenced as @AudioN in the prompt (0 if none)."""
+    nums = [int(m) for m in _AUDIO_TAG_RE.findall(prompt or "")]
+    return max(nums) if nums else 0
+
+
+def connected_audio_indices(reference_mode: dict) -> list[int]:
+    """Indices (1-based) of connected reference_audio sockets, in order."""
+    return [
+        i
+        for i in range(1, 3 + 1)
+        if reference_mode.get(f"reference_audio_{i}") is not None
+    ]
+
+
+def validate_seed_audio_inputs(
+    text_prompt: str,
+    mode: str,
+    audio_indices: list[int],
+    has_image: bool,
+    preset_voice: str | None = None,
+) -> None:
+    validate_string(text_prompt, field_name="text_prompt", min_length=1, max_length=3000)
+    max_tag = max_audio_tag(text_prompt)
+
+    if mode == MODE_TEXT:
+        if max_tag:
+            raise ValueError(
+                f"The prompt references @Audio{max_tag}, but reference mode is '{MODE_TEXT}'. "
+                f"Switch to '{MODE_AUDIO}' and connect the reference clip(s)."
+            )
+    elif mode == MODE_AUDIO:
+        if not audio_indices:
+            raise ValueError(
+                f"Reference mode '{MODE_AUDIO}' requires at least one reference_audio input "
+                f"(or switch to '{MODE_TEXT}')."
+            )
+        if audio_indices != list(range(1, len(audio_indices) + 1)):
+            raise ValueError(
+                "Connect reference_audio inputs in order without gaps: reference_audio_1, then _2, then _3."
+            )
+        if max_tag > len(audio_indices):
+            raise ValueError(
+                f"The prompt references @Audio{max_tag}, but only {len(audio_indices)} "
+                f"reference audio(s) are connected."
+            )
+    elif mode == MODE_IMAGE:
+        if not has_image:
+            raise ValueError(f"Reference mode '{MODE_IMAGE}' requires a reference_image input.")
+        if max_tag:
+            raise ValueError(
+                f"@AudioN tags are not used in '{MODE_IMAGE}' mode; the prompt should contain "
+                f"only the text to synthesize."
+            )
+    elif mode == MODE_SPEAKER:
+        if not preset_voice or preset_voice not in SEED_AUDIO_VOICE_MAP:
+            raise ValueError(f"Reference mode '{MODE_SPEAKER}' requires selecting a preset voice.")
+        if max_tag > 1:
+            raise ValueError(
+                f"'{MODE_SPEAKER}' mode uses a single voice, so @Audio{max_tag} is out of range. "
+                f"Remove the @AudioN tags — the whole prompt is read in the selected voice."
+            )
+    else:
+        raise ValueError(f"Unknown reference mode: {mode!r}")
+
+
+class ByteDanceSeedAudioNode(IO.ComfyNode):
+
+    @classmethod
+    def define_schema(cls) -> IO.Schema:
+        return IO.Schema(
+            node_id="ByteDanceSeedAudio",
+            display_name="ByteDance Seed Audio 1.0",
+            category="api node/audio/ByteDance",
+            description=(
+                "Generate speech, music, sound effects and multi-speaker dialogue from a single prompt "
+                "with ByteDance Seed Audio 1.0. Describe the voice(s), emotion, ambience, background music "
+                "and sound effects in the prompt, and include the lines to speak. Optionally pick a built-in "
+                "preset voice, clone voices from up to 3 reference clips (tagged @Audio1-3 in the prompt), "
+                "or derive a voice from a character image. Up to 2 minutes of audio per run."
+            ),
+            inputs=[
+                IO.String.Input(
+                    "text_prompt",
+                    multiline=True,
+                    default="",
+                    tooltip=(
+                        "Describe the voice(s), emotion, pacing, ambience, background music and sound "
+                        "effects, and include the lines to speak (name characters inline for dialogue). "
+                        "In 'audio reference' mode, refer to connected clips by order as @Audio1, @Audio2, "
+                        "@Audio3. Maximum 3000 characters."
+                    ),
+                ),
+                IO.DynamicCombo.Input(
+                    "reference_mode",
+                    options=[
+                        IO.DynamicCombo.Option(MODE_TEXT, []),
+                        IO.DynamicCombo.Option(
+                            MODE_AUDIO,
+                            [
+                                IO.Audio.Input(
+                                    "reference_audio_1",
+                                    optional=True,
+                                    tooltip="Reference clip for voice cloning, tagged @Audio1 in the prompt. "
+                                    "Up to 30s.",
+                                ),
+                                IO.Audio.Input(
+                                    "reference_audio_2",
+                                    optional=True,
+                                    tooltip="Reference clip tagged @Audio2 in the prompt. Up to 30s.",
+                                ),
+                                IO.Audio.Input(
+                                    "reference_audio_3",
+                                    optional=True,
+                                    tooltip="Reference clip tagged @Audio3 in the prompt. Up to 30s.",
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option(
+                            MODE_IMAGE,
+                            [
+                                IO.Image.Input(
+                                    "reference_image",
+                                    optional=True,
+                                    tooltip="A single character image; the model derives a voice from it. "
+                                    "Cannot be combined with reference audio.",
+                                ),
+                            ],
+                        ),
+                        IO.DynamicCombo.Option(
+                            MODE_SPEAKER,
+                            [
+                                IO.Combo.Input(
+                                    "preset_voice",
+                                    options=SEED_AUDIO_VOICE_OPTIONS,
+                                    default=SEED_AUDIO_VOICE_OPTIONS[0],
+                                    tooltip="A built-in TTS 2.0 voice that reads the prompt. No reference "
+                                    "clip needed, and @AudioN tags are not used in this mode.",
+                                ),
+                            ],
+                        ),
+                    ],
+                    tooltip=(
+                        "How to condition the voice: 'text only' (describe everything in the prompt), "
+                        "'audio reference' (clone up to 3 voices, tagged @Audio1-3), 'image reference' "
+                        "(derive a voice from one character image), or 'preset voice' (pick a built-in "
+                        "named voice that reads the prompt)."
+                    ),
+                ),
+                IO.Combo.Input(
+                    "sample_rate",
+                    options=["8000", "16000", "24000", "32000", "44100", "48000"],
+                    default="24000",
+                    tooltip="Output sample rate in Hz.",
+                ),
+                IO.Int.Input(
+                    "speech_rate",
+                    default=0,
+                    min=-50,
+                    max=100,
+                    tooltip="Speaking speed. 0 = normal, 100 = 2.0x, -50 = 0.5x.",
+                ),
+                IO.Int.Input(
+                    "loudness_rate",
+                    default=0,
+                    min=-50,
+                    max=100,
+                    tooltip="Loudness. 0 = normal, 100 = 2.0x, -50 = 0.5x.",
+                ),
+                IO.Int.Input(
+                    "pitch_rate",
+                    default=0,
+                    min=-12,
+                    max=12,
+                    tooltip="Pitch shift in semitones (-12 to 12).",
+                ),
+                IO.Int.Input(
+                    "seed",
+                    default=42,
+                    min=0,
+                    max=2147483647,
+                    control_after_generate=True,
+                    tooltip="Seed controls whether the node should re-run; "
+                    "results are non-deterministic regardless of seed.",
+                ),
+            ],
+            outputs=[IO.Audio.Output()],
+            hidden=[
+                IO.Hidden.auth_token_comfy_org,
+                IO.Hidden.api_key_comfy_org,
+                IO.Hidden.unique_id,
+            ],
+            is_api_node=True,
+            price_badge=IO.PriceBadge(
+                expr="""{"type":"usd","usd": 0.2145, "format":{"suffix":"/minute","approximate":true}}""",
+            ),
+        )
+
+    @classmethod
+    async def execute(
+        cls,
+        text_prompt: str,
+        reference_mode: dict,
+        sample_rate: str,
+        speech_rate: int,
+        loudness_rate: int,
+        pitch_rate: int,
+        seed: int,
+    ) -> IO.NodeOutput:
+        mode = reference_mode["reference_mode"]
+        audio_indices = connected_audio_indices(reference_mode)
+        image = reference_mode.get("reference_image")
+        preset_voice = reference_mode.get("preset_voice")
+        validate_seed_audio_inputs(text_prompt, mode, audio_indices, image is not None, preset_voice)
+
+        references: list[SeedAudioReference] | None = None
+        if mode == MODE_AUDIO:
+            references = []
+            for i in audio_indices:
+                clip = reference_mode[f"reference_audio_{i}"]
+                validate_audio_duration(clip, max_duration=30.0)
+                mp3_bytes = audio_input_to_mp3(clip).getvalue()
+                references.append(SeedAudioReference(audio_data=base64.b64encode(mp3_bytes).decode("utf-8")))
+        elif mode == MODE_IMAGE:
+            image = upscale_image_tensor_to_min_pixels(image, 160_000)
+            references = [SeedAudioReference(image_data=tensor_to_base64_string(image, mime_type="image/png"))]
+        elif mode == MODE_SPEAKER:
+            references = [SeedAudioReference(speaker=SEED_AUDIO_VOICE_MAP[preset_voice])]
+
+        response = await sync_op(
+            cls,
+            ApiEndpoint(path="/proxy/byteplus/api/v3/tts/create", method="POST"),
+            response_model=SeedAudioResponse,
+            data=SeedAudioRequest(
+                text_prompt=text_prompt,
+                references=references,
+                audio_config=SeedAudioConfig(
+                    sample_rate=int(sample_rate),
+                    speech_rate=speech_rate,
+                    loudness_rate=loudness_rate,
+                    pitch_rate=pitch_rate,
+                ),
+            ),
+        )
+        if not response.audio:
+            raise Exception(
+                f"Seed Audio returned no audio (code={response.code}): {response.message}"
+            )
+        return IO.NodeOutput(audio_bytes_to_audio_input(base64.b64decode(response.audio)))
+
+
 class ByteDanceExtension(ComfyExtension):
    @override
    async def get_node_list(self) -> list[type[IO.ComfyNode]]:
@ -2490,6 +2805,7 @@ class ByteDanceExtension(ComfyExtension):
            ByteDance2ReferenceNode,
            ByteDanceCreateImageAsset,
            ByteDanceCreateVideoAsset,
+            ByteDanceSeedAudioNode,
        ]


--- a/comfy_api_nodes/nodes_stability.py
+++ b/comfy_api_nodes/nodes_stability.py
@ -1,932 +0,0 @@
-from inspect import cleandoc
-from typing import Optional
-from typing_extensions import override
-
-from comfy_api.latest import ComfyExtension, Input, IO
-from comfy_api_nodes.apis.stability import (
-    StabilityUpscaleConservativeRequest,
-    StabilityUpscaleCreativeRequest,
-    StabilityAsyncResponse,
-    StabilityResultsGetResponse,
-    StabilityStable3_5Request,
-    StabilityStableUltraRequest,
-    StabilityStableUltraResponse,
-    StabilityAspectRatio,
-    Stability_SD3_5_Model,
-    Stability_SD3_5_GenerationMode,
-    get_stability_style_presets,
-    StabilityTextToAudioRequest,
-    StabilityAudioToAudioRequest,
-    StabilityAudioInpaintRequest,
-    StabilityAudioResponse,
-)
-from comfy_api_nodes.util import (
-    validate_audio_duration,
-    validate_string,
-    audio_input_to_mp3,
-    bytesio_to_image_tensor,
-    tensor_to_bytesio,
-    audio_bytes_to_audio_input,
-    sync_op,
-    poll_op,
-    ApiEndpoint,
-)
-
-import torch
-import base64
-from io import BytesIO
-from enum import Enum
-
-
-class StabilityPollStatus(str, Enum):
-    finished = "finished"
-    in_progress = "in_progress"
-    failed = "failed"
-
-
-def get_async_dummy_status(x: StabilityResultsGetResponse):
-    if x.name is not None or x.errors is not None:
-        return StabilityPollStatus.failed
-    elif x.finish_reason is not None:
-        return StabilityPollStatus.finished
-    return StabilityPollStatus.in_progress
-
-
-class StabilityStableImageUltraNode(IO.ComfyNode):
-    """
-    Generates images synchronously based on prompt and resolution.
-    """
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityStableImageUltraNode",
-            display_name="Stability AI Stable Image Ultra",
-            category="partner/image/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.String.Input(
-                    "prompt",
-                    multiline=True,
-                    default="",
-                    tooltip="What you wish to see in the output image. A strong, descriptive prompt that clearly defines" +
-                                    "elements, colors, and subjects will lead to better results. " +
-                                    "To control the weight of a given word use the format `(word:weight)`," +
-                                    "where `word` is the word you'd like to control the weight of and `weight`" +
-                                    "is a value between 0 and 1. For example: `The sky was a crisp (blue:0.3) and (green:0.8)`" +
-                                    "would convey a sky that was blue and green, but more green than blue.",
-                ),
-                IO.Combo.Input(
-                    "aspect_ratio",
-                    options=StabilityAspectRatio,
-                    default=StabilityAspectRatio.ratio_1_1,
-                    tooltip="Aspect ratio of generated image.",
-                ),
-                IO.Combo.Input(
-                    "style_preset",
-                    options=get_stability_style_presets(),
-                    tooltip="Optional desired style of generated image.",
-                    advanced=True,
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for creating the noise.",
-                ),
-                IO.Image.Input(
-                    "image",
-                    optional=True,
-                ),
-                IO.String.Input(
-                    "negative_prompt",
-                    default="",
-                    tooltip="A blurb of text describing what you do not wish to see in the output image. This is an advanced feature.",
-                    force_input=True,
-                    optional=True,
-                    advanced=True,
-                ),
-                IO.Float.Input(
-                    "image_denoise",
-                    default=0.5,
-                    min=0.0,
-                    max=1.0,
-                    step=0.01,
-                    tooltip="Denoise of input image; 0.0 yields image identical to input, 1.0 is as if no image was provided at all.",
-                    optional=True,
-                ),
-            ],
-            outputs=[
-                IO.Image.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.08}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(
-        cls,
-        prompt: str,
-        aspect_ratio: str,
-        style_preset: str,
-        seed: int,
-        image: Optional[torch.Tensor] = None,
-        negative_prompt: str = "",
-        image_denoise: Optional[float] = 0.5,
-    ) -> IO.NodeOutput:
-        validate_string(prompt, strip_whitespace=False)
-        # prepare image binary if image present
-        image_binary = None
-        if image is not None:
-            image_binary = tensor_to_bytesio(image, total_pixels=1504*1504).read()
-        else:
-            image_denoise = None
-
-        if not negative_prompt:
-            negative_prompt = None
-        if style_preset == "None":
-            style_preset = None
-
-        files = {
-            "image": image_binary
-        }
-
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/stable-image/generate/ultra", method="POST"),
-            response_model=StabilityStableUltraResponse,
-            data=StabilityStableUltraRequest(
-                prompt=prompt,
-                negative_prompt=negative_prompt,
-                aspect_ratio=aspect_ratio,
-                seed=seed,
-                strength=image_denoise,
-                style_preset=style_preset,
-            ),
-            files=files,
-            content_type="multipart/form-data",
-        )
-
-        if response_api.finish_reason != "SUCCESS":
-            raise Exception(f"Stable Image Ultra generation failed: {response_api.finish_reason}.")
-
-        image_data = base64.b64decode(response_api.image)
-        returned_image = bytesio_to_image_tensor(BytesIO(image_data))
-
-        return IO.NodeOutput(returned_image)
-
-
-class StabilityStableImageSD_3_5Node(IO.ComfyNode):
-    """
-    Generates images synchronously based on prompt and resolution.
-    """
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityStableImageSD_3_5Node",
-            display_name="Stability AI Stable Diffusion 3.5 Image",
-            category="partner/image/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.String.Input(
-                    "prompt",
-                    multiline=True,
-                    default="",
-                    tooltip="What you wish to see in the output image. A strong, descriptive prompt that clearly defines elements, colors, and subjects will lead to better results.",
-                ),
-                IO.Combo.Input(
-                    "model",
-                    options=Stability_SD3_5_Model,
-                ),
-                IO.Combo.Input(
-                    "aspect_ratio",
-                    options=StabilityAspectRatio,
-                    default=StabilityAspectRatio.ratio_1_1,
-                    tooltip="Aspect ratio of generated image.",
-                ),
-                IO.Combo.Input(
-                    "style_preset",
-                    options=get_stability_style_presets(),
-                    tooltip="Optional desired style of generated image.",
-                    advanced=True,
-                ),
-                IO.Float.Input(
-                    "cfg_scale",
-                    default=4.0,
-                    min=1.0,
-                    max=10.0,
-                    step=0.1,
-                    tooltip="How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt)",
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for creating the noise.",
-                ),
-                IO.Image.Input(
-                    "image",
-                    optional=True,
-                ),
-                IO.String.Input(
-                    "negative_prompt",
-                    default="",
-                    tooltip="Keywords of what you do not wish to see in the output image. This is an advanced feature.",
-                    force_input=True,
-                    optional=True,
-                    advanced=True,
-                ),
-                IO.Float.Input(
-                    "image_denoise",
-                    default=0.5,
-                    min=0.0,
-                    max=1.0,
-                    step=0.01,
-                    tooltip="Denoise of input image; 0.0 yields image identical to input, 1.0 is as if no image was provided at all.",
-                    optional=True,
-                ),
-            ],
-            outputs=[
-                IO.Image.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                depends_on=IO.PriceBadgeDepends(widgets=["model"]),
-                expr="""
-                (
-                  $contains(widgets.model,"large")
-                    ? {"type":"usd","usd":0.065}
-                    : {"type":"usd","usd":0.035}
-                )
-                """,
-            ),
-        )
-
-    @classmethod
-    async def execute(
-        cls,
-        model: str,
-        prompt: str,
-        aspect_ratio: str,
-        style_preset: str,
-        seed: int,
-        cfg_scale: float,
-        image: Optional[torch.Tensor] = None,
-        negative_prompt: str = "",
-        image_denoise: Optional[float] = 0.5,
-    ) -> IO.NodeOutput:
-        validate_string(prompt, strip_whitespace=False)
-        # prepare image binary if image present
-        image_binary = None
-        mode = Stability_SD3_5_GenerationMode.text_to_image
-        if image is not None:
-            image_binary = tensor_to_bytesio(image, total_pixels=1504*1504).read()
-            mode = Stability_SD3_5_GenerationMode.image_to_image
-            aspect_ratio = None
-        else:
-            image_denoise = None
-
-        if not negative_prompt:
-            negative_prompt = None
-        if style_preset == "None":
-            style_preset = None
-
-        files = {
-            "image": image_binary
-        }
-
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/stable-image/generate/sd3", method="POST"),
-            response_model=StabilityStableUltraResponse,
-            data=StabilityStable3_5Request(
-                prompt=prompt,
-                negative_prompt=negative_prompt,
-                aspect_ratio=aspect_ratio,
-                seed=seed,
-                strength=image_denoise,
-                style_preset=style_preset,
-                cfg_scale=cfg_scale,
-                model=model,
-                mode=mode,
-            ),
-            files=files,
-            content_type="multipart/form-data",
-        )
-
-        if response_api.finish_reason != "SUCCESS":
-            raise Exception(f"Stable Diffusion 3.5 Image generation failed: {response_api.finish_reason}.")
-
-        image_data = base64.b64decode(response_api.image)
-        returned_image = bytesio_to_image_tensor(BytesIO(image_data))
-
-        return IO.NodeOutput(returned_image)
-
-
-class StabilityUpscaleConservativeNode(IO.ComfyNode):
-    """
-    Upscale image with minimal alterations to 4K resolution.
-    """
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityUpscaleConservativeNode",
-            display_name="Stability AI Upscale Conservative",
-            category="partner/image/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.Image.Input("image"),
-                IO.String.Input(
-                    "prompt",
-                    multiline=True,
-                    default="",
-                    tooltip="What you wish to see in the output image. A strong, descriptive prompt that clearly defines elements, colors, and subjects will lead to better results.",
-                ),
-                IO.Float.Input(
-                    "creativity",
-                    default=0.35,
-                    min=0.2,
-                    max=0.5,
-                    step=0.01,
-                    tooltip="Controls the likelihood of creating additional details not heavily conditioned by the init image.",
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for creating the noise.",
-                ),
-                IO.String.Input(
-                    "negative_prompt",
-                    default="",
-                    tooltip="Keywords of what you do not wish to see in the output image. This is an advanced feature.",
-                    force_input=True,
-                    optional=True,
-                    advanced=True,
-                ),
-            ],
-            outputs=[
-                IO.Image.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.4}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(
-        cls,
-        image: torch.Tensor,
-        prompt: str,
-        creativity: float,
-        seed: int,
-        negative_prompt: str = "",
-    ) -> IO.NodeOutput:
-        validate_string(prompt, strip_whitespace=False)
-        image_binary = tensor_to_bytesio(image, total_pixels=1024*1024).read()
-
-        if not negative_prompt:
-            negative_prompt = None
-
-        files = {
-            "image": image_binary
-        }
-
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/stable-image/upscale/conservative", method="POST"),
-            response_model=StabilityStableUltraResponse,
-            data=StabilityUpscaleConservativeRequest(
-                prompt=prompt,
-                negative_prompt=negative_prompt,
-                creativity=round(creativity,2),
-                seed=seed,
-            ),
-            files=files,
-            content_type="multipart/form-data",
-        )
-
-        if response_api.finish_reason != "SUCCESS":
-            raise Exception(f"Stability Upscale Conservative generation failed: {response_api.finish_reason}.")
-
-        image_data = base64.b64decode(response_api.image)
-        returned_image = bytesio_to_image_tensor(BytesIO(image_data))
-
-        return IO.NodeOutput(returned_image)
-
-
-class StabilityUpscaleCreativeNode(IO.ComfyNode):
-    """
-    Upscale image with minimal alterations to 4K resolution.
-    """
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityUpscaleCreativeNode",
-            display_name="Stability AI Upscale Creative",
-            category="partner/image/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.Image.Input("image"),
-                IO.String.Input(
-                    "prompt",
-                    multiline=True,
-                    default="",
-                    tooltip="What you wish to see in the output image. A strong, descriptive prompt that clearly defines elements, colors, and subjects will lead to better results.",
-                ),
-                IO.Float.Input(
-                    "creativity",
-                    default=0.3,
-                    min=0.1,
-                    max=0.5,
-                    step=0.01,
-                    tooltip="Controls the likelihood of creating additional details not heavily conditioned by the init image.",
-                ),
-                IO.Combo.Input(
-                    "style_preset",
-                    options=get_stability_style_presets(),
-                    tooltip="Optional desired style of generated image.",
-                    advanced=True,
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for creating the noise.",
-                ),
-                IO.String.Input(
-                    "negative_prompt",
-                    default="",
-                    tooltip="Keywords of what you do not wish to see in the output image. This is an advanced feature.",
-                    force_input=True,
-                    optional=True,
-                    advanced=True,
-                ),
-            ],
-            outputs=[
-                IO.Image.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.6}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(
-        cls,
-        image: torch.Tensor,
-        prompt: str,
-        creativity: float,
-        style_preset: str,
-        seed: int,
-        negative_prompt: str = "",
-    ) -> IO.NodeOutput:
-        validate_string(prompt, strip_whitespace=False)
-        image_binary = tensor_to_bytesio(image, total_pixels=1024*1024).read()
-
-        if not negative_prompt:
-            negative_prompt = None
-        if style_preset == "None":
-            style_preset = None
-
-        files = {
-            "image": image_binary
-        }
-
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/stable-image/upscale/creative", method="POST"),
-            response_model=StabilityAsyncResponse,
-            data=StabilityUpscaleCreativeRequest(
-                prompt=prompt,
-                negative_prompt=negative_prompt,
-                creativity=round(creativity,2),
-                style_preset=style_preset,
-                seed=seed,
-            ),
-            files=files,
-            content_type="multipart/form-data",
-        )
-
-        response_poll = await poll_op(
-            cls,
-            ApiEndpoint(path=f"/proxy/stability/v2beta/results/{response_api.id}"),
-            response_model=StabilityResultsGetResponse,
-            poll_interval=3,
-            status_extractor=lambda x: get_async_dummy_status(x),
-        )
-
-        if response_poll.finish_reason != "SUCCESS":
-            raise Exception(f"Stability Upscale Creative generation failed: {response_poll.finish_reason}.")
-
-        image_data = base64.b64decode(response_poll.result)
-        returned_image = bytesio_to_image_tensor(BytesIO(image_data))
-
-        return IO.NodeOutput(returned_image)
-
-
-class StabilityUpscaleFastNode(IO.ComfyNode):
-    """
-    Quickly upscales an image via Stability API call to 4x its original size; intended for upscaling low-quality/compressed images.
-    """
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityUpscaleFastNode",
-            display_name="Stability AI Upscale Fast",
-            category="partner/image/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.Image.Input("image"),
-            ],
-            outputs=[
-                IO.Image.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.02}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(cls, image: torch.Tensor) -> IO.NodeOutput:
-        image_binary = tensor_to_bytesio(image, total_pixels=4096*4096).read()
-
-        files = {
-            "image": image_binary
-        }
-
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/stable-image/upscale/fast", method="POST"),
-            response_model=StabilityStableUltraResponse,
-            files=files,
-            content_type="multipart/form-data",
-        )
-
-        if response_api.finish_reason != "SUCCESS":
-            raise Exception(f"Stability Upscale Fast failed: {response_api.finish_reason}.")
-
-        image_data = base64.b64decode(response_api.image)
-        returned_image = bytesio_to_image_tensor(BytesIO(image_data))
-
-        return IO.NodeOutput(returned_image)
-
-
-class StabilityTextToAudio(IO.ComfyNode):
-    """Generates high-quality music and sound effects from text descriptions."""
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityTextToAudio",
-            display_name="Stability AI Text To Audio",
-            category="partner/audio/Stability AI",
-            essentials_category="Audio",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.Combo.Input(
-                    "model",
-                    options=["stable-audio-2.5"],
-                ),
-                IO.String.Input("prompt", multiline=True, default=""),
-                IO.Int.Input(
-                    "duration",
-                    default=190,
-                    min=1,
-                    max=190,
-                    step=1,
-                    tooltip="Controls the duration in seconds of the generated audio.",
-                    optional=True,
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for generation.",
-                    optional=True,
-                ),
-                IO.Int.Input(
-                    "steps",
-                    default=8,
-                    min=4,
-                    max=8,
-                    step=1,
-                    tooltip="Controls the number of sampling steps.",
-                    optional=True,
-                    advanced=True,
-                ),
-            ],
-            outputs=[
-                IO.Audio.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.2}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(cls, model: str, prompt: str, duration: int, seed: int, steps: int) -> IO.NodeOutput:
-        validate_string(prompt, max_length=10000)
-        payload = StabilityTextToAudioRequest(prompt=prompt, model=model, duration=duration, seed=seed, steps=steps)
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/audio/stable-audio-2/text-to-audio", method="POST"),
-            response_model=StabilityAudioResponse,
-            data=payload,
-            content_type="multipart/form-data",
-        )
-        if not response_api.audio:
-            raise ValueError("No audio file was received in response.")
-        return IO.NodeOutput(audio_bytes_to_audio_input(base64.b64decode(response_api.audio)))
-
-
-class StabilityAudioToAudio(IO.ComfyNode):
-    """Transforms existing audio samples into new high-quality compositions using text instructions."""
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityAudioToAudio",
-            display_name="Stability AI Audio To Audio",
-            category="partner/audio/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.Combo.Input(
-                    "model",
-                    options=["stable-audio-2.5"],
-                ),
-                IO.String.Input("prompt", multiline=True, default=""),
-                IO.Audio.Input("audio", tooltip="Audio must be between 6 and 190 seconds long."),
-                IO.Int.Input(
-                    "duration",
-                    default=190,
-                    min=1,
-                    max=190,
-                    step=1,
-                    tooltip="Controls the duration in seconds of the generated audio.",
-                    optional=True,
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for generation.",
-                    optional=True,
-                ),
-                IO.Int.Input(
-                    "steps",
-                    default=8,
-                    min=4,
-                    max=8,
-                    step=1,
-                    tooltip="Controls the number of sampling steps.",
-                    optional=True,
-                    advanced=True,
-                ),
-                IO.Float.Input(
-                    "strength",
-                    default=1,
-                    min=0.01,
-                    max=1.0,
-                    step=0.01,
-                    display_mode=IO.NumberDisplay.slider,
-                    tooltip="Parameter controls how much influence the audio parameter has on the generated audio.",
-                    optional=True,
-                ),
-            ],
-            outputs=[
-                IO.Audio.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.2}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(
-        cls, model: str, prompt: str, audio: Input.Audio, duration: int, seed: int, steps: int, strength: float
-    ) -> IO.NodeOutput:
-        validate_string(prompt, max_length=10000)
-        validate_audio_duration(audio, 6, 190)
-        payload = StabilityAudioToAudioRequest(
-            prompt=prompt, model=model, duration=duration, seed=seed, steps=steps, strength=strength
-        )
-        response_api = await sync_op(
-            cls,
-            ApiEndpoint(path="/proxy/stability/v2beta/audio/stable-audio-2/audio-to-audio", method="POST"),
-            response_model=StabilityAudioResponse,
-            data=payload,
-            content_type="multipart/form-data",
-            files={"audio": audio_input_to_mp3(audio)},
-        )
-        if not response_api.audio:
-            raise ValueError("No audio file was received in response.")
-        return IO.NodeOutput(audio_bytes_to_audio_input(base64.b64decode(response_api.audio)))
-
-
-class StabilityAudioInpaint(IO.ComfyNode):
-    """Transforms part of existing audio sample using text instructions."""
-
-    @classmethod
-    def define_schema(cls):
-        return IO.Schema(
-            node_id="StabilityAudioInpaint",
-            display_name="Stability AI Audio Inpaint",
-            category="partner/audio/Stability AI",
-            description=cleandoc(cls.__doc__ or ""),
-            inputs=[
-                IO.Combo.Input(
-                    "model",
-                    options=["stable-audio-2.5"],
-                ),
-                IO.String.Input("prompt", multiline=True, default=""),
-                IO.Audio.Input("audio", tooltip="Audio must be between 6 and 190 seconds long."),
-                IO.Int.Input(
-                    "duration",
-                    default=190,
-                    min=1,
-                    max=190,
-                    step=1,
-                    tooltip="Controls the duration in seconds of the generated audio.",
-                    optional=True,
-                ),
-                IO.Int.Input(
-                    "seed",
-                    default=0,
-                    min=0,
-                    max=4294967294,
-                    step=1,
-                    display_mode=IO.NumberDisplay.number,
-                    control_after_generate=True,
-                    tooltip="The random seed used for generation.",
-                    optional=True,
-                ),
-                IO.Int.Input(
-                    "steps",
-                    default=8,
-                    min=4,
-                    max=8,
-                    step=1,
-                    tooltip="Controls the number of sampling steps.",
-                    optional=True,
-                    advanced=True,
-                ),
-                IO.Int.Input(
-                    "mask_start",
-                    default=30,
-                    min=0,
-                    max=190,
-                    step=1,
-                    optional=True,
-                    advanced=True,
-                ),
-                IO.Int.Input(
-                    "mask_end",
-                    default=190,
-                    min=0,
-                    max=190,
-                    step=1,
-                    optional=True,
-                    advanced=True,
-                ),
-            ],
-            outputs=[
-                IO.Audio.Output(),
-            ],
-            hidden=[
-                IO.Hidden.auth_token_comfy_org,
-                IO.Hidden.api_key_comfy_org,
-                IO.Hidden.unique_id,
-            ],
-            is_api_node=True,
-            price_badge=IO.PriceBadge(
-                expr="""{"type":"usd","usd":0.2}""",
-            ),
-        )
-
-    @classmethod
-    async def execute(
-        cls,
-        model: str,
-        prompt: str,
-        audio: Input.Audio,
-        duration: int,
-        seed: int,
-        steps: int,
-        mask_start: int,
-        mask_end: int,
-    ) -> IO.NodeOutput:
-        validate_string(prompt, max_length=10000)
-        if mask_end <= mask_start:
-            raise ValueError(f"Value of mask_end({mask_end}) should be greater then mask_start({mask_start})")
-        validate_audio_duration(audio, 6, 190)
-
-        payload = StabilityAudioInpaintRequest(
-            prompt=prompt,
-            model=model,
-            duration=duration,
-            seed=seed,
-            steps=steps,
-            mask_start=mask_start,
-            mask_end=mask_end,
-        )
-        response_api = await sync_op(
-            cls,
-            endpoint=ApiEndpoint(path="/proxy/stability/v2beta/audio/stable-audio-2/inpaint", method="POST"),
-            response_model=StabilityAudioResponse,
-            data=payload,
-            content_type="multipart/form-data",
-            files={"audio": audio_input_to_mp3(audio)},
-        )
-        if not response_api.audio:
-            raise ValueError("No audio file was received in response.")
-        return IO.NodeOutput(audio_bytes_to_audio_input(base64.b64decode(response_api.audio)))
-
-
-class StabilityExtension(ComfyExtension):
-    @override
-    async def get_node_list(self) -> list[type[IO.ComfyNode]]:
-        return [
-            StabilityStableImageUltraNode,
-            StabilityStableImageSD_3_5Node,
-            StabilityUpscaleConservativeNode,
-            StabilityUpscaleCreativeNode,
-            StabilityUpscaleFastNode,
-            StabilityTextToAudio,
-            StabilityAudioToAudio,
-            StabilityAudioInpaint,
-        ]
-
-
-async def comfy_entrypoint() -> StabilityExtension:
-    return StabilityExtension()
--- a/comfy_api_nodes/util/init.py
+++ b/comfy_api_nodes/util/init.py
@ -26,6 +26,7 @@ from .conversions import (
    text_filepath_to_base64_string,
    text_filepath_to_data_uri,
    trim_video,
+    upscale_image_tensor_to_min_pixels,
    upscale_video_to_min_pixels,
    video_to_base64_string,
 )
@ -99,6 +100,7 @@ __all__ = [
    "text_filepath_to_base64_string",
    "text_filepath_to_data_uri",
    "trim_video",
+    "upscale_image_tensor_to_min_pixels",
    "upscale_video_to_min_pixels",
    "video_to_base64_string",
    # Validation utilities
--- a/comfy_api_nodes/util/conversions.py
+++ b/comfy_api_nodes/util/conversions.py
@ -448,6 +448,15 @@ def _compute_upscale_dims(src_w: int, src_h: int, total_pixels: int) -> tuple[in
    return new_w, new_h


+def upscale_image_tensor_to_min_pixels(image: torch.Tensor, total_pixels: int) -> torch.Tensor:
+    samples = image.movedim(-1, 1)
+    dims = _compute_upscale_dims(samples.shape[3], samples.shape[2], int(total_pixels))
+    if dims is None:
+        return image
+    new_w, new_h = dims
+    return common_upscale(samples, new_w, new_h, "lanczos", "disabled").movedim(1, -1)
+
+
 def upscale_video_to_min_pixels(video: Input.Video, min_pixels: int) -> Input.Video:
    """Upscale a video to meet at least ``min_pixels`` (w * h), preserving aspect ratio.

--- a/requirements.txt
+++ b/requirements.txt
@ -1,5 +1,5 @@
 comfyui-frontend-package==1.45.20
-comfyui-workflow-templates==0.11.1
+comfyui-workflow-templates==0.11.2
 comfyui-embedded-docs==0.5.6
 torch
 torchsde
Author	SHA1	Message	Date
Alexander Piskun	77917ed3a6	[Partner Nodes] chore(StabilityAI): remove StabilityAI nodes (#14737 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-07-03 14:24:21 +03:00
Daxiong (Lin)	a04ebe05c2	chore: update workflow templates to v0.11.2 (#14741 )	2026-07-03 19:08:11 +08:00
Alexander Piskun	9764381998	[Partner Nodes] feat(ByteDance): add support for Seed Audio 1.0 (#14731 ) Signed-off-by: bigcat88 <bigcat88@icloud.com>	2026-07-03 14:00:10 +03:00
comfyanonymous	1e04ced089	Update AGENTS.md (#14733 )	2026-07-03 02:08:47 -04:00
Matt Miller	96e0e3585b	security: fix four vulnerabilities (GHSA-779p-m5rp-r4h4) (#14734 ) * security: fix five vulnerabilities (GHSA-779p-m5rp-r4h4) - CVE-2026-56670: force download of SVG/XML responses on /view to prevent stored XSS - CVE-2026-56671: contain /experiment/models/preview reads within the model folder - CVE-2026-56672: stop inline rendering of uploaded /userdata/{file} content - CVE-2026-56673: prevent path traversal in get_annotated_filepath (LoadImage /prompt input) - CVE-2026-56674: reject opaque/null Origin to close the CSRF middleware bypass Adds regression tests under tests-unit/security_test/ covering all five. * security: address review feedback on GHSA-779p fixes - Fix Windows CI failure in test_get_annotated_filepath: compare against os.path.abspath(...) to match the intentional abspath normalization added by the traversal hardening (abspath prepends the drive letter on Windows). - origin_check: narrow the bare `except:` in is_loopback() to ValueError so genuine interrupts aren't swallowed (review nit). - origin_check: guard .port access in is_cross_origin_forbidden() so a malformed/out-of-range port (e.g. Origin: http://127.0.0.1:99999) fails closed with a 403 instead of surfacing an uncaught 500 in the middleware. - server /view: escape backslash/quote in the Content-Disposition filename (RFC 6266 quoted-string) so a filename containing a double quote can't malform the response header. * security: address CodeRabbit review feedback on GHSA-779p tests - test #3: guard the symlink-escape test with a try/except skip so it no longer errors on Windows CI where os.symlink needs elevated privileges / Developer Mode (mirrors the guard in the sibling test #2). - test #5: refresh the stale module docstring to describe the actual /view gating (view_image closure calling folder_paths.is_dangerous_content_type, the normalising check) instead of the bypassable raw set-membership test. * revert(security): drop CVE-2026-56674 Origin: null CSRF change Per maintainer review, the reported CSRF is already mitigated by the pre-existing Sec-Fetch-Site: cross-site check for current browsers, and the null-origin rejection risked breaking legitimate sandboxed-iframe embeds. Restores origin_only_middleware and is_loopback in server.py to their prior state (the Sec-Fetch-Site check is retained) and removes utils/origin_check.py and its regression test. The other four GHSA-779p fixes are unaffected.	2026-07-02 20:44:54 -07:00