Add colored logs (#14036 )

cache-ram: lower thresholds (#14089 )
Use the RAM right up to the wire as the community is bit accustomed too. This trades off headroom for the case where large chunky intermediates arrive and potenitally hits pagefile/swap, but a lot of people have "it just fits" workflows out there, so strike a compromise with 75->90%. Disable the incative cache for all but the very high RAM users.
2026-05-25 10:36:57 +08:00 · 2026-05-25 10:00:55 +08:00 · 2026-05-24 15:26:50 -07:00 · 2026-05-24 15:25:59 -07:00 · 2026-05-24 10:58:35 +08:00 · 2026-05-24 10:48:31 +08:00
9 changed files with 52 additions and 24 deletions
--- a/README.md
+++ b/README.md
@ -433,7 +433,7 @@ See also: [https://www.comfy.org/](https://www.comfy.org/)
 ## Frontend Development
-As of August 15, 2024, we have transitioned to a new frontend, which is now hosted in a separate repository: [ComfyUI Frontend](https://github.com/Comfy-Org/ComfyUI_frontend). This repository now hosts the compiled JS (from TS/Vue) under the `web/` directory.
+As of August 15, 2024, we have transitioned to a new frontend, which is now hosted in a separate repository: [ComfyUI Frontend](https://github.com/Comfy-Org/ComfyUI_frontend). The compiled JS files (from TS/Vue) are published to [pypi](https://pypi.org/project/comfyui-frontend-package) and installed as a dependency in ComfyUI.
 ### Reporting Issues and Requesting Features
--- a/app/logger.py
+++ b/app/logger.py
@ -5,6 +5,40 @@ import logging
 import sys
 import threading
 ANSI_NAMED_COLORS = {
    'black':   '\033[30m',
    'red':     '\033[31m',
    'green':   '\033[32m',
    'yellow':  '\033[33m',
    'blue':    '\033[34m',
    'magenta': '\033[35m',
    'cyan':    '\033[36m',
    'white':   '\033[37m',
 }
 ANSI_LEVEL_COLORS = {
    'DEBUG':    ANSI_NAMED_COLORS['cyan'],
    'INFO':     ANSI_NAMED_COLORS['green'],
    'WARNING':  ANSI_NAMED_COLORS['yellow'],
    'ERROR':    ANSI_NAMED_COLORS['red'],
    'CRITICAL': ANSI_NAMED_COLORS['magenta'],
 }
 ANSI_RESET = '\033[0m'
 ANSI_BOLD  = '\033[1m'
 class ColoredFormatter(logging.Formatter):
    def format(self, record):
        color = ANSI_LEVEL_COLORS.get(record.levelname, '')
        bold  = ANSI_BOLD if record.levelno >= logging.WARNING else ''
        level_tag = f"{bold}{color}[{record.levelname}]{ANSI_RESET} "
        message = super().format(record)
        line_color = ANSI_NAMED_COLORS.get(getattr(record, 'color', ''), '')
        if line_color:
            return f"{level_tag}{line_color}{message}{ANSI_RESET}"
        return level_tag + message
 logs = None
 stdout_interceptor = None
 stderr_interceptor = None
@ -68,8 +102,10 @@ def setup_logger(log_level: str = 'INFO', capacity: int = 300, use_stdout: bool
    logger = logging.getLogger()
    logger.setLevel(log_level)
    formatter = ColoredFormatter("%(message)s")
    stream_handler = logging.StreamHandler()
-    stream_handler.setFormatter(logging.Formatter("%(message)s"))
+    stream_handler.setFormatter(formatter)
    if use_stdout:
        # Only errors and critical to stderr
@ -77,7 +113,7 @@ def setup_logger(log_level: str = 'INFO', capacity: int = 300, use_stdout: bool
        # Lesser to stdout
        stdout_handler = logging.StreamHandler(sys.stdout)
-        stdout_handler.setFormatter(logging.Formatter("%(message)s"))
+        stdout_handler.setFormatter(formatter)
        stdout_handler.addFilter(lambda record: record.levelno < logging.ERROR)
        logger.addHandler(stdout_handler)
--- a/comfy/cli_args.py
+++ b/comfy/cli_args.py
@ -111,7 +111,7 @@ parser.add_argument("--preview-method", type=LatentPreviewMethod, default=Latent
 parser.add_argument("--preview-size", type=int, default=512, help="Sets the maximum preview size for sampler nodes.")
 cache_group = parser.add_mutually_exclusive_group()
-cache_group.add_argument("--cache-ram", nargs='*', type=float, default=[], metavar="GB", help="Use RAM pressure caching with the specified headroom thresholds. This is the default caching mode. The first value sets the active-cache threshold; the optional second value sets the inactive-cache/pin threshold. Defaults when no values are provided: active 25%% of system RAM (min 4GB, max 32GB), inactive 75%% of system RAM (min 12GB, max 96GB).")
+cache_group.add_argument("--cache-ram", nargs='*', type=float, default=[], metavar="GB", help="Use RAM pressure caching with the specified headroom thresholds. This is the default caching mode. The first value sets the active-cache threshold; the optional second value sets the inactive-cache/pin threshold. Defaults when no values are provided: active 10%% of system RAM (min 2GB, max 10GB), inactive 100%% of system RAM (max 96GB).")
 cache_group.add_argument("--cache-classic", action="store_true", help="Use the old style (aggressive) caching.")
 cache_group.add_argument("--cache-lru", type=int, default=0, help="Use LRU caching with a maximum of N node results cached. May use more RAM/VRAM.")
 cache_group.add_argument("--cache-none", action="store_true", help="Reduced RAM/VRAM usage at the expense of executing every node for each run.")
--- a/comfy/ldm/modules/attention.py
+++ b/comfy/ldm/modules/attention.py
@ -741,12 +741,12 @@ optimized_attention = attention_basic
 if model_management.sage_attention_enabled():
    logging.info("Using sage attention")
    optimized_attention = attention_sage
 elif model_management.xformers_enabled():
    logging.info("Using xformers attention")
    optimized_attention = attention_xformers
 elif model_management.flash_attention_enabled():
    logging.info("Using Flash Attention")
    optimized_attention = attention_flash
 elif model_management.xformers_enabled():
    logging.info("Using xformers attention")
    optimized_attention = attention_xformers
 elif model_management.pytorch_attention_enabled():
    logging.info("Using pytorch attention")
    optimized_attention = attention_pytorch
--- a/comfy/model_management.py
+++ b/comfy/model_management.py
@ -1217,7 +1217,7 @@ def get_aimdo_cast_buffer(offload_stream, device):
 def get_pin_buffer(offload_stream):
    pin_buffer = STREAM_PIN_BUFFERS.get(offload_stream, None)
    if pin_buffer is None:
-        pin_buffer = comfy_aimdo.host_buffer.HostBuffer(0, 0, pinned_hostbuf_size(8 * 1024**3))
+        pin_buffer = comfy_aimdo.host_buffer.HostBuffer(0, 0, pinned_hostbuf_size(8 * 1024**3), mark_cold=False)
        STREAM_PIN_BUFFERS[offload_stream] = pin_buffer
    elif offload_stream is not None:
        event = getattr(pin_buffer, "_comfy_event", None)
--- a/comfy/samplers.py
+++ b/comfy/samplers.py
@ -265,7 +265,6 @@ def _calc_cond_batch(model: BaseModel, conds: list[list[dict]], x_in: torch.Tens
                input_shape = [len(batch_amount) * first_shape[0]] + list(first_shape)[1:]
                cond_shapes = collections.defaultdict(list)
                for tt in batch_amount:
                    cond = {k: v.size() for k, v in to_run[tt][0].conditioning.items()}
                    for k, v in to_run[tt][0].conditioning.items():
                        cond_shapes[k].append(v.size())
--- a/main.py
+++ b/main.py
@ -286,8 +286,8 @@ def prompt_worker(q, server_instance):
    cache_ram = 0
    cache_ram_inactive = 0
    if not args.cache_classic and not args.cache_none and args.cache_lru <= 0:
-        cache_ram = min(32.0, max(4.0, comfy.model_management.total_ram * 0.25 / 1024.0))
+        cache_ram = min(10.0, max(2.0, comfy.model_management.total_ram * 0.10 / 1024.0))
-        cache_ram_inactive = min(96.0, max(12.0, comfy.model_management.total_ram * 0.75 / 1024.0))
+        cache_ram_inactive = min(96.0, comfy.model_management.total_ram / 1024.0)
        if len(args.cache_ram) > 0:
            cache_ram = args.cache_ram[0]
        if len(args.cache_ram) > 1:
@ -344,9 +344,9 @@ def prompt_worker(q, server_instance):
            # Log Time in a more readable way after 10 minutes
            if execution_time > 600:
                execution_time = time.strftime("%H:%M:%S", time.gmtime(execution_time))
-                logging.info(f"Prompt executed in {execution_time}")
+                logging.info(f"Prompt executed in {execution_time}", extra={'color': 'green'})
            else:
-                logging.info("Prompt executed in {:.2f} seconds".format(execution_time))
+                logging.info("Prompt executed in {:.2f} seconds".format(execution_time), extra={'color': 'green'})
            if not asset_seeder.is_disabled():
                paths = _collect_output_absolute_paths(e.history_result)
--- a/openapi.yaml
+++ b/openapi.yaml
@ -9585,16 +9585,9 @@ components:
          description: List of plan features
    BillingStatus:
-      type: object
+      type: string
      x-runtime: [cloud]
-      description: "[cloud-only] Overall billing and subscription status."
+      description: "[cloud-only] Overall billing/payment lifecycle status."
      properties:
        subscription:
          $ref: "#/components/schemas/BillingSubscription"
        balance:
          $ref: "#/components/schemas/BillingBalance"
        has_payment_method:
          type: boolean
      enum:
        - awaiting_payment_method
        - pending_payment
--- a/requirements.txt
+++ b/requirements.txt
@ -1,4 +1,4 @@
-comfyui-frontend-package==1.43.18
+comfyui-frontend-package==1.44.19
 comfyui-workflow-templates==0.9.82
 comfyui-embedded-docs==0.5.0
 torch
@ -23,7 +23,7 @@ SQLAlchemy>=2.0.0
 filelock
 av>=14.2.0
 comfy-kitchen>=0.2.8
-comfy-aimdo==0.4.3
+comfy-aimdo==0.4.5
 requests
 simpleeval>=1.0.0
 blake3
Author	SHA1	Message	Date
Talmaj	63bcaec5d1	Add colored logs (#14036 )	2026-05-25 10:00:55 +08:00
rattus	b30e980a20	cache-ram: lower thresholds (#14089 ) Use the RAM right up to the wire as the community is bit accustomed too. This trades off headroom for the case where large chunky intermediates arrive and potenitally hits pagefile/swap, but a lot of people have "it just fits" workflows out there, so strike a compromise with 75->90%. Disable the incative cache for all but the very high RAM users.	2026-05-24 15:26:50 -07:00
rattus	39f963b4b0	mark loads to pins as cold immediately (#14088 ) This does the posix_fadvise to kick pins out of the disk cache (to avoid a double copy in RAM).	2026-05-24 15:25:59 -07:00
Matt Miller	ea62dc11c9	openapi: fix invalid BillingStatus schema (object + enum hybrid) (#14071 )	2026-05-24 10:58:35 +08:00
Robin Huang	32a7092c52	fix: correct description of where compiled FE files live (#14013 )	2026-05-24 10:48:31 +08:00
comfyanonymous	08d809d128	Fix --use-flash-attention ignored when xformers installed. (#14083 )	2026-05-23 17:44:28 -07:00
Comfy Org PR Bot	0af123022d	Bump comfyui-frontend-package to 1.44.19 (#14074 )	2026-05-24 08:27:52 +08:00
comfyanonymous	d80fcafee7	Remove dead code. (#14072 )	2026-05-22 19:56:36 -07:00
Matt Miller	187442cca4	openapi: add enum values + FeedbackRequest schema for cloud cutover (PR E) (#14070 ) * openapi: add enum values + FeedbackRequest schema for cloud cutover (PR E) Adds missing cloud-runtime enum values to vendor schemas that the cloud runtime emits but vendor declared as plain strings. Changes: - JobEntry.status: enum [pending, in_progress, completed, failed, cancelled] - JobDetailResponse.status: same enum - BillingStatus: enum [awaiting_payment_method, pending_payment, paid, payment_failed, inactive] - FeedbackRequest schema added (with type enum) - /api/feedback POST: requestBody now $refs FeedbackRequest All cloud-runtime-emitted; no impact on OSS-local semantics. Identified via Comfy-Org/cloud's TestCutoverSafe gate (BE-1106) as the remaining schema-level divergences after PRs A-D landed and got synced. * openapi: add type enum to Workspace schema (cutover follow-up) Cloud's Workspace runtime shape includes a 'type' field with enum [personal, team] that vendor's Workspace was missing. Cloud handlers reference the generated ingest.WorkspaceType Go enum. Same kind of surgical addition as JobEntry.status / BillingStatus / JobDetailResponse.status in this PR — adds cloud-runtime field to existing vendor schema.	2026-05-22 18:23:22 -07:00