1 scriptable-latent
Balazs Horvath edited this page 2026-04-18 10:29:30 +02:00
This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

ScriptableLatent Node

Overview

The ScriptableLatent node allows you to write Python scripts that generate latent representations for diffusion models. A latent is a compressed representation of an image in the model's internal space, typically with shape [batch, channels, height, width] where channels is usually 4 for stable diffusion models.

What is a Latent?

When diffusion models generate images, they don't work directly with pixels. Instead, they work in a compressed "latent space" that's much smaller than the original image. This makes the process faster and more efficient.

Think of it like this:

  • A 512x512 image has 262,144 pixels (3 channels × 512 × 512)
  • A 512x512 latent has only 1,048,576 values (4 channels × 512 × 512), but these values are more information-dense

The latent space captures the essential features of an image without the fine details that the model will add later during the diffusion process.

Inputs

Parameter Type Default Range Description
width INT 512 64-8192 Width of the latent in pixels
height INT 512 64-8192 Height of the latent in pixels
batch_size INT 1 1-64 Number of latents to generate (for batching)
seed INT 0 0-4294967295 Random seed for reproducibility
enable_io_helpers BOOLEAN False - Enable file I/O helper functions
script STRING (see below) - Python script to execute

Outputs

  • latent: A dictionary with key "samples" containing a tensor of shape [batch_size, 4, height, width] with dtype float32

Default Script

# Scriptable Latent: set 'output_latent' (dict with 'samples' tensor [B,C,H,W]).
# Injected: width, height, batch_size, seed, torch, np.
# Example - empty latent:
#   output_latent = {'samples': torch.zeros((batch_size, 4, height, width), dtype=torch.float32)}

output_latent = {'samples': torch.zeros((batch_size, 4, height, width), dtype=torch.float32)}

How the Script Works

The default script creates an empty latent filled with zeros. Let's break down what each line does:

output_latent = {'samples': torch.zeros((batch_size, 4, height, width), dtype=torch.float32)}
  • torch.zeros(): Creates a tensor filled with zeros
  • (batch_size, 4, height, width): The shape of the tensor
    • batch_size: Number of images in the batch
    • 4: Number of channels (standard for stable diffusion)
    • height: Height of the latent
    • width: Width of the latent
  • dtype=torch.float32: Uses 32-bit floating point numbers
  • {'samples': ...}: Wraps the tensor in a dictionary with the key "samples" (ComfyUI's expected format)

Available Variables

Your script has access to these variables:

  • width (int): The width parameter
  • height (int): The height parameter
  • batch_size (int): The batch_size parameter
  • seed (int): The seed parameter
  • torch: The PyTorch library
  • np: The NumPy library

If enable_io_helpers is True, you also get:

  • get_output_dir(): Returns ComfyUI's output directory path
  • get_input_dir(): Returns ComfyUI's input directory path
  • get_temp_dir(): Returns ComfyUI's temporary directory path

Example: Latent with Random Noise

import torch

# Create a latent with random noise between -1 and 1
noise = torch.randn(batch_size, 4, height, width) * 0.5
output_latent = {'samples': noise}

Explanation:

  • torch.randn(): Generates random numbers from a normal distribution (mean=0, std=1)
  • * 0.5: Scales the values to have standard deviation 0.5 (common for diffusion models)
  • The result is values roughly between -1.5 and 1.5 (3 standard deviations)

Example: Latent with Gradient

import torch

# Create a latent with a horizontal gradient
x_coords = torch.linspace(-1, 1, width).view(1, 1, 1, width).expand(batch_size, 4, height, width)
output_latent = {'samples': x_coords}

Explanation:

  • torch.linspace(-1, 1, width): Creates a sequence of width values evenly spaced from -1 to 1
  • .view(1, 1, 1, width): Reshapes to have shape [1, 1, 1, width]
  • .expand(batch_size, 4, height, width): Expands to the full latent shape by repeating the values
  • This creates a gradient that varies from left to right across the latent

Example: Latent with Centered Gaussian

import torch

# Create coordinates
y_coords = torch.linspace(-1, 1, height)
x_coords = torch.linspace(-1, 1, width)
Y, X = torch.meshgrid(y_coords, x_coords, indexing='ij')

# Create a Gaussian blob in the center
sigma = 0.3
gaussian = torch.exp(-(X**2 + Y**2) / (2 * sigma**2))

# Expand to latent shape
gaussian = gaussian.unsqueeze(0).unsqueeze(0).expand(batch_size, 4, height, height, width)
output_latent = {'samples': gaussian}

Explanation:

  • torch.meshgrid(): Creates a 2D grid of coordinates
  • X**2 + Y**2: Computes the squared distance from the center
  • torch.exp(-(X**2 + Y**2) / (2 * sigma**2)): Applies the Gaussian formula
    • The Gaussian function is: exp(-x² / (2σ²))
    • It peaks at 0 (the center) and decays smoothly
    • sigma controls how spread out the blob is
  • .unsqueeze(0).unsqueeze(0): Adds two dimensions at the front
  • .expand(): Repeats the pattern across all channels and batch items

Mathematical Background: The Gaussian Function

The Gaussian function (also called a bell curve) is defined as:

f(x) = exp(-x² / (2σ²))

Where:

  • x is the distance from the center
  • σ (sigma) is the standard deviation, which controls the width

Key properties:

  • Maximum value is 1 at x=0 (the center)
  • Approaches 0 as x gets far from the center
  • The inflection points (where the curve changes from convex to concave) are at x = ±σ
  • About 68% of the area under the curve is within ±σ
  • About 95% is within ±2σ
  • About 99.7% is within ±3σ

In the example above, we use a 2D Gaussian:

f(x, y) = exp(-(x² + y²) / (2σ²))

This creates a circular blob that's brightest in the center and fades smoothly toward the edges.

Common Patterns

Pattern 1: Initialize from Existing Latent

# If you have an input latent, you can modify it
# (This requires connecting a latent to an input, which ScriptableLatent doesn't have by default)
# See ScriptableConditioning for working with inputs

Pattern 2: Combine Multiple Sources

import torch

# Combine noise with a structured pattern
noise = torch.randn(batch_size, 4, height, width) * 0.3
pattern = torch.sin(torch.linspace(0, 6.28, width)).view(1, 1, 1, width).expand(batch_size, 4, height, width)
combined = noise + pattern * 0.5
output_latent = {'samples': combined}

Pattern 3: Use Seed for Reproducibility

import torch

# Set the random seed for reproducible results
torch.manual_seed(seed)
noise = torch.randn(batch_size, 4, height, width) * 0.5
output_latent = {'samples': noise}

Error Handling

The node includes error handling:

  • If the script fails to execute, it returns a zero latent
  • If the script doesn't set output_latent, it returns a zero latent
  • If the output format is invalid, it returns a zero latent

This ensures your workflow won't crash if there's a script error.

Security Considerations

This node uses exec() to run your script. This is intentionally powerful for local development but should not be used in production or exposed to untrusted users.

  • ScriptableEmptyLatent: Similar node for creating empty latents
  • ScriptableImage: For generating images directly instead of latents
  • ScriptableNoise: For generating noise tensors for samplers