1 nodes-scriptable-image
Balazs Horvath edited this page 2026-04-18 10:52:18 +02:00

ScriptableImage Node

Overview

The ScriptableImage node allows you to write Python scripts that generate images. Unlike ScriptableLatent which works in the compressed latent space, ScriptableImage generates actual pixel data that can be viewed directly or saved as images.

Image Format in ComfyUI

ComfyUI uses a specific image format: a PyTorch tensor with shape [batch, height, width, channels] where:

  • batch: Number of images (use 1 for single images, >1 for animations)
  • height: Image height in pixels
  • width: Image width in pixels
  • channels: Usually 3 for RGB (red, green, blue)

Values are floating point numbers between 0.0 and 1.0, where:

  • 0.0 = black (or absence of that color)
  • 1.0 = full intensity of that color

Inputs

Parameter Type Default Range Description
width INT 512 64-8192 Width of the image in pixels
height INT 512 64-8192 Height of the image in pixels
seed INT 0 0-4294967295 Random seed for reproducibility
enable_io_helpers BOOLEAN False - Enable file I/O helper functions
script STRING (see below) - Python script to execute

Outputs

  • image: A tensor of shape [batch, height, width, 3] with dtype float32, values in range [0, 1]

Animation Support

The ScriptableImage node supports both still images and animations:

  • Still image: Set batch = 1 in the output tensor
  • Animation: Set batch > 1 in the output tensor (e.g., 30 frames)

Animations can be processed by ComfyUI's video nodes like SaveWEBM or SaveVideo.

Default Script

# Scriptable Image: set 'output_image' (torch.Tensor [B,H,W,C] float 0-1).
# Injected: width, height, seed, torch, np, PIL.
# For animation: set 'output_image' as batch tensor [B,H,W,C] where B > 1.
# Example - single frame (still image):
#   output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)
#   output_image[:, :, :, 0] = 1.0  # Red channel
# Example - animation (batch):
#   frames = []
#   for t in range(10):
#       frame = torch.sin(torch.linspace(0, 2*np.pi, width) + t/5).unsqueeze(0).repeat(height, 1, 1)
#       frame = torch.stack([frame, frame, frame], dim=-1)
#       frames.append(frame)
#   output_image = torch.stack(frames, dim=0)

output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)

How the Default Script Works

output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)
  • torch.zeros(): Creates a tensor filled with zeros (black image)
  • (1, height, width, 3): The shape
    • 1: Single image (batch size)
    • height: Image height
    • width: Image width
    • 3: RGB channels
  • dtype=torch.float32: Uses 32-bit floating point

This creates a completely black image.

Available Variables

Your script has access to:

  • width (int): The width parameter
  • height (int): The height parameter
  • seed (int): The seed parameter
  • torch: The PyTorch library
  • np: The NumPy library
  • PIL: The Pillow image library

If enable_io_helpers is True, you also get:

  • get_output_dir(): Returns ComfyUI's output directory path
  • get_input_dir(): Returns ComfyUI's input directory path
  • get_temp_dir(): Returns ComfyUI's temporary directory path

Example: Solid Red Image

import torch

# Create a black image
output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)

# Set the red channel to 1.0 (full intensity)
output_image[:, :, :, 0] = 1.0

Explanation:

  • [:, :, :, 0] selects all elements in the red channel (index 0)
  • Index 0 = red, 1 = green, 2 = blue
  • Setting it to 1.0 makes the image fully red

Example: Gradient from Blue to Green

import torch

# Create a horizontal gradient from blue to green
gradient = torch.linspace(0, 1, width).view(1, width, 1).expand(1, height, width, 1)
output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)
output_image[:, :, :, 1] = gradient  # Green channel increases left to right
output_image[:, :, :, 2] = 1 - gradient  # Blue channel decreases left to right

Explanation:

  • torch.linspace(0, 1, width): Creates values from 0 to 1 across the width
  • .view(1, width, 1): Reshapes to [1, width, 1]
  • .expand(1, height, width, 1): Expands to full image shape by repeating
  • Green channel: increases from 0 (left) to 1 (right)
  • Blue channel: decreases from 1 (left) to 0 (right)
  • Result: blue on left, green on right, with a smooth transition

Example: Checkerboard Pattern

import torch

# Create a checkerboard pattern
tile_size = 32
x_indices = torch.arange(width) // tile_size
y_indices = torch.arange(height) // tile_size
checkerboard = (x_indices.unsqueeze(0) + y_indices.unsqueeze(1)) % 2

# Expand to RGB (make it red and white)
output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)
output_image[:, :, :, 0] = checkerboard.float()  # Red channel
output_image[:, :, :, 1] = checkerboard.float()  # Green channel
output_image[:, :, :, 2] = checkerboard.float()  # Blue channel

Explanation:

  • torch.arange(width) // tile_size: Divides the width into tiles of size tile_size
  • x_indices.unsqueeze(0): Reshapes to a row vector
  • y_indices.unsqueeze(1): Reshapes to a column vector
  • Adding them broadcasts to a 2D grid
  • % 2: Modulo 2 gives alternating 0 and 1
  • Setting all RGB channels to the same value creates a grayscale checkerboard

Example: Plasma Animation

import torch

# Generate plasma effect animation
frames = []
num_frames = 30

for t in range(num_frames):
    # Create coordinate grids
    x = torch.linspace(0, 2 * torch.pi, width)
    y = torch.linspace(0, 2 * torch.pi, height)
    X, Y = torch.meshgrid(x, y, indexing='ij')
    
    # Plasma formula: sum of sine waves with time variation
    plasma = torch.sin(X + t/5) + torch.sin(Y + t/5) + torch.sin((X + Y)/2 + t/5)
    
    # Normalize from [-3, 3] to [0, 1]
    plasma = (plasma + 3) / 6
    
    # Convert to RGB with color cycling
    r = torch.sin(plasma * torch.pi + t/10)
    g = torch.sin(plasma * torch.pi + t/10 + 2*torch.pi/3)
    b = torch.sin(plasma * torch.pi + t/10 + 4*torch.pi/3)
    
    # Normalize from [-1, 1] to [0, 1]
    rgb = torch.stack([r, g, b], dim=-1)
    rgb = (rgb + 1) / 2
    
    frames.append(rgb)

# Stack frames into batch tensor
output_image = torch.stack(frames, dim=0)  # Shape: [30, height, width, 3]

Explanation:

This is a classic plasma effect. Let's break down the mathematics:

Step 1: Coordinate Grid

x = torch.linspace(0, 2 * torch.pi, width)
y = torch.linspace(0, 2 * torch.pi, height)
X, Y = torch.meshgrid(x, y, indexing='ij')
  • Creates a 2D grid where X and Y range from 0 to 2π
  • meshgrid creates coordinate matrices for every pixel

Step 2: Plasma Formula

plasma = torch.sin(X + t/5) + torch.sin(Y + t/5) + torch.sin((X + Y)/2 + t/5)

The plasma effect comes from summing three sine waves:

  • sin(X + t/5): Horizontal wave moving right over time
  • sin(Y + t/5): Vertical wave moving down over time
  • sin((X + Y)/2 + t/5): Diagonal wave
  • t/5: Controls animation speed (smaller = slower)

Step 3: Normalization

plasma = (plasma + 3) / 6
  • Each sine wave ranges from -1 to 1
  • Sum of three waves ranges from -3 to 3
  • Adding 3 shifts to [0, 6]
  • Dividing by 6 normalizes to [0, 1]

Step 4: Color Cycling

r = torch.sin(plasma * torch.pi + t/10)
g = torch.sin(plasma * torch.pi + t/10 + 2*torch.pi/3)
b = torch.sin(plasma * torch.pi + t/10 + 4*torch.pi/3)
  • Each channel is a sine wave of the plasma value
  • The offsets (2*torch.pi/3 and 4*torch.pi/3) create phase shifts
  • Phase shifts of 120° (2π/3) and 240° (4π/3) separate the colors
  • This creates smooth color transitions as the plasma value changes

Mathematical Insight: Phase Shifts

When you add a phase shift to a sine wave:

sin(x + φ)
  • φ is the phase shift
  • A shift of 2π/3 (120°) means the wave starts at a different point in its cycle
  • This is how we get different colors at the same plasma value

The three channels are 120° apart, which is evenly distributed around the color wheel, ensuring smooth color transitions.

Example: Sine Wave Interference

import torch

# Generate interference pattern animation
frames = []
num_frames = 30

for t in range(num_frames):
    # Create coordinate grids (larger range for more waves)
    x = torch.linspace(0, 4 * torch.pi, width)
    y = torch.linspace(0, 4 * torch.pi, height)
    X, Y = torch.meshgrid(x, y, indexing='ij')
    
    # Three sine waves with different frequencies
    wave1 = torch.sin(X + t/5)
    wave2 = torch.sin(Y + t/5)
    wave3 = torch.sin((X + Y)/2 + t/5)
    
    # Interference: sum of waves
    interference = wave1 + wave2 + wave3
    
    # Normalize
    interference = (interference + 3) / 6
    
    # Convert to RGB
    rgb = torch.stack([
        interference,
        torch.sin(interference * torch.pi),
        torch.cos(interference * torch.pi)
    ], dim=-1)
    rgb = (rgb + 1) / 2
    
    frames.append(rgb)

output_image = torch.stack(frames, dim=0)

Mathematical Background: Wave Interference

When two or more waves overlap, they interfere with each other:

  • Constructive interference: Waves add together (peaks align with peaks)
  • Destructive interference: Waves cancel out (peaks align with troughs)

The interference pattern is:

I = sin(X + t) + sin(Y + t) + sin((X+Y)/2 + t)

This creates a complex pattern of bright and dark regions that shifts over time.

Example: Radial Gradient

import torch

# Create a radial gradient from center
y_coords = torch.linspace(-1, 1, height)
x_coords = torch.linspace(-1, 1, width)
Y, X = torch.meshgrid(y_coords, x_coords, indexing='ij')

# Calculate distance from center
distance = torch.sqrt(X**2 + Y**2)

# Create gradient (1 at center, 0 at corners)
gradient = 1 - distance
gradient = torch.clamp(gradient, 0, 1)  # Ensure values stay in [0, 1]

# Convert to RGB (grayscale)
output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)
output_image[:, :, :, 0] = gradient
output_image[:, :, :, 1] = gradient
output_image[:, :, :, 2] = gradient

Explanation:

  • Coordinates range from -1 to 1 in both directions
  • sqrt(X**2 + Y**2): Euclidean distance from center (0, 0)
  • 1 - distance: Inverts so center is 1, corners are ~0.41
  • torch.clamp(gradient, 0, 1): Ensures no negative values

Mathematical Background: Distance Formula

The Euclidean distance from a point (x, y) to the center (0, 0) is:

d = sqrt(x² + y²)

This is the Pythagorean theorem in 2D. For a unit square where x and y range from -1 to 1:

  • Center (0, 0): d = 0
  • Corner (1, 1): d = sqrt(2) ≈ 1.41
  • Edge (1, 0): d = 1

Common Patterns

Pattern 1: Use NumPy for Mathematical Functions

import numpy as np
import torch

# NumPy has some functions PyTorch doesn't
x = np.linspace(0, 2*np.pi, width)
y = np.linspace(0, 2*np.pi, height)
X, Y = np.meshgrid(x, y)

# Convert back to PyTorch
pattern = torch.from_numpy(np.sin(X) + np.cos(Y)).float()
output_image = pattern.unsqueeze(0).unsqueeze(-1).expand(1, height, width, 3)

Pattern 2: Use PIL for Image Operations

from PIL import Image
import numpy as np
import torch

# Create an image using PIL
img = Image.new('RGB', (width, height), color='red')

# Convert to tensor
arr = np.array(img).astype(np.float32) / 255.0
output_image = torch.from_numpy(arr).unsqueeze(0)

Pattern 3: Conditional Coloring

import torch

# Create a pattern
pattern = torch.randn(height, width)

# Color based on value
output_image = torch.zeros((1, height, width, 3), dtype=torch.float32)
output_image[:, :, :, 0] = torch.where(pattern > 0, 1.0, 0.0)  # Red where positive
output_image[:, :, :, 1] = torch.where(pattern < 0, 1.0, 0.0)  # Green where negative

Error Handling

The node includes error handling:

  • Invalid output types are rejected
  • Output is automatically clamped to [0, 1]
  • Invalid shapes are corrected when possible
  • Errors fall back to a black image

H.264 Compatibility

The node automatically enforces even dimensions for video compatibility:

  • H.264 (used by many video encoders) requires both width and height to be even
  • If dimensions are odd, they're automatically cropped to the nearest even number
  • A warning is logged when this happens

Security Considerations

This node uses exec() to run your script. This is intentionally powerful for local development but should not be used in production or exposed to untrusted users.

  • ScriptableLatent: For generating latents instead of images
  • ScriptableMask: For generating masks
  • SaveWEBM / SaveVideo: For saving animations as video files