Documentation
Image-Video Generator

Image-Video Generator

The Image-Video Generator is a 4-step pipeline that turns a written script into a complete video with AI-generated images, voiceover audio, and scene transitions. It automates the most time-consuming part of visual content creation — producing the images and assembling them with audio into a watchable video.

This tool is designed for content formats like explainer videos, listicle videos, educational content, and any format where narrated slides or scene-based visuals are the primary visual medium.

Pipeline Steps

Step 1: Script Chunking

Start by importing your video script — either paste it directly or upload a text file. The AI then analyzes your script and breaks it into logical segments, each representing a distinct scene or visual moment.

Automatic segmentation identifies natural breakpoints based on:

  • Topic transitions in the narration
  • Paragraph and sentence boundaries
  • Timing estimates for each segment

Manual adjustment lets you fine-tune the segmentation. Merge segments that should be one scene, split segments that cover multiple visual moments, or adjust the boundaries to get the exact breakdown you want.

Step 2: Content Segments

With the script chunked into segments, this step lets you define the visual content for each one.

For each segment, you can:

  • Write a scene description — Describe what the viewer should see during this part of the narration
  • Set timing — Define how long each segment should display (in seconds)
  • Add notes — Include production notes or special instructions for the generation step

The AI can also suggest scene descriptions based on the script content, giving you a starting point that you can then refine.

Step 3: Images & Audio

This is the production step where visual and audio assets are created and assembled for each segment.

Image Generation: For each segment, the AI generates an image based on the scene description from Step 2. You can:

  • Review each generated image
  • Regenerate individual images with modified descriptions
  • Choose which AI model tier to use
  • Upload your own images instead of generating them

Audio Assignment: Attach voiceover audio to each segment. You can:

  • Upload pre-recorded voiceover clips for each segment
  • Use AI-generated voiceover (if configured with an audio provider)
  • Adjust timing to sync images with audio

Preview: Preview individual segments with their image and audio to check that everything looks and sounds right before final assembly.

Step 4: Video Generation

The final step compiles all segments into a complete video.

  • Assembly — The AI combines all segment images and audio into a single video with transitions
  • Real-time progress — A live progress tracker shows you the compilation status in real-time
  • Download — Once complete, download the finished video in a standard format ready for upload to YouTube

Project Management

The Image-Video Generator supports multiple concurrent projects:

  • Project list — See all your projects with their current step and completion status
  • Auto-save — Progress is saved at every step automatically
  • Resume — Return to any project and pick up exactly where you left off
  • Delete — Remove projects you no longer need

Requirements

This feature must be enabled for your workspace and requires the appropriate role permissions. This is a plan-based feature — check your subscription tier for availability. Contact your workspace admin if you can't access it.

Credits

Video generation consumes credits for each image generated. The final video compilation step itself does not consume additional credits beyond the image generations. Credit cost depends on the number of segments and the AI model tier used for image generation.