VO to Images API

REST API reference for the VO to Images pipeline. For the product overview and UI walkthrough, see VO to Images.

VO to Images is available via the Flokan public API. Because the pipeline is multi-step and long-running, the API is split into a small set of endpoints you call in sequence — there is no single "do it all" call that returns finished images.

Authentication

All requests require a workspace API key minted in Workspace Settings → API Keys. Pass the key in the Authorization header:

Authorization: Bearer flk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The key must include the vo_to_images scope, and the user who created the key must have the Access VO to Images permission in the workspace.

Base URL

https://app.flokan.com/api/v1/vo-to-images

End-to-end flow

The shortest path from "I have an audio URL" to "I have generated images" is:

POST /presets (optional) — read your saved settings presets and pick one.
POST /projects — create a project (optionally with a preset_id).
POST /projects/{projectId}/audio — point the project at an audio URL the server can fetch.
POST /projects/{projectId}/auto-run — kicks off transcription + segmentation. Returns a list of segment_ids.
For each segment_id, POST /projects/{projectId}/segments/{segmentId}/prompts — generates prompts for that segment. Safe to fan out in parallel (10 at a time is a good ceiling).
POST /projects/{projectId}/finalize — once all segments have prompts, this enqueues image generation jobs.
GET /projects/{projectId}/auto-status — poll every 5–10 seconds until is_done: true.
GET /projects/{projectId} — read the finished sentences with their image_urls.

If generate_videos: true is set on the project, video jobs are enqueued automatically by the same worker that processes images — no separate finalize call is needed.

Endpoints

`POST /projects` — Create a project

Request body

Field	Type	Required	Notes
`title`	string	yes	1–200 characters
`workflow_mode`	string	no	`automatic` (default) or `manual`. Auto-run only works on `automatic` projects.
`preset_id`	uuid	no	Apply a saved settings preset at creation time
`art_style_id`	uuid	no	Inline override (wins over preset)
`char_gen_style_id`	uuid	no	Inline override
`detect_characters`	boolean	no	When false, auto-run skips the character-detect step
`generate_characters`	boolean	no	When true, character portraits are auto-generated
`generate_videos`	boolean	no	When true, video jobs are enqueued after images complete

`GET /projects` — List projects

Query params

Param	Type	Notes
`page`	integer	Default `1`
`per_page`	integer	Default `50`, max `100`
`status`	string	`draft`, `transcribing`, `characters`, `prompts`, `generating`, `complete`

`GET /projects/{projectId}` — Fetch project with sentences/images/videos

Query params

Param	Type	Notes
`sentences_offset`	integer	Default `0`
`sentences_limit`	integer	Default `150`, max `300`

Each sentence includes its images[] (with public image_url for completed images) and video_jobs[]. Use has_more_sentences + sentences_offset to page through long projects.

`DELETE /projects/{projectId}` — Delete a project

Deletes the project, all child rows, and the underlying audio / image / video files in storage.

`POST /projects/{projectId}/audio` — Set audio from URL

The server fetches the URL and uploads the file to your workspace's voiceover bucket. Hard cap of 200 MB. For larger files, host them somewhere reachable and pass the public URL.

Request body

Field	Type	Required	Notes
`audio_url`	string	yes	HTTP(S) URL the Flokan server can fetch (max 2000 chars)

Supported formats: MP3, WAV, M4A, AAC, OGG, FLAC, WebM.

`POST /projects/{projectId}/auto-run` — Run prep pipeline

Transcribes the audio, optionally detects characters, recomputes initial segments, and flips auto_step to generating_prompts.

Returns 202 Accepted. The response includes segment_ids[] — call the prompts endpoint for each segment next.

Response 202

{
  "success": true,
  "data": {
    "project_id": "...",
    "auto_step": "generating_prompts",
    "sentence_count": 87,
    "duration_ms": 312450,
    "transcript": "Welcome to the channel...",
    "characters": [],
    "skipped_character_detection": true,
    "segment_ids": ["seg-1-uuid", "seg-2-uuid", "..."],
    "next_actions": [
      "For each segment_id, call POST /api/v1/vo-to-images/projects/{projectId}/segments/{segmentId}/prompts to generate prompts (parallelizable, up to 10 at a time).",
      "Once all segments have prompts, call POST /api/v1/vo-to-images/projects/{projectId}/finalize to enqueue image generation jobs.",
      "Poll GET /api/v1/vo-to-images/projects/{projectId}/auto-status to track progress."
    ]
  }
}

`POST /projects/{projectId}/segments/{segmentId}/prompts` — Generate prompts for one segment

Safe to call in parallel for many segments. Each call has its own 300-second budget so long projects don't hit a shared timeout.

Request body (all fields optional)

Field	Type	Notes
`smart_camera_angles`	boolean	Enables AI-chosen camera directions on prompts

`POST /projects/{projectId}/finalize` — Enqueue image generation

Call once every segment has prompts. Refuses if no sentences are in prompt_ready or failed status.

`GET /projects/{projectId}/auto-status` — Poll progress

Cheap status endpoint — returns just the project's automatic-mode state and counts.

Response 200

{
  "success": true,
  "data": {
    "project_id": "...",
    "status": "generating",
    "auto_step": "generating_images",
    "workflow_mode": "automatic",
    "is_done": false,
    "progress": {
      "segments_total": 12,
      "sentences_total": 87,
      "prompts_ready": 87,
      "sentences_complete": 41,
      "sentences_failed": 0,
      "image_jobs_active": 16,
      "image_jobs_failed": 0,
      "video_jobs_total": 0,
      "video_jobs_complete": 0,
      "video_jobs_failed": 0
    }
  }
}

Recommended poll cadence: 5–10 seconds. is_done flips to true once every sentence is in a terminal state (and, when generate_videos is on, every video job too).

`GET /estimate-credits` — Estimate cost before running a phase

Query params

Param	Type	Required	Notes
`project_id`	uuid	yes
`action`	string	yes	`generate_images`, `generate_prompts`, `create_characters`, or `detect_characters`

Response 200

{
  "success": true,
  "data": {
    "action": "generate_images",
    "count": 87,
    "estimated_credits": 4263.0,
    "per_item_credits": 49.0,
    "is_exact": false,
    "available_credits": 12500.0,
    "can_proceed": true
  }
}

`GET /presets` — List settings presets

Returns every saved preset in the workspace. Pass any preset's id to POST /projects (preset_id) at creation, or to POST /projects/{projectId}/apply-preset to load it onto an existing project.

`POST /projects/{projectId}/apply-preset` — Apply a preset to an existing project

Request body

Field	Type	Required	Notes
`preset_id`	uuid	yes	Must belong to the same workspace

workflow_mode is intentionally not merged — changing it mid-project would invalidate pipeline state.

Errors

All error responses use the envelope { "success": false, "error": "..." }.

Code	Meaning
`400`	Validation error, missing prerequisite (no audio, no art style, wrong `auto_step`, …)
`401`	Missing or invalid API key
`402`	Insufficient AI credits or storage limit exceeded
`403`	Missing scope, missing permission, or feature not enabled on the workspace plan
`404`	Project, segment, or preset not found in this workspace
`413`	Audio file exceeds the 200 MB upload cap
`429`	Rate limit exceeded — see `Retry-After` header and `X-RateLimit-*` headers
`500`	Unexpected server or provider error

Rate limits

Every response carries X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers. The per-minute limit is determined by the workspace's billing plan.

Worked example

# 1. Create the project, applying a saved preset
PROJECT=$(curl -s https://app.flokan.com/api/v1/vo-to-images/projects \
  -H "Authorization: Bearer $FLOKAN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"title":"My Narrated Video","preset_id":"PRESET_UUID"}' \
  | jq -r '.data.id')
 
# 2. Attach audio
curl -s -X POST https://app.flokan.com/api/v1/vo-to-images/projects/$PROJECT/audio \
  -H "Authorization: Bearer $FLOKAN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"audio_url":"https://example.com/voiceover.mp3"}'
 
# 3. Run the prep phase — returns segment_ids
SEGMENTS=$(curl -s -X POST https://app.flokan.com/api/v1/vo-to-images/projects/$PROJECT/auto-run \
  -H "Authorization: Bearer $FLOKAN_KEY" \
  | jq -r '.data.segment_ids[]')
 
# 4. Generate prompts for each segment (run these in parallel for speed)
for SEG in $SEGMENTS; do
  curl -s -X POST "https://app.flokan.com/api/v1/vo-to-images/projects/$PROJECT/segments/$SEG/prompts" \
    -H "Authorization: Bearer $FLOKAN_KEY" &
done
wait
 
# 5. Enqueue image generation
curl -s -X POST https://app.flokan.com/api/v1/vo-to-images/projects/$PROJECT/finalize \
  -H "Authorization: Bearer $FLOKAN_KEY"
 
# 6. Poll until done
while :; do
  DONE=$(curl -s "https://app.flokan.com/api/v1/vo-to-images/projects/$PROJECT/auto-status" \
    -H "Authorization: Bearer $FLOKAN_KEY" | jq -r '.data.is_done')
  [ "$DONE" = "true" ] && break
  sleep 8
done
 
# 7. Read the finished images
curl -s "https://app.flokan.com/api/v1/vo-to-images/projects/$PROJECT" \
  -H "Authorization: Bearer $FLOKAN_KEY" \
  | jq '.data.sentences[] | {sequence, image_urls: [.images[].image_url]}'

Image Editor API

VO to Images API

Authentication

Base URL

End-to-end flow

Endpoints

POST /projects — Create a project

GET /projects — List projects

GET /projects/{projectId} — Fetch project with sentences/images/videos

DELETE /projects/{projectId} — Delete a project

POST /projects/{projectId}/audio — Set audio from URL

POST /projects/{projectId}/auto-run — Run prep pipeline

POST /projects/{projectId}/segments/{segmentId}/prompts — Generate prompts for one segment

POST /projects/{projectId}/finalize — Enqueue image generation

GET /projects/{projectId}/auto-status — Poll progress

GET /estimate-credits — Estimate cost before running a phase

GET /presets — List settings presets

POST /projects/{projectId}/apply-preset — Apply a preset to an existing project

Errors

Rate limits

Worked example

`POST /projects` — Create a project

`GET /projects` — List projects

`GET /projects/{projectId}` — Fetch project with sentences/images/videos

`DELETE /projects/{projectId}` — Delete a project

`POST /projects/{projectId}/audio` — Set audio from URL

`POST /projects/{projectId}/auto-run` — Run prep pipeline

`POST /projects/{projectId}/segments/{segmentId}/prompts` — Generate prompts for one segment

`POST /projects/{projectId}/finalize` — Enqueue image generation

`GET /projects/{projectId}/auto-status` — Poll progress

`GET /estimate-credits` — Estimate cost before running a phase

`GET /presets` — List settings presets

`POST /projects/{projectId}/apply-preset` — Apply a preset to an existing project