Features

Everything Episio can do — from topic research to publishing. Covers all three creation modes: Single Video, Series, and Creative Series.

Format & Duration

All three creation modes support every format and duration. Set it once — the entire pipeline adapts: canvas size, caption positioning, thumbnail dimensions, and provider aspect ratios.

3 Video Formats

Portrait 9:16

1080×1920 — YouTube Shorts, TikTok, Instagram Reels

Landscape 16:9

1920×1080 — YouTube, Vimeo, Twitter

Square 1:1

1080×1080 — Instagram Feed, LinkedIn, Facebook

5 Duration Presets + Custom

Preset	Duration	Word Count	Best For
Short	~60 seconds	90–120 words	YouTube Shorts, TikTok, Reels
Medium	~2 minutes	180–240 words	Explainers, summaries
Long	~5 minutes	540–660 words	Deep topics, tutorials
Deep	~10 minutes	1140–1260 words	Documentaries, analysis
Max	~12 minutes	1380–1500 words	Full episodes, long-form

Custom duration lets you set any length. Word count scales automatically (~2 words/second). Series episodes inherit format and duration from the project settings.

Topic & Research

Used in Single Video and knowledge-based Series modes. Episio fetches trending topics from multiple live sources and lets you choose or bring your own. Seven content domains focus the research:

Geopolitics

Global conflicts, diplomacy, power shifts, sanctions

Personalities

CEOs, leaders, creators, cultural figures

AI Updates

New models, launches, industry moves

Technology

Hardware, software, space, science breakthroughs

Crypto

Bitcoin, altcoins, DeFi, regulation, market moves

Markets

Stocks, earnings, macro trends, central banks

Custom

Any topic — type your own. AI researches from multiple live sources.

Research sources: Web search, Reddit, RSS feeds, Hacker News — aggregated and ranked by recency and relevance. AI synthesizes findings into your script automatically.

Synthesis topics: AI-generated cross-domain angles that connect ideas from different fields — often the most viral format.

Synthesis topics ★ — A special topic type that connects ideas across domains. When selected, the Synthesis script style is automatically pre-set. These are often the highest-performing videos for virality because they surface non-obvious angles nobody else has covered.

Narrative angle analysis: After entering a topic, click "Analyze narrative angles" — AI generates 4–5 distinct narrative frames for the same topic. Each angle changes what the story is "about" while using the same facts. Pick the one that fits your channel voice.

Document upload: Upload a PDF, DOCX, TXT, or Markdown file instead of typing a topic. Episio extracts the text and uses it as the research source — great for turning reports, articles, your own notes, or transcripts into videos.

Custom topics automatically adjust research depth based on intent — current news gets live web research, historical topics get deep synthesis, fiction skips research entirely.

Script Generation

Scripts are generated by Claude AI. Two dimensions shape the output: script style (structure) and narrator type (voice and tone). 100 unique combinations.

10 Script Styles

Style	Structure	Best For
Synthesis	Cross-domain connections + unique angle	Viral explainers — most shared
Suspenseful	Identity withheld, reveal at 80%	Mystery and intrigue topics
Story	4-beat narrative arc	Human interest, historical drama
True Crime	Atmospheric slow reveal	Crime, investigation, scandal
Educational	Myth → bust → insight	Science, debunking, counterintuitive facts
News	Fact → implication → impact	Current events, breaking developments
Dramatic	Emotion arc, human face on data	Finance, geopolitics with stakes
Conspiracy	3 dots connected, viewer concludes	Pattern-finding, alternative angles
Comedy	Setup → punchline, rule of 3	Light topics, tech culture, satire
Retrospective	Then vs now, looking back with wisdom	Anniversaries, legacy, nostalgia

10 Narrator Types

Narrator	Tone	Example Opener
Documentary	Authoritative, measured	"In the summer of 1947, something shifted..."
Direct Address	Confrontational, personal	"You've been lied to about this for years."
Investigative	Methodical, evidence-driven	"The documents tell a different story."
Comedy	Witty, irreverent	"So apparently, the government forgot..."
True Crime	Atmospheric, tense	"At 3am on a Tuesday, the phone rang."
Teacher	Patient, curious	"Let me ask you something surprising..."
Breaking News	Urgent, present tense	"This is happening right now."
First Person	Intimate, experiential	"I was there when it happened."
Retrospective	Wise, reflective	"Nobody saw it coming — including me."
Devil's Advocate	Contrarian, challenging	"What if everything you believe is wrong?"

Script Review — Gate Before Footage

After generation, you land on the Script Review page before any footage is generated. This is where you:

Read the full script and edit any sentence directly
Refine via AI chat — "make the opening more dramatic", "add a statistic", "cut to 60 seconds"
Switch script style or narrator type and regenerate
Approve and proceed to footage generation only when satisfied

This gate is the most important step for quality control. Footage generation is where credits are spent — a script you approve is a script worth generating. Never skip the script review.

AI-Composed Music

Episio doesn't pick music from a library. It reads the emotional arc of your script and composes an original soundtrack from scratch — matched exactly to your video's length. Original every time. No royalty issues. No ContentID strikes on YouTube.

AI composed

Original per video

Duration-matched

Exact fit, no fades

Royalty-free

No copyright claims

8 emotional profiles — you set the direction; AI composes within it:

Dramatic

Mysterious

Epic

Melancholic

Upbeat

Inspiring

Corporate

Tense

Auto-classification: AI reads your script and selects the emotional profile automatically. You can override it at any time.

Custom MP3 upload: Upload your own track — it replaces the composed score entirely. Good for branded channels with a signature sound.

Music remix: Change the profile after video generation and reassemble with one click.

Voiceover & Languages

Episio supports 17 voice languages for voiceover generation. Captions render natively in 10 languages (English + 5 Indian + 4 European). Select the language on the creation page — the voice engine auto-routes based on your choice.

Language	Voice Engine	Captions
English	ElevenLabs	✓
Hindi	Sarvam AI (auto)	✓
Telugu	Sarvam AI (auto)	✓
Tamil	Sarvam AI (auto)	✓
Kannada	Sarvam AI (auto)	✓
Malayalam	Sarvam AI (auto)	✓
Spanish, French, Portuguese, German	ElevenLabs	✓
Bengali, Arabic, Japanese, Korean, Chinese, Turkish, Indonesian	ElevenLabs	—

ElevenLabs Voices

High-quality voices with distinct accents and tones.

Voice	Accent	Gender	Tone
Adam	American	Male	Deep, authoritative
Antoni	American	Male	Warm, conversational
Arnold	American	Male	Strong, confident
Bella	American	Female	Soft, engaging
Domi	American	Female	Confident, professional
Elli	American	Female	Friendly, youthful
Josh	American	Male	Clear, energetic
Rachel	American	Female	Calm, professional

Sarvam AI — Indian Languages

Native-quality voiceover for 5 Indian languages. Episio auto-routes to Sarvam AI when an Indian language is selected — no manual setup needed.

Language	Female Speaker	Male Speaker
Hindi	Neha	Rahul
Telugu	Kavya	Aditya
Tamil	Priya	Ashutosh
Kannada	Surabhi	Vikram
Malayalam	Anjali	Arjun

Voice Cloning

Upload a 30-second audio sample → Episio creates a clone via ElevenLabs. Use it across all your videos for a consistent personal brand voice. Set it as your default in Settings.

Custom Voice ID: Have an ElevenLabs voice ID not listed? Paste it directly — Episio supports any ElevenLabs voice ID.

Visual Generation

14 AI providers power footage generation — all via your FAL key. The footage director plans each scene automatically, or you can override per scene.

Quality Tiers

Tier	Credits/Video	Provider Mix	When to Use
Budget	~3 credits	Flux Ken Burns images	High-volume, drafts, testing
Standard	~8 credits	Kling i2v + Flux B-roll + Pexels	Daily content — recommended
Premium	~20 credits	Veo3 + Kling Pro + WAN 14B	Showcase, brand content

Free sources: Pexels stock footage, Archive.org historical footage, NASA imagery — all at zero credit cost. The footage director balances free and AI sources automatically.

See the Provider Guide for full comparison with costs, render times, and recommended combos.

Scene Studio

Scene Studio gives you full control over any scene in your video. Open it from the review page by clicking any scene card. 8 ways to create or replace a scene:

Footage Sources

AI Video

Generate a clip from any of the 14 AI providers with a custom prompt

Upload

Upload your own video clip — drag and drop, any format

Image / Ken Burns

Upload or generate an image, apply Ken Burns motion (zoom, pan, speed)

Me (Avatar)

Use your portrait or video for a talking-head scene with lipsync

Procedural Visuals — Zero Credit Cost

Map

Interactive map with location pins — great for geopolitics and travel

Chart

6 chart types (bar, line, pie, area, radar, donut) with 7 color palettes

Comparison

Side-by-side comparison cards — ideal for versus-style content

Quote

3 styled quote layouts — minimal, bold, or cinematic

Process Flow

Step-by-step diagrams — ideal for how-it-works explanations

Scene Sub-clip Splitting

Any scene can be split into 2, 3, or 4 sub-clips. Each slot is filled independently — use AI generation for one, upload for another, or pull from Pexels. The narration text is divided proportionally across the slots. Ideal for long scenes that need visual variety.

Image upload → Ken Burns clip: Upload any JPG, PNG, or WebP image from Scene Studio and Episio applies Ken Burns motion (zoom in, zoom out, pan left, pan right at slow/normal/fast speed) to create a 6-second animated clip at zero AI credit cost.

Free visuals save credits

Maps, charts, comparisons, quotes, and process flows are rendered locally — zero credit cost. Assign them to data-heavy scenes to preserve credits for hero visuals.

Creative Series

Creative Series is a distinct creation mode for purely imaginative content. No topic research, no news — just a creative vision, AI-built characters, and episode after episode of cinematic storytelling.

What Makes It Different

AI World Bible

Write one paragraph — AI generates the full production bible: era, tone, color palette, camera style, music mood.

Consistent AI Characters

AI generates portrait photos for every character. Same face, same style, locked across unlimited episodes.

Brainstorm Drawer

Chat with AI before generating. Develop camera angles, scene ideas, world-building depth, character motivations, dialogue.

Ambient or Lipsync Mode

Ambient: cinematic scenes with no dialogue, background music. Lipsync: characters speak, AI syncs lip movement to voice.

The Brainstorm Drawer

A side drawer that opens while you write your episode scenario. The AI knows your full world bible and character list — every suggestion is grounded in your specific world.

Topics you can explore in the drawer:

Camera anglesLow-angle handheld, wide establishing shot, tight close-up on a prop — specific and cinematic.
World buildingRecurring visual motifs, sound design palette, era-specific details that anchor the setting.
Character depthMotivations, relationships, unspoken conflicts, how a character would react to this scene.
Scene ideasEpisode structure, what happens, bittersweet turns, how to show emotion without dialogue.
DialogueLines characters speak in lipsync mode, or narration phrasing for ambient episodes.
Mood & toneEmotional arc, pacing notes, how the score should shift through the episode.

Two Modes

Ambient

Cinematic scenes with no dialogue. Background music and Kling native audio (ambient sounds from the scene). Best for atmospheric, visually-driven storytelling.

Lipsync

Characters speak dialogue. Kling built-in voice synthesis syncs lip movement to voice. Best for character-driven stories where dialogue carries the narrative.

What Kind of World Can You Build?

Historical fictionMythologyFolkloreSci-fiFantasySlice-of-lifeCozy & nostalgicHorrorBrand storytellingTrue story

Scene Plan Review

After entering your episode scenario, AI generates a scene-by-scene plan before generating any footage. You see: scene number, duration, scene type, characters appearing, and a brief scene description. Review the plan, then click Generate Episode to commit. If the plan isn't right, regenerate with different scenario notes.

Creative Series uses Kling image-to-video with canonical character portraits to maintain face consistency across episodes — the same person looks the same in episode 1 and episode 20.

Series & World Bibles

Knowledge-based Series (for topic-driven channels) use a world bible to maintain consistent visual identity and character continuity across episodes. Every series has a world bible — the creative DNA that controls how AI generates each episode.

World Bible Sections

Section	Purpose
Color Palette	Visual mood — warm, cool, neon, muted, etc.
Camera Style	Cinematic language — close-ups, wide shots, tracking
Era & Setting	Time period and location for all episodes
Visual Motifs	Recurring symbols and imagery
Story Context	Overarching narrative that threads through episodes

Characters

Each character has: name, role (Protagonist / Deuteragonist / Antagonist / Ally / Supporting), visual description, AI-generated portrait, and a Veo3 hint — a short phrase (e.g. "tall man with gray beard") appended to every scene prompt that features this character. Approve one portrait as "canonical" — that face is locked for all episodes.

Shot Arc Phases

Group episodes into story arcs — each phase has its own camera language and optional character age. This lets you evolve the visual style as the story progresses:

Shot style — camera language for this phase (e.g. "intimate close-ups, handheld" for early episodes; "wide epic shots, crane" for the climax)
Default character age — set once per phase for biography series; AI references the age in scene prompts automatically
Quality tier — Economy for setup, Standard for core, Premium for finale

Continuity Features

AI suggests episode topics based on series context and prior episodes
Continuity checker flags contradictions with previous episodes
Character pre-flight check warns about missing portraits before generation
World bible injected into every scene regeneration prompt

Avatar & Lipsync

Create talking-head videos from a portrait photo or video clip. The avatar speaks your voiceover with lip-synced animation. Available on the Studio plan.

Two Input Modes

Portrait Photo

Upload a face photo → AI animates it as a talking head

Video Clip

Upload a video of yourself → lipsync replaces audio with your voiceover

3 Lipsync Engines

Engine	Cost/sec	Quality	Speed	Best For
SadTalker	$0.001	Good	Fast	Budget creators, batch content
MuseTalk	$0.003	Great	Medium	Balanced quality/cost — recommended
Sync.so	$0.05	Excellent	Slow	Premium showcase videos

Start with MuseTalk — best balance of quality and cost. Use Sync.so only for your most important videos or channel trailer.

Captions

Burned-in captions — no CapCut, no Descript, no subtitle files. Auto-generated from your voiceover with word-level timestamp sync. Configure on the review page before publishing.

Setting	Options
Display Mode	Word-by-Word · Phrase · Full Line
Font	Impact · Montserrat · Oswald · Roboto · Playfair Display
Highlight Color	Yellow · White · Orange · Green · Purple + custom hex
Position	Top · Center · Bottom
Languages	10 languages — renders native script (Hindi, Telugu, Tamil, Kannada, Malayalam, English + 5 European)

High-retention formula

Word-by-Word + Impact + Yellow highlight + Bottom = the caption style that dominates TikTok and YouTube Shorts right now.

Captions are burned into the final MP4 via FFmpeg — upload directly to YouTube, TikTok, or Instagram with no separate subtitle file needed.

📸

Caption configuration panel

Screenshot coming soon

Thumbnails

AI-generated thumbnails via Ideogram — no Canva, no Photoshop, no designer. The prompt is auto-generated from your script and topic. Available in 3 styles:

Cinematic

Dramatic lighting, atmospheric depth. Best for documentary and storytelling channels.

Bold Text

High-contrast, large type. Best for finance, tech, and news-style channels.

Documentary

Measured, authoritative. Best for educational and analysis channels.

4 AI variants at once: Click "Generate thumbnails" to produce 4 different variants simultaneously. Claude Vision scores each one for click-through potential — titles, contrast, emotional pull. The top-scored variant is pre-selected, but you can pick any.

Edit the prompt: Modify the generation prompt to steer the style and composition before generating.

Series thumbnails: Character portraits are injected into the thumbnail prompt for face-consistent thumbnails across all episodes.

Review & Editing

The review page is where you fine-tune your video before publishing. You control every scene, the music, captions, and thumbnail — nothing is locked.

Scene Grid

Each scene card shows: clip preview, narration text, type badge (AI Video, Stock, Image, Chart, etc.), source provider, mood, and status. Click any scene to open Scene Studio.

Regenerating a Scene

Edit the prompt, select a different AI model, upload your own clip, or browse Pexels stock footage. Cost and ETA shown before confirming.

Reassembly

After editing scenes, music, or captions — click Reassemble to re-render the full video. Takes ~1–2 minutes.

📸

Review page with scene grid

Screenshot coming soon

Publishing

Publish directly to YouTube from the review page. Connect your channel in Settings → Publishing.

YouTube Publish Panel

Channel selector (if multiple channels connected)
Title (100 character limit)
Description
Hashtags (auto-suggested from topic)
Pinned comment
Privacy: Private, Unlisted, or Public

Real-time status: uploading → processing → published. You get the YouTube link as soon as it's live. Can't publish now? Download the MP4 and post manually to TikTok or Instagram Reels.

Coming soon

Instagram Reels and TikTok direct publishing are in development. Currently YouTube only — download for other platforms.

Library

All your videos in one place. Filter by status: All, Completed, Review Needed, In Progress, Failed.

Run cards show: thumbnail, topic, date, cost, and status badge. Active runs display a live progress bar with stage counter.

Actions: View/edit, download video + thumbnail, or publish — all from the library card.

Settings & Brand Kit

API Keys

Connect your keys for each service. Each key has a test button to verify connectivity.

Service	What It Powers
FAL.ai	All video and image generation (14 providers)
ElevenLabs	English voiceover + voice cloning
Sarvam AI	Indian language voiceover (auto-routed)
Ideogram	Thumbnail generation
HeyGen	Avatar / presenter mode
Anthropic	Script generation + AI brainstorm
Tavily	Topic research + web search

Brand Kit

Logo watermark: Upload your logo (PNG/JPG ≤2MB). Choose corner position (4 options) and opacity (30–100%) — burned into every video automatically.
Branded intro clip: Upload an MP4/MOV intro (≤200MB) — prepended to every video before the generated content. Great for channel identity.
Default voice + language: Set once — applied to every new video. Change per-video at creation time.

Presenter / Avatar (HeyGen)

Settings → Presenter. Configure an avatar that appears in talking-head scenes as an alternative to lipsync:

HeyGen avatar picker: Browse and search from hundreds of ready-made avatars. Choose rendering style (full body, close-up, circle).
Photo avatar: Upload a portrait photo of yourself — Episio creates a personalized avatar via HeyGen (5–20 minute processing time). Your face, your avatar.
HeyGen API key: Required for avatar mode — add in Settings → API Keys.

YouTube Connection

Settings → Publishing → Connect YouTube. OAuth flow links your channel for direct publishing. Multiple channels supported — choose per video.