Features
Everything Episio can do — from topic research to publishing. Covers all three creation modes: Single Video, Series, and Creative Series.
All three creation modes support every format and duration. Set it once — the entire pipeline adapts: canvas size, caption positioning, thumbnail dimensions, and provider aspect ratios.
3 Video Formats
Portrait 9:16
1080×1920 — YouTube Shorts, TikTok, Instagram Reels
Landscape 16:9
1920×1080 — YouTube, Vimeo, Twitter
Square 1:1
1080×1080 — Instagram Feed, LinkedIn, Facebook
5 Duration Presets + Custom
| Preset | Duration | Word Count | Best For |
|---|---|---|---|
| Short | ~60 seconds | 90–120 words | YouTube Shorts, TikTok, Reels |
| Medium | ~2 minutes | 180–240 words | Explainers, summaries |
| Long | ~5 minutes | 540–660 words | Deep topics, tutorials |
| Deep | ~10 minutes | 1140–1260 words | Documentaries, analysis |
| Max | ~12 minutes | 1380–1500 words | Full episodes, long-form |
Used in Single Video and knowledge-based Series modes. Episio fetches trending topics from multiple live sources and lets you choose or bring your own. Seven content domains focus the research:
Geopolitics
Global conflicts, diplomacy, power shifts, sanctions
Personalities
CEOs, leaders, creators, cultural figures
AI Updates
New models, launches, industry moves
Technology
Hardware, software, space, science breakthroughs
Crypto
Bitcoin, altcoins, DeFi, regulation, market moves
Markets
Stocks, earnings, macro trends, central banks
Custom
Any topic — type your own. AI researches from multiple live sources.
Research sources: Web search, Reddit, RSS feeds, Hacker News — aggregated and ranked by recency and relevance. AI synthesizes findings into your script automatically.
Synthesis topics: AI-generated cross-domain angles that connect ideas from different fields — often the most viral format.
Synthesis topics ★ — A special topic type that connects ideas across domains. When selected, the Synthesis script style is automatically pre-set. These are often the highest-performing videos for virality because they surface non-obvious angles nobody else has covered.
Narrative angle analysis: After entering a topic, click "Analyze narrative angles" — AI generates 4–5 distinct narrative frames for the same topic. Each angle changes what the story is "about" while using the same facts. Pick the one that fits your channel voice.
Document upload: Upload a PDF, DOCX, TXT, or Markdown file instead of typing a topic. Episio extracts the text and uses it as the research source — great for turning reports, articles, your own notes, or transcripts into videos.
Scripts are generated by Claude AI. Two dimensions shape the output: script style (structure) and narrator type (voice and tone). 100 unique combinations.
10 Script Styles
| Style | Structure | Best For |
|---|---|---|
| Synthesis | Cross-domain connections + unique angle | Viral explainers — most shared |
| Suspenseful | Identity withheld, reveal at 80% | Mystery and intrigue topics |
| Story | 4-beat narrative arc | Human interest, historical drama |
| True Crime | Atmospheric slow reveal | Crime, investigation, scandal |
| Educational | Myth → bust → insight | Science, debunking, counterintuitive facts |
| News | Fact → implication → impact | Current events, breaking developments |
| Dramatic | Emotion arc, human face on data | Finance, geopolitics with stakes |
| Conspiracy | 3 dots connected, viewer concludes | Pattern-finding, alternative angles |
| Comedy | Setup → punchline, rule of 3 | Light topics, tech culture, satire |
| Retrospective | Then vs now, looking back with wisdom | Anniversaries, legacy, nostalgia |
10 Narrator Types
| Narrator | Tone | Example Opener |
|---|---|---|
| Documentary | Authoritative, measured | "In the summer of 1947, something shifted..." |
| Direct Address | Confrontational, personal | "You've been lied to about this for years." |
| Investigative | Methodical, evidence-driven | "The documents tell a different story." |
| Comedy | Witty, irreverent | "So apparently, the government forgot..." |
| True Crime | Atmospheric, tense | "At 3am on a Tuesday, the phone rang." |
| Teacher | Patient, curious | "Let me ask you something surprising..." |
| Breaking News | Urgent, present tense | "This is happening right now." |
| First Person | Intimate, experiential | "I was there when it happened." |
| Retrospective | Wise, reflective | "Nobody saw it coming — including me." |
| Devil's Advocate | Contrarian, challenging | "What if everything you believe is wrong?" |
Script Review — Gate Before Footage
After generation, you land on the Script Review page before any footage is generated. This is where you:
- Read the full script and edit any sentence directly
- Refine via AI chat — "make the opening more dramatic", "add a statistic", "cut to 60 seconds"
- Switch script style or narrator type and regenerate
- Approve and proceed to footage generation only when satisfied
Episio doesn't pick music from a library. It reads the emotional arc of your script and composes an original soundtrack from scratch — matched exactly to your video's length. Original every time. No royalty issues. No ContentID strikes on YouTube.
AI composed
Original per video
Duration-matched
Exact fit, no fades
Royalty-free
No copyright claims
8 emotional profiles — you set the direction; AI composes within it:
Auto-classification: AI reads your script and selects the emotional profile automatically. You can override it at any time.
Custom MP3 upload: Upload your own track — it replaces the composed score entirely. Good for branded channels with a signature sound.
Music remix: Change the profile after video generation and reassemble with one click.
Episio supports 17 voice languages for voiceover generation. Captions render natively in 10 languages (English + 5 Indian + 4 European). Select the language on the creation page — the voice engine auto-routes based on your choice.
| Language | Voice Engine | Captions |
|---|---|---|
| English | ElevenLabs | ✓ |
| Hindi | Sarvam AI (auto) | ✓ |
| Telugu | Sarvam AI (auto) | ✓ |
| Tamil | Sarvam AI (auto) | ✓ |
| Kannada | Sarvam AI (auto) | ✓ |
| Malayalam | Sarvam AI (auto) | ✓ |
| Spanish, French, Portuguese, German | ElevenLabs | ✓ |
| Bengali, Arabic, Japanese, Korean, Chinese, Turkish, Indonesian | ElevenLabs | — |
ElevenLabs Voices
High-quality voices with distinct accents and tones.
| Voice | Accent | Gender | Tone |
|---|---|---|---|
| Adam | American | Male | Deep, authoritative |
| Antoni | American | Male | Warm, conversational |
| Arnold | American | Male | Strong, confident |
| Bella | American | Female | Soft, engaging |
| Domi | American | Female | Confident, professional |
| Elli | American | Female | Friendly, youthful |
| Josh | American | Male | Clear, energetic |
| Rachel | American | Female | Calm, professional |
Sarvam AI — Indian Languages
Native-quality voiceover for 5 Indian languages. Episio auto-routes to Sarvam AI when an Indian language is selected — no manual setup needed.
| Language | Female Speaker | Male Speaker |
|---|---|---|
| Hindi | Neha | Rahul |
| Telugu | Kavya | Aditya |
| Tamil | Priya | Ashutosh |
| Kannada | Surabhi | Vikram |
| Malayalam | Anjali | Arjun |
Voice Cloning
Upload a 30-second audio sample → Episio creates a clone via ElevenLabs. Use it across all your videos for a consistent personal brand voice. Set it as your default in Settings.
Custom Voice ID: Have an ElevenLabs voice ID not listed? Paste it directly — Episio supports any ElevenLabs voice ID.
14 AI providers power footage generation — all via your FAL key. The footage director plans each scene automatically, or you can override per scene.
Quality Tiers
| Tier | Credits/Video | Provider Mix | When to Use |
|---|---|---|---|
| Budget | ~3 credits | Flux Ken Burns images | High-volume, drafts, testing |
| Standard | ~8 credits | Kling i2v + Flux B-roll + Pexels | Daily content — recommended |
| Premium | ~20 credits | Veo3 + Kling Pro + WAN 14B | Showcase, brand content |
Free sources: Pexels stock footage, Archive.org historical footage, NASA imagery — all at zero credit cost. The footage director balances free and AI sources automatically.
See the Provider Guide for full comparison with costs, render times, and recommended combos.
Scene Studio gives you full control over any scene in your video. Open it from the review page by clicking any scene card. 8 ways to create or replace a scene:
Footage Sources
AI Video
Generate a clip from any of the 14 AI providers with a custom prompt
Upload
Upload your own video clip — drag and drop, any format
Image / Ken Burns
Upload or generate an image, apply Ken Burns motion (zoom, pan, speed)
Me (Avatar)
Use your portrait or video for a talking-head scene with lipsync
Procedural Visuals — Zero Credit Cost
Map
Interactive map with location pins — great for geopolitics and travel
Chart
6 chart types (bar, line, pie, area, radar, donut) with 7 color palettes
Comparison
Side-by-side comparison cards — ideal for versus-style content
Quote
3 styled quote layouts — minimal, bold, or cinematic
Process Flow
Step-by-step diagrams — ideal for how-it-works explanations
Scene Sub-clip Splitting
Any scene can be split into 2, 3, or 4 sub-clips. Each slot is filled independently — use AI generation for one, upload for another, or pull from Pexels. The narration text is divided proportionally across the slots. Ideal for long scenes that need visual variety.
Image upload → Ken Burns clip: Upload any JPG, PNG, or WebP image from Scene Studio and Episio applies Ken Burns motion (zoom in, zoom out, pan left, pan right at slow/normal/fast speed) to create a 6-second animated clip at zero AI credit cost.
Free visuals save credits
Creative Series is a distinct creation mode for purely imaginative content. No topic research, no news — just a creative vision, AI-built characters, and episode after episode of cinematic storytelling.
What Makes It Different
AI World Bible
Write one paragraph — AI generates the full production bible: era, tone, color palette, camera style, music mood.
Consistent AI Characters
AI generates portrait photos for every character. Same face, same style, locked across unlimited episodes.
Brainstorm Drawer
Chat with AI before generating. Develop camera angles, scene ideas, world-building depth, character motivations, dialogue.
Ambient or Lipsync Mode
Ambient: cinematic scenes with no dialogue, background music. Lipsync: characters speak, AI syncs lip movement to voice.
The Brainstorm Drawer
A side drawer that opens while you write your episode scenario. The AI knows your full world bible and character list — every suggestion is grounded in your specific world.
Topics you can explore in the drawer:
- Camera anglesLow-angle handheld, wide establishing shot, tight close-up on a prop — specific and cinematic.
- World buildingRecurring visual motifs, sound design palette, era-specific details that anchor the setting.
- Character depthMotivations, relationships, unspoken conflicts, how a character would react to this scene.
- Scene ideasEpisode structure, what happens, bittersweet turns, how to show emotion without dialogue.
- DialogueLines characters speak in lipsync mode, or narration phrasing for ambient episodes.
- Mood & toneEmotional arc, pacing notes, how the score should shift through the episode.
Two Modes
Ambient
Cinematic scenes with no dialogue. Background music and Kling native audio (ambient sounds from the scene). Best for atmospheric, visually-driven storytelling.
Lipsync
Characters speak dialogue. Kling built-in voice synthesis syncs lip movement to voice. Best for character-driven stories where dialogue carries the narrative.
What Kind of World Can You Build?
Scene Plan Review
After entering your episode scenario, AI generates a scene-by-scene plan before generating any footage. You see: scene number, duration, scene type, characters appearing, and a brief scene description. Review the plan, then click Generate Episode to commit. If the plan isn't right, regenerate with different scenario notes.
Knowledge-based Series (for topic-driven channels) use a world bible to maintain consistent visual identity and character continuity across episodes. Every series has a world bible — the creative DNA that controls how AI generates each episode.
World Bible Sections
| Section | Purpose |
|---|---|
| Color Palette | Visual mood — warm, cool, neon, muted, etc. |
| Camera Style | Cinematic language — close-ups, wide shots, tracking |
| Era & Setting | Time period and location for all episodes |
| Visual Motifs | Recurring symbols and imagery |
| Story Context | Overarching narrative that threads through episodes |
Characters
Each character has: name, role (Protagonist / Deuteragonist / Antagonist / Ally / Supporting), visual description, AI-generated portrait, and a Veo3 hint — a short phrase (e.g. "tall man with gray beard") appended to every scene prompt that features this character. Approve one portrait as "canonical" — that face is locked for all episodes.
Shot Arc Phases
Group episodes into story arcs — each phase has its own camera language and optional character age. This lets you evolve the visual style as the story progresses:
- Shot style — camera language for this phase (e.g. "intimate close-ups, handheld" for early episodes; "wide epic shots, crane" for the climax)
- Default character age — set once per phase for biography series; AI references the age in scene prompts automatically
- Quality tier — Economy for setup, Standard for core, Premium for finale
Continuity Features
- AI suggests episode topics based on series context and prior episodes
- Continuity checker flags contradictions with previous episodes
- Character pre-flight check warns about missing portraits before generation
- World bible injected into every scene regeneration prompt
Create talking-head videos from a portrait photo or video clip. The avatar speaks your voiceover with lip-synced animation. Available on the Studio plan.
Two Input Modes
Portrait Photo
Upload a face photo → AI animates it as a talking head
Video Clip
Upload a video of yourself → lipsync replaces audio with your voiceover
3 Lipsync Engines
| Engine | Cost/sec | Quality | Speed | Best For |
|---|---|---|---|---|
| SadTalker | $0.001 | Good | Fast | Budget creators, batch content |
| MuseTalk | $0.003 | Great | Medium | Balanced quality/cost — recommended |
| Sync.so | $0.05 | Excellent | Slow | Premium showcase videos |
Burned-in captions — no CapCut, no Descript, no subtitle files. Auto-generated from your voiceover with word-level timestamp sync. Configure on the review page before publishing.
| Setting | Options |
|---|---|
| Display Mode | Word-by-Word · Phrase · Full Line |
| Font | Impact · Montserrat · Oswald · Roboto · Playfair Display |
| Highlight Color | Yellow · White · Orange · Green · Purple + custom hex |
| Position | Top · Center · Bottom |
| Languages | 10 languages — renders native script (Hindi, Telugu, Tamil, Kannada, Malayalam, English + 5 European) |
High-retention formula
Captions are burned into the final MP4 via FFmpeg — upload directly to YouTube, TikTok, or Instagram with no separate subtitle file needed.
Caption configuration panel
Screenshot coming soon
AI-generated thumbnails via Ideogram — no Canva, no Photoshop, no designer. The prompt is auto-generated from your script and topic. Available in 3 styles:
Cinematic
Dramatic lighting, atmospheric depth. Best for documentary and storytelling channels.
Bold Text
High-contrast, large type. Best for finance, tech, and news-style channels.
Documentary
Measured, authoritative. Best for educational and analysis channels.
4 AI variants at once: Click "Generate thumbnails" to produce 4 different variants simultaneously. Claude Vision scores each one for click-through potential — titles, contrast, emotional pull. The top-scored variant is pre-selected, but you can pick any.
Edit the prompt: Modify the generation prompt to steer the style and composition before generating.
Series thumbnails: Character portraits are injected into the thumbnail prompt for face-consistent thumbnails across all episodes.
The review page is where you fine-tune your video before publishing. You control every scene, the music, captions, and thumbnail — nothing is locked.
Scene Grid
Each scene card shows: clip preview, narration text, type badge (AI Video, Stock, Image, Chart, etc.), source provider, mood, and status. Click any scene to open Scene Studio.
Regenerating a Scene
Edit the prompt, select a different AI model, upload your own clip, or browse Pexels stock footage. Cost and ETA shown before confirming.
Reassembly
After editing scenes, music, or captions — click Reassemble to re-render the full video. Takes ~1–2 minutes.
Review page with scene grid
Screenshot coming soon
Publish directly to YouTube from the review page. Connect your channel in Settings → Publishing.
YouTube Publish Panel
- Channel selector (if multiple channels connected)
- Title (100 character limit)
- Description
- Hashtags (auto-suggested from topic)
- Pinned comment
- Privacy: Private, Unlisted, or Public
Real-time status: uploading → processing → published. You get the YouTube link as soon as it's live. Can't publish now? Download the MP4 and post manually to TikTok or Instagram Reels.
Coming soon
All your videos in one place. Filter by status: All, Completed, Review Needed, In Progress, Failed.
Run cards show: thumbnail, topic, date, cost, and status badge. Active runs display a live progress bar with stage counter.
Actions: View/edit, download video + thumbnail, or publish — all from the library card.
API Keys
Connect your keys for each service. Each key has a test button to verify connectivity.
| Service | What It Powers |
|---|---|
| FAL.ai | All video and image generation (14 providers) |
| ElevenLabs | English voiceover + voice cloning |
| Sarvam AI | Indian language voiceover (auto-routed) |
| Ideogram | Thumbnail generation |
| HeyGen | Avatar / presenter mode |
| Anthropic | Script generation + AI brainstorm |
| Tavily | Topic research + web search |
Brand Kit
- Logo watermark: Upload your logo (PNG/JPG ≤2MB). Choose corner position (4 options) and opacity (30–100%) — burned into every video automatically.
- Branded intro clip: Upload an MP4/MOV intro (≤200MB) — prepended to every video before the generated content. Great for channel identity.
- Default voice + language: Set once — applied to every new video. Change per-video at creation time.
Presenter / Avatar (HeyGen)
Settings → Presenter. Configure an avatar that appears in talking-head scenes as an alternative to lipsync:
- HeyGen avatar picker: Browse and search from hundreds of ready-made avatars. Choose rendering style (full body, close-up, circle).
- Photo avatar: Upload a portrait photo of yourself — Episio creates a personalized avatar via HeyGen (5–20 minute processing time). Your face, your avatar.
- HeyGen API key: Required for avatar mode — add in Settings → API Keys.
YouTube Connection
Settings → Publishing → Connect YouTube. OAuth flow links your channel for direct publishing. Multiple channels supported — choose per video.

