mcp-video — Video Editing for AI Agents

The Problem

AI agents can't edit video. Until now.

AI agents can write code, analyze documents, browse the web, create images — but video editing? Existing tools are either GUI-only (agents can't click buttons), raw FFmpeg wrappers (agents can't memorize hundreds of flags), or cloud APIs (expensive, slow, vendor lock-in). mcp-video bridges this gap.

⚙

26 Video Editing Tools

Trim, merge, add text, sync audio, resize, convert, speed change, watermarks, subtitles, storyboards, crop, rotate, fade, filters, color grading, audio normalization, PiP, split-screen, batch processing, and more.

⚡

Progress Callbacks

Real-time progress reporting on long operations. Agents can tell you "50% done..." instead of going silent during converts and merges.

🔒

Local & Private

Runs on your machine. No cloud uploads, no API keys, no per-minute billing. Your video never leaves your computer.

📷

Visual Verification

Returns a thumbnail of the output after every operation. Agents can confirm results look right without opening the file.

🔧

Auto-Fix Error Handling

Parses FFmpeg errors into structured responses with actionable suggestions. "Codec error: vp9" becomes "Auto-convert from vp9 to H.264/AAC."

🎨

Filters & Color Grading

Blur, sharpen, brightness, contrast, saturation, grayscale, sepia, invert, vignette. Plus color presets: warm, cool, vintage, cinematic, noir.

🎵

Audio Normalization

Normalize loudness to LUFS targets: YouTube (-16), broadcast (-23), Spotify (-14). Correct true peak for broadcast compliance.

📶

PiP & Split-Screen

Picture-in-picture overlay with positioning, opacity, and timing. Side-by-side or top/bottom split-screen with auto-resize.

📦

Batch Processing

Apply the same operation to multiple files in one call. Trim 10 videos, blur 5, or convert a whole directory at once.

⚡

Validation Before Execution

Filter types, color presets, and parameters are validated before hitting FFmpeg. Get clear error messages, not cryptic KeyError tracebacks.

Why mcp-video

Not just another FFmpeg wrapper.

"Can't I just have Claude Code run ffmpeg commands?" Sure — if you enjoy debugging 200-character filter graphs, wrestling with flag order semantics, and paying dollars per short clip on cloud APIs. Here's why mcp-video is different.

	mcp-video	Claude Code + FFmpeg	Cloud APIs
Cost	Free, local	Free, but your time isn't	$0.28–$2.50 per 5s video
Privacy	100% local	100% local	Uploads to cloud
Error messages	Structured + auto-fix	FFmpeg stderr (cryptic)	HTTP errors
Parameter validation	Before execution	Only at runtime	API schema check
Text overlay	`add_text(text="Hi")`	`-vf "drawtext=text='Hi':fontfile=...:fontsize=24:x=(w-text_w)/2:y=h-50"`	Varies by API
Speed change	Handles audio-less videos	Fails if no audio stream	Usually handled
Discovery	26 self-documenting tools	Must know hundreds of flags	API docs
Testing	380 tests, CI-ready	None (ad-hoc)	Provider's tests
Progress reporting	Real-time %	Silence until done	Varies
Visual verification	Thumbnail after ops	No	Varies

The real problems with raw FFmpeg

Flag order changes behavior silently

Put -ss 10 before -i input.mp4 and FFmpeg seeks instantly. Put it after, and it decodes every frame up to 10s first. Same flags, completely different performance. Claude Code can't know which you meant.

Cryptic error messages

Try adding audio to a silent video and you get: "Stream specifier ':a' matches no streams." No hint that the video has no audio. mcp-video checks before running and gives you a clear error.

Intermediate files = quality loss

Every FFmpeg call re-encodes. Chain 5 operations and you've re-encoded 5 times. mcp-video's timeline DSL composes operations into a single FFmpeg command where possible.

Claude forgets between commands

In a 10-step workflow, Claude must re-discover file paths, resolutions, and durations at every step. mcp-video returns structured results (path, duration, resolution) so the agent always has context.

Why not cloud APIs?

Expensive at scale

Replicate charges $0.28–$0.50 per second of generated video. A 60-second clip costs $17–$30. RunwayML's Pro plan gives you ~90 seconds of Gen-4.5 for $28/month. mcp-video: $0, forever.

Vendor lock-in

Your workflow depends on their API staying up, their pricing not changing, and their model versions not breaking your prompts. With mcp-video, FFmpeg runs on your machine — no API deprecation risk.

Saves tokens. Saves turns.

Every wasted turn costs money. With raw FFmpeg, Claude spends tokens parsing hundreds of lines of stderr, retrying failed commands, and re-probing files it already processed. mcp-video eliminates all of that.

	mcp-video	Claude Code + FFmpeg
Turns for 5 operations	4–5	6–10
Output tokens per turn	50–80 (JSON)	200–500 (stderr)
Error retries	0–1 (pre-validated)	2–3 typical
Re-probing between steps	Never	Every step
Token reduction	~40–50% fewer turns	baseline

Capabilities

26 tools. Every operation you need.

Every tool returns structured JSON. On success, you get output paths, duration, resolution. On failure, you get error types with auto-fix suggestions.

Tool	What It Does
video_info	Get metadata: duration, resolution, codec, fps, file size
video_trim	Trim clip by start time + duration or end time
video_merge	Concatenate multiple clips with optional transitions
video_add_text	Overlay text, titles, captions with custom positioning
video_add_audio	Add, replace, or mix audio tracks with fade effects
video_resize	Change resolution or apply preset aspect ratios
video_convert	Convert between mp4, webm, gif, mov
video_speed	Speed up or slow down (slow-mo, time-lapse)
video_thumbnail	Extract a single frame at any timestamp
video_preview	Generate fast low-res preview for quick review
video_storyboard	Extract key frames + create grid for review
video_subtitles	Burn SRT/VTT subtitles into the video
video_watermark	Add image watermark with opacity control
video_export	Render final video with quality presets
video_edit	Full timeline-based edit (JSON DSL for complex edits)
video_extract_audio	Extract audio as mp3, wav, aac, ogg, or flac
video_crop	Crop to rectangular region with custom offset
video_rotate	Rotate 90/180/270 degrees or flip horizontal/vertical
video_fade	Add fade in/out effects to video
video_filter	Apply visual filter (blur, sharpen, grayscale, sepia, invert, vignette)
video_blur	Blur video with custom radius and strength
video_color_grade	Apply color preset (warm, cool, vintage, cinematic, noir)
video_normalize_audio	Normalize loudness to LUFS target (YouTube, broadcast, Spotify)
video_overlay	Picture-in-picture overlay with positioning and opacity
video_split_screen	Side-by-side or top/bottom layout for two videos
video_batch	Apply same operation to multiple files at once

Three Interfaces

However you want to use it.

mcp-video works as an MCP server for AI agents, a Python library for scripts, and a CLI tool for the terminal.

MCP Server (for AI agents)

Add to your Claude Code or Cursor MCP config. Then just talk to your agent.

{
  "mcpServers": {
    "mcp-video": {
      "command": "uvx",
      "args": ["mcp-video"]
    }
  }
}

Then: *"Trim this video from 0:30 to 1:00, add a title, and export for TikTok."*

Python Client

Clean API for automation, pipelines, and batch processing.

from mcp_video import Client

editor = Client()
clip = editor.trim("v.mp4", start="0:30", duration="15")
final = editor.resize(clip.output_path, aspect_ratio="9:16")
result = editor.export(final.output_path)
print(result.resolution)  # 1080x1920

CLI Tool

# Get info
$ mcp-video info video.mp4

# Trim and convert
$ mcp-video trim video.mp4 -s 0:30 -d 15
$ mcp-video convert video.mp4 -f webm -q high

# Generate storyboard
$ mcp-video storyboard video.mp4 -n 12

Timeline DSL

Complex edits in a single JSON object.

Describe multi-track edits with video clips, audio, text overlays, transitions, and export settings — all in one call.

editor.edit({
    "width": 1080, "height": 1920,
    "tracks": [
        {
            "type": "video",
            "clips": [
                {"source": "intro.mp4", "duration": 5},
                {"source": "main.mp4", "trim_start": 10, "duration": 30},
            ],
            "transitions": [{"after_clip": 0, "type": "fade", "duration": 1.0}],
        },
        {
            "type": "audio",
            "clips": [{"source": "music.mp3", "volume": 0.7}],
        },
        {
            "type": "text",
            "elements": [{"text": "EPISODE 42", "position": "top-center"}],
        },
    ],
    "export": {"format": "mp4", "quality": "high"},
})

Templates

One function per platform.

Pre-built templates handle the correct dimensions, text positioning, and export settings for each social media platform.

TikTok

9:16, 1080x1920. Caption at bottom-center. Optional background music at volume 0.5.

YouTube Shorts

9:16, title at top-center with size 42. Same format as TikTok but different text placement.

Instagram Reel

9:16, caption at bottom-center. Clean, simple template.

YouTube Video

16:9, 1920x1080. Supports title card (3s), outro clip, and background music.

Instagram Post

1:1, 1080x1080. Square format with optional caption overlay.

Custom

Build your own with the Timeline DSL. Any dimensions, any number of tracks.

Video editing for AI agents.