• Practicaly AI
  • Posts
  • 🧠 How to Use AI for Video Editing with Remotion and Claude Code

🧠 How to Use AI for Video Editing with Remotion and Claude Code

Who this is for: Content creators, marketers, and video editors who want to use AI to handle the mechanical parts of video production β€” adding captions, trimming footage, applying motion graphics β€” without spending hours in a timeline editor. You don't need to know how to code, but you do need to be comfortable running commands in a terminal.

What you'll learn: What Remotion is and how it connects to Claude Code, which video editing tasks AI can genuinely handle today (and which it can't), the exact setup you need for two different workflows, and how to prompt Claude well enough to get results worth keeping.

What is Remotion for video editing?
Remotion is a React-based framework that lets you create videos using code instead of a timeline editor. You define animations, captions, and layouts programmatically, and Remotion renders them into a video file.

How does Claude Code help with video editing?
Claude Code generates Remotion code based on natural language prompts. Instead of manually editing video, you describe what you want β€” captions, overlays, animations β€” and Claude writes the code to produce it.

Can AI fully edit videos automatically?
No. AI can handle structured tasks like captions, overlays, and segment-based cuts, but it cannot perform frame-perfect editing or fully automate creative decisions on raw footage

TL;DR β€” Too Long Didn’t Read

  • Remotion is a React-based framework that renders video programmatically. You describe what you want, Claude writes the code, and Remotion renders it to MP4.

  • The Remotion Agent Skill teaches Claude how to write Remotion code correctly. It's a knowledge file, not a plugin β€” you install it once and it changes how Claude generates video code.

  • There are two distinct workflows: one for adding overlays and captions on top of a clean video, and one for AI-assisted segment cutting using audio transcription.

  • Automated "video cleaning" β€” trimming filler words frame-precisely β€” is not possible with this workflow. AI processes video as sampled frames, not at millisecond precision. This is a hard technical limitation, not a gap that will be fixed with better prompting.

  • What does work reliably: animated word-by-word captions synced to speech, segment cutting from manually-defined timestamps, and motion graphics like lower thirds, logo intros, and animated overlays.

  • Your source video must be H.264 MP4. If you're recording on a Mac or iPhone, you're probably shooting HEVC or MOV β€” you'll need to convert first.

  • Vague prompts produce generic results. The more specific you are about timing, font, color, and layout, the better the output.

  • Everything renders locally. No cloud service. No per-minute fees.

Table of Contents

1. What Is Remotion for AI Video Editing?

Traditional video editors β€” Premiere, Final Cut, DaVinci β€” work on a visual timeline. You drag clips, drag effects, and export. It's intuitive but manual. Every project is one-off. You can't version control it, you can't generate 50 variations from a template, and you certainly can't describe what you want in plain English and have it built for you.

Remotion approaches video differently. Instead of a timeline, it's code. Specifically, it's React β€” the same framework used to build web applications. You write components that describe what should appear on each frame, and Remotion renders those components into an MP4 file. Time is just a variable. Want something to fade in at second 3? You write that in code.

Here’s a brief overview from their own YouTube channel and this video was edited using Claude Code:

For most people, that sounds like a step backward. Learning React to edit a video? That's the wrong way to think about it. Because what it actually means is this: if you can describe a video in English, and Claude can write the React code for you, you've just bypassed the entire learning curve.

That's what the Remotion Agent Skill enables. It's a SKILL.md file β€” a knowledge package that teaches Claude how Remotion works: its APIs, animation patterns, common pitfalls, and best practices. When you install it, Claude stops guessing at Remotion syntax and starts writing code that actually works.

The Remotion MCP is a separate but related component. Where the agent skill gives Claude knowledge to write Remotion code, the Remotion MCP server gives Claude access to indexed Remotion documentation at runtime β€” so it can look up specific APIs and check technical details as it builds your composition. In practice, for most users the agent skill handles the majority of the work.

The preview environment runs at localhost:3003 in your browser. You describe a change, Claude updates the code, you reload the preview, and see the result β€” all without rendering. Rendering to a final MP4 only happens when you're satisfied.

2. What AI Video Editing Can and Cannot Do (With Remotion)

Being clear about this upfront will save you time and frustration.

What it does well

Animated captions synced to speech. Whisper transcribes your audio and Remotion renders word-by-word or line-by-line captions with custom fonts, colors, and animations β€” frame-perfectly synced. This is the single most polished use case in this entire workflow and it's well-documented and reliable.

Segment cutting from defined timestamps. When you (or an AI analyzing a transcript) provide a list of which segments to keep and which to cut as a JSON file, Remotion renders only those segments seamlessly. The output is a trimmed video with no raw editing software required.

Motion graphics and overlays. Lower thirds, animated titles, logo intros, progress bars, branded overlays β€” this is Remotion's strongest native capability. These are composited on top of your source video with full control over timing, style, and animation.

What it cannot do

Automatic filler word removal. This is the most common misconception about AI video editing, and it's worth explaining the exact reason. AI models process video by sampling frames at intervals β€” typically one frame per second. If your ideal cut point is at 00:02:15 but the AI only sampled the frame at 00:02:00 and 00:03:00, it cannot identify that precise moment. Frame-perfect cleaning requires external tooling that can operate at the actual frame level, not sampled approximations. This is not a limitation of Claude's intelligence β€” it's a fundamental constraint of how AI tokenization handles video.

Direct editing of raw, uncut footage. Remotion is an editing and animation layer that works on top of a clean source video. It doesn't analyze messy footage and produce a highlight reel automatically. Your source needs to be pre-cut before Remotion adds to it.

We tested this out for ourselves:

And here’s the finished output:

When we tested this, we noticed that AI isn’t always perfect at identifying exact cut points. That’s because it doesn’t process every single frame of a video β€” it looks at sampled moments β€” so you’ll often get close suggestions, but not frame-perfect edits.

3. AI Video Editing Setup (Two Workflows Explained)

There are two distinct workflows, and they have different requirements. Know which one you're doing before you start installing things.

Scenario A β€” Overlay & Animation: You have a clean, pre-cut video and you want to add captions, motion graphics, or effects on top of it.

Scenario B β€” AI-Assisted Cutting: You have raw or semi-raw footage and you want to use Whisper to transcribe the audio, then use Claude to identify which segments to keep, and render a trimmed output.

Requirement

Scenario A

Scenario B

Node.js + npm

Required

Required

Remotion (npx create-video)

Required

Required

Remotion Agent Skill

Required

Required

Claude Code with skills enabled

Required

Required

Clean pre-cut source video (H.264 MP4)

Required

Required

ffmpeg (npx remotion ffmpeg)

Optional

Required

whisper.cpp + language model

Not needed

Required

Claude API key in environment

Not needed

Required for AI cuts

If you're not sure which to start with, start with Scenario A. It has fewer dependencies, faster iteration, and the results are immediately visible.

4. How to Add AI Captions and Motion Graphics with Remotion

Here’s a quick overview before the detailed step-by-step breakdown:

Step 1: Install Node.js and npm

If you don't have Node.js installed, download it from nodejs.org. The LTS version is fine. npm is included.

Verify in your terminal:

bash

node --version
npm --version

Step 2: Create a new Remotion project

Navigate to the folder where you want to work, then run:

bash

npx create-video@latest

This scaffolds a Remotion project with the starter template. When prompted, give the project a name. Don't worry about which template you choose β€” Claude is going to rewrite most of it anyway.

Step 3: Install the Remotion Agent Skill

The skill can be installed at the project level (only applies to this project) or globally (applies to all Claude Code sessions).

Project-level install (recommended to start):

bash

npx skills add remotion-dev/skills

This downloads the skill files into .claude/skills/ in your project directory. You should see a SKILL.md file and subdirectory files like rules/animations.md and rules/audio.md. These are the knowledge files Claude will reference.

Global install (if you plan to use Remotion across multiple projects):

bash

npx skills add remotion-dev/skills --global

To keep the skill current:

bash

npx remotion skills update

Step 4: Prepare your source video

This is where most first attempts fail. Remotion's video seeking depends on H.264 encoding. If your source file is HEVC, MOV, or anything else, convert it first:

bash

npx remotion ffmpeg -i source.mov -c:v libx264 -crf 23 output.mp4

Put the converted file in your Remotion project's public/ folder. This makes it available to Remotion as a static asset.

Step 5: Open Claude Code and describe what you want

Open your terminal in the Remotion project directory and start a Claude Code session. Reference the skill explicitly to make sure Claude activates it:

Using the remotion-best-practices skill, take the video in public/output.mp4 and add animated word-by-word captions in white Inter font, 36px, centered at the bottom of the frame, with the active word highlighted in yellow. Match the timing to the audio.

Claude will write or update your Remotion composition code. You'll see it modify files in src/.

Step 6: Preview in Remotion Studio

Start the preview server:

bash

npx remotion studio

Open your browser at localhost:3000 (Remotion Studio). You'll see your video composition and can scrub through it, test animations, and check caption sync before committing to a render.

Make adjustments by going back to Claude Code and describing what to change. This loop β€” describe, preview, adjust β€” is the core of the workflow. You're not rendering between every iteration.

Step 7: Render the final video

When you're satisfied with the preview:

bash

npx remotion render

A 1080p video typically takes 5–15 minutes to render locally. The output lands in the out/ directory.

5. How to Use AI for Video Cutting with Whisper + Claude

This scenario uses Whisper to transcribe your footage and produce a timestamped transcript. Claude then analyzes that transcript to identify which segments are worth keeping, and outputs a JSON file of keep/cut markers. Remotion renders only the kept segments into a final video.

Step 1: Complete all Scenario A setup first

Scenario B builds on everything in Scenario A. Make sure your project is created, the skill is installed, and your video is in H.264 MP4.

Step 2: Install ffmpeg

If ffmpeg isn't already on your machine:

bash

npx remotion ffmpeg

Remotion bundles a version of ffmpeg through its CLI. This is the safest way to get a version that's compatible with Remotion's pipeline.

Step 3: Install whisper.cpp and a language model

whisper.cpp is a C/C++ port of OpenAI's Whisper transcription model that runs entirely locally β€” no API key, no cloud, no usage fees.

Install via Remotion's package:

bash

npm install @remotion/install-whisper-cpp
npx remotion whisper install

This will download whisper.cpp binaries and prompt you to download a language model. For most content, the medium.en model balances speed and accuracy well. The large model is more accurate but slower.

Step 4: Transcribe your audio

Remotion provides a transcription command that handles the audio extraction and whisper.cpp processing:

bash

npx remotion whisper transcribe public/output.mp4

This outputs a JSON file with word-level timestamps in the format:

json

{ "text": "So today we're going to", "startMs": 1200, "endMs": 2800 }

Step 5: Have Claude analyze the transcript and generate cut markers

This is where your Claude API key is required. In Claude Code, provide the transcript JSON and ask Claude to identify the segments worth keeping. Be specific about your criteria:

Here is the transcript JSON from my video. Identify the segments where I'm speaking clearly about the main topic and mark them as "keep". Mark all pauses longer than 2 seconds, filler sections, and off-topic tangents as "cut". Output a segments JSON with startMs, endMs, and action for each segment.

Claude will output something like:

json

[
  { "startMs": 0, "endMs": 4500, "action": "keep" },
  { "startMs": 4500, "endMs": 7200, "action": "cut" },
  { "startMs": 7200, "endMs": 18400, "action": "keep" }
]

Important: Review this output before using it. Claude's AI-analyzed cut points are a strong starting point but are not frame-perfect. You may want to manually adjust timestamps for critical transitions before feeding the JSON to Remotion.

Step 6: Render the cut video

Pass the segments JSON to your Remotion composition. Claude Code can wire this up for you β€” describe what you want:

Using the remotion-best-practices skill, update the composition to read segments from public/segments.json and render only the "keep" segments from public/output.mp4 in sequence, with a 3-frame crossfade between each segment.

Preview, adjust, then render.

6. How to Prompt Claude for Better Video Results

The quality of your output is almost entirely determined by the specificity of your prompt. Remotion has a lot of knobs β€” and if you don't specify them, Claude makes assumptions that may not match what you had in mind.

Be explicit about every visual dimension

Weak prompt: "Add captions to the video."

Strong prompt: "Add word-by-word animated captions to the video. Font: Inter Bold, 40px, white. Position: horizontally centered, 80px from the bottom. The active word should scale up to 44px and turn #FFD700 yellow. Inactive words stay white and 40px. Animate in from opacity 0 over 4 frames."

The difference in output quality is significant. Every detail you leave unspecified is a detail Claude will decide for you.

Specify timing in frames, not vague language

Remotion works in frames. A 30fps video has 30 frames per second. When you say "fade in quickly," Claude will guess what that means. When you say "fade in over 8 frames," Claude has an exact number to work with.

Reference remotion.dev/prompts before writing your first prompt

The Remotion prompts library is a public gallery of community-created prompts with preview videos. Before prompting Claude for something you haven't tried before, find a similar example in the library. It will show you what's achievable and give you the right vocabulary to describe the result you want.

Ask for one thing at a time

Claude can handle multi-part requests, but video compositions are complex. If you ask for captions, a lower third, a progress bar, and an intro animation all in one prompt, the chance of something breaking is high. Build incrementally β€” get captions working, then add the lower third, then add the progress bar.

When something looks wrong, describe the visual state β€” not the fix

Instead of: "The animation is broken, fix it."

Say: "The caption text is appearing all at once instead of word by word. Each word should appear as it's spoken in the audio."

Claude can reason from a description of the visual problem to the code fix. It cannot reason from "broken" alone.

7. Common AI Video Editing Problems (And Fixes)

The video won't seek / captions are out of sync Almost always a codec issue. Your source video is probably not H.264. Run the ffmpeg conversion command from Step 4 of Scenario A and replace your source file.

Claude Code generates code that throws errors in Remotion Studio The skill may be out of date. Run npx remotion skills update to get the latest rules and patterns, then ask Claude to regenerate the component.

The preview server won't start Check that you're in the right directory (the Remotion project root, where package.json is). Then check that Node.js is installed correctly with node --version. If you see a version number, run npm install inside the project directory to ensure all dependencies are installed.

Whisper transcription is slow The large model is accurate but can take 2–3x real-time on a standard laptop. Switch to medium.en for faster transcription if perfect accuracy isn't required. base.en is the fastest but less accurate for technical or accented speech.

AI-generated cut points don't match the actual content This is expected. Claude's transcript analysis identifies likely cut points but cannot see the video β€” it only sees the words and timestamps. Review the segments JSON manually and adjust any timestamps that feel off. Remotion's frame-level control means you can be precise once you're in the JSON.

Render fails partway through This usually means Remotion ran out of memory on a long video or complex composition. Try rendering in segments (--start and --end flags on npx remotion render) and concatenating the outputs with ffmpeg.

Captions overlap with existing on-screen text You need to specify the position more carefully. Give Claude explicit coordinates: "Position the caption block at Y: 820px from the top, horizontally centered, in a 1920Γ—1080 frame." If the video has a speaker title lower third, note that in your prompt so Claude positions captions to avoid overlap.

8. FAQs

Do I need to know React or JavaScript to use this workflow? No. Claude writes all the Remotion code for you. You describe in English, Claude generates the code, you preview and iterate. That said, being able to read code helps when things go wrong β€” you can describe the error to Claude more precisely.

Does this replace video editing software like Premiere or DaVinci? For motion graphics, caption overlays, and programmatic edits β€” it can replace those workflows. For anything involving complex multicam editing, color grading, or audio mixing, it doesn't. Think of it as a complement: edit and clean your footage in your usual tool, then use Remotion for the programmatic layers on top.

How much does this cost? Remotion itself is free and open source for personal and non-commercial use. Rendering is local β€” no per-minute cloud fees. You'll need a Claude Code subscription (currently $100/month) and, for Scenario B, the whisper.cpp model runs locally with no API cost. The only additional cost is if you use paid Remotion components like Animated Captions.

What video formats can Remotion output? By default, Remotion renders to H.264 MP4. It also supports WebM, HEVC, ProRes, and GIF. The format is set with the --codec flag when rendering.

Can I use this for videos longer than a few minutes? Yes, but render times scale with video length and composition complexity. A 10-minute video with animated captions can take 30–60 minutes to render locally. For long-form content, consider rendering in segments.

Can the Remotion workflow generate AI voiceover? Not natively, but it integrates well with tools like ElevenLabs. You generate the audio externally, export it as an MP3 or WAV, place it in the public/ folder, and reference it in your Remotion composition. Claude can wire the audio source into the composition for you.

What's the difference between the Remotion Agent Skill and the Remotion MCP? The agent skill is a knowledge file that teaches Claude how to write Remotion code. The Remotion MCP is a separate server that gives Claude runtime access to indexed Remotion documentation via a vector database. In practice, the agent skill handles most use cases. The MCP is more useful if you're building complex custom compositions and need Claude to look up specific API details.

My source video is a screen recording from Loom or Zoom β€” will this work? These files are often exported as MP4 but encoded with H.265 or with variable frame rates, which can cause seeking issues in Remotion. Always run the ffmpeg conversion step regardless of the file extension to ensure you have a reliable H.264 source.

Got a burning question about AI tools or workflows? Hit reply or drop a comment. You might just inspire the next guide.

Did you learn something new?

Login or Subscribe to participate in polls.

πŸ’Œ  We’d Love Your Feedback

Got 30 seconds? Tell us what you liked (or didn’t).

Until next time,
Team PracticalyAI

Reply

or to participate.