Every other video model on Cliprise starts from nothing — a text prompt, an image, or both — and generates video. Luma Modify starts from footage you already have.
Released as Ray3 Modify by Luma AI on December 18, 2025, this is a video-to-video editing model with one central purpose: take recorded footage and transform the scene — environment, costume, character appearance, visual style — while keeping the original performance exactly as it was filmed. The actor's movement, timing, eye line, and emotional delivery stay intact. Everything else can change.
This is a different tool for a different moment in production. Not for generating new content. For transforming footage that already exists.
What Luma Modify Actually Does
The problem Luma Modify was built to solve has existed since AI video tools became production-relevant: generating video from scratch is expressive but hard to control, while real human performance captured on camera is authentic but locked to the original environment and conditions of the shoot.
Luma Modify bridges these two. You provide filmed footage as input — a real performance, captured with a real camera — and the model treats that human performance as the directorial instruction. Your text prompt describes how the scene should change. The model applies that transformation while treating the original motion, timing, framing, and emotional delivery as constraints to preserve rather than elements to replace.
In Luma AI's description: the human performer becomes the source of direction for AI. The model conditions its generation on the input footage, which means it follows the real-world motion rather than inventing new motion from a text description.
Practically, this means:
You filmed an actor walking through a studio. Luma Modify can place that same walk in a forest, a city street, or a spaceship corridor — without re-shooting.
You filmed a product demonstration. Luma Modify can change the background setting, the presenter's costume, or the visual style for different market versions — without new shoots for each version.
You filmed a performance for a brand campaign. Luma Modify can apply a character reference to replace the actor's appearance with a brand mascot or animated character, keeping the performance timing and movement from the original take.
Three Controls That Define the Workflow
1. Modify Strength: Adhere → Reimagine
Every transformation in Luma Modify is governed by the Modify Strength slider. This is the most important single control in the model.
Adhere (low strength): Conservative transformation. The model makes targeted changes close to your source footage. Use this for: costume swaps where you want the rest of the frame unchanged, subtle background adjustments, lighting mood changes, localized element swaps.
Reimagine (high strength): Extensive transformation. The model applies significant changes to the visual environment based on your prompt. Use this for: complete environment changes, dramatic style transformations, placing a performance in a completely different world.
The practical rule: Start at Adhere. If the change is not sufficient, move toward Reimagine. Moving in this direction progressively applies more extensive transformation — but also increases the risk of performance detail loss at the far end.
Luma AI's guidance is explicit: match the strength setting to the scope of the modification you want. A costume swap belongs at Adhere. Teleporting a performer from a studio to the surface of Mars belongs at Reimagine.
2. Start and End Frame (Keyframe Control)
Start and End Frame lets you define the exact opening composition and the exact closing composition of the modified output. You provide two images — one showing where the shot should begin, one showing where it should end — and the model generates the transformation between them.
This gives you precise control over transitions, reveals, and scene continuations. The use cases:
- Continuity across cuts: The end frame of one shot matches the start frame of the next, creating visual continuity across a sequence even when the underlying source clips are from different recordings.
- Controlled reveals: The shot begins at a specific framing and resolves at a defined endpoint — a pull-back, a turn, a transition to a new environment.
- Character state transitions: Begin with a neutral expression frame, end with an emotional reaction frame, and let the model generate the natural transition between them.
Per Luma AI's documentation: ensure the target of your modification is visible in the Start or End Frame for best results. If you want a costume to appear a certain way at the end of the shot, show that costume in the End Frame.
3. Character Reference
Character Reference allows you to provide a still image of a target character and apply that character's appearance — likeness, costume, identity — onto the performer in your source footage. The model replaces the visual appearance while keeping all original performance data: movement, timing, eye line, emotional delivery.
The reference can be: a photograph of a real person (for brand or actor continuity), an illustrated character design (for animation or game workflows), a brand mascot, or any other character visual identity.
This is the model's most distinctive capability. What it enables is a clean separation between performance direction (handled on set with a real actor and camera) and character identity (applied in post via the reference). A small crew can shoot multiple performances using a standard talent, then apply different character identities to each performance for different deliverables.
When to Use Luma Modify vs Other Video Models
Luma Modify serves a specific production moment: post-capture transformation of footage you already have. It is not a replacement for generation models — it is a different tool used at a different stage.
| Situation | Correct model |
|---|---|
| You have footage and want to change the scene/costume | Luma Modify |
| You want to apply a character identity to a performance | Luma Modify |
| You need specific start/end frame control on a transformation | Luma Modify |
| You need to generate new video from a text description | Kling 3.0, Veo 3.1, Sora 2 |
| You need to generate video from a still image | Kling 2.5 Turbo, Kling 3.0 |
| You need video with native audio from generation | Kling 3.0, Seedance 2.0 |
A note on Luma models on Cliprise: Luma Modify (Ray3 Modify) is available on Cliprise as a video editing model. Luma's text-to-video and image-to-video generation models (Ray3 standard generation) are not currently on Cliprise — for those generation use cases, Kling 3.0, Veo 3.1, and Sora 2 are the primary video generation models available on the platform.
Practical Workflow: Performance Capture to Multi-Context Campaign Asset
This workflow demonstrates the most common professional use case: a single performance shoot producing multiple campaign versions.
Shoot: Film a talent performance in a standard studio. Clean lighting, neutral or greenscreen background, good camera control (slow to medium movement — faster movement gives the model less clean input). Keep clips under 10 seconds per segment.
Version 1 — Outdoor lifestyle context: Upload clip to Luma Modify. Set Modify Strength toward middle. Prompt: "Urban street scene, late afternoon golden hour, pedestrians in background, natural environment." Review output — if transformation is insufficient, increase strength slightly.
Version 2 — Brand environment: Return to original clip. Prompt: "Modern retail interior, branded environment, clean professional setting." Apply appropriate strength.
Version 3 — Character reference: Upload character reference image (illustrated brand mascot or specific talent). In Character Reference mode, upload source clip + character reference. Model applies character identity to original performance.
Assembly: Three distinct versions from one shoot, each appropriate for a different placement in the campaign — outdoor digital, retail, and character-branded content.
Post-processing: Run outputs through Topaz Video Upscaler if the delivery specification requires higher resolution than the source footage.
Prompting for Transformation
Luma Modify responds to clear scene description prompts — not creative narrative, but specific visual environment and style instructions.
What to specify:
- Environment type and setting: "mountain forest in winter," "modern office interior," "cyberpunk city at night"
- Lighting condition: "golden hour sunlight," "overcast diffused light," "dramatic studio key light"
- Atmosphere and style: "photorealistic," "stylized animation aesthetic," "cinematic color grade"
- Specific elements to change: "change costume to business attire," "swap background to branded retail setting"
What to avoid:
- Describing the performance or actor behavior — Luma Modify follows the source footage for performance, not the prompt
- Overly complex multi-step instructions — separate transformations into distinct generation passes if needed
- Prompting for elements that conflict with the source footage's camera movement or perspective
On aspect ratio: If your source footage is not in a standard preset ratio (16:9, 9:16, 1:1, 4:3), the model will crop to the nearest preset. Check which elements will be preserved in the crop before uploading — adjust your source clip framing if needed.
Real Limitations
10-second input limit. For longer sequences, segment into clips and assemble in post. This is a real constraint for longer-form content — plan segmentation in advance so cut points fall at natural pause points in the performance.
Fine detail loss at high strength. At the Reimagine end of the strength scale, subtle performance details — micro-expressions, fine hand movements — can be partially lost in heavy transformations. For performance-critical content, test at lower strength first.
Fast camera movement. Rapid handheld movement, fast pans, and action-camera footage give the model less stable input to condition on. Smoother, more controlled camera movement produces more reliable transformation results.
No audio generation. Like Flux Kontext, Luma Modify is a visual model. Audio — music, sound design, voiceover — is added in post-production. If your final output requires audio synchronized to the performance, add it after transformation.
Related Articles
- AI Video Generation 2026: 22+ Models, Workflows, and What Actually Works — Full video model landscape
- Kling 3.0 Complete Guide 2026 — Primary generation model for new video
- Kling 2.5 Turbo Complete Guide 2026 — High-motion video generation
- AI Avatar Video Generator 2026: Complete Guide — AI presenter and talking-head video
- AI Video Ads 2026: Complete Guide to AI-Powered Advertising — Commercial production workflows
- Best AI Video Models on Cliprise 2026: Ranked by Use Case — Model selection by use case