VeloEdit: Training-Free Consistent and Continuous Instruction-Based Image Editing via Velocity Field Decomposition

Zongqing Li1,2 Zhihui Liu2 Yujie Xie2 Shansiyuan Wu2 Hongshen Lv1 Songzhi Su1
1 Xiamen University
2 Truesight

TLDR

We introduce VeloEdit, a training-free method that enables consistent and continuous image editing by decomposing and manipulating the velocity field in diffusion models. Our method automatically identifies preservation and editing regions, achieving smooth multi-intensity editing results without any additional training.

Under Review
'Convert to color image'
0.0 1.0
Convert to color image

Abstract

Instruction-based image editing aims to modify source content according to textual instructions. However, existing methods built upon flow matching often struggle to maintain consistency in non-edited regions due to denoising-induced reconstruction errors that cause drift in preserved content. Moreover, they typically lack fine-grained control over edit strength. To address these limitations, we propose VeloEdit, a training-free method that enables highly consistent and continuously controllable editing. VeloEdit dynamically identifies editing regions by quantifying the discrepancy between the velocity fields responsible for preserving source content and those driving the desired edits. Based on this partition, we enforce consistency in preservation regions by substituting the editing velocity with the source-restoring velocity, while enabling continuous modulation of edit intensity in target regions via velocity interpolation. Unlike prior works that rely on complex attention manipulation or auxiliary trainable modules, VeloEdit operates directly on the velocity fields. Extensive experiments on Flux.1 Kontext and Qwen-Image-Edit demonstrate that VeloEdit improves visual consistency and editing continuity with negligible additional computational cost.

Strength Controlled Image Editing

'Add flowers to the helmet'
0.0 1.0
Add a graffiti to the girl's face
'Turn off the light'
0.0 1.0
Turn off the light
'Convert to pixel style'
0.0 1.0
Turn off the light

Motivation

We extend instruction-driven image editing models to provide continuous control over edit strength while maintaining visual consistency in non-edited regions. Unlike existing methods that require training or complex attention manipulation, VeloEdit operates directly on the velocity field of diffusion models, enabling a simple yet effective approach to controllable editing. By analyzing velocity field similarity, we automatically identify which regions should be preserved and which should be edited, then smoothly interpolate between source and editing velocities to achieve fine-grained control.

Method

VeloEdit operates by decomposing the velocity field into preservation and editing components based on similarity analysis. The key insight is that regions with high velocity similarity between source and edited images should remain unchanged, while regions with low similarity require editing control. Our method consists of three main steps: (1) Velocity Field Decomposition - we compute cosine similarity between source and editing velocities at each spatial location; (2) Consistency Preservation - in high-similarity regions, we replace editing velocity with source velocity; (3) Continuous Intensity Control - in low-similarity regions, we interpolate between velocities using a strength parameter α ∈ [0, 1]. This approach requires no training and can be applied to any diffusion-based editing model, providing both consistency and continuous control in a unified framework.

Method Architecture

Results

'Turn into Van Gogh's style'
0.0 1.0
Turn into Van Gogh's style
'Make the bird fluffy'
0.0 1.0
Make the bird fluffy
'Make her hair curly'
0.0 1.0
Make her hair curly
'It is daytime now'
0.0 1.0
It is daytime now
'It is raining now'
0.0 1.0
It is raining now
'The lake is frozen'
0.0 1.0
The lake is frozen
'Turn into a simple line drawing'
0.0 1.0
Turn into a simple line drawing
'Add a graffiti to the girl's face'
0.0 1.0
Add a graffiti to the girl's face
'Turn the horse into a bronze horse'
0.0 1.0
Turn the horse into a bronze horse
'Make the car shiny and brand-new'
0.0 1.0
make the car shiny and brand-new