Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Language: 中文 | English

1 minute read

Published:

Note: this is the English version paired with the Chinese post Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation.

Abstract

Visual effects (VFX) are central to modern video and film production. Although recent video generation models enable low-cost VFX creation, they are typically trained with single-effect LoRA adapters and therefore cannot produce multiple effects at user-specified locations. To address cross-effect interference and the lack of spatial controllability in joint multi-VFX training, we propose Omni-Effects, the first unified framework for prompt-driven and spatially controllable composite VFX generation. The core design includes: 1) a LoRA-based Mixture-of-Experts (LoRA-MoE) module that integrates diverse effects in a single model while alleviating inter-task interference; and 2) a Spatial-Aware Prompt (SAP) module that injects spatial masks into text tokens for precise spatial control, equipped with an Independent-Information Flow (IIF) submodule to isolate control signals of different effects and avoid unwanted blending. We further construct the Omni-VFX dataset and a dedicated VFX evaluation protocol. Extensive experiments demonstrate that Omni-Effects achieves accurate spatial control and diverse, high-quality effects, supporting user-defined effect types and locations.

Introduction

Briefly introduce:

  • the role and cost of traditional VFX production;
  • limitations of single-effect LoRA-based methods for real-world workflows;
  • the need for unified, spatially-controllable multi-effect generation.

Method

1. Problem Definition

Formulate unified VFX generation with:

  • multiple effect types;
  • user-specified spatial regions;
  • quality, independence and controllability requirements.

2. Approach

Describe:

  • the LoRA-MoE module: expert design, routing / combination strategy and how it reduces cross-effect interference;
  • the SAP module: how spatial masks are embedded into prompts;
  • the IIF design: how information flow is separated across effects.

3. Data and Training

Summarize the Omni-VFX dataset construction pipeline and the training setup for the unified model.

4. Results and Analysis

Highlight:

  • single-effect quality vs. single-LoRA baselines;
  • spatial accuracy compared with existing editing / generation methods;
  • independence of multiple effects on the same frame.

Conclusion and Future Work

Summarize the contributions and outline:

  • extension to higher-resolution and production-grade VFX;
  • better temporal modeling for long videos;
  • interactive tools built on top of Omni-Effects.

References