Computer Vision

Text-to-Video Editing

4 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Text-to-Video Editing

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Most implemented papers

Most implemented Social Latest No code

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

chenyangqiqi/fatezero • • ICCV 2023

We also have a better zero-shot shape-aware editing ability based on the text-to-video model.

Paper
Code

ControlVideo: Conditional Control for One-shot Text-driven Video Editing and Beyond

thu-ml/controlvideo • • 26 May 2023

This paper presents \emph{ControlVideo} for text-driven video editing -- generating a video that aligns with a given text while preserving the structure of the source video.

Paper
Code

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

g-u-n/gen-l-video • • 29 May 2023

To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency.

Paper
Code

Contextualized Diffusion Models for Text-Guided Image and Video Generation

yangling0818/contextdiff • • 26 Feb 2024

To address this issue, we propose a novel and general contextualized diffusion model (ContextDiff) by incorporating the cross-modal context encompassing interactions and alignments between text condition and visual sample into forward and reverse processes.

Paper
Code

Text-to-Video Editing

Benchmarks Add a Result

Most implemented papers

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

ControlVideo: Conditional Control for One-shot Text-driven Video Editing and Beyond

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

Contextualized Diffusion Models for Text-Guided Image and Video Generation

Content

Benchmarks

Add a Result