[CVPR 2024] Video-P2P: Video Editing with Cross-attention Control The official implementation of Video-P2P . Shaoteng Liu , Yuechen Zhang , Wenbo Li , Zhe Lin , Jiaya Jia Changelog 2023.03.20 Release Demo. 2023.03.19 Release Code. 2023.03.09 Paper preprint on arxiv. Todo Release the code with 6 examples. Update a faster version. Release data. Release the Gradio Demo. Add local Gradio Demo. Release more configs and new applications. Setup pip install -r requirements.txt The code was tested on both Tesla V100 32GB and RTX3090 24GB. At least 20GB VRAM is required. The environment is similar to Tune-A-Video and prompt-to-prompt . xformers on 3090 may meet this issue . Quickstart Please replace pretrained_model_path with the path to your stable-diffusion. To download the pre-trained model, please refer to diffusers . # Stage 1: Tuning to do model initialization. # You can minimize the tuning epochs to speed up. python run_tuning.py --config= " configs/rabbit-jump-tune.yaml " # Stage 2: Attention Control # We develop a faster mode (1 min on V100): python run_videop2p.py --config= " configs/rabbit-jump-p2p.yaml " --fast # The official mode (10 mins on V100, more stable): python run_videop2p.py --config= " configs/rabbit-jump-p2p.yaml " Find your results in Video-P2P/outputs/xxx/results . Dataset We release our dataset here . Download them under ./data and explore your creativity! Results configs/rabbit-jump-p2p.yaml configs/penguin-run-p2p.yaml configs/man-motor-p2p.yaml configs/car-drive-p2p.yaml configs/tiger-forest-p2p.yaml configs/bird-forest-p2p.yaml Gradio demo Running the following command to launch the local demo built with gradio : python app_gradio.py Find the demo on HuggingFace here . The demo code borrows heavily from Tune-A-Video . Citation @misc{liu2023videop2p, author={Liu, Shaoteng and Zhang, Yuechen and Li, Wenbo and Lin, Zhe and Jia, Jiaya}, title={Video-P2P: Video Editing with Cross-attention Control}, journal={arXiv:2303.04761}, year={2023}, } References prompt-to-prompt: https://github.com/google/prompt-to-prompt Tune-A-Video: https://github.com/showlab/Tune-A-Video diffusers: https://github.com/huggingface/diffusers