Yuchao Gu

Ph.D. Student

Show Lab
National University of Singapore

Email: yuchaogu9710 [at] gmail.com

Biography

Hi there! I am a third-year Ph.D. student in Show Lab @ NUS, working with Prof. Mike Shou. Before that, I received master's degree in Nankai University in 2022, working with Prof. Ming-Ming Cheng. In 2019, I received my bachelor degree from Beijing University of Chemical Technology, working with Prof. Wei Hu.

My previous research focuses on visual generation, especially in foundational generative modeling and instance control for visual generation:

Foundational generative modeling: my work first reveals the reconstruction and generation dilemma of vision tokenizer in (VQFR, ECCV 2022) and (RethinkVQ, CVPR 2024). In the video generation domain, my work introduces frame autoregressive model (FAR, Arxiv 2025) for long-context video modeling. I also collaborate in developing unified models for understanding and generation (Show-O, ICLR 2025), foundation model for video generation (Show-1, IJCV 2024) and (Tune-A-Video, ICCV 2023).
Instance control in generation: my work investigates multi-instance identity control (Mix-of-Show, NeruIPS 2023), multi-instance position control (ROICtrl, CVPR 2025) and instance trajectory control in video generation (VideoSwap, CVPR 2024). I also collaborate in developing instance motion control in video generation (DragAnything, ECCV 2024) and (MotionDirector, ECCV 2024).

Recently, my research interests focus on long-context video generation and unified models. I am always open to discussion and collaboration. Feel free to reach out.

News

2025 Feb: One paper (ROICtrl) got accepted by CVPR 2025.
2025 Jan: One paper (Show-O) got accepted by ICLR 2025.
2024 Sept: One paper (EvolveDirector) got accepted by NeurIPS 2024.
2024 July: Two papers (MotionDirector, DragAnything) got accepted by ECCV 2024, with MotionDirector selected for Oral presentation.
2024 Feb: Four papers (VideoSwap, RethinkVQ, DynVideo-E, MaskINT) got accepted by CVPR 2024.
2023 Sept: Two papers (Mix-of-Show, DatasetDM) got accepted by NeurIPS 2023.
2023 Jun: Joined Meta GenAI as a Research Intern in Menlo Park, CA.
2023 July: One paper (Tune-A-Video) got accepted by ICCV 2023.
2022 July: One paper (VQFR) got accepted by ECCV 2022 as Oral.
2022 July: Joined Show Lab @ NUS to start my Ph.D. journey!
2021 Sep: Joined Tencent ARC Lab as a Research Intern.

Selected Publications

	Long-Context Autoregressive Video Modeling with Next-Frame Prediction Yuchao Gu, Weijia Mao and Mike Zheng Shou. Arxiv, 2025 [project] [paper] [code]
	ROICtrl: Boosting Instance Control for Visual Generation Yuchao Gu, Yipin Zhou, Yunfan Ye, Yixin Nie, Licheng Yu, Pingchuan Ma, Kevin Qinghong Lin and Mike Zheng Shou. CVPR, 2025 [project] [paper] [code]
	Show-O: One Single Transformer to Unify Multimodal Understanding and Generation Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang and Mike Zheng Shou. ICLR, 2025 [project] [paper] [code]
	Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao and Mike Zheng Shou. IJCV, 2024 [project] [paper] [code]
	MotionDirector: Motion Customization of Text-to-Video Diffusion Models Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jiawei Liu, Weijia Wu, Jussi Keppo and Mike Zheng Shou. European Conference on Computer Vision (ECCV), 2024. Oral. [project] [paper] [code]
	VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jiawei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou and Kevin Tang. IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2024 [project] [paper] [code]
	Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Xiaohu Qie and Mike Zheng Shou. IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2024 [paper] [code]
	Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan and Mike Zheng Shou. Neural Information Processing Systems (NeurIPS), 2023 [project] [paper] [code]
	Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie and Mike Zheng Shou. International Conference on Computer Vision (ICCV), 2023 [project] [paper] [code]
	VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan and Ming-Ming Cheng. European Conference on Computer Vision (ECCV), 2022. Oral. [project] [paper] [code]