Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleNovember 2024
ReSyncer: Rewiring Style-Based Generator for Unified Audio-Visually Synced Facial Performer
- Jiazhi Guan,
- Zhiliang Xu,
- Hang Zhou,
- Kaisiyuan Wang,
- Shengyi He,
- Zhanwang Zhang,
- Borong Liang,
- Haocheng Feng,
- Errui Ding,
- Jingtuo Liu,
- Jingdong Wang,
- Youjian Zhao,
- Ziwei Liu
AbstractLip-syncing videos with given audio is the foundation for various applications including the creation of virtual presenters or performers. While recent studies explore high-fidelity lip-sync with different techniques, their task-orientated models ...
- research-articleJuly 2023
Efficient Video Portrait Reenactment via Grid-based Codebook
- Kaisiyuan Wang,
- Hang Zhou,
- Qianyi Wu,
- Jiaxiang Tang,
- Zhiliang Xu,
- Borong Liang,
- Tianshu Hu,
- Errui Ding,
- Jingtuo Liu,
- Ziwei Liu,
- Jingdong Wang
SIGGRAPH '23: ACM SIGGRAPH 2023 Conference ProceedingsArticle No.: 66, Pages 1–9https://doi.org/10.1145/3588432.3591509While progress has been made in the field of portrait reenactment, the problem of how to efficiently produce high-fidelity and accurate videos remains. Recent studies build direct mappings between driving signals and their predictions, leading to ...
- research-articleFebruary 2023
Robust video portrait reenactment via personalized representation quantization
- Kaisiyuan Wang,
- Changcheng Liang,
- Hang Zhou,
- Jiaxiang Tang,
- Qianyi Wu,
- Dongliang He,
- Zhibin Hong,
- Jingtuo Liu,
- Errui Ding,
- Ziwei Liu,
- Jingdong Wang
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial IntelligenceArticle No.: 285, Pages 2564–2572https://doi.org/10.1609/aaai.v37i2.25354While progress has been made in the field of portrait reenactment, the problem of how to produce high-fidelity and robust videos remains. Recent studies normally find it challenging to handle rarely seen target poses due to the limitation of source data. ...
- research-articleNovember 2022
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
- Yasheng Sun,
- Hang Zhou,
- Kaisiyuan Wang,
- Qianyi Wu,
- Zhibin Hong,
- Jingtuo Liu,
- Errui Ding,
- Jingdong Wang,
- Ziwei Liu,
- Koike Hideki
SA '22: SIGGRAPH Asia 2022 Conference PapersArticle No.: 17, Pages 1–9https://doi.org/10.1145/3550469.3555393Previous studies have explored generating accurately lip-synced talking faces for arbitrary targets given audio conditions. However, most of them deform or generate the whole facial area, leading to non-realistic results. In this work, we delve into the ...
- ArticleOctober 2022
UFO: Unified Feature Optimization
- Teng Xi,
- Yifan Sun,
- Deli Yu,
- Bi Li,
- Nan Peng,
- Gang Zhang,
- Xinyu Zhang,
- Zhigang Wang,
- Jinwen Chen,
- Jian Wang,
- Lufei Liu,
- Haocheng Feng,
- Junyu Han,
- Jingtuo Liu,
- Errui Ding,
- Jingdong Wang
AbstractThis paper proposes a novel Unified Feature Optimization (UFO) paradigm for training and deploying deep models under real-world and large-scale scenarios, which requires a collection of multiple AI functions. UFO aims to benefit each single task ...
- ArticleOctober 2022
StyleSwap: Style-Based Generator Empowers Robust Face Swapping
- Zhiliang Xu,
- Hang Zhou,
- Zhibin Hong,
- Ziwei Liu,
- Jiaming Liu,
- Zhizhi Guo,
- Junyu Han,
- Jingtuo Liu,
- Errui Ding,
- Jingdong Wang
AbstractNumerous attempts have been made to the task of person-agnostic face swapping given its wide applications. While existing methods mostly rely on tedious network and loss designs, they still struggle in the information balancing between the source ...
- research-articleOctober 2021
StrucTexT: Structured Text Understanding with Multi-Modal Transformers
- Yulin Li,
- Yuxi Qian,
- Yuechen Yu,
- Xiameng Qin,
- Chengquan Zhang,
- Yan Liu,
- Kun Yao,
- Junyu Han,
- Jingtuo Liu,
- Errui Ding
MM '21: Proceedings of the 29th ACM International Conference on MultimediaPages 1912–1920https://doi.org/10.1145/3474085.3475345Structured text understanding on Visually Rich Documents (VRDs) is a crucial part of Document Intelligence. Due to the complexity of content and layout in VRDs, structured text understanding has been a challenging task. Most existing studies decoupled ...
- research-articleApril 2021
AutoDet: Pyramid Network Architecture Search for Object Detection
International Journal of Computer Vision (IJCV), Volume 129, Issue 4Pages 1087–1105https://doi.org/10.1007/s11263-020-01415-xAbstractFeature pyramids have delivered significant improvement in object detection. However, building effective feature pyramids heavily relies on expert knowledge, and also requires strenuous efforts to balance effectiveness and efficiency. Automatic ...
- research-articleOctober 2020
Learning Global Structure Consistency for Robust Object Tracking
MM '20: Proceedings of the 28th ACM International Conference on MultimediaPages 229–237https://doi.org/10.1145/3394171.3413644Fast appearance variations and the distractions of similar objects are two of the most challenging problems in visual object tracking. Unlike many existing trackers that focus on modeling only the target, in this work, we consider the transient ...
- ArticleAugust 2020
Real Image Super Resolution via Heterogeneous Model Ensemble Using GP-NAS
AbstractWith advancement in deep neural network (DNN), recent state-of-the-art (SOTA) image super-resolution (SR) methods have achieved impressive performance using deep residual network with dense skip connections. While these models perform well on ...
- ArticleAugust 2020
AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results
- Pengxu Wei,
- Hannan Lu,
- Radu Timofte,
- Liang Lin,
- Wangmeng Zuo,
- Zhihong Pan,
- Baopu Li,
- Teng Xi,
- Yanwen Fan,
- Gang Zhang,
- Jingtuo Liu,
- Junyu Han,
- Errui Ding,
- Tangxin Xie,
- Liang Cao,
- Yan Zou,
- Yi Shen,
- Jialiang Zhang,
- Yu Jia,
- Kaihua Cheng,
- Chenhuan Wu,
- Yue Lin,
- Cen Liu,
- Yunbo Peng,
- Xueyi Zou,
- Zhipeng Luo,
- Yuehan Yao,
- Zhenyu Xu,
- Syed Waqas Zamir,
- Aditya Arora,
- Salman Khan,
- Munawar Hayat,
- Fahad Shahbaz Khan,
- Keon-Hee Ahn,
- Jun-Hyuk Kim,
- Jun-Ho Choi,
- Jong-Seok Lee,
- Tongtong Zhao,
- Shanshan Zhao,
- Yoseob Han,
- Byung-Hoon Kim,
- JaeHyun Baek,
- Haoning Wu,
- Dejia Xu,
- Bo Zhou,
- Wei Guan,
- Xiaobo Li,
- Chen Ye,
- Hao Li,
- Haoyu Zhong,
- Yukai Shi,
- Zhijing Yang,
- Xiaojun Yang,
- Haoyu Zhong,
- Xin Li,
- Xin Jin,
- Yaojun Wu,
- Yingxue Pang,
- Sen Liu,
- Zhi-Song Liu,
- Li-Wen Wang,
- Chu-Tak Li,
- Marie-Paule Cani,
- Wan-Chi Siu,
- Yuanbo Zhou,
- Rao Muhammad Umer,
- Christian Micheloni,
- Xiaofeng Cong,
- Rajat Gupta,
- Keon-Hee Ahn,
- Jun-Hyuk Kim,
- Jun-Ho Choi,
- Jong-Seok Lee,
- Feras Almasri,
- Thomas Vandamme,
- Olivier Debeir
- research-articleJanuary 2020
Progressively Refined Face Detection Through Semantics-Enriched Representation Learning
IEEE Transactions on Information Forensics and Security (TIFS), Volume 15Pages 1394–1406https://doi.org/10.1109/TIFS.2019.2941800Feature pyramids aim to learn multi-scale representations for detecting faces over various scales. However, they often lack adequate context over different scales, especially when there are many tiny faces in the wild. In this paper, we propose an ...
- research-articleOctober 2019
A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning
- Pengfei Wang,
- Chengquan Zhang,
- Fei Qi,
- Zuming Huang,
- Mengyi En,
- Junyu Han,
- Jingtuo Liu,
- Errui Ding,
- Guangming Shi
MM '19: Proceedings of the 27th ACM International Conference on MultimediaPages 1277–1285https://doi.org/10.1145/3343031.3350988Detecting scene text of arbitrary shapes has been a challenging task over the past years. In this paper, we propose a novel segmentation-based text detector, namely SAST, which employs a context attended multi-task learning framework based on a Fully ...
- research-articleOctober 2019
Editing Text in the Wild
MM '19: Proceedings of the 27th ACM International Conference on MultimediaPages 1500–1508https://doi.org/10.1145/3343031.3350929In this paper, we are interested in editing text in natural images, which aims to replace or modify a word in the source image with another one while maintaining its realistic look. This task is challenging, as the styles of both background and text ...
- ArticleSeptember 2018
PyramidBox: A Context-Assisted Single Shot Face Detector
AbstractFace detection has been well studied for many years and one of remaining challenges is to detect small, blurred and partially occluded faces in uncontrolled environment. This paper proposes a novel context-assisted single shot face detector, named ...