Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–49 of 49 results for author: Chu, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07457  [pdf, other

    cs.LG stat.ML

    Estimating the Hallucination Rate of Generative AI

    Authors: Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P. Cunningham, David Blei

    Abstract: This work is about estimating the hallucination rate for in-context learning (ICL) with Generative AI. In ICL, a conditional generative model (CGM) is prompted with a dataset and asked to make a prediction based on that dataset. The Bayesian interpretation of ICL assumes that the CGM is calculating a posterior predictive distribution over an unknown Bayesian model of a latent parameter and data. W… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2406.05227  [pdf, other

    cs.LG

    Mixed-Curvature Decision Trees and Random Forests

    Authors: Philippe Chlenski, Quentin Chu, Itsik Pe'er

    Abstract: We extend decision tree and random forest algorithms to product space manifolds: Cartesian products of Euclidean, hyperspherical, and hyperbolic manifolds. Such spaces have extremely expressive geometries capable of representing many arrangements of distances with low metric distortion. To date, all classifiers for product spaces fit a single linear decision boundary, and no regressor has been des… ▽ More

    Submitted 18 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2405.06699  [pdf

    cs.CL cs.AI

    ChatSOS: Vector Database Augmented Generative Question Answering Assistant in Safety Engineering

    Authors: Haiyang Tang, Dongping Chen, Qingzhao Chu

    Abstract: With the rapid advancement of natural language processing technologies, generative artificial intelligence techniques, represented by large language models (LLMs), are gaining increasing prominence and demonstrating significant potential for applications in safety engineering. However, fundamental LLMs face constraints such as limited training data coverage and unreliable responses. This study dev… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  4. arXiv:2404.00513  [pdf, other

    cs.CV

    Transformer based Pluralistic Image Completion with Reduced Information Loss

    Authors: Qiankun Liu, Yuqi Jiang, Zhentao Tan, Dongdong Chen, Ying Fu, Qi Chu, Gang Hua, Nenghai Yu

    Abstract: Transformer based methods have achieved great success in image inpainting recently. However, we find that these solutions regard each pixel as a token, thus suffering from an information loss issue from two aspects: 1) They downsample the input image into much lower resolutions for efficiency consideration. 2) They quantize $256^3$ RGB values to a small number (such as 512) of quantized color valu… ▽ More

    Submitted 14 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted by TPAMI (2024). arXiv admin note: text overlap with arXiv:2205.05076

  5. arXiv:2403.18405  [pdf, other

    cs.AI cs.IR

    Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval

    Authors: Shengjie Ma, Chong Chen, Qi Chu, Jiaxin Mao

    Abstract: Collecting relevant judgments for legal case retrieval is a challenging and time-consuming task. Accurately judging the relevance between two legal cases requires a considerable effort to read the lengthy text and a high level of domain expertise to extract Legal Facts and make juridical judgments. With the advent of advanced large language models, some recent studies have suggested that it is pro… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  6. arXiv:2403.02148  [pdf, other

    cs.CV

    MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection

    Authors: Tianxiang Chen, Zi Ye, Zhentao Tan, Tao Gong, Yue Wu, Qi Chu, Bin Liu, Nenghai Yu, Jieping Ye

    Abstract: Recently, infrared small target detection (ISTD) has made significant progress, thanks to the development of basic models. Specifically, the models combining CNNs with transformers can successfully extract both local and global features. However, the disadvantage of the transformer is also inherited, i.e., the quadratic computational complexity to sequence length. Inspired by the recent basic mode… ▽ More

    Submitted 24 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: The first Mamba-based model for infrared small target detection

  7. arXiv:2402.02327  [pdf, other

    cs.CV cs.SD eess.AS

    Bootstrapping Audio-Visual Segmentation by Strengthening Audio Cues

    Authors: Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Le Lu, Jieping Ye, Nenghai Yu

    Abstract: How to effectively interact audio with vision has garnered considerable interest within the multi-modality research field. Recently, a novel audio-visual segmentation (AVS) task has been proposed, aiming to segment the sounding objects in video frames under the guidance of audio cues. However, most existing AVS methods are hindered by a modality imbalance where the visual features tend to dominate… ▽ More

    Submitted 6 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  8. arXiv:2402.02046  [pdf, other

    cs.CV

    TCI-Former: Thermal Conduction-Inspired Transformer for Infrared Small Target Detection

    Authors: Tianxiang Chen, Zhentao Tan, Qi Chu, Yue Wu, Bin Liu, Nenghai Yu

    Abstract: Infrared small target detection (ISTD) is critical to national security and has been extensively applied in military areas. ISTD aims to segment small target pixels from background. Most ISTD networks focus on designing feature extraction blocks or feature fusion modules, but rarely describe the ISTD process from the feature map evolution perspective. In the ISTD process, the network attention gra… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  9. arXiv:2312.08629  [pdf

    cs.AI

    ChatSOS: LLM-based knowledge Q&A system for safety engineering

    Authors: Haiyang Tang, Zhenyi Liu, Dongping Chen, Qingzhao Chu

    Abstract: Recent advancements in large language models (LLMs) have notably propelled natural language processing (NLP) capabilities, demonstrating significant potential in safety engineering applications. Despite these advancements, LLMs face constraints in processing specialized tasks, attributed to factors such as corpus size, input processing limitations, and privacy concerns. Obtaining useful informatio… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: in Chinese language

  10. arXiv:2312.02520  [pdf, other

    cs.CV

    Towards More Unified In-context Visual Understanding

    Authors: Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

    Abstract: The rapid advancement of large language models (LLMs) has accelerated the emergence of in-context learning (ICL) as a cutting-edge approach in the natural language processing domain. Recently, ICL has been employed in visual understanding tasks, such as semantic segmentation and image captioning, yielding promising results. However, existing visual ICL framework can not enable producing content ac… ▽ More

    Submitted 16 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Accepted by CVPR 2024

  11. arXiv:2310.15624  [pdf, other

    cs.CV cs.LG

    GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object Detection

    Authors: Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Tong He, Yonghui Li, Wanli Ouyang

    Abstract: Geometry plays a significant role in monocular 3D object detection. It can be used to estimate object depth by using the perspective projection between object's physical size and 2D projection in the image plane, which can introduce mathematical priors into deep models. However, this projection process also introduces error amplification, where the error of the estimated height is amplified and re… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 18 pages, 9 figures

  12. arXiv:2309.16668  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    RealFill: Reference-Driven Generation for Authentic Image Completion

    Authors: Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein

    Abstract: Recent advances in generative imagery have brought forth outpainting and inpainting models that can produce high-quality, plausible image content in unknown regions. However, the content these models hallucinate is necessarily inauthentic, since they are unaware of the true scene. In this work, we propose RealFill, a novel generative approach for image completion that fills in missing regions of a… ▽ More

    Submitted 14 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: SIGGRAPH 2024 (Journal Track). Project page: https://realfill.github.io

  13. arXiv:2309.12657  [pdf, other

    cs.CV

    Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding

    Authors: Jiazhen Wang, Bin Liu, Changtao Miao, Zhiwei Zhao, Wanyi Zhuang, Qi Chu, Nenghai Yu

    Abstract: AI-synthesized text and images have gained significant attention, particularly due to the widespread dissemination of multi-modal manipulations on the internet, which has resulted in numerous negative impacts on society. Existing methods for multi-modal manipulation detection and grounding primarily focus on fusing vision-language features to make predictions, while overlooking the importance of m… ▽ More

    Submitted 13 January, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Camera-ready version and supplementary material

  14. arXiv:2306.10900  [pdf, other

    cs.CV cs.AI

    MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators

    Authors: Yaqi Zhang, Di Huang, Bin Liu, Shixiang Tang, Yan Lu, Lu Chen, Lei Bai, Qi Chu, Nenghai Yu, Wanli Ouyang

    Abstract: Generating realistic human motion from given action descriptions has experienced significant advancements because of the emerging requirement of digital humans. While recent works have achieved impressive results in generating motion directly from textual action descriptions, they often support only a single modality of the control signal, which limits their application in the real digital human i… ▽ More

    Submitted 18 March, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: 18 pages, 8 figures, accepted by AAAI 2024

  15. EVOPOSE: A Recursive Transformer For 3D Human Pose Estimation With Kinematic Structure Priors

    Authors: Yaqi Zhang, Yan Lu, Bin Liu, Zhiwei Zhao, Qi Chu, Nenghai Yu

    Abstract: Transformer is popular in recent 3D human pose estimation, which utilizes long-term modeling to lift 2D keypoints into the 3D space. However, current transformer-based methods do not fully exploit the prior knowledge of the human skeleton provided by the kinematic structure. In this paper, we propose a novel transformer-based model EvoPose to introduce the human body prior knowledge for 3D human p… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 5 pages, 2 figures, 4 tables, published in the proceedings of IEEE ICASSP 2023

  16. arXiv:2306.09008  [pdf, other

    cs.CV

    Exploring the Application of Large-scale Pre-trained Models on Adverse Weather Removal

    Authors: Zhentao Tan, Yue Wu, Qiankun Liu, Qi Chu, Le Lu, Jieping Ye, Nenghai Yu

    Abstract: Image restoration under adverse weather conditions (e.g., rain, snow and haze) is a fundamental computer vision problem and has important indications for various downstream applications. Different from early methods that are specially designed for specific type of weather, most recent works tend to remove various adverse weather effects simultaneously through either spatial feature representation… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  17. arXiv:2306.05390  [pdf, other

    cs.CV

    HQ-50K: A Large-scale, High-quality Dataset for Image Restoration

    Authors: Qinhong Yang, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Lu Yuan, Gang Hua, Nenghai Yu

    Abstract: This paper introduces a new large-scale image restoration dataset, called HQ-50K, which contains 50,000 high-quality images with rich texture details and semantic diversity. We analyze existing image restoration datasets from five different perspectives, including data scale, resolution, compression rates, texture details, and semantic coverage. However, we find that all of these datasets are defi… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Dataset and code will be available at https://github.com/littleYaang/HQ-50K

  18. arXiv:2305.10794  [pdf, other

    cs.CV

    Multi-spectral Class Center Network for Face Manipulation Detection and Localization

    Authors: Changtao Miao, Qi Chu, Zhentao Tan, Zhenchao Jin, Tao Gong, Wanyi Zhuang, Yue Wu, Bin Liu, Honggang Hu, Nenghai Yu

    Abstract: As deepfake content proliferates online, advancing face manipulation forensics has become crucial. To combat this emerging threat, previous methods mainly focus on studying how to distinguish authentic and manipulated face images. Although impressive, image-level classification lacks explainability and is limited to specific application scenarios, spurring recent research on pixel-level prediction… ▽ More

    Submitted 13 July, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Update Version

  19. arXiv:2305.06145  [pdf, other

    cs.CV

    Clothes-Invariant Feature Learning by Causal Intervention for Clothes-Changing Person Re-identification

    Authors: Xulin Li, Yan Lu, Bin Liu, Yuenan Hou, Yating Liu, Qi Chu, Wanli Ouyang, Nenghai Yu

    Abstract: Clothes-invariant feature extraction is critical to the clothes-changing person re-identification (CC-ReID). It can provide discriminative identity features and eliminate the negative effects caused by the confounder--clothing changes. But we argue that there exists a strong spurious correlation between clothes and human identity, that restricts the common likelihood-based ReID method P(Y|X) to ex… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  20. arXiv:2303.09522  [pdf, other

    cs.CV cs.CL cs.GR cs.LG

    P+: Extended Textual Conditioning in Text-to-Image Generation

    Authors: Andrey Voynov, Qinghao Chu, Daniel Cohen-Or, Kfir Aberman

    Abstract: We introduce an Extended Textual Conditioning space in text-to-image models, referred to as $P+$. This space consists of multiple textual conditions, derived from per-layer prompts, each corresponding to a layer of the denoising U-net of the diffusion model. We show that the extended space provides greater disentangling and control over image synthesis. We further introduce Extended Textual Inve… ▽ More

    Submitted 15 July, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

  21. arXiv:2301.04265  [pdf, other

    cs.CV cs.AI

    Adversarial Alignment for Source Free Object Detection

    Authors: Qiaosong Chu, Shuyan Li, Guangyi Chen, Kai Li, Xiu Li

    Abstract: Source-free object detection (SFOD) aims to transfer a detector pre-trained on a label-rich source domain to an unlabeled target domain without seeing source data. While most existing SFOD methods generate pseudo labels via a source-pretrained model to guide training, these pseudo labels usually contain high noises due to heavy domain discrepancy. In order to obtain better pseudo supervisions, we… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

  22. arXiv:2212.03863  [pdf, other

    cs.CV cs.LG

    X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

    Authors: Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, Nenghai Yu

    Abstract: Copy-Paste is a simple and effective data augmentation strategy for instance segmentation. By randomly pasting object instances onto new background images, it creates new training data for free and significantly boosts the segmentation performance, especially for rare object categories. Although diverse, high-quality object instances used in Copy-Paste result in more performance gain, previous wor… ▽ More

    Submitted 31 May, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: ICML 2023, code is available at https://github.com/yoctta/XPaste

  23. arXiv:2210.12752  [pdf, other

    cs.CV

    UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection

    Authors: Wanyi Zhuang, Qi Chu, Zhentao Tan, Qiankun Liu, Haojie Yuan, Changtao Miao, Zixiang Luo, Nenghai Yu

    Abstract: Intra-frame inconsistency has been proved to be effective for the generalization of face forgery detection. However, learning to focus on these inconsistency requires extra pixel-level forged location annotations. Acquiring such annotations is non-trivial. Some existing methods generate large-scale synthesized data with location annotations, which is only composed of real images and cannot capture… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: accepted by ECCV 2022 (oral)

  24. arXiv:2208.00967  [pdf, other

    cs.CV

    Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification

    Authors: Xulin Li, Yan Lu, Bin Liu, Yating Liu, Guojun Yin, Qi Chu, Jinyang Huang, Feng Zhu, Rui Zhao, Nenghai Yu

    Abstract: Graph-based models have achieved great success in person re-identification tasks recently, which compute the graph topology structure (affinities) among different people first and then pass the information across them to achieve stronger features. But we find existing graph-based methods in the visible-infrared person re-identification task (VI-ReID) suffer from bad generalization because of two i… ▽ More

    Submitted 14 November, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

  25. arXiv:2207.03776  [pdf, other

    cs.CV

    Towards Intrinsic Common Discriminative Features Learning for Face Forgery Detection using Adversarial Learning

    Authors: Wanyi Zhuang, Qi Chu, Haojie Yuan, Changtao Miao, Bin Liu, Nenghai Yu

    Abstract: Existing face forgery detection methods usually treat face forgery detection as a binary classification problem and adopt deep convolution neural networks to learn discriminative features. The ideal discriminative features should be only related to the real/fake labels of facial images. However, we observe that the features learned by vanilla classification networks are correlated to unnecessary p… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  26. arXiv:2205.05076  [pdf, other

    cs.CV cs.GR

    Reduce Information Loss in Transformers for Pluralistic Image Inpainting

    Authors: Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu

    Abstract: Transformers have achieved great success in pluralistic image inpainting recently. However, we find existing transformer based solutions regard each pixel as a token, thus suffer from information loss issue from two aspects: 1) They downsample the input image into much lower resolutions for efficiency consideration, incurring information loss and extra misalignment for the boundaries of masked reg… ▽ More

    Submitted 15 May, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: CVPR 2022, code is available at https://github.com/liuqk3/PUT

  27. arXiv:2201.01297  [pdf, other

    cs.CV

    Online Multi-Object Tracking with Unsupervised Re-Identification Learning and Occlusion Estimation

    Authors: Qiankun Liu, Dongdong Chen, Qi Chu, Lu Yuan, Bin Liu, Lei Zhang, Nenghai Yu

    Abstract: Occlusion between different objects is a typical challenge in Multi-Object Tracking (MOT), which often leads to inferior tracking results due to the missing detected objects. The common practice in multi-object tracking is re-identifying the missed objects after their reappearance. Though tracking performance can be boosted by the re-identification, the annotation of identity is required to train… ▽ More

    Submitted 4 January, 2022; originally announced January 2022.

    Comments: To Appear at Neurocomputing 2022

  28. arXiv:2110.09510  [pdf, other

    cs.CV cs.LG

    Unsupervised Finetuning

    Authors: Suichan Li, Dongdong Chen, Yinpeng Chen, Lu Yuan, Lei Zhang, Qi Chu, Bin Liu, Nenghai Yu

    Abstract: This paper studies "unsupervised finetuning", the symmetrical problem of the well-known "supervised finetuning". Given a pretrained model and small-scale unlabeled target data, unsupervised finetuning is to adapt the representation pretrained from the source domain to the target domain so that better transfer performance can be obtained. This problem is more challenging than the supervised counter… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

  29. arXiv:2109.03495  [pdf, other

    cs.CV

    Temporal RoI Align for Video Object Recognition

    Authors: Tao Gong, Kai Chen, Xinjiang Wang, Qi Chu, Feng Zhu, Dahua Lin, Nenghai Yu, Huamin Feng

    Abstract: Video object detection is challenging in the presence of appearance deterioration in certain video frames. Therefore, it is a natural choice to aggregate temporal information from other frames of the same video into the current frame. However, RoI Align, as one of the most core procedures of video detectors, still remains extracting features from a single-frame feature map for proposals, making th… ▽ More

    Submitted 10 September, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: Accpeted by AAAI 2021

  30. arXiv:2108.12382  [pdf, other

    cs.CV

    ISNet: Integrate Image-Level and Semantic-Level Context for Semantic Segmentation

    Authors: Zhenchao Jin, Bin Liu, Qi Chu, Nenghai Yu

    Abstract: Co-occurrent visual pattern makes aggregating contextual information a common paradigm to enhance the pixel representation for semantic image segmentation. The existing approaches focus on modeling the context from the perspective of the whole image, i.e., aggregating the image-level contextual information. Despite impressive, these methods weaken the significance of the pixel representations of t… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV2021

  31. arXiv:2108.11819  [pdf, other

    cs.CV

    Mining Contextual Information Beyond Image for Semantic Segmentation

    Authors: Zhenchao Jin, Tao Gong, Dongdong Yu, Qi Chu, Jian Wang, Changhu Wang, Jie Shao

    Abstract: This paper studies the context aggregation problem in semantic image segmentation. The existing researches focus on improving the pixel representations by aggregating the contextual information within individual images. Though impressive, these methods neglect the significance of the representations of the pixels of the corresponding class beyond the input image. To address this, this paper propos… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV2021

  32. arXiv:2107.13774  [pdf, other

    cs.CV

    Geometry Uncertainty Projection Network for Monocular 3D Object Detection

    Authors: Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Junjie Yan, Wanli Ouyang

    Abstract: Geometry Projection is a powerful depth estimation method in monocular 3D object detection. It estimates depth dependent on heights, which introduces mathematical priors into the deep model. But projection process also introduces the error amplification problem, in which the error of the estimated height will be amplified and reflected greatly at the output depth. This property leads to uncontroll… ▽ More

    Submitted 13 August, 2021; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: To appear at ICCV2021

  33. arXiv:2107.13718  [pdf, ps, other

    cs.CV

    Cascaded Residual Density Network for Crowd Counting

    Authors: Kun Zhao, Luchuan Song, Bin Liu, Qi Chu, Nenghai Yu

    Abstract: Crowd counting is a challenging task due to the issues such as scale variation and perspective variation in real crowd scenes. In this paper, we propose a novel Cascaded Residual Density Network (CRDNet) in a coarse-to-fine approach to generate the high-quality density map for crowd counting more accurately. (1) We estimate the residual density maps by multi-scale pyramidal features through cascad… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  34. arXiv:2107.13706  [pdf, other

    cs.CV

    Abnormal Behavior Detection Based on Target Analysis

    Authors: Luchuan Song, Bin Liu, Huihui Zhu, Qi Chu, Nenghai Yu

    Abstract: Abnormal behavior detection in surveillance video is a pivotal part of the intelligent city. Most existing methods only consider how to detect anomalies, with less considering to explain the reason of the anomalies. We investigate an orthogonal perspective based on the reason of these abnormal behaviors. To this end, we propose a multivariate fusion method that analyzes each target through three b… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  35. arXiv:2107.12369  [pdf, other

    cs.CV cs.LG

    Improve Unsupervised Pretraining for Few-label Transfer

    Authors: Suichan Li, Dongdong Chen, Yinpeng Chen, Lu Yuan, Lei Zhang, Qi Chu, Bin Liu, Nenghai Yu

    Abstract: Unsupervised pretraining has achieved great success and many recent works have shown unsupervised pretraining can achieve comparable or even slightly better transfer performance than supervised pretraining on downstream target datasets. But in this paper, we find this conclusion may not hold when the target dataset has very few labeled samples for finetuning, \ie, few-label transfer. We analyze th… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: ICCV 2021. arXiv admin note: substantial text overlap with arXiv:2012.05899

  36. arXiv:2103.07915  [pdf, other

    cs.CV

    Towards Generalizable and Robust Face Manipulation Detection via Bag-of-local-feature

    Authors: Changtao Miao, Qi Chu, Weihai Li, Tao Gong, Wanyi Zhuang, Nenghai Yu

    Abstract: Over the past several years, in order to solve the problem of malicious abuse of facial manipulation technology, face manipulation detection technology has obtained considerable attention and achieved remarkable progress. However, most existing methods have very impoverished generalization ability and robustness. In this paper, we propose a novel method for face manipulation detection, which can i… ▽ More

    Submitted 14 March, 2021; originally announced March 2021.

    Comments: 5 pages, 2 figures

  37. arXiv:2103.06878  [pdf, other

    cs.CV cs.GR

    Diverse Semantic Image Synthesis via Probability Distribution Modeling

    Authors: Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu

    Abstract: Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. Though impressive progress has been recently made, diverse semantic synthesis that can efficiently produce semantic-level multimodal results, still remains a challenge. In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: Accepted By CVPR 2021

  38. arXiv:2012.05899  [pdf, other

    cs.CV cs.LG

    Are Fewer Labels Possible for Few-shot Learning?

    Authors: Suichan Li, Dongdong Chen, Yinpeng Chen, Lu Yuan, Lei Zhang, Qi Chu, Nenghai Yu

    Abstract: Few-shot learning is challenging due to its very limited data and labels. Recent studies in big transfer (BiT) show that few-shot learning can greatly benefit from pretraining on large scale labeled dataset in a different domain. This paper asks a more challenging question: "can we use as few as possible labels for few-shot learning in both pretraining (with no labels) and fine-tuning (with fewer… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

  39. arXiv:2012.04644  [pdf, other

    cs.CV cs.GR

    Efficient Semantic Image Synthesis via Class-Adaptive Normalization

    Authors: Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu

    Abstract: Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the advantages insi… ▽ More

    Submitted 4 May, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: To appear at TPAMI 2021, code is available https://github.com/tzt101/CLADE.git

  40. arXiv:2010.16417  [pdf, other

    cs.CV

    MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait Editing

    Authors: Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, Nenghai Yu

    Abstract: Despite the recent success of face image generation with GANs, conditional hair editing remains challenging due to the under-explored complexity of its geometry and appearance. In this paper, we present MichiGAN (Multi-Input-Conditioned Hair Image GAN), a novel conditional image generation method for interactive portrait hair manipulation. To provide user control over every major hair visual facto… ▽ More

    Submitted 30 October, 2020; originally announced October 2020.

    Comments: Siggraph 2020, code is available at https://github.com/tzt101/MichiGAN

  41. arXiv:2004.02867  [pdf, other

    cs.CV cs.GR

    Rethinking Spatially-Adaptive Normalization

    Authors: Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Nenghai Yu

    Abstract: Spatially-adaptive normalization is remarkably successful recently in conditional semantic image synthesis, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to preserve the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the true advantages inside the box is still highly… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

  42. arXiv:2003.13194  [pdf, other

    cs.CV

    Density-Aware Graph for Deep Semi-Supervised Visual Recognition

    Authors: Suichan Li, Bin Liu, Dongdong Chen, Qi Chu, Lu Yuan, Nenghai Yu

    Abstract: Semi-supervised learning (SSL) has been extensively studied to improve the generalization ability of deep neural networks for visual recognition. To involve the unlabelled data, most existing SSL methods are based on common density-based cluster assumption: samples lying in the same high-density region are likely to belong to the same class, including the methods performing consistency regularizat… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

    Comments: Accepted by CVPR2020

  43. arXiv:2002.12489  [pdf, other

    cs.CV

    Cross-modality Person re-identification with Shared-Specific Feature Transfer

    Authors: Yan Lu, Yue Wu, Bin Liu, Tianzhu Zhang, Baopu Li, Qi Chu, Nenghai Yu

    Abstract: Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis. Existing works mainly focus on learning common representation by embedding different modalities into a same feature space. However, only learning the common characteristics means great information loss, lowering the upper bound of feature distinctiveness. In this paper, we tackle t… ▽ More

    Submitted 12 March, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: To appear at CVPR2020

  44. Incremental Nonlinear Fault-Tolerant Control of a Quadrotor with Complete Loss of Two Opposing Rotors

    Authors: Sihao Sun, Xuerui Wang, Qiping Chu, Coen de Visser

    Abstract: In order to further expand the flight envelope of quadrotors under actuator failures, we design a nonlinear sensor-based fault-tolerant controller to stabilize a quadrotor with failure of two opposing rotors in the high-speed flight condition (> 8m/s). The incremental nonlinear dynamic inversion (INDI) approach which excels in handling model uncertainties is adopted to compensate for the significa… ▽ More

    Submitted 26 October, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: 15 pages, 18 figures

    MSC Class: 93C85 93C95 93D25

    Journal ref: IEEE Transactions on Robotics, 2020

  45. arXiv:1708.02843  [pdf, other

    cs.CV

    Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism

    Authors: Qi Chu, Wanli Ouyang, Hongsheng Li, Xiaogang Wang, Bin Liu, Nenghai Yu

    Abstract: In this paper, we propose a CNN-based framework for online MOT. This framework utilizes the merits of single object trackers in adapting appearance models and searching for target in the next frame. Simply applying single object tracker for MOT will encounter the problem in computational efficiency and drifted results caused by occlusion. Our framework achieves computational efficiency by sharing… ▽ More

    Submitted 13 August, 2017; v1 submitted 9 August, 2017; originally announced August 2017.

    Comments: Accepted at International Conference on Computer Vision (ICCV) 2017

  46. arXiv:1702.02256  [pdf, other

    astro-ph.IM cs.DC

    Acceleration of low-latency gravitational wave searches using Maxwell-microarchitecture GPUs

    Authors: Xiangyu Guo, Qi Chu, Shin Kee Chung, Zhihui Du, Linqing Wen

    Abstract: Low-latency detections of gravitational waves (GWs) are crucial to enable prompt follow-up observations to astrophysical transients by conventional telescopes. We have developed a low-latency pipeline using a technique called Summed Parallel Infinite Impulse Response (SPIIR) filtering, realized by a Graphic Processing Unit (GPU). In this paper, we exploit the new \textit{Maxwell} memory access arc… ▽ More

    Submitted 7 February, 2017; originally announced February 2017.

  47. Cascaded Incremental Nonlinear Dynamic Inversion Control for MAV Disturbance Rejection

    Authors: Ewoud J. J. Smeur, Guido C. H. E. de Croon, Qiping Chu

    Abstract: Micro Aerial Vehicles (MAVs) are limited in their operation outdoors near obstacles by their ability to withstand wind gusts. Currently widespread position control methods such as Proportional Integral Derivative control do not perform well under the influence of gusts. Incremental Nonlinear Dynamic Inversion (INDI) is a sensor-based control technique that can control nonlinear systems subject to… ▽ More

    Submitted 12 January, 2022; v1 submitted 25 January, 2017; originally announced January 2017.

    Comments: 12 pages, 15 figures update: The transfer function in Eq. 12 was incorrect, this has been adjusted. All the conclusions are still valid

    Journal ref: Control Engineering Practice 73 (2018) 79-90

  48. arXiv:1609.06767  [pdf, ps, other

    cs.RO

    Adaptive Control Strategy for Constant Optical Flow Divergence Landing

    Authors: H. W. Ho, G. C. H. E. de Croon, E. van Kampen, Q. P. Chu, M. Mulder

    Abstract: Bio-inspired methods can provide efficient solutions to perform autonomous landing for Micro Air Vehicles (MAVs). Flying insects such as honeybees perform vertical landings by keeping flow divergence constant. This leads to an exponential decay of both height and vertical velocity, and allows for smooth and safe landings. However, the presence of noise and delay in obtaining flow divergence estima… ▽ More

    Submitted 21 September, 2016; originally announced September 2016.

    Comments: This manuscript is submitted to the IEEE Transactions on Robotics

  49. Framework for state and unknown input estimation of linear time-varying systems

    Authors: Peng Lu, Erik-Jan van Kampen, Cornelis C. de Visser, Qiping Chu

    Abstract: The design of unknown-input decoupled observers and filters requires the assumption of an existence condition in the literature. This paper addresses an unknown input filtering problem where the existence condition is not satisfied. Instead of designing a traditional unknown input decoupled filter, a Double-Model Adaptive Estimation approach is extended to solve the unknown input filtering problem… ▽ More

    Submitted 26 June, 2016; originally announced June 2016.

    Comments: This paper has been accepted by Automatica. It considers unknown input estimation or fault and disturbances estimation. Existing approaches considers the case where the effects of fault and disturbance can be decoupled. In our paper, we consider the case where the effects of fault and disturbance are coupled. This approach can be easily extended to nonlinear systems

    Journal ref: Automatica, 73 (2016), 145-154