Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–48 of 48 results for author: Ming, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14852  [pdf, other

    cs.CV cs.AI

    Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models

    Authors: Jiayu Wang, Yifei Ming, Zhenmei Shi, Vibhav Vineet, Xin Wang, Neel Joshi

    Abstract: Large language models (LLMs) and vision-language models (VLMs) have demonstrated remarkable performance across a wide range of tasks and domains. Despite this promise, spatial understanding and reasoning -- a fundamental component of human cognition -- remains under-explored. We develop novel benchmarks that cover diverse aspects of spatial reasoning such as relationship understanding, navigation,… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2405.05526  [pdf, other

    cs.RO

    Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview

    Authors: Yuhang Ming, Xingrui Yang, Weihan Wang, Zheng Chen, Jinglun Feng, Yifan Xing, Guofeng Zhang

    Abstract: Neural Radiance Fields (NeRF) have emerged as a powerful paradigm for 3D scene representation, offering high-fidelity renderings and reconstructions from a set of sparse and unstructured sensor data. In the context of autonomous robotics, where perception and understanding of the environment are pivotal, NeRF holds immense promise for improving performance. In this paper, we present a comprehensiv… ▽ More

    Submitted 26 July, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: 32 pages, 5 figures, 8 tables

  3. arXiv:2405.01468  [pdf, other

    cs.LG cs.AI cs.CV

    Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models

    Authors: Yifei Ming, Yixuan Li

    Abstract: Pre-trained contrastive vision-language models have demonstrated remarkable performance across a wide range of tasks. However, they often struggle on fine-trained datasets with categories not adequately represented during pre-training, which makes adaptation necessary. Recent works have shown promising results by utilizing samples from web-scale databases for retrieval-augmented adaptation, especi… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: The paper is accepted at ICML 2024

  4. arXiv:2404.18437  [pdf, ps, other

    cs.IT

    A family of self-orthogonal divisible codes with locality 2

    Authors: Ziling Heng, Mengjie Yang, Yang Ming

    Abstract: Linear codes are widely studied due to their applications in communication, cryptography, quantum codes, distributed storage and many other fields. In this paper, we use the trace and norm functions over finite fields to construct a family of linear codes. The weight distributions of the codes are determined in three cases via Gaussian sums. The codes are shown to be self-orthogonal divisible code… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 25 pages

  5. arXiv:2403.20331  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models

    Authors: Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Qing Yu, Go Irie, Yixuan Li, Hai Li, Ziwei Liu, Kiyoharu Aizawa

    Abstract: This paper introduces a novel and significant challenge for Vision Language Models (VLMs), termed Unsolvable Problem Detection (UPD). UPD examines the VLM's ability to withhold answers when faced with unsolvable problems in the context of Visual Question Answering (VQA) tasks. UPD encompasses three distinct settings: Absent Answer Detection (AAD), Incompatible Answer Set Detection (IASD), and Inco… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Code: https://github.com/AtsuMiyai/UPD

  6. arXiv:2403.12536  [pdf, other

    cs.CV

    Vox-Fusion++: Voxel-based Neural Implicit Dense Tracking and Mapping with Multi-maps

    Authors: Hongjia Zhai, Hai Li, Xingrui Yang, Gan Huang, Yuhang Ming, Hujun Bao, Guofeng Zhang

    Abstract: In this paper, we introduce Vox-Fusion++, a multi-maps-based robust dense tracking and mapping system that seamlessly fuses neural implicit representations with traditional volumetric fusion techniques. Building upon the concept of implicit mapping and positioning systems, our approach extends its applicability to real-world scenarios. Our system employs a voxel-based neural implicit surface repre… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 14 pages. arXiv admin note: text overlap with arXiv:2210.15858

  7. arXiv:2403.10021  [pdf, other

    cs.CR

    Time-Frequency Jointed Imperceptible Adversarial Attack to Brainprint Recognition with Deep Learning Models

    Authors: Hangjie Yi, Yuhang Ming, Dongjun Liu, Wanzeng Kong

    Abstract: EEG-based brainprint recognition with deep learning models has garnered much attention in biometric identification. Yet, studies have indicated vulnerability to adversarial attacks in deep learning models with EEG inputs. In this paper, we introduce a novel adversarial attack method that jointly attacks time-domain and frequency-domain EEG signals by employing wavelet transform. Different from mos… ▽ More

    Submitted 30 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: This work is accepted by ICME 2024

  8. arXiv:2402.16280  [pdf, other

    cs.CV

    Few-Shot Learning for Annotation-Efficient Nucleus Instance Segmentation

    Authors: Yu Ming, Zihao Wu, Jie Yang, Danyi Li, Yuan Gao, Changxin Gao, Gui-Song Xia, Yuanqing Li, Li Liang, Jin-Gang Yu

    Abstract: Nucleus instance segmentation from histopathology images suffers from the extremely laborious and expert-dependent annotation of nucleus instances. As a promising solution to this task, annotation-efficient deep learning paradigms have recently attracted much research interest, such as weakly-/semi-supervised learning, generative adversarial learning, etc. In this paper, we propose to formulate an… ▽ More

    Submitted 27 February, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  9. arXiv:2402.07785  [pdf, other

    cs.LG

    HYPO: Hyperspherical Out-of-Distribution Generalization

    Authors: Yifei Ming, Haoyue Bai, Julian Katz-Samuels, Yixuan Li

    Abstract: Out-of-distribution (OOD) generalization is critical for machine learning models deployed in the real world. However, achieving this can be fundamentally challenging, as it requires the ability to learn invariant features across different domains or environments. In this paper, we propose a novel framework HYPO (HYPerspherical OOD generalization) that provably learns domain-invariant representatio… ▽ More

    Submitted 19 March, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: The conference version of this paper is published at ICLR 2024; First two authors contributed equally

  10. arXiv:2312.09538  [pdf, other

    cs.CV cs.RO

    AEGIS-Net: Attention-guided Multi-Level Feature Aggregation for Indoor Place Recognition

    Authors: Yuhang Ming, Jian Ma, Xingrui Yang, Weichen Dai, Yong Peng, Wanzeng Kong

    Abstract: We present AEGIS-Net, a novel indoor place recognition model that takes in RGB point clouds and generates global place descriptors by aggregating lower-level color, geometry features and higher-level implicit semantic features. However, rather than simple feature concatenation, self-attention modules are employed to select the most important local features that best describe an indoor place. Our A… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024)

  11. arXiv:2312.06424  [pdf, other

    cs.IR

    Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

    Authors: Ruijie Hou, Zhaoyang Yang, Yu Ming, Hongyu Lu, Zhuobin Zheng, Yu Chen, Qinsong Zeng, Ming Chen

    Abstract: Deep neural networks (DNNs) that incorporated lifelong sequential modeling (LSM) have brought great success to recommendation systems in various social media platforms. While continuous improvements have been made in domain-specific LSM, limited work has been done in cross-domain LSM, which considers modeling of lifelong sequences of both target domain and source domain. In this paper, we propose… ▽ More

    Submitted 17 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted by KDD 2024

  12. arXiv:2310.05962  [pdf, other

    cs.IT cs.LG eess.SP

    Improving the Performance of R17 Type-II Codebook with Deep Learning

    Authors: Ke Ma, Yiliang Sang, Yang Ming, Jin Lian, Chang Tian, Zhaocheng Wang

    Abstract: The Type-II codebook in Release 17 (R17) exploits the angular-delay-domain partial reciprocity between uplink and downlink channels to select part of angular-delay-domain ports for measuring and feeding back the downlink channel state information (CSI), where the performance of existing deep learning enhanced CSI feedback methods is limited due to the deficiency of sparse structures. To address th… ▽ More

    Submitted 13 September, 2023; originally announced October 2023.

    Comments: Accepted by IEEE GLOBECOM 2023, conference version of Arxiv:2305.08081

  13. arXiv:2308.02670  [pdf, other

    cs.RO cs.CV

    EDI: ESKF-based Disjoint Initialization for Visual-Inertial SLAM Systems

    Authors: Weihan Wang, Jiani Li, Yuhang Ming, Philippos Mordohai

    Abstract: Visual-inertial initialization can be classified into joint and disjoint approaches. Joint approaches tackle both the visual and the inertial parameters together by aligning observations from feature-bearing points based on IMU integration then use a closed-form solution with visual and acceleration observations to find initial velocity and gravity. In contrast, disjoint approaches independently s… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  14. How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?

    Authors: Yifei Ming, Yixuan Li

    Abstract: Recent large vision-language models such as CLIP have shown remarkable out-of-distribution (OOD) detection and generalization performance. However, their zero-shot in-distribution (ID) accuracy is often limited for downstream datasets. Recent CLIP-based fine-tuning methods such as prompt learning have demonstrated significant improvements in ID classification and OOD generalization where OOD label… ▽ More

    Submitted 28 July, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted to IJCV 2023

    Journal ref: International Journal of Computer Vision 2023

  15. arXiv:2305.08081  [pdf, other

    cs.IT cs.AI

    Deep Learning Empowered Type-II Codebook: New Paradigm for Enhancing CSI Feedback

    Authors: Ke Ma, Yiliang Sang, Yang Ming, Jin Lian, Chang Tian, Zhaocheng Wang

    Abstract: Deep learning based channel state information (CSI) feedback in frequency division duplex systems has drawn much attention in both academia and industry. In this paper, we focus on integrating the Type-II codebook in the beyond fifth-generation (B5G) wireless systems with deep learning to enhance the performance of CSI feedback. In contrast to its counterpart in Release 16, the Type-II codebook in… ▽ More

    Submitted 30 May, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: This updated version has been submitted to IEEE for possible publication. Copyright may be transferred without notice

  16. arXiv:2303.07527  [pdf, other

    cs.LG cs.CV

    Domain Generalization via Nuclear Norm Regularization

    Authors: Zhenmei Shi, Yifei Ming, Ying Fan, Frederic Sala, Yingyu Liang

    Abstract: The ability to generalize to unseen domains is crucial for machine learning systems deployed in the real world, especially when we only have data from limited training domains. In this paper, we propose a simple and effective regularization method based on the nuclear norm of the learned features for domain generalization. Intuitively, the proposed regularizer mitigates the impacts of environmenta… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: 23 pages

  17. arXiv:2301.02299  [pdf, other

    cs.CL cs.AI cs.LG

    Sequentially Controlled Text Generation

    Authors: Alexander Spangher, Xinyu Hua, Yao Ming, Nanyun Peng

    Abstract: While GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure. We study the problem of imposing structure on long-range text. We propose a novel controlled text generation task, sequentially controlled text generation, and identify a dataset, NewsDiscourse as a starting point for this task. We develop a sequential control… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

    Comments: 19 pages. 10 pages main body, 3 pages references, 6 pages appendix

    Journal ref: Findings of the 2022 Conference on Empirical Methods in Natural Language Processing

  18. arXiv:2211.13445  [pdf, other

    cs.CV cs.AI cs.LG

    Delving into Out-of-Distribution Detection with Vision-Language Representations

    Authors: Yifei Ming, Ziyang Cai, Jiuxiang Gu, Yiyou Sun, Wei Li, Yixuan Li

    Abstract: Recognizing out-of-distribution (OOD) samples is critical for machine learning systems deployed in the open world. The vast majority of OOD detection methods are driven by a single modality (e.g., either vision or language), leaving the rich information in multi-modal representations untapped. Inspired by the recent success of vision-language pre-training, this paper enriches the landscape of OOD… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  19. Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation

    Authors: Xingrui Yang, Hai Li, Hongjia Zhai, Yuhang Ming, Yuqian Liu, Guofeng Zhang

    Abstract: In this work, we present a dense tracking and mapping system named Vox-Fusion, which seamlessly fuses neural implicit representations with traditional volumetric fusion methods. Our approach is inspired by the recently developed implicit mapping and positioning system and further extends the idea so that it can be freely applied to practical scenarios. Specifically, we leverage a voxel-based neura… ▽ More

    Submitted 6 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

  20. arXiv:2210.01806  [pdf

    eess.IV cs.AI cs.CV cs.LG

    Low-Light Image Restoration Based on Retina Model using Neural Networks

    Authors: Yurui Ming, Yuanyuan Liang

    Abstract: We report the possibility of using a simple neural network for effortless restoration of low-light images inspired by the retina model, which mimics the neurophysiological principles and dynamics of various types of optical neurons. The proposed neural network model saves the cost of computational overhead in contrast with traditional signal-processing models, and generates results comparable with… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

  21. arXiv:2209.07919  [pdf, other

    cs.RO cs.CV

    iDF-SLAM: End-to-End RGB-D SLAM with Neural Implicit Mapping and Deep Feature Tracking

    Authors: Yuhang Ming, Weicai Ye, Andrew Calway

    Abstract: We propose a novel end-to-end RGB-D SLAM, iDF-SLAM, which adopts a feature-based deep neural tracker as the front-end and a NeRF-style neural implicit mapper as the back-end. The neural implicit mapper is trained on-the-fly, while though the neural tracker is pretrained on the ScanNet dataset, it is also finetuned along with the training of the neural implicit mapper. Under such a design, our iDF-… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: 7 pages, 6 figures, 3 tables

  22. arXiv:2207.08794  [pdf, other

    cs.CV cs.RO

    DeFlowSLAM: Self-Supervised Scene Motion Decomposition for Dynamic Dense SLAM

    Authors: Weicai Ye, Xingyuan Yu, Xinyue Lan, Yuhang Ming, Jinyu Li, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

    Abstract: We present a novel dual-flow representation of scene motion that decomposes the optical flow into a static flow field caused by the camera motion and another dynamic flow field caused by the objects' movements in the scene. Based on this representation, we present a dynamic SLAM, dubbed DeFlowSLAM, that exploits both static and dynamic pixels in the images to solve the camera poses, rather than si… ▽ More

    Submitted 13 January, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: Homepage: https://zju3dv.github.io/deflowslam

  23. arXiv:2207.01610  [pdf, other

    cs.CV cs.RO

    PVO: Panoptic Visual Odometry

    Authors: Weicai Ye, Xinyue Lan, Shuo Chen, Yuhang Ming, Xingyuan Yu, Hujun Bao, Zhaopeng Cui, Guofeng Zhang

    Abstract: We present PVO, a novel panoptic visual odometry framework to achieve more comprehensive modeling of the scene motion, geometry, and panoptic segmentation information. Our PVO models visual odometry (VO) and video panoptic segmentation (VPS) in a unified view, which makes the two tasks mutually beneficial. Specifically, we introduce a panoptic update module into the VO Module with the guidance of… ▽ More

    Submitted 26 March, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: CVPR2023 Project page: https://zju3dv.github.io/pvo/ code: https://github.com/zju3dv/PVO

  24. arXiv:2206.13687  [pdf, other

    cs.LG cs.AI cs.CV

    POEM: Out-of-Distribution Detection with Posterior Sampling

    Authors: Yifei Ming, Ying Fan, Yixuan Li

    Abstract: Out-of-distribution (OOD) detection is indispensable for machine learning models deployed in the open world. Recently, the use of an auxiliary outlier dataset during training (also known as outlier exposure) has shown promising performance. As the sample space for potential OOD data can be prohibitively large, sampling informative outliers is essential. In this work, we propose a novel posterior s… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: ICML 2022 (Long Talk); First two authors contributed equally

    Journal ref: Thirty-ninth International Conference on Machine Learning (2022)

  25. arXiv:2205.15592  [pdf

    cs.LG cs.CR

    Semantic Autoencoder and Its Potential Usage for Adversarial Attack

    Authors: Yurui Ming, Cuihuan Du, Chin-Teng Lin

    Abstract: Autoencoder can give rise to an appropriate latent representation of the input data, however, the representation which is solely based on the intrinsic property of the input data, is usually inferior to express some semantic information. A typical case is the potential incapability of forming a clear boundary upon clustering of these representations. By encoding the latent representation that not… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

  26. arXiv:2205.11616  [pdf, other

    cs.CL cs.LG

    Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment

    Authors: Tuan Dinh, Jy-yong Sohn, Shashank Rajput, Timothy Ossowski, Yifei Ming, Junjie Hu, Dimitris Papailiopoulos, Kangwook Lee

    Abstract: Word translation without parallel corpora has become feasible, rivaling the performance of supervised methods. Recent findings have shown that the accuracy and robustness of unsupervised word translation (UWT) can be improved by making use of visual observations, which are universal representations across languages. In this work, we investigate the potential of using not only visual observations b… ▽ More

    Submitted 7 November, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings)

  27. arXiv:2204.06507  [pdf, other

    cs.LG cs.CV

    Out-of-Distribution Detection with Deep Nearest Neighbors

    Authors: Yiyou Sun, Yifei Ming, Xiaojin Zhu, Yixuan Li

    Abstract: Out-of-distribution (OOD) detection is a critical task for deploying machine learning models in the open world. Distance-based methods have demonstrated promise, where testing samples are detected as OOD if they are relatively far away from in-distribution (ID) data. However, prior methods impose a strong distributional assumption of the underlying feature space, which may not always hold. In this… ▽ More

    Submitted 7 December, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: 15 pages, 4 figures, accepted in ICML 2022

  28. FD-SLAM: 3-D Reconstruction Using Features and Dense Matching

    Authors: Xingrui Yang, Yuhang Ming, Zhaopeng Cui, Andrew Calway

    Abstract: It is well known that visual SLAM systems based on dense matching are locally accurate but are also susceptible to long-term drift and map corruption. In contrast, feature matching methods can achieve greater long-term consistency but can suffer from inaccurate local pose estimation when feature information is sparse. Based on these observations, we propose an RGB-D SLAM system that leverages the… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

  29. arXiv:2203.09125  [pdf, other

    cs.CV cs.AI cs.LG

    Are Vision Transformers Robust to Spurious Correlations?

    Authors: Soumya Suvra Ghosal, Yifei Ming, Yixuan Li

    Abstract: Deep neural networks may be susceptible to learning spurious correlations that hold on average but not in atypical test samples. As with the recent emergence of vision transformer (ViT) models, it remains underexplored how spurious correlations are manifested in such architectures. In this paper, we systematically investigate the robustness of vision transformers to spurious correlations on three… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

  30. arXiv:2203.04450  [pdf, other

    cs.CV cs.LG

    How to Exploit Hyperspherical Embeddings for Out-of-Distribution Detection?

    Authors: Yifei Ming, Yiyou Sun, Ousmane Dia, Yixuan Li

    Abstract: Out-of-distribution (OOD) detection is a critical task for reliable machine learning. Recent advances in representation learning give rise to distance-based OOD detection, where testing samples are detected as OOD if they are relatively far away from the centroids or prototypes of in-distribution (ID) classes. However, prior methods directly take off-the-shelf contrastive losses that suffice for c… ▽ More

    Submitted 15 April, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Published at ICLR 2023

    Journal ref: The Eleventh International Conference on Learning Representations, 2023

  31. arXiv:2202.02070  [pdf, other

    cs.CV

    CGiS-Net: Aggregating Colour, Geometry and Implicit Semantic Features for Indoor Place Recognition

    Authors: Yuhang Ming, Xingrui Yang, Guofeng Zhang, Andrew Calway

    Abstract: We describe a novel approach to indoor place recognition from RGB point clouds based on aggregating low-level colour and geometry features with high-level implicit semantic features. It uses a 2-stage deep learning framework, in which the first stage is trained for the auxiliary task of semantic segmentation and the second stage uses features from layers in the first stage to generate discriminate… ▽ More

    Submitted 11 July, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: Accepted by 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

  32. arXiv:2109.05642  [pdf, other

    cs.LG cs.AI

    On the Impact of Spurious Correlation for Out-of-distribution Detection

    Authors: Yifei Ming, Hang Yin, Yixuan Li

    Abstract: Modern neural networks can assign high confidence to inputs drawn from outside the training distribution, posing threats to models in real-world deployments. While much research attention has been placed on designing new out-of-distribution (OOD) detection methods, the precise definition of OOD is often left in vagueness and falls short of the desired notion of OOD in reality. In this paper, we pr… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Journal ref: AAAI 2022

  33. arXiv:2108.02522  [pdf, other

    cs.CV cs.RO

    Object-Augmented RGB-D SLAM for Wide-Disparity Relocalisation

    Authors: Yuhang Ming, Xingrui Yang, Andrew Calway

    Abstract: We propose a novel object-augmented RGB-D SLAM system that is capable of constructing a consistent object map and performing relocalisation based on centroids of objects in the map. The approach aims to overcome the view dependence of appearance-based relocalisation methods using point features or images. During the map construction, we use a pre-trained neural network to detect objects and estima… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: Accepted by 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)

  34. arXiv:2107.08356  [pdf, other

    cs.CL cs.HC cs.LG cs.MM

    DeHumor: Visual Analytics for Decomposing Humor

    Authors: Xingbo Wang, Yao Ming, Tongshuang Wu, Haipeng Zeng, Yong Wang, Huamin Qu

    Abstract: Despite being a critical communication skill, grasping humor is challenging -- a successful use of humor requires a mixture of both engaging content build-up and an appropriate vocal delivery (e.g., pause). Prior studies on computational humor emphasize the textual and audio features immediately next to the punchline, yet overlooking longer-term context setup. Moreover, the theories are usually to… ▽ More

    Submitted 18 July, 2021; originally announced July 2021.

    Comments: 15 pages. A preprint version of a publication at IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021

    ACM Class: I.2.7; I.7.0; H.4.2; J.4

  35. arXiv:2102.10994  [pdf

    q-bio.NC cs.LG cs.NE eess.SP

    Coherence of Working Memory Study Between Deep Neural Network and Neurophysiology

    Authors: Yurui Ming

    Abstract: The auto feature extraction capability of deep neural networks (DNN) endows them the potentiality for analysing complicated electroencephalogram (EEG) data captured from brain functionality research. This work investigates the potential coherent correspondence between the region-of-interest (ROI) for DNN to explore, and ROI for conventional neurophysiological oriented methods to work with, exempli… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

  36. arXiv:2012.09613  [pdf, other

    cs.LG

    Model-based Reinforcement Learning for Continuous Control with Posterior Sampling

    Authors: Ying Fan, Yifei Ming

    Abstract: Balancing exploration and exploitation is crucial in reinforcement learning (RL). In this paper, we study model-based posterior sampling for reinforcement learning (PSRL) in continuous state-action spaces theoretically and empirically. First, we show the first regret bound of PSRL in continuous spaces which is polynomial in the episode length to the best of our knowledge. With the assumption that… ▽ More

    Submitted 16 November, 2021; v1 submitted 20 November, 2020; originally announced December 2020.

    Comments: Accepted to ICML 2021

    Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3078-3087, 2021

  37. arXiv:2011.11048  [pdf, other

    cs.HC cs.LG cs.SI

    GNNLens: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

    Authors: Zhihua Jin, Yong Wang, Qianwen Wang, Yao Ming, Tengfei Ma, Huamin Qu

    Abstract: Graph Neural Networks (GNNs) aim to extend deep learning techniques to graph data and have achieved significant progress in graph analysis tasks (e.g., node classification) in recent years. However, similar to other deep neural networks like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), GNNs behave like a black box with their details hidden from model developers and us… ▽ More

    Submitted 7 April, 2022; v1 submitted 22 November, 2020; originally announced November 2020.

    Comments: 17 pages

  38. arXiv:2010.04884  [pdf

    cs.RO

    Truck-and-Trailer Backer-Upper problem using Cascaded Fuzzy Controllers

    Authors: Yurui Ming

    Abstract: In this paper we craft a cascaded fuzzy controlling system for the traditional Truck-and-Trailer Backer-Upper problem, which is a benchmarking for testing various intelligent controlling systems. Inspired by the most inclination of human operations, we decompose the original overall controlling problem into two sub-controlling problems. A first fuzzy controller which predicts the optimal deviation… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

  39. arXiv:2008.08353  [pdf, other

    cs.LG cs.HC stat.ML

    DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models

    Authors: Furui Cheng, Yao Ming, Huamin Qu

    Abstract: With machine learning models being increasingly applied to various decision-making scenarios, people have spent growing efforts to make machine learning models more transparent and explainable. Among various explanation techniques, counterfactual explanations have the advantages of being human-friendly and actionable -- a counterfactual explanation tells the user how to gain the desired prediction… ▽ More

    Submitted 19 August, 2020; originally announced August 2020.

    Comments: 10 pages, 7 figures. The paper will be published on IEEE Transactions on Visualization and Computer Graphics (TVCG)

    ACM Class: I.2.0; H.5.2

  40. arXiv:2003.12428  [pdf

    cs.MM cs.LG stat.ML

    A General Approach for Using Deep Neural Network for Digital Watermarking

    Authors: Yurui Ming, Weiping Ding, Zehong Cao, Chin-Teng Lin

    Abstract: Technologies of the Internet of Things (IoT) facilitate digital contents such as images being acquired in a massive way. However, consideration from the privacy or legislation perspective still demands the need for intellectual content protection. In this paper, we propose a general deep neural network (DNN) based watermarking method to fulfill this goal. Instead of training a neural network for p… ▽ More

    Submitted 8 March, 2020; originally announced March 2020.

  41. arXiv:2001.02399  [pdf

    cs.LG eess.SP stat.ML

    EEG-based Drowsiness Estimation for Driving Safety using Deep Q-Learning

    Authors: Yurui Ming, Dongrui Wu, Yu-Kai Wang, Yuhui Shi, Chin-Teng Lin

    Abstract: Fatigue is the most vital factor of road fatalities and one manifestation of fatigue during driving is drowsiness. In this paper, we propose using deep Q-learning to analyze an electroencephalogram (EEG) dataset captured during a simulated endurance driving test. By measuring the correlation between drowsiness and driving performance, this experiment represents an important brain-computer interfac… ▽ More

    Submitted 16 May, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

  42. arXiv:1907.09728  [pdf, other

    cs.LG cs.HC stat.ML

    Interpretable and Steerable Sequence Learning via Prototypes

    Authors: Yao Ming, Panpan Xu, Huamin Qu, Liu Ren

    Abstract: One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the doma… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: Accepted as a full paper at KDD 2019 on May 8, 2019

    Journal ref: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019

  43. arXiv:1907.08796  [pdf, other

    cs.HC

    Learning Vis Tools: Teaching Data Visualization Tutorials

    Authors: Leo Yu-Ho Lo, Yao Ming, Huamin Qu

    Abstract: Teaching and advocating data visualization are among the most important activities in the visualization community. With growing interest in data analysis from business and science professionals, data visualization courses attract students across different disciplines. However, comprehensive visualization training requires students to have a certain level of proficiency in programming, a requiremen… ▽ More

    Submitted 7 October, 2019; v1 submitted 20 July, 2019; originally announced July 2019.

    Comments: 5 pages, 1 figure, IEEE VIS 2019

  44. arXiv:1902.05009  [pdf, other

    cs.LG cs.HC stat.ML

    ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning

    Authors: Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, Huamin Qu

    Abstract: To relieve the pain of manually selecting machine learning algorithms and tuning hyperparameters, automated machine learning (AutoML) methods have been developed to automatically search for good models. Due to the huge model search space, it is impossible to try all models. Users tend to distrust automatic results and increase the search budget as much as they can, thereby undermining the efficien… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: Published in the ACM Conference on Human Factors in Computing Systems (CHI), 2019, Glasgow, Scotland UK

    Journal ref: In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Paper 681, 1-12

  45. arXiv:1807.06228  [pdf, other

    cs.LG cs.AI cs.HC stat.ML

    RuleMatrix: Visualizing and Understanding Classifiers with Rules

    Authors: Yao Ming, Huamin Qu, Enrico Bertini

    Abstract: With the growing adoption of machine learning techniques, there is a surge of research interest towards making machine learning systems more transparent and interpretable. Various visualizations have been developed to help model developers understand, diagnose, and refine machine learning models. However, a large number of potential but neglected users are the domain experts with little knowledge… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Comments: Accepted by IEEE Conference of Visual Analytics Science and Technology 2018

  46. arXiv:1710.10777  [pdf, other

    cs.CL cs.AI

    Understanding Hidden Memories of Recurrent Neural Networks

    Authors: Yao Ming, Shaozu Cao, Ruixiang Zhang, Zhen Li, Yuanzhe Chen, Yangqiu Song, Huamin Qu

    Abstract: Recurrent neural networks (RNNs) have been successfully applied to various natural language processing (NLP) tasks and achieved better results than conventional methods. However, the lack of understanding of the mechanisms behind their effectiveness limits further improvements on their architectures. In this paper, we present a visual analytics method for understanding and comparing RNN models for… ▽ More

    Submitted 30 October, 2017; originally announced October 2017.

    Comments: Published at IEEE Conference on Visual Analytics Science and Technology (IEEE VAST 2017)

  47. arXiv:1710.00166  [pdf, other

    cs.CV

    PCANet-II: When PCANet Meets the Second Order Pooling

    Authors: Lei Tian, Xiaopeng Hong, Guoying Zhao, Chunxiao Fan, Yue Ming, Matti Pietikäinen

    Abstract: PCANet, as one noticeable shallow network, employs the histogram representation for feature pooling. However, there are three main problems about this kind of pooling method. First, the histogram-based pooling method binarizes the feature maps and leads to inevitable discriminative information loss. Second, it is difficult to effectively combine other visual cues into a compact representation, bec… ▽ More

    Submitted 30 September, 2017; originally announced October 2017.

  48. arXiv:1606.00305  [pdf, ps, other

    cs.CV

    Improving Deep Neural Network with Multiple Parametric Exponential Linear Units

    Authors: Yang Li, Chunxiao Fan, Yong Li, Qiong Wu, Yue Ming

    Abstract: Activation function is crucial to the recent successes of deep neural networks. In this paper, we first propose a new activation function, Multiple Parametric Exponential Linear Units (MPELU), aiming to generalize and unify the rectified and exponential linear units. As the generalized form, MPELU shares the advantages of Parametric Rectified Linear Unit (PReLU) and Exponential Linear Unit (ELU),… ▽ More

    Submitted 17 January, 2017; v1 submitted 1 June, 2016; originally announced June 2016.