Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 93 results for author: Wan, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10903  [pdf, other

    cs.LG cs.CL cs.SE

    New Solutions on LLM Acceleration, Optimization, and Application

    Authors: Yingbing Huang, Lily Jiaxin Wan, Hanchen Ye, Manvi Jha, Jinghua Wang, Yuhong Li, Xiaofan Zhang, Deming Chen

    Abstract: Large Language Models (LLMs) have become extremely potent instruments with exceptional capacities for comprehending and producing human-like text in a wide range of applications. However, the increasing size and complexity of LLMs present significant challenges in both training and deployment, leading to substantial computational and storage costs as well as heightened energy consumption. In this… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: This is an expanded and more comprehensive study based on our invited DAC-24 paper with the same title and co-authors

  2. arXiv:2405.04175  [pdf, other

    cs.CV

    Topicwise Separable Sentence Retrieval for Medical Report Generation

    Authors: Junting Zhao, Yang Zhou, Zhihao Chen, Huazhu Fu, Liang Wan

    Abstract: Automated radiology reporting holds immense clinical potential in alleviating the burdensome workload of radiologists and mitigating diagnostic bias. Recently, retrieval-based report generation methods have garnered increasing attention due to their inherent advantages in terms of the quality and consistency of generated reports. However, due to the long-tail distribution of the training data, the… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  3. arXiv:2405.01584  [pdf, other

    cs.CL cs.LG eess.SP

    Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression

    Authors: Li Wan, Tansu Alpcan, Margreta Kuijper, Emanuele Viterbo

    Abstract: We propose a novel, lightweight supervised dictionary learning framework for text classification based on data compression and representation. This two-phase algorithm initially employs the Lempel-Ziv-Welch (LZW) algorithm to construct a dictionary from text datasets, focusing on the conceptual significance of dictionary elements. Subsequently, dictionaries are refined considering label data, opti… ▽ More

    Submitted 28 April, 2024; originally announced May 2024.

    Comments: 12 pages, TKDE format

  4. arXiv:2405.00074  [pdf, other

    cs.LG cs.SE

    PAODING: A High-fidelity Data-free Pruning Toolkit for Debloating Pre-trained Neural Networks

    Authors: Mark Huasong Meng, Hao Guan, Liuhuo Wan, Sin Gee Teo, Guangdong Bai, Jin Song Dong

    Abstract: We present PAODING, a toolkit to debloat pretrained neural network models through the lens of data-free pruning. To preserve the model fidelity, PAODING adopts an iterative process, which dynamically measures the effect of deleting a neuron to identify candidates that have the least impact to the output layer. Our evaluation shows that PAODING can significantly reduce the model size, generalize on… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 3 pages

  5. arXiv:2404.11111  [pdf, other

    cs.CV

    CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation

    Authors: Lianyu Hu, Wei Feng, Liqing Gao, Zekang Liu, Liang Wan

    Abstract: In sign language, the conveyance of human body trajectories predominantly relies upon the coordinated movements of hands and facial expressions across successive frames. Despite the recent advancements of sign language understanding methods, they often solely focus on individual frames, inevitably overlooking the inter-frame correlations that are essential for effectively modeling human body traje… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.03202

  6. arXiv:2404.10253  [pdf, other

    cs.DC

    Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development

    Authors: Xiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiaojing Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu , et al. (16 additional authors not shown)

    Abstract: With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 13 figures

  7. arXiv:2404.06661  [pdf, other

    cs.CV

    Efficient Denoising using Score Embedding in Score-based Diffusion Models

    Authors: Andrew S. Na, William Gao, Justin W. L. Wan

    Abstract: It is well known that training a denoising score-based diffusion models requires tens of thousands of epochs and a substantial number of image data to train the model. In this paper, we propose to increase the efficiency in training score-based diffusion models. Our method allows us to decrease the number of epochs needed to train the diffusion model. We accomplish this by solving the log-density… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  8. arXiv:2403.01414  [pdf, other

    cs.CV

    Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes

    Authors: Yujie Lu, Long Wan, Nayu Ding, Yulong Wang, Shuhan Shen, Shen Cai, Lin Gao

    Abstract: Neural implicit representation of geometric shapes has witnessed considerable advancements in recent years. However, common distance field based implicit representations, specifically signed distance field (SDF) for watertight shapes or unsigned distance field (UDF) for arbitrary shapes, routinely suffer from degradation of reconstruction accuracy when converting to explicit surface points and mes… ▽ More

    Submitted 1 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: accepted by CVPR 2024

  9. arXiv:2402.17978  [pdf, other

    cs.LG cs.AI cs.MA

    Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning

    Authors: Zeyang Liu, Lipeng Wan, Xinrui Yang, Zhuoran Chen, Xingyu Chen, Xuguang Lan

    Abstract: Effective exploration is crucial to discovering optimal strategies for multi-agent reinforcement learning (MARL) in complex coordination tasks. Existing methods mainly utilize intrinsic rewards to enable committed exploration or use role-based learning for decomposing joint action spaces instead of directly conducting a collective search in the entire action-observation space. However, they often… ▽ More

    Submitted 1 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: The 38th Annual AAAI Conference on Artificial Intelligence

  10. arXiv:2402.03807  [pdf, other

    cs.LG cs.AI

    SEABO: A Simple Search-Based Method for Offline Imitation Learning

    Authors: Jiafei Lyu, Xiaoteng Ma, Le Wan, Runze Liu, Xiu Li, Zongqing Lu

    Abstract: Offline reinforcement learning (RL) has attracted much attention due to its ability in learning from static offline datasets and eliminating the need of interacting with the environment. Nevertheless, the success of offline RL relies heavily on the offline transitions annotated with reward labels. In practice, we often need to hand-craft the reward function, which is sometimes difficult, labor-int… ▽ More

    Submitted 21 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: To appear in ICLR2024

  11. arXiv:2402.02701  [pdf, other

    cs.LG cs.AI stat.ML

    Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence

    Authors: Jiafei Lyu, Le Wan, Xiu Li, Zongqing Lu

    Abstract: Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL). In this scenario, it is important to learn a generalizable policy, as the testing environment may differ from the training environment, e.g., there exist distractors during deployment. Many practical algorithms are proposed to handle this problem. However, to the best… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Part of this work is accepted as AAMAS 2024 extended abstract

  12. arXiv:2401.17268  [pdf, other

    cs.CL cs.AI cs.LG

    Weaver: Foundation Models for Creative Writing

    Authors: Tiannan Wang, Jiamin Chen, Qingrui Jia, Shuai Wang, Ruoyu Fang, Huilin Wang, Zhaowei Gao, Chunzhao Xie, Chuou Xu, Jihong Dai, Yibin Liu, Jialong Wu, Shengwei Ding, Long Li, Zhiwei Huang, Xinle Deng, Teng Yu, Gangan Ma, Han Xiao, Zixin Chen, Danjun Xiang, Yunxia Wang, Yuanyuan Zhu, Yi Xiao, Jing Wang , et al. (21 additional authors not shown)

    Abstract: This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  13. arXiv:2401.15946  [pdf, ps, other

    cs.IT

    Approaching Maximum Likelihood Decoding Performance via Reshuffling ORBGRAND

    Authors: Li Wan, Wenyi Zhang

    Abstract: Guessing random additive noise decoding (GRAND) is a recently proposed decoding paradigm particularly suitable for codes with short length and high rate. Among its variants, ordered reliability bits GRAND (ORBGRAND) exploits soft information in a simple and effective fashion to schedule its queries, thereby allowing efficient hardware implementation. Compared with maximum likelihood (ML) decoding,… ▽ More

    Submitted 28 April, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  14. MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoring

    Authors: Qian Gong, Jieyang Chen, Ben Whitney, Xin Liang, Viktor Reshniak, Tania Banerjee, Jaemoon Lee, Anand Rangarajan, Lipeng Wan, Nicolas Vidal, Qing Liu, Ana Gainaru, Norbert Podhorszki, Richard Archibald, Sanjay Ranka, Scott Klasky

    Abstract: We describe MGARD, a software providing MultiGrid Adaptive Reduction for floating-point scientific data on structured and unstructured grids. With exceptional data compression capability and precise error control, MGARD addresses a wide range of requirements, including storage reduction, high-performance I/O, and in-situ data analysis. It features a unified application programming interface (API)… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 20 pages, 8 figures

    Journal ref: SoftwareX, 24(2023), 101590

  15. arXiv:2401.04283  [pdf, ps, other

    eess.AS cs.SD

    FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

    Authors: Yang Liu, Li Wan, Yun Li, Yiteng Huang, Ming Sun, James Luan, Yangyang Shi, Xin Lei

    Abstract: Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted. In this paper, we propose DI-AEC, pioneering a diffusion-based stochastic regeneration approach dedicated to AEC. Further, we propose FADI-AEC, fast score-based diffusion AEC framework to save computational demands, making it favorable for edge devices. It stan… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  16. Spatiotemporally adaptive compression for scientific dataset with feature preservation -- a case study on simulation data with extreme climate events analysis

    Authors: Qian Gong, Chengzhu Zhang, Xin Liang, Viktor Reshniak, Jieyang Chen, Anand Rangarajan, Sanjay Ranka, Nicolas Vidal, Lipeng Wan, Paul Ullrich, Norbert Podhorszki, Robert Jacob, Scott Klasky

    Abstract: Scientific discoveries are increasingly constrained by limited storage space and I/O capacities. For time-series simulations and experiments, their data often need to be decimated over timesteps to accommodate storage and I/O limitations. In this paper, we propose a technique that addresses storage costs while improving post-analysis accuracy through spatiotemporal adaptive, error-controlled lossy… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 10 pages, 13 figures, 2023 IEEE International Conference on e-Science and Grid Computing

    Journal ref: 2023 IEEE 19th International Conference on e-Science, Limassol, Cyprus, 2023, pp. 1-10

  17. arXiv:2401.01054  [pdf, other

    cs.LG cs.AI

    Elastic Multi-Gradient Descent for Parallel Continual Learning

    Authors: Fan Lyu, Wei Feng, Yuepan Li, Qing Sun, Fanhua Shang, Liang Wan, Liang Wang

    Abstract: The goal of Continual Learning (CL) is to continuously learn from new data streams and accomplish the corresponding tasks. Previously studied CL assumes that data are given in sequence nose-to-tail for different tasks, thus indeed belonging to Serial Continual Learning (SCL). This paper studies the novel paradigm of Parallel Continual Learning (PCL) in dynamic multi-task scenarios, where a diverse… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Submited to IEEE TPAMI

  18. arXiv:2312.04416  [pdf, other

    cs.LG cs.CY

    Monitoring Sustainable Global Development Along Shared Socioeconomic Pathways

    Authors: Michelle W. L. Wan, Jeffrey N. Clark, Edward A. Small, Elena Fillola Mayoral, Raúl Santos-Rodríguez

    Abstract: Sustainable global development is one of the most prevalent challenges facing the world today, hinging on the equilibrium between socioeconomic growth and environmental sustainability. We propose approaches to monitor and quantify sustainable development along the Shared Socioeconomic Pathways (SSPs), including mathematically derived scoring algorithms, and machine learning methods. These integrat… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 5 pages, 1 figure. Presented at NeurIPS 2023 Workshop: Tackling Climate Change with Machine Learning

  19. arXiv:2310.20490  [pdf, other

    cs.CV cs.LG

    Long-Tailed Learning as Multi-Objective Optimization

    Authors: Weiqi Li, Fan Lyu, Fanhua Shang, Liang Wan, Wei Feng

    Abstract: Real-world data is extremely imbalanced and presents a long-tailed distribution, resulting in models that are biased towards classes with sufficient samples and perform poorly on rare classes. Recent methods propose to rebalance classes but they undertake the seesaw dilemma (what is increasing performance on tail classes may decrease that of head classes, and vice versa). In this paper, we argue t… ▽ More

    Submitted 1 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: In submission

  20. arXiv:2310.20305  [pdf

    cs.CV

    Bilateral Network with Residual U-blocks and Dual-Guided Attention for Real-time Semantic Segmentation

    Authors: Liang Liao, Liang Wan, Mingsheng Liu, Shusheng Li

    Abstract: When some application scenarios need to use semantic segmentation technology, like automatic driving, the primary concern comes to real-time performance rather than extremely high segmentation accuracy. To achieve a good trade-off between speed and accuracy, two-branch architecture has been proposed in recent years. It treats spatial information and semantics information separately which allows th… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  21. arXiv:2310.07898  [pdf, other

    cs.SE cs.DB

    FlorDB: Multiversion Hindsight Logging for Continuous Training

    Authors: Rolando Garcia, Anusha Dandamudi, Gabriel Matute, Lehan Wan, Joseph Gonzalez, Joseph M. Hellerstein, Koushik Sen

    Abstract: Production Machine Learning involves continuous training: hosting multiple versions of models over time, often with many model versions running at once. When model performance does not meet expectations, Machine Learning Engineers (MLEs) debug issues by exploring and analyzing numerous prior versions of code and training data to identify root causes and mitigate problems. Traditional debugging and… ▽ More

    Submitted 2 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  22. arXiv:2310.04367  [pdf

    stat.ML cs.LG

    A Marketplace Price Anomaly Detection System at Scale

    Authors: Akshit Sarpal, Qiwen Kang, Fangping Huang, Yang Song, Lijie Wan

    Abstract: Online marketplaces execute large volume of price updates that are initiated by individual marketplace sellers each day on the platform. This price democratization comes with increasing challenges with data quality. Lack of centralized guardrails that are available for a traditional online retailer causes a higher likelihood for inaccurate prices to get published on the website, leading to poor cu… ▽ More

    Submitted 9 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: 10 pages, 4 figures, 7 tables

  23. arXiv:2309.16127  [pdf, other

    cs.CV

    Open Compound Domain Adaptation with Object Style Compensation for Semantic Segmentation

    Authors: Tingliang Feng, Hao Shi, Xueyang Liu, Wei Feng, Liang Wan, Yanlin Zhou, Di Lin

    Abstract: Many methods of semantic image segmentation have borrowed the success of open compound domain adaptation. They minimize the style gap between the images of source and target domains, more easily predicting the accurate pseudo annotations for target domain's images that train segmentation network. The existing methods globally adapt the scene style of the images, whereas the object styles of differ… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted by NeurlPS2023

  24. arXiv:2309.15965  [pdf, other

    cs.LG cs.CY math.MG

    TraCE: Trajectory Counterfactual Explanation Scores

    Authors: Jeffrey N. Clark, Edward A. Small, Nawid Keshtmand, Michelle W. L. Wan, Elena Fillola Mayoral, Enrico Werner, Christopher P. Bourdeaux, Raul Santos-Rodriguez

    Abstract: Counterfactual explanations, and their associated algorithmic recourse, are typically leveraged to understand, explain, and potentially alter a prediction coming from a black-box classifier. In this paper, we propose to extend the use of counterfactuals to evaluate progress in sequential decision making tasks. To this end, we introduce a model-agnostic modular framework, TraCE (Trajectory Counterf… ▽ More

    Submitted 26 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: 10 pages, 4 figures, appendix

  25. arXiv:2309.12865  [pdf, other

    cs.CV

    Bridging Sensor Gaps via Single-Direction Tuning for Hyperspectral Image Classification

    Authors: Xizhe Xue, Haokui Zhang, Ying Li, Liuwei Wan, Zongwen Bai, Mike Zheng Shou

    Abstract: Recently, some researchers started exploring the use of ViTs in tackling HSI classification and achieved remarkable results. However, the training of ViT models requires a considerable number of training samples, while hyperspectral data, due to its high annotation costs, typically has a relatively small number of training samples. This contradiction has not been effectively addressed. In this pap… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  26. arXiv:2309.10993  [pdf, other

    cs.SD cs.HC eess.AS

    Directional Source Separation for Robust Speech Recognition on Smart Glasses

    Authors: Tiantian Feng, Ju Lin, Yiteng Huang, Weipeng He, Kaustubh Kalgaonkar, Niko Moritz, Li Wan, Xin Lei, Ming Sun, Frank Seide

    Abstract: Modern smart glasses leverage advanced audio sensing and machine learning technologies to offer real-time transcribing and captioning services, considerably enriching human experiences in daily communications. However, such systems frequently encounter challenges related to environmental noises, resulting in degradation to speech recognition and speaker change detection. To improve voice quality,… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  27. arXiv:2308.10601  [pdf, other

    cs.CV cs.CR cs.LG eess.IV

    Improving the Transferability of Adversarial Examples with Arbitrary Style Transfer

    Authors: Zhijin Ge, Fanhua Shang, Hongying Liu, Yuanyuan Liu, Liang Wan, Wei Feng, Xiaosen Wang

    Abstract: Deep neural networks are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on clean inputs. Although many attack methods can achieve high success rates in the white-box setting, they also exhibit weak transferability in the black-box setting. Recently, various methods have been proposed to improve adversarial transferability, in which the input transformation… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 10 pages, 2 figures, accepted by the 31st ACM International Conference on Multimedia (MM '23)

  28. arXiv:2308.05784  [pdf, other

    eess.IV cs.CV

    High-performance Data Management for Whole Slide Image Analysis in Digital Pathology

    Authors: Haoju Leng, Ruining Deng, Shunxing Bao, Dazheng Fang, Bryan A. Millis, Yucheng Tang, Haichun Yang, Xiao Wang, Yifan Peng, Lipeng Wan, Yuankai Huo

    Abstract: When dealing with giga-pixel digital pathology in whole-slide imaging, a notable proportion of data records holds relevance during each analysis operation. For instance, when deploying an image analysis algorithm on whole-slide images (WSI), the computational bottleneck often lies in the input-output (I/O) system. This is particularly notable as patch-level processing introduces a considerable I/O… ▽ More

    Submitted 20 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

  29. arXiv:2308.04493  [pdf, other

    quant-ph cs.LG q-fin.CP

    Efficient option pricing with unary-based photonic computing chip and generative adversarial learning

    Authors: Hui Zhang, Lingxiao Wan, Sergi Ramos-Calderer, Yuancheng Zhan, Wai-Keong Mok, Hong Cai, Feng Gao, Xianshu Luo, Guo-Qiang Lo, Leong Chuan Kwek, José Ignacio Latorre, Ai Qun Liu

    Abstract: In the modern financial industry system, the structure of products has become more and more complex, and the bottleneck constraint of classical computing power has already restricted the development of the financial industry. Here, we present a photonic chip that implements the unary approach to European option pricing, in combination with the quantum amplitude estimation algorithm, to achieve a q… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 11 pages, 7 figures

    Journal ref: Photonics Research 10.1364/PRJ.493865 (2023)

  30. Role Engine Implementation for a Continuous and Collaborative Multi-Robot System

    Authors: Behzad Akbari, Zikai Wang, Haibin Zhu, Lucas Wan, Ryan Adderson, Ya-Jun Pan

    Abstract: In situations involving teams of diverse robots, assigning appropriate roles to each robot and evaluating their performance is crucial. These roles define the specific characteristics of a robot within a given context. The stream actions exhibited by a robot based on its assigned role are referred to as the process role. Our research addresses the depiction of process roles using a multivariate pr… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 10 pages, 18 figures, summited in IEEE Transactions on Systems, Man and Cybernetics(T-SMC)

  31. arXiv:2306.08956  [pdf, other

    cs.SD eess.AS stat.ML

    Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement

    Authors: Liang Wan, Hongqing Liu, Yi Zhou, Jie Ji

    Abstract: The Dual-Path Convolution Recurrent Network (DPCRN) was proposed to effectively exploit time-frequency domain information. By combining the DPRNN module with Convolution Recurrent Network (CRN), the DPCRN obtained a promising performance in speech separation with a limited model size. In this paper, we explore self-attention in the DPCRN module and design a model called Multi-Loss Convolutional Ne… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  32. arXiv:2305.18443  [pdf, other

    cs.LG

    Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse

    Authors: Jiafei Lyu, Le Wan, Zongqing Lu, Xiu Li

    Abstract: Sample efficiency is one of the most critical issues for online reinforcement learning (RL). Existing methods achieve higher sample efficiency by adopting model-based methods, Q-ensemble, or better exploration mechanisms. We, instead, propose to train an off-policy RL agent via updating on a fixed sampled batch multiple times, thus reusing these samples and better exploiting them within a single o… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: 37 pages

  33. arXiv:2305.14566  [pdf, other

    eess.IV cs.CV

    An Accelerated Pipeline for Multi-label Renal Pathology Image Segmentation at the Whole Slide Image Level

    Authors: Haoju Leng, Ruining Deng, Zuhayr Asad, R. Michael Womick, Haichun Yang, Lipeng Wan, Yuankai Huo

    Abstract: Deep-learning techniques have been used widely to alleviate the labour-intensive and time-consuming manual annotation required for pixel-level tissue characterization. Our previous study introduced an efficient single dynamic network - Omni-Seg - that achieved multi-class multi-scale pathological segmentation with less computational complexity. However, the patch-wise segmentation paradigm still a… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  34. arXiv:2305.12106  [pdf

    cs.CV cs.AI

    Human-annotated label noise and their impact on ConvNets for remote sensing image scene classification

    Authors: Longkang Peng, Tao Wei, Xuehong Chen, Xiaobei Chen, Rui Sun, Luoma Wan, Jin Chen, Xiaolin Zhu

    Abstract: Convolutional neural networks (ConvNets) have been successfully applied to satellite image scene classification. Human-labeled training datasets are essential for ConvNets to perform accurate classification. Errors in human-annotated training datasets are unavoidable due to the complexity of satellite images. However, the distribution of real-world human-annotated label noises on remote sensing im… ▽ More

    Submitted 30 April, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  35. arXiv:2304.12592  [pdf, other

    cs.CV cs.AI

    MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes

    Authors: Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nanning Zheng

    Abstract: Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes. Previous works infer manipulation relationship by deep neural network trained with data collected from a predefined view, which has limitation in visual dislocation in unstructured environments. Multi-vi… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  36. arXiv:2304.04660  [pdf, other

    cs.LG cs.AI

    Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning

    Authors: Junjie Zhang, Jiafei Lyu, Xiaoteng Ma, Jiangpeng Yan, Jun Yang, Le Wan, Xiu Li

    Abstract: Equipped with the trained environmental dynamics, model-based offline reinforcement learning (RL) algorithms can often successfully learn good policies from fixed-sized datasets, even some datasets with poor quality. Unfortunately, however, it can not be guaranteed that the generated samples from the trained dynamics model are reliable (e.g., some synthetic samples may lie outside of the support r… ▽ More

    Submitted 26 July, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

  37. HybridMIM: A Hybrid Masked Image Modeling Framework for 3D Medical Image Segmentation

    Authors: Zhaohu Xing, Lei Zhu, Lequan Yu, Zhiheng Xing, Liang Wan

    Abstract: Masked image modeling (MIM) with transformer backbones has recently been exploited as a powerful self-supervised pre-training technique. The existing MIM methods adopt the strategy to mask random patches of the image and reconstruct the missing pixels, which only considers semantic information at a lower level, and causes a long pre-training time.This paper presents HybridMIM, a novel hybrid self-… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

    Comments: 10 pages, submitted to TMI

  38. arXiv:2303.10326  [pdf, other

    eess.IV cs.CV

    Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation

    Authors: Zhaohu Xing, Liang Wan, Huazhu Fu, Guang Yang, Lei Zhu

    Abstract: In recent years, Denoising Diffusion Models have demonstrated remarkable success in generating semantically valuable pixel-wise representations for image generative modeling. In this study, we propose a novel end-to-end framework, called Diff-UNet, for medical volumetric segmentation. Our approach integrates the diffusion model into a standard U-shaped architecture to extract semantic information… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

    Comments: 8 pages

  39. arXiv:2303.09370  [pdf, other

    cs.CV

    Learning Physical-Spatio-Temporal Features for Video Shadow Removal

    Authors: Zhihao Chen, Liang Wan, Yefan Xiao, Lei Zhu, Huazhu Fu

    Abstract: Shadow removal in a single image has received increasing attention in recent years. However, removing shadows over dynamic scenes remains largely under-explored. In this paper, we propose the first data-driven video shadow removal model, termed PSTNet, by exploiting three essential characteristics of video shadows, i.e., physical property, spatio relation, and temporal coherence. Specifically, a d… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  40. arXiv:2303.07618  [pdf, other

    cs.CV

    Medical Phrase Grounding with Region-Phrase Context Contrastive Alignment

    Authors: Zhihao Chen, Yang Zhou, Anh Tran, Junting Zhao, Liang Wan, Gideon Ooi, Lionel Cheng, Choon Hua Thng, Xinxing Xu, Yong Liu, Huazhu Fu

    Abstract: Medical phrase grounding (MPG) aims to locate the most relevant region in a medical image, given a phrase query describing certain medical findings, which is an important task for medical image analysis and radiological diagnosis. However, existing visual grounding methods rely on general visual features for identifying objects in natural images and are not capable of capturing the subtle and spec… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  41. arXiv:2302.08950  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

    Authors: Vinicius Ribeiro, Yiteng Huang, Yuan Shangguan, Zhaojun Yang, Li Wan, Ming Sun

    Abstract: Wake word detection exists in most intelligent homes and portable devices. It offers these devices the ability to "wake up" when summoned at a low cost of power and computing. This paper focuses on understanding alignment's role in developing a wake-word system that answers a generic phrase. We discuss three approaches. The first is alignment-based, where the model is trained with frame-wise cross… ▽ More

    Submitted 7 June, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted to Interspeech 2023

  42. arXiv:2211.12075  [pdf, other

    cs.MA cs.LG

    Greedy based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning

    Authors: Lipeng Wan, Zeyang Liu, Xingyu Chen, Xuguang Lan, Nanning Zheng

    Abstract: Due to the representation limitation of the joint Q value function, multi-agent reinforcement learning methods with linear value decomposition (LVD) or monotonic value decomposition (MVD) suffer from relative overgeneralization. As a result, they can not ensure optimal consistency (i.e., the correspondence between individual greedy actions and the maximal true Q value). In this paper, we derive th… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2112.04454

  43. arXiv:2211.06566  [pdf, other

    cs.LG

    Innovative Drug-like Molecule Generation from Flow-based Generative Model

    Authors: Haotian Zhang, Linxiaoyi Wan

    Abstract: To design a drug given a biological molecule by using deep learning methods, there are many successful models published recently. People commonly used generative models to design new molecules given certain protein. LiGAN was regarded as the baseline of deep learning model which was developed on convolutional neural networks. Recently, GraphBP showed its ability to predict innovative "real" chemic… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  44. arXiv:2211.04635  [pdf, other

    cs.LG cs.AI eess.AS

    LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting

    Authors: Haichuan Yang, Zhaojun Yang, Li Wan, Biqiao Zhang, Yangyang Shi, Yiteng Huang, Ivaylo Enchev, Limin Tang, Raziel Alvarez, Ming Sun, Xin Lei, Raghuraman Krishnamoorthi, Vikas Chandra

    Abstract: This paper proposes a hardware-efficient architecture, Linearized Convolution Network (LiCo-Net) for keyword spotting. It is optimized specifically for low-power processor units like microcontrollers. ML operators exhibit heterogeneous efficiency profiles on power-efficient hardware. Given the exact theoretical computation cost, int8 operators are more computation-effective than float operators, a… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

  45. arXiv:2210.15392  [pdf, other

    cs.CV

    LeNo: Adversarial Robust Salient Object Detection Networks with Learnable Noise

    Authors: He Wang, Lin Wan, He Tang

    Abstract: Pixel-wise prediction with deep neural network has become an effective paradigm for salient object detection (SOD) and achieved remarkable performance. However, very few SOD models are robust against adversarial attacks which are visually imperceptible for human visual attention. The previous work robust saliency (ROSA) shuffles the pre-segmented superpixels and then refines the coarse saliency ma… ▽ More

    Submitted 7 December, 2022; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: 8 pages, 6 figures, accepted by AAAI 2023

  46. arXiv:2210.04251  [pdf, other

    cs.LG cs.AI

    State Advantage Weighting for Offline RL

    Authors: Jiafei Lyu, Aicheng Gong, Le Wan, Zongqing Lu, Xiu Li

    Abstract: We present state advantage weighting for offline reinforcement learning (RL). In contrast to action advantage $A(s,a)$ that we commonly adopt in QSA learning, we leverage state advantage $A(s,s^\prime)$ and QSS learning for offline RL, hence decoupling the action from values. We expect the agent can get to the high-reward state and the action is determined by how the agent can get to that correspo… ▽ More

    Submitted 8 November, 2022; v1 submitted 9 October, 2022; originally announced October 2022.

    Comments: 3rd Offline RL workshop at NeurIPS 2022. arXiv admin note: text overlap with arXiv:2206.07989

  47. arXiv:2209.12241  [pdf, other

    cs.LG

    Exploring Example Influence in Continual Learning

    Authors: Qing Sun, Fan Lyu, Fanhua Shang, Wei Feng, Liang Wan

    Abstract: Continual Learning (CL) sequentially learns new tasks like human beings, with the goal to achieve better Stability (S, remembering past tasks) and Plasticity (P, adapting to new tasks). Due to the fact that past training data is not available, it is valuable to explore the influence difference on S and P among training examples, which may improve the learning pattern towards better SP. Inspired by… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022

  48. arXiv:2209.01517  [pdf, other

    cs.CV

    Joint Prediction of Meningioma Grade and Brain Invasion via Task-Aware Contrastive Learning

    Authors: Tianling Liu, Wennan Liu, Lequan Yu, Liang Wan, Tong Han, Lei Zhu

    Abstract: Preoperative and noninvasive prediction of the meningioma grade is important in clinical practice, as it directly influences the clinical decision making. What's more, brain invasion in meningioma (i.e., the presence of tumor tissue within the adjacent brain tissue) is an independent criterion for the grading of meningioma and influences the treatment strategy. Although efforts have been reported… ▽ More

    Submitted 3 September, 2022; originally announced September 2022.

    Comments: Accepted by MICCAI2022

  49. arXiv:2208.14876  [pdf, other

    eess.IV cs.CV

    NestedFormer: Nested Modality-Aware Transformer for Brain Tumor Segmentation

    Authors: Zhaohu Xing, Lequan Yu, Liang Wan, Tong Han, Lei Zhu

    Abstract: Multi-modal MR imaging is routinely used in clinical practice to diagnose and investigate brain tumors by providing rich complementary information. Previous multi-modal MRI segmentation methods usually perform modal fusion by concatenating multi-modal MRIs at an early/middle stage of the network, which hardly explores non-linear dependencies between modalities. In this work, we propose a novel Nes… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: MICCAI2022

  50. arXiv:2203.02285  [pdf, other

    cs.ET cs.LG physics.optics quant-ph

    A photonic chip-based machine learning approach for the prediction of molecular properties

    Authors: Hui Zhang, Jonathan Wei Zhong Lau, Lingxiao Wan, Liang Shi, Hong Cai, Xianshu Luo, Patrick Lo, Chee-Kong Lee, Leong-Chuan Kwek, Ai Qun Liu

    Abstract: Machine learning methods have revolutionized the discovery process of new molecules and materials. However, the intensive training process of neural networks for molecules with ever-increasing complexity has resulted in exponential growth in computation cost, leading to long simulation time and high energy consumption. Photonic chip technology offers an alternative platform for implementing neural… ▽ More

    Submitted 25 December, 2022; v1 submitted 2 March, 2022; originally announced March 2022.