Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 353 results for author: Fan, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19042  [pdf, other

    cs.CR cs.DC

    Towards Credential-based Device Registration in DApps for DePINs with ZKPs

    Authors: Jonathan Heiss, Fernando Castillo, Xinxin Fan

    Abstract: Decentralized Physical Infrastructure Networks (DePINS) are secured and governed by blockchains but beyond crypto-economic incentives, they lack measures to establish trust in participating devices and their services. The verification of relevant device credentials during device registration helps to overcome this problem. However, on-chain verification in decentralized applications (dApp) disclos… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.02064  [pdf, other

    cs.LG cs.CR cs.CV

    Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation

    Authors: Yaohua Liu, Jiaxin Gao, Xuan Liu, Xianghao Jiao, Xin Fan, Risheng Liu

    Abstract: Transfer attacks generate significant interest for real-world black-box applications by crafting transferable adversarial examples through surrogate models. Whereas, existing works essentially directly optimize the single-level objective w.r.t. the surrogate model, which always leads to poor interpretability of attack mechanism and limited generalization performance over unknown victim models. In… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024. 10 pages

  3. arXiv:2406.01631  [pdf, other

    cs.IR cs.LG

    An LLM-based Recommender System Environment

    Authors: Nathan Corecco, Giorgio Piatti, Luca A. Lanzendörfer, Flint Xiaofeng Fan, Roger Wattenhofer

    Abstract: Reinforcement learning (RL) has gained popularity in the realm of recommender systems due to its ability to optimize long-term rewards and guide users in discovering relevant content. However, the successful implementation of RL in recommender systems is challenging because of several factors, including the limited availability of online data for training on-policy methods. This scarcity requires… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  4. arXiv:2405.18291  [pdf, other

    cs.LG cs.AI cs.DC

    FedSAC: Dynamic Submodel Allocation for Collaborative Fairness in Federated Learning

    Authors: Zihui Wang, Zheng Wang, Lingjuan Lyu, Zhaopeng Peng, Zhicheng Yang, Chenglu Wen, Rongshan Yu, Cheng Wang, Xiaoliang Fan

    Abstract: Collaborative fairness stands as an essential element in federated learning to encourage client participation by equitably distributing rewards based on individual contributions. Existing methods primarily focus on adjusting gradient allocations among clients to achieve collaborative fairness. However, they frequently overlook crucial factors such as maintaining consistency across local models and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD'24

  5. arXiv:2405.17141  [pdf, other

    eess.IV cs.CV

    MVMS-RCN: A Dual-Domain Unfolding CT Reconstruction with Multi-sparse-view and Multi-scale Refinement-correction

    Authors: Xiaohong Fan, Ke Chen, Huaming Yi, Yin Yang, Jianping Zhang

    Abstract: X-ray Computed Tomography (CT) is one of the most important diagnostic imaging techniques in clinical applications. Sparse-view CT imaging reduces the number of projection views to a lower radiation dose and alleviates the potential risk of radiation exposure. Most existing deep learning (DL) and deep unfolding sparse-view CT reconstruction methods: 1) do not fully use the projection data; 2) do n… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 12 pages, submitted

  6. arXiv:2405.15268  [pdf, other

    cs.LG

    ParamReL: Learning Parameter Space Representation via Progressively Encoding Bayesian Flow Networks

    Authors: Zhangkai Wu, Xuhui Fan, Jin Li, Zhilin Zhao, Hui Chen, Longbing Cao

    Abstract: The recently proposed Bayesian Flow Networks~(BFNs) show great potential in modeling parameter spaces, offering a unified strategy for handling continuous, discretized, and discrete data. However, BFNs cannot learn high-level semantic representation from the parameter space since {common encoders, which encode data into one static representation, cannot capture semantic changes in parameters.} Thi… ▽ More

    Submitted 5 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  7. arXiv:2405.15193  [pdf, other

    cs.DB cs.DS

    CuckooGraph: A Scalable and Space-Time Efficient Data Structure for Large-Scale Dynamic Graphs

    Authors: Zhuochen Fan, Yalun Cai, Zirui Liu, Jiarui Guo, Xin Fan, Tong Yang, Bin Cui

    Abstract: Graphs play an increasingly important role in various big data applications. However, existing graph data structures cannot simultaneously address the performance bottlenecks caused by the dynamic updates, large scale, and high query complexity of current graphs. This paper proposes a novel data structure for large-scale dynamic graphs called CuckooGraph. It does not need to know the amount of gra… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  8. arXiv:2405.14139  [pdf, other

    q-bio.NC cs.LG cs.NE

    Contribute to balance, wire in accordance: Emergence of backpropagation from a simple, bio-plausible neuroplasticity rule

    Authors: Xinhao Fan, Shreesh P Mysore

    Abstract: Backpropagation (BP) has been pivotal in advancing machine learning and remains essential in computational applications and comparative studies of biological and artificial neural networks. Despite its widespread use, the implementation of BP in the brain remains elusive, and its biological plausibility is often questioned due to inherent issues such as the need for symmetry of weights between for… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  9. arXiv:2405.11683  [pdf, other

    cs.LG

    Conditionally-Conjugate Gaussian Process Factor Analysis for Spike Count Data via Data Augmentation

    Authors: Yididiya Y. Nadew, Xuhui Fan, Christopher J. Quinn

    Abstract: Gaussian process factor analysis (GPFA) is a latent variable modeling technique commonly used to identify smooth, low-dimensional latent trajectories underlying high-dimensional neural recordings. Specifically, researchers model spiking rates as Gaussian observations, resulting in tractable inference. Recently, GPFA has been extended to model spike count data. However, due to the non-conjugacy of… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 23 pages, 2 figures, ICML

  10. arXiv:2405.11416  [pdf, other

    cs.LG

    Discrete-state Continuous-time Diffusion for Graph Generation

    Authors: Zhe Xu, Ruizhong Qiu, Yuzhong Chen, Huiyuan Chen, Xiran Fan, Menghai Pan, Zhichen Zeng, Mahashweta Das, Hanghang Tong

    Abstract: Graph is a prevalent discrete data structure, whose generation has wide applications such as drug discovery and circuit design. Diffusion generative models, as an emerging research focus, have been applied to graph generation tasks. Overall, according to the space of states and time steps, diffusion generative models can be categorized into discrete-/continuous-state discrete-/continuous-time fash… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  11. arXiv:2405.10959  [pdf, other

    cs.CY cs.LG

    Foundation Models for Education: Promises and Prospects

    Authors: Tianlong Xu, Richard Tong, Jing Liang, Xing Fan, Haoyang Li, Qingsong Wen

    Abstract: With the advent of foundation models like ChatGPT, educators are excited about the transformative role that AI might play in propelling the next education revolution. The developing speed and the profound impact of foundation models in various industries force us to think deeply about the changes they will make to education, a domain that is critically important for the future of humans. In this p… ▽ More

    Submitted 8 April, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Intelligent Systems

  12. arXiv:2405.10793  [pdf, other

    cs.RO

    CCTNet: A Circular Convolutional Transformer Network for LiDAR-based Place Recognition Handling Movable Objects Occlusion

    Authors: Gang Wang, Chaoran Zhu, Qian Xu, Tongzhou Zhang, Hai Zhang, XiaoPeng Fan, Jue Hu

    Abstract: Place recognition is a fundamental task for robotic application, allowing robots to perform loop closure detection within simultaneous localization and mapping (SLAM), and achieve relocalization on prior maps. Current range image-based networks use single-column convolution to maintain feature invariance to shifts in image columns caused by LiDAR viewpoint change.However, this raises the issues su… ▽ More

    Submitted 26 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  13. arXiv:2405.06814  [pdf, other

    cs.CV

    Dual-Task Vision Transformer for Rapid and Accurate Intracerebral Hemorrhage Classification on CT Images

    Authors: Jialiang Fan, Guoyu Lu, Xinhui Fan, Lucan Li

    Abstract: Intracerebral hemorrhage (ICH) is a severe and sudden medical condition caused by the rupture of blood vessels in the brain, leading to permanent damage to brain tissue and often resulting in functional disabilities or death in patients. Diagnosis and analysis of ICH typically rely on brain CT imaging. Given the urgency of ICH conditions, early treatment is crucial, necessitating rapid analysis of… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figure3

  14. arXiv:2405.04289  [pdf, ps, other

    cs.NE

    Direct Training High-Performance Deep Spiking Neural Networks: A Review of Theories and Methods

    Authors: Chenlin Zhou, Han Zhang, Liutao Yu, Yumin Ye, Zhaokun Zhou, Liwei Huang, Zhengyu Ma, Xiaopeng Fan, Huihui Zhou, Yonghong Tian

    Abstract: Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks (ANNs), in virtue of their high biological plausibility, rich spatial-temporal dynamics, and event-driven computation. The direct training algorithms based on the surrogate gradient method provide sufficient flexibility to design novel SNN architectures and explore the spatial-temporal dynam… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 29 pages

  15. arXiv:2405.02476  [pdf, other

    cs.ET cs.CR cs.DC

    SSI4IoT: Unlocking the Potential of IoT Tailored Self-Sovereign Identity

    Authors: Thusitha Dayaratne, Xinxin Fan, Yuhong Liu, Carsten Rudolph

    Abstract: The emerging Self-Sovereign Identity (SSI) techniques, such as Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), move control of digital identity from conventional identity providers to individuals and lay down the foundation for people, organizations, and things establishing rich digital relationship. The existing applications of SSI mainly focus on creating person-to-person and… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  16. arXiv:2405.00222  [pdf, other

    quant-ph cs.NI

    Optimized Distribution of Entanglement Graph States in Quantum Networks

    Authors: Xiaojie Fan, Caitao Zhan, Himanshu Gupta, C. R. Ramakrishnan

    Abstract: Building large-scale quantum computers, essential to demonstrating quantum advantage, is a key challenge. Quantum Networks (QNs) can help address this challenge by enabling the construction of large, robust, and more capable quantum computing platforms by connecting smaller quantum computers. Moreover, unlike classical systems, QNs can enable fully secured long-distance communication. Thus, quantu… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, 13 figures

  17. arXiv:2404.16371  [pdf, other

    cs.CV

    Multimodal Information Interaction for Medical Image Segmentation

    Authors: Xinxin Fan, Lin Liu, Haoran Zhang

    Abstract: The use of multimodal data in assisted diagnosis and segmentation has emerged as a prominent area of interest in current research. However, one of the primary challenges is how to effectively fuse multimodal features. Most of the current approaches focus on the integration of multimodal features while ignoring the correlation and consistency between different modal features, leading to the inclusi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  18. arXiv:2404.16271  [pdf

    cs.CR cond-mat.mtrl-sci

    True random number generation using metastable 1T' molybdenum ditelluride

    Authors: Yang Liu, Pengyu Liu, Yingyi Wen, Zihan Liang, Songwei Liu, Lekai Song, Jingfang Pei, Xiaoyue Fan, Teng Ma, Gang Wang, Shuo Gao, Kong-Pang Pun, Xiaolong Chen, Guohua Hu

    Abstract: True random numbers play a critical role in secure cryptography. The generation relies on a stable and readily extractable entropy source. Here, from solution-processed structurally metastable 1T' MoTe2, we prove stable output of featureless, stochastic, and yet stable conductance noise at a broad temperature (down to 15 K) with minimal power consumption (down to 0.05 micro-W). Our characterizatio… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  19. arXiv:2404.15657  [pdf, other

    cs.LG cs.AI

    FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

    Authors: Hui Chen, Hengyu Liu, Zhangkai Wu, Xuhui Fan, Longbing Cao

    Abstract: While deep neural networks (DNNs) based personalized federated learning (PFL) is demanding for addressing data heterogeneity and shows promising performance, existing methods for federated learning (FL) suffer from efficient systematic uncertainty quantification. The Bayesian DNNs-based PFL is usually questioned of either over-simplified model structures or high computational and memory costs. In… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  20. arXiv:2404.13419  [pdf, other

    cs.SC

    On Modeling Multi-Criteria Decision Making with Uncertain Information using Probabilistic Rules

    Authors: Shengxin Hong, Xiuyi Fan

    Abstract: Decision-making processes often involve dealing with uncertainty, which is traditionally addressed through probabilistic models. However, in practical scenarios, assessing probabilities reliably can be challenging, compounded by diverse perceptions of probabilistic information among decision makers. To address this variability and accommodate diverse preferences regarding uncertainty, we introduce… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  21. arXiv:2404.11536  [pdf, other

    cs.LG cs.AI

    FedPFT: Federated Proxy Fine-Tuning of Foundation Models

    Authors: Zhaopeng Peng, Xiaoliang Fan, Yufan Chen, Zheng Wang, Shirui Pan, Chenglu Wen, Ruisheng Zhang, Cheng Wang

    Abstract: Adapting Foundation Models (FMs) for downstream tasks through Federated Learning (FL) emerges a promising strategy for protecting data privacy and valuable FMs. Existing methods fine-tune FM by allocating sub-FM to clients in FL, however, leading to suboptimal performance due to insufficient tuning and inevitable error accumulations of gradients. In this paper, we propose Federated Proxy Fine-Tuni… ▽ More

    Submitted 28 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI'24

  22. arXiv:2404.10253  [pdf, other

    cs.DC

    Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development

    Authors: Xiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiaojing Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu , et al. (16 additional authors not shown)

    Abstract: With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 13 figures

  23. arXiv:2404.01174  [pdf, other

    cs.CV cs.MM

    SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding

    Authors: Wenrui Li, Xiaopeng Hong, Ruiqin Xiong, Xiaopeng Fan

    Abstract: Temporal video grounding (TVG) is a critical task in video content understanding, requiring precise alignment between video content and natural language instructions. Despite significant advancements, existing methods face challenges in managing confidence bias towards salient objects and capturing long-term dependencies in video sequences. To address these issues, we introduce SpikeMba: a multi-m… ▽ More

    Submitted 23 May, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  24. arXiv:2403.20156  [pdf, other

    cs.LG cs.AI

    CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening

    Authors: Hei Yi Mak, Flint Xiaofeng Fan, Luca A. Lanzendörfer, Cheston Tan, Wei Tsang Ooi, Roger Wattenhofer

    Abstract: In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converg… ▽ More

    Submitted 16 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

  25. arXiv:2403.17676  [pdf

    physics.app-ph cs.ET

    Analysis on reservoir activation with the nonlinearity harnessed from solution-processed MoS2 devices

    Authors: Songwei Liu, Yang Liu, Yingyi Wen, Jingfang Pei, Pengyu Liu, Lekai Song, Xiaoyue Fan, Wenchen Yang, Danmei Pan, Teng Ma, Yue Lin, Gang Wang, Guohua Hu

    Abstract: Reservoir computing is a recurrent neural network that has been applied across various domains in machine learning. The implementation of reservoir computing, however, often demands heavy computations for activating the reservoir. Configuring physical reservoir networks and harnessing the nonlinearity from the underlying devices for activation is an emergent solution to address the computational c… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  26. arXiv:2403.16552  [pdf, other

    cs.NE cs.AI cs.CV

    QKFormer: Hierarchical Spiking Transformer using Q-K Attention

    Authors: Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tian

    Abstract: Spiking Transformers, which integrate Spiking Neural Networks (SNNs) with Transformer architectures, have attracted significant attention due to their potential for energy efficiency and high performance. However, existing models in this domain still suffer from suboptimal performance. We introduce several innovations to improve the performance: i) We propose a novel spike-form Q-K attention mecha… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 10 pages, code: https://github.com/zhouchenlin2096/QKFormer

  27. arXiv:2403.14617  [pdf, other

    cs.CV cs.AI cs.LG

    Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

    Authors: Xiang Fan, Anand Bhattad, Ranjay Krishna

    Abstract: We introduce Videoshop, a training-free video editing algorithm for localized semantic edits. Videoshop allows users to use any editing software, including Photoshop and generative inpainting, to modify the first frame; it automatically propagates those changes, with semantic, spatial, and temporally consistent motion, to the remaining frames. Unlike existing methods that enable edits only through… ▽ More

    Submitted 22 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Project page at https://videoshop-editing.github.io/

  28. arXiv:2403.05924  [pdf, other

    cs.CV

    CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning

    Authors: Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He

    Abstract: Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framew… ▽ More

    Submitted 13 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: ICASSP 2024

  29. arXiv:2403.05751  [pdf, other

    cs.LG cs.AI

    MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process

    Authors: Xinyao Fan, Yueying Wu, Chang Xu, Yuhao Huang, Weiqing Liu, Jiang Bian

    Abstract: Recently, diffusion probabilistic models have attracted attention in generative time series forecasting due to their remarkable capacity to generate high-fidelity samples. However, the effective utilization of their strong modeling ability in the probabilistic time series forecasting task remains an open question, partially due to the challenge of instability arising from their stochastic nature.… ▽ More

    Submitted 15 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: International Conference on Learning Representations (ICLR) 2024

  30. arXiv:2403.04703  [pdf, other

    cs.RO

    mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar

    Authors: Chengzhen Meng, Yifan Duan, Chenming He, Dequan Wang, Xiaoran Fan, Yanyong Zhang

    Abstract: Place recognition is crucial for tasks like loop-closure detection and re-localization. Single-chip millimeter wave radar (single-chip radar in short) emerges as a low-cost sensor option for place recognition, with the advantage of insensitivity to degraded visual environments. However, it encounters two challenges. Firstly, sparse point cloud from single-chip radar leads to poor performance when… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 8 pages, 8 figures

  31. arXiv:2403.04365  [pdf, other

    cs.NI cs.SE

    DV-Hop localization based on Distance Estimation using Multinode and Hop Loss in WSNs

    Authors: Penghong Wang, Xingtao Wang, Wenrui Li, Xiaopeng Fan, Debin Zhao

    Abstract: Location awareness is a critical issue in wireless sensor network applications. For more accurate location estimation, the two issues should be considered extensively: 1) how to sufficiently utilize the connection information between multiple nodes and 2) how to select a suitable solution from multiple solutions obtained by the Euclidean distance loss. In this paper, a DV-Hop localization based on… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  32. arXiv:2402.19111  [pdf, other

    eess.IV cs.CV

    Deep Network for Image Compressed Sensing Coding Using Local Structural Sampling

    Authors: Wenxue Cui, Xingtao Wang, Xiaopeng Fan, Shaohui Liu, Xinwei Gao, Debin Zhao

    Abstract: Existing image compressed sensing (CS) coding frameworks usually solve an inverse problem based on measurement coding and optimization-based image reconstruction, which still exist the following two challenges: 1) The widely used random sampling matrix, such as the Gaussian Random Matrix (GRM), usually leads to low measurement coding efficiency. 2) The optimization-based reconstruction methods gen… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by ACM Transactions on Multimedia Computing Communications and Applications (TOMM)

  33. arXiv:2402.18796  [pdf, other

    cs.RO

    MOSAIC: A Modular System for Assistive and Interactive Cooking

    Authors: Huaxiaoyue Wang, Kushal Kedia, Juntao Ren, Rahma Abdullah, Atiksh Bhardwaj, Angela Chao, Kelly Y Chen, Nathaniel Chin, Prithwish Dan, Xinyi Fan, Gonzalo Gonzalez-Pumariega, Aditya Kompella, Maximus Adrian Pace, Yash Sharma, Xiangwan Sun, Neha Sunkara, Sanjiban Choudhury

    Abstract: We present MOSAIC, a modular architecture for home robots to perform complex collaborative tasks, such as cooking with everyday users. MOSAIC tightly collaborates with humans, interacts with users using natural language, coordinates multiple robots, and manages an open vocabulary of everyday objects. At its core, MOSAIC employs modularity: it leverages multiple large-scale pre-trained models for g… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 22 pages, 13 figures

  34. arXiv:2402.18493  [pdf, other

    cs.CV

    Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection

    Authors: Xun Huang, Hai Wu, Xin Li, Xiaoliang Fan, Chenglu Wen, Cheng Wang

    Abstract: LiDAR-based 3D object detection models have traditionally struggled under rainy conditions due to the degraded and noisy scanning signals. Previous research has attempted to address this by simulating the noise from rain to improve the robustness of detection models. However, significant disparities exist between simulated and actual rain-impacted data points. In this work, we propose a novel rain… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

  35. arXiv:2402.17516  [pdf, other

    cs.LG cs.AI

    QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations

    Authors: Jamie Duell, Monika Seisenberger, Hsuan Fu, Xiuyi Fan

    Abstract: Deep Neural Networks (DNNs) stand out as one of the most prominent approaches within the Machine Learning (ML) domain. The efficacy of DNNs has surged alongside recent increases in computational capacity, allowing these approaches to scale to significant complexities for addressing predictive challenges in big data. However, as the complexity of DNN models rises, interpretability diminishes. In re… ▽ More

    Submitted 29 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  36. arXiv:2402.15959  [pdf, other

    cs.CV

    Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

    Authors: Zhiying Jiang, Xingyuan Li, Jinyuan Liu, Xin Fan, Risheng Liu

    Abstract: Image stitching seamlessly integrates images captured from varying perspectives into a single wide field-of-view image. Such integration not only broadens the captured scene but also augments holistic perception in computer vision applications. Given a pair of captured images, subtle perturbations and distortions which go unnoticed by the human visual system tend to attack the correspondence match… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

  37. arXiv:2402.14601  [pdf, other

    cs.CY cs.AI cs.HC cs.LG

    Bringing Generative AI to Adaptive Learning in Education

    Authors: Hang Li, Tianlong Xu, Chaoli Zhang, Eason Chen, Jing Liang, Xing Fan, Haoyang Li, Jiliang Tang, Qingsong Wen

    Abstract: The recent surge in generative AI technologies, such as large language models and diffusion models, have boosted the development of AI applications in various domains, including science, finance, and education. Concurrently, adaptive learning, a concept that has gained substantial interest in the educational sphere, has proven its efficacy in enhancing students' learning efficiency. In this positi… ▽ More

    Submitted 22 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 14 pages, 5 figures

  38. arXiv:2402.08303   

    cs.CL cs.AI cs.CE cs.HC cs.LG

    ChatCell: Facilitating Single-Cell Analysis with Natural Language

    Authors: Yin Fang, Kangwei Liu, Ningyu Zhang, Xinle Deng, Penghui Yang, Zhuo Chen, Xiangru Tang, Mark Gerstein, Xiaohui Fan, Huajun Chen

    Abstract: As Large Language Models (LLMs) rapidly evolve, their influence in science is becoming increasingly prominent. The emerging capabilities of LLMs in task generalization and free-form dialogue can significantly advance fields like chemistry and biology. However, the field of single-cell biology, which forms the foundational building blocks of living organisms, still faces several challenges. High kn… ▽ More

    Submitted 19 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: I have decided to temporarily withdraw this draft as I am in the process of making further revisions to improve its content. Code: https://github.com/zjunlp/ChatCell Dataset: https://huggingface.co/datasets/zjunlp/ChatCell-Instructions Demo: https://chat.openai.com/g/g-vUwj222gQ-chatcell

  39. arXiv:2402.05808  [pdf, other

    cs.AI cs.CL cs.LG

    Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

    Authors: Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models. The core challenge in applying RL to complex reasoning is to identify a sequence of actions that result in positive rewards and provide appropriate supervision for o… ▽ More

    Submitted 17 March, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Preprint. Codes released: https://github.com/WooooDyy/LLM-Reverse-Curriculum-RL

  40. arXiv:2402.01391  [pdf, other

    cs.SE cs.CL

    StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

    Authors: Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui

    Abstract: The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit te… ▽ More

    Submitted 5 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 13 pages, 5 figures

  41. arXiv:2401.17221  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MouSi: Poly-Visual-Expert Vision-Language Models

    Authors: Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Current large vision-language models (VLMs) often encounter challenges such as insufficient capabilities of a single visual component and excessively long visual tokens. These issues can limit the model's effectiveness in accurately interpreting complex visual information and over-lengthy contextual information. Addressing these challenges is crucial for enhancing the performance and applicability… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  42. arXiv:2401.14656  [pdf, other

    cs.CL

    Scientific Large Language Models: A Survey on Biological & Chemical Domains

    Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen

    Abstract: Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent o… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  43. arXiv:2401.11759  [pdf, other

    cs.DC

    Integrated Sensing, Communication, and Computing: An Information-oriented Resource Transaction Mechanism

    Authors: Ning Chen, Zhipeng Cheng, Xuwei Fan, Zhang Liu, Bangzhen Huang, Jie Yang, Yifeng Zhao, Lianfen Huang

    Abstract: Information acquisition from target perception represents the key enabling technology of the Internet of Automatic Vehicles (IoAV), which is essential for the decision-making and control operation of connected automatic vehicles (CAVs). Exploring target information involves multiple operations on data, e.g., wireless sensing (for data acquisition), communication (for data transmission), and comput… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 8 pages, 4 figures, 2 tables

  44. arXiv:2401.08326  [pdf, other

    cs.CL cs.AI

    RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning

    Authors: Junjie Ye, Yilong Wu, Songyang Gao, Caishuang Huang, Sixian Li, Guanyu Li, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang

    Abstract: Tool learning has generated widespread interest as a vital means of interaction between Large Language Models (LLMs) and the physical world. Current research predominantly emphasizes LLMs' capacity to utilize tools in well-structured environments while overlooking their stability when confronted with the inevitable noise of the real world. To bridge this gap, we introduce RoTBench, a multi-level b… ▽ More

    Submitted 19 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  45. arXiv:2401.08123  [pdf, other

    cs.CV

    The Devil is in the Details: Boosting Guided Depth Super-Resolution via Rethinking Cross-Modal Alignment and Aggregation

    Authors: Xinni Jiang, Zengsheng Kuang, Chunle Guo, Ruixun Zhang, Lei Cai, Xiao Fan, Chongyi Li

    Abstract: Guided depth super-resolution (GDSR) involves restoring missing depth details using the high-resolution RGB image of the same scene. Previous approaches have struggled with the heterogeneity and complementarity of the multi-modal inputs, and neglected the issues of modal misalignment, geometrical misalignment, and feature selection. In this study, we rethink some essential components in GDSR netwo… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  46. arXiv:2401.06080  [pdf, other

    cs.AI

    Secrets of RLHF in Large Language Models Part II: Reward Modeling

    Authors: Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang , et al. (2 additional authors not shown)

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has become a crucial technology for aligning language models with human values and intentions, enabling models to produce more helpful and harmless responses. Reward models are trained as proxies for human preferences to drive reinforcement learning optimization. While reward models are often considered central to achieving high performance, they f… ▽ More

    Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  47. arXiv:2401.05709  [pdf, other

    cs.NI eess.SP

    Probability-based Distance Estimation Model for 3D DV-Hop Localization in WSNs

    Authors: Penghong Wang, Hao Wang, Wenrui Li, Xiaopeng Fan, Debin Zhao

    Abstract: Localization is one of the pivotal issues in wireless sensor network applications. In 3D localization studies, most algorithms focus on enhancing the location prediction process, lacking theoretical derivation of the detection distance of an anchor node at the varying hops, engenders a localization performance bottleneck. To address this issue, we propose a probability-based average distance estim… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  48. arXiv:2401.03489  [pdf, other

    cs.LG cs.AI cs.DC cs.MA

    Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence

    Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer

    Abstract: In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted at AAMAS'24

  49. arXiv:2401.02668  [pdf, other

    cs.DC cs.LG

    Towards Integrated Fine-tuning and Inference when Generative AI meets Edge Intelligence

    Authors: Ning Chen, Zhipeng Cheng, Xuwei Fan, Xiaoyu Xia, Lianfen Huang

    Abstract: The high-performance generative artificial intelligence (GAI) represents the latest evolution of computational intelligence, while the blessing of future 6G networks also makes edge intelligence (EI) full of development potential. The inevitable encounter between GAI and EI can unleash new opportunities, where GAI's pre-training based on massive computing resources and large-scale unlabeled corpor… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 11 pages, 8 figures, and 5 tables

  50. arXiv:2401.02662  [pdf, other

    cs.NI eess.SP

    GainNet: Coordinates the Odd Couple of Generative AI and 6G Networks

    Authors: Ning Chen, Jie Yang, Zhipeng Cheng, Xuwei Fan, Zhang Liu, Bangzhen Huang, Yifeng Zhao, Lianfen Huang, Xiaojiang Du, Mohsen Guizani

    Abstract: The rapid expansion of AI-generated content (AIGC) reflects the iteration from assistive AI towards generative AI (GAI) with creativity. Meanwhile, the 6G networks will also evolve from the Internet-of-everything to the Internet-of-intelligence with hybrid heterogeneous network architectures. In the future, the interplay between GAI and the 6G will lead to new opportunities, where GAI can learn th… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 10 pages, 5 figures, 1 table