Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 214 results for author: Fang, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.17745  [pdf, other

    cs.CL

    Beyond Entity Alignment: Towards Complete Knowledge Graph Alignment via Entity-Relation Synergy

    Authors: Xiaohan Fang, Chaozhuo Li, Yi Zhao, Qian Zang, Litian Zhang, Jiquan Peng, Xi Zhang, Jibing Gong

    Abstract: Knowledge Graph Alignment (KGA) aims to integrate knowledge from multiple sources to address the limitations of individual Knowledge Graphs (KGs) in terms of coverage and depth. However, current KGA models fall short in achieving a ``complete'' knowledge graph alignment. Existing models primarily emphasize the linkage of cross-graph entities but overlook aligning relations across KGs, thereby prov… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  2. arXiv:2407.16842  [pdf, other

    cs.RO

    Adapting Image-based RL Policies via Predicted Rewards

    Authors: Weiyao Wang, Xinyuan Fang, Gregory D. Hager

    Abstract: Image-based reinforcement learning (RL) faces significant challenges in generalization when the visual environment undergoes substantial changes between training and deployment. Under such circumstances, learned policies may not perform well leading to degraded results. Previous approaches to this problem have largely focused on broadening the training observation distribution, employing technique… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: L4DC 2024

  3. arXiv:2407.16164  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Representation Magnitude has a Liability to Privacy Vulnerability

    Authors: Xingli Fang, Jung-Eun Kim

    Abstract: The privacy-preserving approaches to machine learning (ML) models have made substantial progress in recent years. However, it is still opaque in which circumstances and conditions the model becomes privacy-vulnerable, leading to a challenge for ML models to maintain both performance and privacy. In this paper, we first explore the disparity between member and non-member data in the representation… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Accepted in the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2024

  4. arXiv:2407.11691  [pdf, other

    cs.CV

    VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

    Authors: Haodong Duan, Junming Yang, Yuxuan Qiao, Xinyu Fang, Lin Chen, Yuan Liu, Xiaoyi Dong, Yuhang Zang, Pan Zhang, Jiaqi Wang, Dahua Lin, Kai Chen

    Abstract: We present VLMEvalKit: an open-source toolkit for evaluating large multi-modality models based on PyTorch. The toolkit aims to provide a user-friendly and comprehensive framework for researchers and developers to evaluate existing multi-modality models and publish reproducible evaluation results. In VLMEvalKit, we implement over 70 different large multi-modality models, including both proprietary… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  5. arXiv:2407.09274  [pdf, other

    cs.LG cs.AI q-bio.BM

    Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX

    Authors: Zhiyuan Chen, Tianhao Chen, Chenggang Xie, Yang Xue, Xiaonan Zhang, Jingbo Zhou, Xiaomin Fang

    Abstract: Proteins are fundamental components of biological systems and can be represented through various modalities, including sequences, structures, and textual descriptions. Despite the advances in deep learning and scientific large language models (LLMs) for protein research, current methodologies predominantly focus on limited specialized tasks -- often predicting one protein modality from another. Th… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  6. arXiv:2407.02783  [pdf, ps, other

    cs.CL cs.AI

    52B to 1T: Lessons Learned via Tele-FLM Series

    Authors: Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

    Abstract: Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: For the Tele-FLM-52B tech report, see also 2404.16645

  7. arXiv:2407.02052  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

    Authors: Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

    Abstract: This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case. We implement the front-end speaker diarization using the self-supervised learning representation based multi-speaker embedding and beamforming using the speaker position,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ICASSP 2024

  8. arXiv:2406.16995  [pdf, other

    q-bio.QM cs.AI

    A large language model for predicting T cell receptor-antigen binding specificity

    Authors: Xing Fang, Chenpeng Yu, Shiye Tian, Hui Liu

    Abstract: The human immune response depends on the binding of T-cell receptors (TCRs) to antigens (pTCR), which elicits the T cells to eliminate viruses, tumor cells, and other pathogens. The ability of human immunity system responding to unknown viruses and bacteria stems from the TCR diversity. However, this vast diversity poses challenges on the TCR-antigen binding prediction methods. In this study, we p… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  9. arXiv:2406.14544  [pdf, other

    cs.CV cs.CL

    Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

    Authors: Yuxuan Qiao, Haodong Duan, Xinyu Fang, Junming Yang, Lin Chen, Songyang Zhang, Jiaqi Wang, Dahua Lin, Kai Chen

    Abstract: Vision Language Models (VLMs) demonstrate remarkable proficiency in addressing a wide array of visual questions, which requires strong perception and reasoning faculties. Assessing these two competencies independently is crucial for model refinement, despite the inherent difficulty due to the intertwined nature of seeing and reasoning in existing VLMs. To tackle this issue, we present Prism, an in… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  10. arXiv:2406.14515  [pdf, other

    cs.CV cs.MM

    MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding

    Authors: Xinyu Fang, Kangrui Mao, Haodong Duan, Xiangyu Zhao, Yining Li, Dahua Lin, Kai Chen

    Abstract: The advent of large vision-language models (LVLMs) has spurred research into their applications in multi-modal contexts, particularly in video understanding. Traditional VideoQA benchmarks, despite providing quantitative metrics, often fail to encompass the full spectrum of video content and inadequately assess models' temporal comprehension. To address these limitations, we introduce MMBench-Vide… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  11. arXiv:2406.10238  [pdf, other

    cs.CL cs.LG cs.SI

    Early Detection of Misinformation for Infodemic Management: A Domain Adaptation Approach

    Authors: Minjia Mao, Xiaohang Zhao, Xiao Fang

    Abstract: An infodemic refers to an enormous amount of true information and misinformation disseminated during a disease outbreak. Detecting misinformation at the early stage of an infodemic is key to manage it and reduce its harm to public health. An early stage infodemic is characterized by a large volume of unlabeled information concerning a disease. As a result, conventional misinformation detection met… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  12. arXiv:2406.03325  [pdf, other

    physics.flu-dyn cs.CV

    EngineBench: Flow Reconstruction in the Transparent Combustion Chamber III Optical Engine

    Authors: Samuel J. Baker, Michael A. Hobley, Isabel Scherl, Xiaohang Fang, Felix C. P. Leach, Martin H. Davy

    Abstract: We present EngineBench, the first machine learning (ML) oriented database to use high quality experimental data for the study of turbulent flows inside combustion machinery. Prior datasets for ML in fluid mechanics are synthetic or use overly simplistic geometries. EngineBench is comprised of real-world particle image velocimetry (PIV) data that captures the turbulent airflow patterns in a special… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  13. arXiv:2405.17741  [pdf, other

    cs.AI

    LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design

    Authors: Rui Kong, Qiyang Li, Xinyu Fang, Qingtian Feng, Qingfeng He, Yazhu Dong, Weijun Wang, Yuanchun Li, Linghe Kong, Yunxin Liu

    Abstract: Recent literature has found that an effective method to customize or further improve large language models (LLMs) is to add dynamic adapters, such as low-rank adapters (LoRA) with Mixture-of-Experts (MoE) structures. Though such dynamic adapters incur modest computational complexity, they surprisingly lead to huge inference latency overhead, slowing down the decoding speed by 2.5+ times. In this p… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  14. arXiv:2405.14604  [pdf, other

    cs.CL

    A Watermark for Low-entropy and Unbiased Generation in Large Language Models

    Authors: Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, Michael Chau

    Abstract: Recent advancements in large language models (LLMs) have highlighted the risk of misuse, raising concerns about accurately detecting LLM-generated content. A viable solution for the detection problem is to inject imperceptible identifiers into LLMs, known as watermarks. Previous work demonstrates that unbiased watermarks ensure unforgeability and preserve text quality by maintaining the expectatio… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  15. arXiv:2405.06840  [pdf, other

    cs.AR cs.SE

    MEIC: Re-thinking RTL Debug Automation using LLMs

    Authors: Ke Xu, Jialin Sun, Yuchen Hu, Xinwei Fang, Weiwei Shan, Xi Wang, Zhe Jiang

    Abstract: The deployment of Large Language Models (LLMs) for code debugging (e.g., C and Python) is widespread, benefiting from their ability to understand and interpret intricate concepts. However, in the semiconductor industry, utilising LLMs to debug Register Transfer Level (RTL) code is still insufficient, largely due to the underrepresentation of RTL-specific data in training sets. This work introduces… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  16. arXiv:2405.06653  [pdf, other

    q-bio.BM cs.LG

    A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules

    Authors: Chenpeng Yu, Xing Fang, Hui Liu

    Abstract: The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumor types, yet the percentage of patients who benefit from them remains low. The binding affinity between antigens and HLA-I/TCR molecules plays a critical role in antigen presentation and T-cell activation. Some computational methods have been developed to predict antigen-HLA or antigen-TCR binding spe… ▽ More

    Submitted 8 April, 2024; originally announced May 2024.

  17. arXiv:2405.06068  [pdf, other

    cs.LG eess.SP stat.AP stat.ML

    Deep Learning-Based Residual Useful Lifetime Prediction for Assets with Uncertain Failure Modes

    Authors: Yuqi Su, Xiaolei Fang

    Abstract: Industrial prognostics focuses on utilizing degradation signals to forecast and continually update the residual useful life of complex engineering systems. However, existing prognostic models for systems with multiple failure modes face several challenges in real-world applications, including overlapping degradation signals from multiple components, the presence of unlabeled historical data, and t… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  18. arXiv:2405.04840  [pdf, other

    cs.IR

    Federated Adaptation for Foundation Model-based Recommendations

    Authors: Chunxu Zhang, Guodong Long, Hongkuan Guo, Xiao Fang, Yang Song, Zhaojie Liu, Guorui Zhou, Zijian Zhang, Yang Liu, Bo Yang

    Abstract: With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted as a regular paper of IJCAI'24

  19. arXiv:2404.17674  [pdf, other

    cs.LG cs.AI cs.CR

    Center-Based Relaxed Learning Against Membership Inference Attacks

    Authors: Xingli Fang, Jung-Eun Kim

    Abstract: Membership inference attacks (MIAs) are currently considered one of the main privacy attack strategies, and their defense mechanisms have also been extensively explored. However, there is still a gap between the existing defense approaches and ideal models in performance and deployment costs. In particular, we observed that the privacy vulnerability of the model is closely correlated with the gap… ▽ More

    Submitted 29 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted in the Conference on Uncertainty in Artificial Intelligence (UAI) 2024, PMLR

  20. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  21. arXiv:2404.16645  [pdf, other

    cs.CL cs.AI

    Tele-FLM Technical Report

    Authors: Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

    Abstract: Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications. However, there is a notable paucity of detailed, open-sourced methodologies on efficiently scaling LLMs beyond 50 billion parameters with minimum trial-and-error cost and computational resources. In this report, we introduce Tele-FLM (aka FLM-2), a… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  22. arXiv:2404.15684  [pdf, other

    cs.NI

    Generative Diffusion Model (GDM) for Optimization of Wi-Fi Networks

    Authors: Tie Liu, Xuming Fang, Rong He

    Abstract: Generative Diffusion Models (GDMs), have made significant strides in modeling complex data distributions across diverse domains. Meanwhile, Deep Reinforcement Learning (DRL) has demonstrated substantial improvements in optimizing Wi-Fi network performance. Wi-Fi optimization problems are highly challenging to model mathematically, and DRL methods can bypass complex mathematical modeling, while GDM… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: This paper has been submitted to GlobeCom 2024 and is currently under review

  23. arXiv:2404.13598  [pdf, other

    cs.NI eess.SP

    An Integrated Communication and Computing Scheme for Wi-Fi Networks based on Generative AI and Reinforcement Learning

    Authors: Xinyang Du, Xuming Fang

    Abstract: The continuous evolution of future mobile communication systems is heading towards the integration of communication and computing, with Mobile Edge Computing (MEC) emerging as a crucial means of implementing Artificial Intelligence (AI) computation. MEC could enhance the computational performance of wireless edge networks by offloading computing-intensive tasks to MEC servers. However, in edge com… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: This paper has been submitted to GlobeCom 2024 and is currently under review

  24. arXiv:2404.13299  [pdf, other

    cs.CV

    PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition

    Authors: Xi Fang, Weigang Wang, Xiaoxin Lv, Jun Yan

    Abstract: The development of Large Language Models (LLM) and Diffusion Models brings the boom of Artificial Intelligence Generated Content (AIGC). It is essential to build an effective quality assessment framework to provide a quantifiable evaluation of different images or videos based on the AIGC technologies. The content generated by AIGC methods is driven by the crafted prompts. Therefore, it is intuitiv… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Published in CVPR-2024's NTIRE: New Trends in Image Restoration and Enhancement workshop and challenges

  25. arXiv:2404.10260  [pdf, other

    q-bio.BM cs.AI

    HelixFold-Multimer: Elevating Protein Complex Structure Prediction to New Heights

    Authors: Xiaomin Fang, Jie Gao, Jing Hu, Lihang Liu, Yang Xue, Xiaonan Zhang, Kunrui Zhu

    Abstract: While monomer protein structure prediction tools boast impressive accuracy, the prediction of protein complex structures remains a daunting challenge in the field. This challenge is particularly pronounced in scenarios involving complexes with protein chains from different species, such as antigen-antibody interactions, where accuracy often falls short. Limited by the accuracy of complex predictio… ▽ More

    Submitted 17 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  26. arXiv:2404.10211  [pdf, other

    cs.LG cs.AI

    Anomaly Correction of Business Processes Using Transformer Autoencoder

    Authors: Ziyou Gong, Xianwen Fang, Ping Wu

    Abstract: Event log records all events that occur during the execution of business processes, so detecting and correcting anomalies in event log can provide reliable guarantee for subsequent process analysis. The previous works mainly include next event prediction based methods and autoencoder-based methods. These methods cannot accurately and efficiently detect anomalies and correct anomalies at the same t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  27. arXiv:2404.08715  [pdf, other

    stat.ML cs.CR cs.LG stat.AP

    Differentially Private Log-Location-Scale Regression Using Functional Mechanism

    Authors: Jiewen Sheng, Xiaolei Fang

    Abstract: This article introduces differentially private log-location-scale (DP-LLS) regression models, which incorporate differential privacy into LLS regression through the functional mechanism. The proposed models are established by injecting noise into the log-likelihood function of LLS regression for perturbed parameter estimation. We will derive the sensitivities utilized to determine the magnitude of… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  28. arXiv:2404.01638  [pdf, other

    cs.NI

    Collaborative Optimization of Wireless Communication and Computing Resource Allocation based on Multi-Agent Federated Weighting Deep Reinforcement Learning

    Authors: Junjie Wu, Xuming Fang

    Abstract: As artificial intelligence (AI)-enabled wireless communication systems continue their evolution, distributed learning has gained widespread attention for its ability to offer enhanced data privacy protection, improved resource utilization, and enhanced fault tolerance within wireless communication applications. Federated learning further enhances the ability of resource coordination and model gene… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  29. arXiv:2403.13233  [pdf, other

    cs.CL

    Technical Report: Competition Solution For BetterMixture

    Authors: Shuaijiang Zhao, Xiaoquan Fang

    Abstract: In the era of flourishing large-scale models, the challenge of selecting and optimizing datasets from the vast and complex sea of data, to enhance the performance of large language models within the constraints of limited computational resources, has become paramount. This paper details our solution for the BetterMixture challenge, which focuses on the fine-tuning data mixing for large language mo… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 6 pages

  30. arXiv:2403.11091  [pdf, other

    cs.SD cs.CV eess.AS

    Multitask frame-level learning for few-shot sound event detection

    Authors: Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang

    Abstract: This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 6 pages, 4 figures, conference

  31. arXiv:2403.08818  [pdf, other

    cs.LG cs.AI cs.CL

    Multimodal Fusion of EHR in Structures and Semantics: Integrating Clinical Records and Notes with Hypergraph and LLM

    Authors: Hejie Cui, Xinyu Fang, Ran Xu, Xuan Kan, Joyce C. Ho, Carl Yang

    Abstract: Electronic Health Records (EHRs) have become increasingly popular to support clinical decision-making and healthcare in recent decades. EHRs usually contain heterogeneous information, such as structural data in tabular form and unstructured data in textual notes. Different types of information in EHRs can complement each other and provide a more complete picture of the health status of a patient.… ▽ More

    Submitted 19 February, 2024; originally announced March 2024.

  32. arXiv:2403.01826  [pdf, other

    cs.CE

    A Novel Shortest Path Query Algorithm Based on Optimized Adaptive Topology Structure

    Authors: Xiao Fang, Xuyang Song, Jiyuan Ma, Guanhua Liu, Shurong Pang, Wenbo Zhao, Cong Cao, Ling Fan

    Abstract: Urban rail transit is a fundamental component of public transportation, however, commonly station-based path search algorithms often overlook the impact of transfer times on search results, leading to decreased accuracy. To solve this problem, this paper proposes a novel shortest path query algorithm based on adaptive topology optimization called the Adaptive Topology Extension Road Network Struct… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  33. arXiv:2403.00642  [pdf, other

    cs.LG cs.AI cs.CV

    Rethinking The Uniformity Metric in Self-Supervised Learning

    Authors: Xianghong Fang, Jian Li, Qiang Sun, Benyou Wang

    Abstract: Uniformity plays an important role in evaluating learned representations, providing insights into self-supervised learning. In our quest for effective uniformity metrics, we pinpoint four principled properties that such metrics should possess. Namely, an effective uniformity metric should remain invariant to instance permutations and sample replications while accurately capturing feature redundanc… ▽ More

    Submitted 26 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Journal ref: ICLR 2024

  34. arXiv:2402.19001  [pdf, other

    cs.CV

    Analysis of the Two-Step Heterogeneous Transfer Learning for Laryngeal Blood Vessel Classification: Issue and Improvement

    Authors: Xinyi Fang, Xu Yang, Chak Fong Chong, Kei Long Wong, Yapeng Wang, Tiankui Zhang, Sio-Kei Im

    Abstract: Accurate classification of laryngeal vascular as benign or malignant is crucial for early detection of laryngeal cancer. However, organizations with limited access to laryngeal vascular images face challenges due to the lack of large and homogeneous public datasets for effective learning. Distinguished from the most familiar works, which directly transfer the ImageNet pre-trained models to the tar… ▽ More

    Submitted 14 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  35. arXiv:2402.17944  [pdf, other

    cs.CL

    Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

    Authors: Xi Fang, Weijie Xu, Fiona Anting Tan, Jiani Zhang, Ziqing Hu, Yanjun Qi, Scott Nickleach, Diego Socolinsky, Srinivasan Sengamedu, Christos Faloutsos

    Abstract: Recent breakthroughs in large language modeling have facilitated rigorous exploration of their application in diverse tasks related to tabular data modeling, such as prediction, tabular data synthesis, question answering, and table understanding. Each task presents unique challenges and opportunities. However, there is currently a lack of comprehensive review that summarizes and compares the key t… ▽ More

    Submitted 21 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 41 pages, 4 figures, 8 tables

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: TMLR 2024

  36. arXiv:2402.15852  [pdf, other

    cs.CV cs.RO

    NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation

    Authors: Jiazhao Zhang, Kunyu Wang, Rongtao Xu, Gengze Zhou, Yicong Hong, Xiaomeng Fang, Qi Wu, Zhizheng Zhang, He Wang

    Abstract: Vision-and-language navigation (VLN) stands as a key research problem of Embodied AI, aiming at enabling agents to navigate in unseen environments following linguistic instructions. In this field, generalization is a long-standing challenge, either to out-of-distribution scenes or from Sim to Real. In this paper, we propose NaVid, a video-based large vision language model (VLM), to mitigate such a… ▽ More

    Submitted 30 June, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: Accepted by Robotics: Science and Systems (RSS 2024)

  37. arXiv:2402.12685  [pdf, other

    cs.AI

    XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

    Authors: Yu Xiong, Zhipeng Hu, Ye Huang, Runze Wu, Kai Guan, Xingchen Fang, Ji Jiang, Tianze Zhou, Yujing Hu, Haoyu Liu, Tangjie Lyu, Changjie Fan

    Abstract: Reinforcement Learning (RL) has demonstrated substantial potential across diverse fields, yet understanding its decision-making process, especially in real-world scenarios where rationality and safety are paramount, is an ongoing challenge. This paper delves in to Explainable RL (XRL), a subfield of Explainable AI (XAI) aimed at unravelling the complexities of RL models. Our focus rests on state-e… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 10 pages, 5 figures

  38. arXiv:2402.02361  [pdf, other

    cs.LG

    Pruner: A Speculative Exploration Mechanism to Accelerate Tensor Program Tuning

    Authors: Liang Qiao, Jun Shi, Xiaoyu Hao, Xi Fang, Minfan Zhao, Ziqi Zhu, Junshi Chen, Hong An, Bing Li, Honghui Yuan, Xinyang Wang, Xulong Tang

    Abstract: Tensor program tuning is essential for the efficient deployment of deep neural networks. Search-based approaches have demonstrated scalability and effectiveness in automatically finding high-performance programs for specific hardware. However, the search process is often inefficient, taking hours or even days to discover optimal programs due to the exploration mechanisms guided by an accurate but… ▽ More

    Submitted 29 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  39. arXiv:2402.01018  [pdf, other

    cs.CL cs.AI

    HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent

    Authors: Weijie Xu, Zicheng Huang, Wenxiang Hu, Xi Fang, Rajesh Kumar Cherukuri, Naumaan Nayyar, Lorenzo Malandri, Srinivasan H. Sengamedu

    Abstract: Recent advancements in Large Language Models (LLMs) have been reshaping Natural Language Processing (NLP) task in several domains. Their use in the field of Human Resources (HR) has still room for expansions and could be beneficial for several time consuming tasks. Examples such as time-off submissions, medical claims filing, and access requests are noteworthy, but they are by no means the sole in… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 13 pages, 9 figures

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EACL 2024

  40. arXiv:2401.16991  [pdf, other

    cs.CV

    Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels

    Authors: Chak Fong Chong, Xinyi Fang, Jielong Guo, Yapeng Wang, Wei Ke, Chan-Tong Lam, Sio-Kei Im

    Abstract: Large-scale image datasets are often partially labeled, where only a few categories' labels are known for each image. Assigning pseudo-labels to unknown labels to gain additional training signals has become prevalent for training deep classification models. However, some pseudo-labels are inevitably incorrect, leading to a notable decline in the model classification performance. In this paper, we… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  41. arXiv:2401.02118  [pdf, other

    cs.IT eess.SP

    Radio Map-Based Spectrum Sharing for Joint Communication and Sensing

    Authors: Xionran Fang, Wei Feng, Yunfei Chen, Dingxi Yang, Ning Ge, Zhiyong Feng, Yue Gao

    Abstract: The sixth-generation (6G) network is expected to provide both communication and sensing (C&S) services. However, spectrum scarcity poses a major challenge to the harmonious coexistence of C&S systems. Without effective cooperation, the interference resulting from spectrum sharing impairs the performance of both systems. This paper addresses C&S interference within a distributed network. Different… ▽ More

    Submitted 27 June, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  42. arXiv:2312.16498  [pdf, other

    cs.CV

    A Non-Uniform Low-Light Image Enhancement Method with Multi-Scale Attention Transformer and Luminance Consistency Loss

    Authors: Xiao Fang, Xin Gao, Baofeng Li, Feng Zhai, Yu Qin, Zhihang Meng, Jiansheng Lu, Chun Xiao

    Abstract: Low-light image enhancement aims to improve the perception of images collected in dim environments and provide high-quality data support for image recognition tasks. When dealing with photos captured under non-uniform illumination, existing methods cannot adaptively extract the differentiated luminance information, which will easily cause over-exposure and under-exposure. From the perspective of u… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  43. arXiv:2312.14862  [pdf, other

    cs.CL cs.AI

    YAYI 2: Multilingual Open-Source Large Language Models

    Authors: Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang, Fan Feng, Feifei Zhao, Hailong Sun, Hanxuan Yang, Haojun Pan, Hongyu Liu, Jianbin Guo, Jiangtao Du, Jingyi Wang, Junfeng Li, Lei Sun, Liduo Liu, Lifeng Dong, Lili Liu, Lin Wang , et al. (28 additional authors not shown)

    Abstract: As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and ga… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  44. Angle-Displacement Rigidity Theory with Application to Distributed Network Localization

    Authors: Xu Fang, Xiaolei Li, Lihua Xie

    Abstract: This paper investigates the localization problem of a network in 2-D and 3-D spaces given the positions of anchor nodes in a global frame and inter-node relative measurements in local coordinate frames. It is assumed that the local frames of different nodes have different unknown orientations. First, an angle-displacement rigidity theory is developed, which can be used to localize all the free nod… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  45. arXiv:2312.09576  [pdf, other

    eess.IV cs.CV

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Authors: Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, Jin Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein , et al. (17 additional authors not shown)

    Abstract: Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023)

  46. arXiv:2312.08532  [pdf, other

    cs.LG

    Cooperative Learning for Cost-Adaptive Inference

    Authors: Xingli Fang, Richard Bradford, Jung-Eun Kim

    Abstract: We propose a cooperative training framework for deep neural network architectures that enables the runtime network depths to change to satisfy dynamic computing resource requirements. In our framework, the number of layers participating in computation can be chosen dynamically to meet performance-cost trade-offs at inference runtime. Our method trains two Teammate nets and a Leader net, and two se… ▽ More

    Submitted 26 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

  47. arXiv:2312.07207  [pdf

    cs.CV stat.ML

    MCFNet: Multi-scale Covariance Feature Fusion Network for Real-time Semantic Segmentation

    Authors: Xiaojie Fang, Xingguo Song, Xiangyin Meng, Xu Fang, Sheng Jin

    Abstract: The low-level spatial detail information and high-level semantic abstract information are both essential to the semantic segmentation task. The features extracted by the deep network can obtain rich semantic information, while a lot of spatial information is lost. However, how to recover spatial detail information effectively and fuse it with high-level semantics has not been well addressed so far… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  48. arXiv:2312.06050  [pdf, other

    cs.LG eess.IV stat.ML

    Federated Multilinear Principal Component Analysis with Applications in Prognostics

    Authors: Chengyu Zhou, Yuqi Su, Tangbin Xia, Xiaolei Fang

    Abstract: Multilinear Principal Component Analysis (MPCA) is a widely utilized method for the dimension reduction of tensor data. However, the integration of MPCA into federated learning remains unexplored in existing research. To tackle this gap, this article proposes a Federated Multilinear Principal Component Analysis (FMPCA) method, which enables multiple users to collaboratively reduce the dimension of… ▽ More

    Submitted 28 April, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  49. arXiv:2311.14689  [pdf

    cs.CY

    Analyze Factors Influencing Drivers' Cell Phone Online Ride-hailing Software Using While driving: A Case Study in China

    Authors: Xiangnan Song, Xianghong Li, Kai Yin, Huimin Qi, Xufei Fang

    Abstract: The road safety of traffic is greatly affected by the driving performance of online ride-hailing, which has become an increasingly popular travel option for many people. Little attention has been paid to the fact that the use of cell phone online ride-hailing software by drivers to accept orders while driving is one of the causes of traffic accidents involving online ride-hailing. This paper, adop… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

    Comments: 17 pages,7 tables and 2 figures

    ACM Class: F.1.0

  50. arXiv:2311.13182  [pdf, other

    cs.CV

    Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing

    Authors: Xingyu Chen, Xinyu Zhang, Qiyue Xia, Xinmin Fang, Chris Xiaoxuan Lu, Zhengxiong Li

    Abstract: Millimeter wave (mmWave) sensing is an emerging technology with applications in 3D object characterization and environment mapping. However, realizing precise 3D reconstruction from sparse mmWave signals remains challenging. Existing methods rely on data-driven learning, constrained by dataset availability and difficulty in generalization. We propose DiffSBR, a differentiable framework for mmWave-… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.