Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,195 results for author: Ma, J

Searching in archive cs. Search in all archives.
.
  1. Causal Inference with Latent Variables: Recent Advances and Future Prospectives

    Authors: Yaochen Zhu, Yinhan He, Jing Ma, Mengxuan Hu, Sheng Li, Jundong Li

    Abstract: Causality lays the foundation for the trajectory of our world. Causal inference (CI), which aims to infer intrinsic causal relations among variables of interest, has emerged as a crucial research topic. Nevertheless, the lack of observation of important variables (e.g., confounders, mediators, exogenous variables, etc.) severely compromises the reliability of CI methods. The issue may arise from t… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD'24 Survey Track

  2. arXiv:2406.13153  [pdf, other

    cs.CV

    SwinStyleformer is a favorable choice for image inversion

    Authors: Jiawei Mao, Guangyi Zhao, Xuesong Yin, Yuanqi Chang

    Abstract: This paper proposes the first pure Transformer structure inversion network called SwinStyleformer, which can compensate for the shortcomings of the CNNs inversion framework by handling long-range dependencies and learning the global structure of objects. Experiments found that the inversion network with the Transformer backbone could not successfully invert the image. The above phenomena arise fro… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2406.12587  [pdf, other

    cs.CV

    Restorer: Solving Multiple Image Restoration Tasks with One Set of Parameters

    Authors: Jiawei Mao, Xuesong Yin, Yuanqi Chang

    Abstract: Although there are many excellent solutions in image restoration, the fact that they are specifically designed for a single image restoration task may prevent them from being state-of-the-art (SOTA) in other types of image restoration tasks. While some approaches require considering multiple image restoration tasks, they are still not sufficient for the requirements of the real world and may suffe… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.12119  [pdf

    cs.LG cs.AI cs.SI

    Deploying scalable traffic prediction models for efficient management in real-world large transportation networks during hurricane evacuations

    Authors: Qinhua Jiang, Brian Yueshuai He, Changju Lee, Jiaqi Ma

    Abstract: Accurate traffic prediction is vital for effective traffic management during hurricane evacuation. This paper proposes a predictive modeling system that integrates Multilayer Perceptron (MLP) and Long-Short Term Memory (LSTM) models to capture both long-term congestion patterns and short-term speed patterns. Leveraging various input variables, including archived traffic data, spatial-temporal road… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE ITS Magazine and currently under review

  5. arXiv:2406.11678  [pdf, other

    cs.IR cs.CL

    TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy

    Authors: Yiqun Chen, Qi Liu, Yi Zhang, Weiwei Sun, Daiting Shi, Jiaxin Mao, Dawei Yin

    Abstract: Large Language Models (LLMs) are increasingly employed in zero-shot documents ranking, yielding commendable results. However, several significant challenges still persist in LLMs for ranking: (1) LLMs are constrained by limited input length, precluding them from processing a large number of documents simultaneously; (2) The output document sequence is influenced by the input order of documents, re… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.11668  [pdf, other

    cs.CL

    "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jailbreak

    Authors: Lingrui Mei, Shenghua Liu, Yiwei Wang, Baolong Bi, Jiayi Mao, Xueqi Cheng

    Abstract: "Jailbreak" is a major safety concern of Large Language Models (LLMs), which occurs when malicious prompts lead LLMs to produce harmful outputs, raising issues about the reliability and safety of LLMs. Therefore, an effective evaluation of jailbreaks is very crucial to develop its mitigation strategies. However, our research reveals that many jailbreaks identified by current evaluations may actual… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.11519  [pdf, other

    cs.CV eess.IV

    HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

    Authors: Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang

    Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA

  8. arXiv:2406.11288  [pdf, other

    cs.CL cs.CV

    MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models

    Authors: Shengkang Wang, Hongzhan Lin, Ziyang Luo, Zhen Ye, Guang Chen, Jing Ma

    Abstract: Large vision-language models (LVLMs) have significantly improved multimodal reasoning tasks, such as visual question answering and image captioning. These models embed multimodal facts within their parameters, rather than relying on external knowledge bases to store factual information explicitly. However, the content discerned by LVLMs may deviate from actual facts due to inherent bias or incorre… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 22 pages, 8 figures

  9. arXiv:2406.11206  [pdf, other

    cs.LG cs.CR stat.ML

    Retraining with Predicted Hard Labels Provably Increases Model Accuracy

    Authors: Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

    Abstract: The performance of a model trained with \textit{noisy labels} is often improved by simply \textit{retraining} the model with its own predicted \textit{hard} labels (i.e., $1$/$0$ labels). Yet, a detailed theoretical characterization of this phenomenon is lacking. In this paper, we theoretically analyze retraining in a linearly separable setting with randomly corrupted labels given to us and prove… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  10. arXiv:2406.11179  [pdf, other

    cs.LG cs.AI

    Learning Iterative Reasoning through Energy Diffusion

    Authors: Yilun Du, Jiayuan Mao, Joshua B. Tenenbaum

    Abstract: We introduce iterative reasoning through energy diffusion (IRED), a novel framework for learning to reason for a variety of tasks by formulating reasoning and decision-making problems with energy-based optimization. IRED learns energy functions to represent the constraints between input conditions and desired outputs. After training, IRED adapts the number of optimization steps during inference ba… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: ICML 2024, website: https://energy-based-model.github.io/ired/

  11. arXiv:2406.10959  [pdf, ps, other

    math.OC cs.LG

    On Convergence and Rate of Convergence of Policy Improvement Algorithms

    Authors: Jin Ma, Gaozhan Wang, Jianfeng Zhang

    Abstract: In this paper we provide a simple proof from scratch for the convergence of Policy Improvement Algorithm (PIA) for a continuous time entropy-regularized stochastic control problem. Such convergence has been established by Huang-Wang-Zhou(2023) by using sophisticated PDE estimates for the iterative PDEs involved in the PIA. Our approach builds on some Feynman-Kac type probabilistic representation f… ▽ More

    Submitted 20 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: In this version we added a convergence result for the 1-dimensional model with diffusion controls

  12. arXiv:2406.10724  [pdf, other

    eess.IV cs.CV cs.LG

    Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft

    Authors: Ian Vyse, Rishit Dagli, Dav Vrat Chadha, John P. Ma, Hector Chen, Isha Ruparelia, Prithvi Seran, Matthew Xie, Eesa Aamer, Aidan Armstrong, Naveen Black, Ben Borstein, Kevin Caldwell, Orrin Dahanaggamaarachchi, Joe Dai, Abeer Fatima, Stephanie Lu, Maxime Michet, Anoushka Paul, Carrie Ann Po, Shivesh Prakash, Noa Prosser, Riddhiman Roy, Mirai Shinjo, Iliya Shofman , et al. (4 additional authors not shown)

    Abstract: Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear in 38th Annual Small Satellite Conference

  13. arXiv:2406.10047  [pdf, ps, other

    math.CO cs.IT

    On automorphism groups of polar codes

    Authors: Jicheng Ma, Guiying Yan

    Abstract: Over the past years, Polar codes have arisen as a highly effective class of linear codes, equipped with a decoding algorithm of low computational complexity. This family of codes share a common algebraic formalism with the well-known Reed-Muller codes, which involves monomial evaluations. As useful algebraic codes, more specifically known as decreasing monomial codes, a lot of decoding work has be… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 14 pages

    MSC Class: 05E18; 20B25; 94B05; 94B35

  14. arXiv:2406.09046  [pdf, other

    cs.LG cs.DB

    ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability

    Authors: Yanming Guo, Jin Ma

    Abstract: The Environmental Extended Multi-Regional Input-Output analysis is the predominant framework in Ecological Economics for assessing the environmental impact of economic activities. This paper introduces ExioML, the first Machine Learning benchmark dataset designed for sustainability analysis, aimed at lowering barriers and fostering collaboration between Machine Learning and Ecological Economics re… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  15. arXiv:2406.08757  [pdf, other

    cs.CL cs.AI

    SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

    Authors: Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang

    Abstract: Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents,… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Track on Datasets and Benchmarks under review

  16. arXiv:2406.07275  [pdf, other

    cs.AI

    DCA-Bench: A Benchmark for Dataset Curation Agents

    Authors: Benhao Huang, Yingzhuo Yu, Jin Huang, Xingjian Zhang, Jiaqi Ma

    Abstract: The quality of datasets plays an increasingly crucial role in the research and development of modern artificial intelligence (AI). Despite the proliferation of open dataset platforms nowadays, data quality issues, such as insufficient documentation, inaccurate annotations, and ethical concerns, remain common in datasets widely used in AI. Furthermore, these issues are often subtle and difficult to… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  17. arXiv:2406.06999  [pdf, other

    cs.CV

    Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection

    Authors: Junfei Yi, Jianxu Mao, Tengfei Liu, Mingjie Li, Hanyu Gu, Hui Zhang, Xiaojun Chang, Yaonan Wang

    Abstract: Knowledge distillation (KD) is a widely adopted and effective method for compressing models in object detection tasks. Particularly, feature-based distillation methods have shown remarkable performance. Existing approaches often ignore the uncertainty in the teacher model's knowledge, which stems from data noise and imperfect training. This limits the student model's ability to learn latent knowle… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  18. arXiv:2406.06357  [pdf, other

    cs.CL cs.AI

    MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows

    Authors: Xingjian Zhang, Yutong Xie, Jin Huang, Jinge Ma, Zhaoying Pan, Qijia Liu, Ziyang Xiong, Tolga Ergen, Dongsub Shim, Honglak Lee, Qiaozhu Mei

    Abstract: Scientific innovation relies on detailed workflows, which include critical steps such as analyzing literature, generating ideas, validating these ideas, interpreting results, and inspiring follow-up research. However, scientific publications that document these workflows are extensive and unstructured. This makes it difficult for both human researchers and AI systems to effectively navigate and ex… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:1706.03762 by other authors

  19. arXiv:2406.05216  [pdf, other

    cs.LG

    TabPFGen -- Tabular Data Generation with TabPFN

    Authors: Junwei Ma, Apoorv Dankar, George Stein, Guangwei Yu, Anthony Caterini

    Abstract: Advances in deep generative modelling have not translated well to tabular data. We argue that this is caused by a mismatch in structure between popular generative models and discriminative models of tabular data. We thus devise a technique to turn TabPFN -- a highly performant transformer initially designed for in-context discriminative tabular tasks -- into an energy-based generative model, which… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  20. arXiv:2406.05207  [pdf, other

    cs.LG

    Retrieval & Fine-Tuning for In-Context Tabular Models

    Authors: Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, Anthony Caterini

    Abstract: Tabular data is a pervasive modality spanning a wide range of domains, and the inherent diversity poses a considerable challenge for deep learning. Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. To address this limitation, we propose a combination of retrieval and… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  21. arXiv:2406.04628  [pdf, other

    cs.CE q-bio.QM

    Projecting Molecules into Synthesizable Chemical Spaces

    Authors: Shitong Luo, Wenhao Gao, Zuofan Wu, Jian Peng, Connor W. Coley, Jianzhu Ma

    Abstract: Discovering new drug molecules is a pivotal yet challenging process due to the near-infinitely large chemical space and notorious demands on time and resources. Numerous generative models have recently been introduced to accelerate the drug discovery process, but their progression to experimental validation remains limited, largely due to a lack of consideration for synthetic accessibility in prac… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  22. arXiv:2406.04426  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    DeTra: A Unified Model for Object Detection and Trajectory Forecasting

    Authors: Sergio Casas, Ben Agro, Jiageng Mao, Thomas Gilles, Alexander Cui, Thomas Li, Raquel Urtasun

    Abstract: The tasks of object detection and trajectory forecasting play a crucial role in understanding the scene for autonomous driving. These tasks are typically executed in a cascading manner, making them prone to compounding errors. Furthermore, there is usually a very thin interface between the two tasks, creating a lossy information bottleneck. To address these challenges, our approach formulates the… ▽ More

    Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  23. arXiv:2406.02877  [pdf, other

    cs.LG cs.DC

    FedStaleWeight: Buffered Asynchronous Federated Learning with Fair Aggregation via Staleness Reweighting

    Authors: Jeffrey Ma, Alan Tu, Yiling Chen, Vijay Janapa Reddi

    Abstract: Federated Learning (FL) endeavors to harness decentralized data while preserving privacy, facing challenges of performance, scalability, and collaboration. Asynchronous Federated Learning (AFL) methods have emerged as promising alternatives to their synchronous counterparts bounded by the slowest agent, yet they add additional challenges in convergence guarantees, fairness with respect to compute… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  24. arXiv:2406.02143  [pdf, other

    cs.CL

    Reinforcement Tuning for Detecting Stances and Debunking Rumors Jointly with Large Language Models

    Authors: Ruichao Yang, Wei Gao, Jing Ma, Hongzhan Lin, Bo Wang

    Abstract: Learning multi-task models for jointly detecting stance and verifying rumors poses challenges due to the need for training data of stance at post level and rumor veracity at claim level, which are difficult to obtain. To address this issue, we leverage large language models (LLMs) as the foundation annotators for the joint stance detection (SD) and rumor verification (RV) tasks, dubbed as JSDRV. W… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: ACL 2024 (Findings)

  25. arXiv:2406.01967  [pdf, other

    cs.RO cs.AI cs.LG

    DrEureka: Language Model Guided Sim-To-Real Transfer

    Authors: Yecheng Jason Ma, William Liang, Hung-Ju Wang, Sam Wang, Yuke Zhu, Linxi Fan, Osbert Bastani, Dinesh Jayaraman

    Abstract: Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale. However, sim-to-real approaches typically rely on manual design and tuning of the task reward function as well as the simulation physics parameters, rendering the process slow and human-labor intensive. In this paper, we investigate using Large Language Models (LLMs) to automa… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Robotics: Science and Systems (RSS) 2024. Project website and open-source code: https://eureka-research.github.io/dr-eureka/

  26. arXiv:2406.01138  [pdf, ps, other

    eess.SP cs.IT

    Precise Analysis of Covariance Identifiability for Activity Detection in Grant-Free Random Access

    Authors: Shengsong Luo, Junjie Ma, Chongbin Xu, Xin Wang

    Abstract: We consider the identifiability issue of maximum likelihood based activity detection in massive MIMO based grant-free random access. A prior work by Chen et al. indicates that the identifiability undergoes a phase transition for commonly-used random signatures. In this paper, we provide an analytical characterization of the boundary of the phase transition curve. Our theoretical results agree well… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  27. arXiv:2406.01027  [pdf, other

    cs.DB cs.LG

    PRICE: A Pretrained Model for Cross-Database Cardinality Estimation

    Authors: Tianjing Zeng, Junwei Lan, Jiahong Ma, Wenqing Wei, Rong Zhu, Pengfei Li, Bolin Ding, Defu Lian, Zhewei Wei, Jingren Zhou

    Abstract: Cardinality estimation (CardEst) is essential for optimizing query execution plans. Recent ML-based CardEst methods achieve high accuracy but face deployment challenges due to high preparation costs and lack of transferability across databases. In this paper, we propose PRICE, a PRetrained multI-table CardEst model, which addresses these limitations. PRICE takes low-level but transferable features… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  28. arXiv:2406.00816  [pdf, other

    cs.LG cs.CR cs.CV

    Invisible Backdoor Attacks on Diffusion Models

    Authors: Sen Li, Junchi Ma, Minhao Cheng

    Abstract: In recent years, diffusion models have achieved remarkable success in the realm of high-quality image generation, garnering increased attention. This surge in interest is paralleled by a growing concern over the security threats associated with diffusion models, largely attributed to their susceptibility to malicious exploitation. Notably, recent research has brought to light the vulnerability of… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/invisibleTriggerDiffusion/invisible_triggers_for_diffusion

  29. arXiv:2406.00735  [pdf, other

    q-bio.BM cs.AI cs.LG

    Full-Atom Peptide Design based on Multi-modal Flow Matching

    Authors: Jiahan Li, Chaoran Cheng, Zuofan Wu, Ruihan Guo, Shitong Luo, Zhizhou Ren, Jian Peng, Jianzhu Ma

    Abstract: Peptides, short chains of amino acid residues, play a vital role in numerous biological processes by interacting with other target molecules, offering substantial potential in drug discovery. In this work, we present PepFlow, the first multi-modal deep generative model grounded in the flow-matching framework for the design of full-atom peptides that target specific protein receptors. Drawing inspi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  30. arXiv:2406.00480  [pdf, other

    cs.CV

    AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning

    Authors: Duojun Huang, Xinyu Xiong, Jie Ma, Jichang Li, Zequn Jie, Lin Ma, Guanbin Li

    Abstract: Powered by massive curated training data, Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts. However, the vanilla SAM is class agnostic and heavily relies on user-provided prompts to segment objects of interest. Adapting this method to diverse tasks is crucial for accurate target identification and to avoid… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: CVPR2024

  31. arXiv:2406.00037  [pdf, other

    cs.CL cs.AI

    Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering

    Authors: Hongyu Yang, Liyang He, Min Hou, Shuanghong Shen, Rui Li, Jiahui Hou, Jianhui Ma, Junda Zhao

    Abstract: Code Community Question Answering (CCQA) seeks to tackle programming-related issues, thereby boosting productivity in both software engineering and academic research. Recent advancements in Reinforcement Learning from Human Feedback (RLHF) have transformed the fine-tuning process of Large Language Models (LLMs) to produce responses that closely mimic human behavior. Leveraging LLMs with RLHF for p… ▽ More

    Submitted 27 May, 2024; originally announced June 2024.

  32. arXiv:2406.00017  [pdf, other

    cs.CL cs.AI cs.MM

    PTA: Enhancing Multimodal Sentiment Analysis through Pipelined Prediction and Translation-based Alignment

    Authors: Shezheng Song, Shasha Li, Shan Zhao, Chengyu Wang, Xiaopeng Li, Jie Yu, Qian Wan, Jun Ma, Tianwei Yan, Wentao Ma, Xiaoguang Mao

    Abstract: Multimodal aspect-based sentiment analysis (MABSA) aims to understand opinions in a granular manner, advancing human-computer interaction and other fields. Traditionally, MABSA methods use a joint prediction approach to identify aspects and sentiments simultaneously. However, we argue that joint models are not always superior. Our analysis shows that joint models struggle to align relevant text to… ▽ More

    Submitted 13 June, 2024; v1 submitted 22 May, 2024; originally announced June 2024.

    Comments: Code will be released upon publication

  33. arXiv:2405.21045  [pdf

    cs.LG

    An Attention-Based Multi-Context Convolutional Encoder-Decoder Neural Network for Work Zone Traffic Impact Prediction

    Authors: Qinhua Jiang, Xishun Liao, Yaofa Gong, Jiaqi Ma

    Abstract: Work zone is one of the major causes of non-recurrent traffic congestion and road incidents. Despite the significance of its impact, studies on predicting the traffic impact of work zones remain scarce. In this paper, we propose a data integration pipeline that enhances the utilization of work zone and traffic data from diversified platforms, and introduce a novel deep learning model to predict th… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  34. arXiv:2405.19524  [pdf, other

    cs.CR cs.AI

    AI Risk Management Should Incorporate Both Safety and Security

    Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

    Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this pape… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  35. arXiv:2405.19338  [pdf, other

    eess.SP cs.AI cs.CV

    Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV Images

    Authors: Yuzhen Ding, Jason M. Holmes, Hongying Feng, Baoxin Li, Lisa A. McGee, Jean-Claude M. Rwigema, Sujay A. Vora, Daniel J. Ma, Robert L. Foote, Samir H. Patel, Wei Liu

    Abstract: In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imag… ▽ More

    Submitted 1 April, 2024; originally announced May 2024.

    Comments: 17 pages, 8 figures and tables

  36. arXiv:2405.19088  [pdf, other

    cs.CL cs.CV

    Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions

    Authors: Zhe Hu, Tuo Liang, Jing Li, Yiren Lu, Yunlai Zhou, Yiran Qiao, Jing Ma, Yu Yin

    Abstract: Recent advancements in large multimodal language models have demonstrated remarkable proficiency across a wide range of tasks. Yet, these models still struggle with understanding the nuances of human humor through juxtaposition, particularly when it involves nonlinear narratives that underpin many jokes and humor cues. This paper investigates this challenge by focusing on comics with contradictory… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  37. arXiv:2405.18458  [pdf

    cs.LG physics.optics

    Asymmetrical estimator for training grey-box deep photonic neural networks

    Authors: Yizhi Wang, Minjia Chen, Chunhui Yao, Jie Ma, Ting Yan, Richard Penty, Qixiang Cheng

    Abstract: Physical neural networks (PNNs) are emerging paradigms for neural network acceleration due to their high-bandwidth, in-propagation analogue processing. Despite the advantages of PNN for inference, training remains a challenge. The imperfect information of the physical transformation means the failure of conventional gradient-based updates from backpropagation (BP). Here, we present the asymmetrica… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures

    MSC Class: 78-05

  38. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  39. arXiv:2405.18405  [pdf, other

    cs.CV cs.AI

    WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization

    Authors: Jiawei Ma, Yulei Niu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang

    Abstract: Language has been useful in extending the vision encoder to data from diverse distributions without empirical discovery in training domains. However, as the image description is mostly at coarse-grained level and ignores visual details, the resulted embeddings are still ineffective in overcoming complexity of domains at inference time. We present a self-supervision framework WIDIn, Wording Images… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  40. arXiv:2405.18255  [pdf, other

    cs.CR cs.SI eess.SP

    Channel Reciprocity Based Attack Detection for Securing UWB Ranging by Autoencoder

    Authors: Wenlong Gou, Chuanhang Yu, Juntao Ma, Gang Wu, Vladimir Mordachev

    Abstract: A variety of ranging threats represented by Ghost Peak attack have raised concerns regarding the security performance of Ultra-Wide Band (UWB) systems with the finalization of the IEEE 802.15.4z standard. Based on channel reciprocity, this paper proposes a low complexity attack detection scheme that compares Channel Impulse Response (CIR) features of both ranging sides utilizing an autoencoder wit… ▽ More

    Submitted 10 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    ACM Class: H.1.1

  41. arXiv:2405.18081  [pdf, other

    math.ST cs.IT math.PR stat.ML

    Optimality of Approximate Message Passing Algorithms for Spiked Matrix Models with Rotationally Invariant Noise

    Authors: Rishabh Dudeja, Songbin Liu, Junjie Ma

    Abstract: We study the problem of estimating a rank one signal matrix from an observed matrix generated by corrupting the signal with additive rotationally invariant noise. We develop a new class of approximate message-passing algorithms for this problem and provide a simple and concise characterization of their dynamics in the high-dimensional limit. At each iteration, these algorithms exploit prior knowle… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  42. arXiv:2405.17976  [pdf

    cs.AI cs.CL

    Yuan 2.0-M32: Mixture of Experts with Attention Router

    Authors: Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao, Houbo He, Zeru Zhang, Zeyu Sun, Junxiong Mao, Chong Shen

    Abstract: Yuan 2.0-M32, with a similar base architecture as Yuan-2.0 2B, uses a mixture-of-experts architecture with 32 experts of which 2 experts are active. A new router network, Attention Router, is proposed and adopted for a more efficient selection of experts, which improves the accuracy compared to the model with classical router network. Yuan 2.0-M32 is trained with 2000B tokens from scratch, and the… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages,3 figures, 7 tables

  43. arXiv:2405.17468  [pdf, other

    cs.LG cs.AI

    Deep Activity Model: A Generative Approach for Human Mobility Pattern Synthesis

    Authors: Xishun Liao, Brian Yueshuai He, Qinhua Jiang, Chenchen Kuai, Jiaqi Ma

    Abstract: Human mobility significantly impacts various aspects of society, including transportation, urban planning, and public health. The increasing availability of diverse mobility data and advancements in deep learning have revolutionized mobility modeling. Existing deep learning models, however, mainly study spatio-temporal patterns using trajectories and often fall short in capturing the underlying se… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  44. arXiv:2405.17293  [pdf, other

    cs.LG cs.AI

    Efficient Ensembles Improve Training Data Attribution

    Authors: Junwei Deng, Ting-Wei Li, Shichang Zhang, Jiaqi Ma

    Abstract: Training data attribution (TDA) methods aim to quantify the influence of individual training data points on the model predictions, with broad applications in data-centric AI, such as mislabel detection, data selection, and copyright compensation. However, existing methods in this field, which can be categorized as retraining-based and gradient-based, have struggled with the trade-off between compu… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  45. arXiv:2405.17132  [pdf, other

    cs.LG

    Your decision path does matter in pre-training industrial recommenders with multi-source behaviors

    Authors: Chunjing Gan, Binbin Hu, Bo Huang, Ziqi Liu, Jian Ma, Zhiqiang Zhang, Wenliang Zhong, Jun Zhou

    Abstract: Online service platforms offering a wide range of services through miniapps have become crucial for users who visit these platforms with clear intentions to find services they are interested in. Aiming at effective content delivery, cross-domain recommendation are introduced to learn high-quality representations by transferring behaviors from data-rich scenarios. However, these methods overlook th… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  46. arXiv:2405.16930  [pdf, other

    cs.CV

    From Obstacle to Opportunity: Enhancing Semi-supervised Learning with Synthetic Data

    Authors: Zerun Wang, Jiafeng Mao, Liuyu Xiang, Toshihiko Yamasaki

    Abstract: Semi-supervised learning (SSL) can utilize unlabeled data to enhance model performance. In recent years, with increasingly powerful generative models becoming available, a large number of synthetic images have been uploaded to public image sets. Therefore, when collecting unlabeled data from these sources, the inclusion of synthetic images is inevitable. This prompts us to consider the impact of u… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  47. arXiv:2405.16821  [pdf, other

    cs.CL

    Perturbation-Restrained Sequential Model Editing

    Authors: Jun-Yu Ma, Hong Wang, Hao-Xiang Xu, Zhen-Hua Ling, Jia-Chen Gu

    Abstract: Model editing is an emerging field that focuses on updating the knowledge embedded within large language models (LLMs) without extensive retraining. However, current model editing methods significantly compromise the general abilities of LLMs as the number of edits increases, and this trade-off poses a substantial challenge to the continual learning of LLMs. In this paper, we first theoretically a… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  48. arXiv:2405.16599  [pdf, other

    cs.RO

    MCGMapper: Light-Weight Incremental Structure from Motion and Visual Localization With Planar Markers and Camera Groups

    Authors: Yusen Xie, Zhenmin Huang, Kai Chen, Lei Zhu, Jun Ma

    Abstract: Structure from Motion (SfM) and visual localization in indoor texture-less scenes and industrial scenarios present prevalent yet challenging research topics. Existing SfM methods designed for natural scenes typically yield low accuracy or map-building failures due to insufficient robust feature extraction in such settings. Visual markers, with their artificially designed features, can effectively… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 8 pages,8 figures

  49. arXiv:2405.16472  [pdf, other

    cs.LG

    Multi-Level Additive Modeling for Structured Non-IID Federated Learning

    Authors: Shutong Chen, Tianyi Zhou, Guodong Long, Jie Ma, Jing Jiang, Chengqi Zhang

    Abstract: The primary challenge in Federated Learning (FL) is to model non-IID distributions across clients, whose fine-grained structure is important to improve knowledge sharing. For example, some knowledge is globally shared across all clients, some is only transferable within a subgroup of clients, and some are client-specific. To capture and exploit this structure, we train models organized in a multi-… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  50. arXiv:2405.12875  [pdf

    cs.CV cs.CL

    Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images

    Authors: Xiaofei Yu, Yitong Li, Jie Ma

    Abstract: Remote sensing image change captioning (RSICC) aims at generating human-like language to describe the semantic changes between bi-temporal remote sensing image pairs. It provides valuable insights into environmental dynamics and land management. Unlike conventional change captioning task, RSICC involves not only retrieving relevant information across different modalities and generating fluent capt… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.