Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 391 results for author: Jia, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03566  [pdf, ps, other

    cs.IT eess.SP

    Stacked Intelligent Metasurfaces for Wireless Sensing and Communication: Applications and Challenges

    Authors: Hao Liu, Jiancheng An, Xing Jia, Shining Lin, Xianghao Yao, Lu Gan, Bruno Clerckx, Chau Yuen, Mehdi Bennis, Mérouane Debbah

    Abstract: The rapid advancement of wireless communication technologies has precipitated an unprecedented demand for high data rates, extremely low latency, and ubiquitous connectivity. In order to achieve these goals, stacked intelligent metasurfaces (SIM) has been developed as a novel solution to perform advanced signal processing tasks directly in the electromagnetic wave domain, thus achieving ultra-fast… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 8 pages, 5 figures, 1 table

  2. arXiv:2407.00955  [pdf, other

    cs.IT cs.AI eess.SP

    Task-oriented Over-the-air Computation for Edge-device Co-inference with Balanced Classification Accuracy

    Authors: Xiang Jiao, Dingzhu Wen, Guangxu Zhu, Wei Jiang, Wu Luo, Yuanming Shi

    Abstract: Edge-device co-inference, which concerns the cooperation between edge devices and an edge server for completing inference tasks over wireless networks, has been a promising technique for enabling various kinds of intelligent services at the network edge, e.g., auto-driving. In this paradigm, the concerned design objective of the network shifts from the traditional communication throughput to the e… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This paper was accepted by IEEE Transactions on Vehicular Technology on June 30, 2024

  3. arXiv:2406.12538  [pdf, other

    cs.LG cs.AI cs.RO

    Variational Distillation of Diffusion Policies into Mixture of Experts

    Authors: Hongyi Zhou, Denis Blessing, Ge Li, Onur Celik, Xiaogang Jia, Gerhard Neumann, Rudolf Lioutikov

    Abstract: This work introduces Variational Diffusion Distillation (VDD), a novel method that distills denoising diffusion policies into Mixtures of Experts (MoE) through variational inference. Diffusion Models are the current state-of-the-art in generative modeling due to their exceptional ability to accurately learn and represent complex, multi-modal distributions. This ability allows Diffusion Models to r… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.08234  [pdf, other

    cs.LG cs.RO

    MaIL: Improving Imitation Learning with Mamba

    Authors: Xiaogang Jia, Qian Wang, Atalay Donat, Bowen Xing, Ge Li, Hongyi Zhou, Onur Celik, Denis Blessing, Rudolf Lioutikov, Gerhard Neumann

    Abstract: This work introduces Mamba Imitation Learning (MaIL), a novel imitation learning (IL) architecture that offers a computationally efficient alternative to state-of-the-art (SoTA) Transformer policies. Transformer-based policies have achieved remarkable results due to their ability in handling human-recorded data with inherently non-Markovian behavior. However, their high performance comes with the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2406.07423  [pdf, other

    cs.LG cs.AI stat.ML

    Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling

    Authors: Denis Blessing, Xiaogang Jia, Johannes Esslinger, Francisco Vargas, Gerhard Neumann

    Abstract: Monte Carlo methods, Variational Inference, and their combinations play a pivotal role in sampling from intractable probability distributions. However, current studies lack a unified evaluation framework, relying on disparate performance measures and limited method comparisons across diverse tasks, complicating the assessment of progress and hindering the decision-making of practitioners. In respo… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. arXiv:2406.06089  [pdf, other

    cs.CV

    Texture Re-scalable Universal Adversarial Perturbation

    Authors: Yihao Huang, Qing Guo, Felix Juefei-Xu, Ming Hu, Xiaojun Jia, Xiaochun Cao, Geguang Pu, Yang Liu

    Abstract: Universal adversarial perturbation (UAP), also known as image-agnostic perturbation, is a fixed perturbation map that can fool the classifier with high probabilities on arbitrary images, making it more practical for attacking deep models in the real world. Previous UAP methods generate a scale-fixed and texture-fixed perturbation map for all images, which ignores the multi-scale objects in images… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 14 pages (accepted by TIFS2024)

  7. arXiv:2406.04367  [pdf, other

    physics.flu-dyn cs.LG physics.comp-ph

    Physics-enhanced Neural Operator for Simulating Turbulent Transport

    Authors: Shengyu Chen, Peyman Givi, Can Zheng, Xiaowei Jia

    Abstract: The precise simulation of turbulent flows is of immense importance in a variety of scientific and engineering fields, including climate science, freshwater science, and the development of energy-efficient manufacturing processes. Within the realm of turbulent flow simulation, direct numerical simulation (DNS) is widely considered to be the most reliable approach, but it is prohibitively expensive… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 13 pages

  8. arXiv:2406.03877  [pdf, other

    cs.RO cs.CV

    Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

    Authors: Xiaosong Jia, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, Junchi Yan

    Abstract: In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuS… ▽ More

    Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Fix typos in text and Table 4. More reference

  9. arXiv:2406.03141  [pdf, other

    q-bio.BM cs.LG

    Floating Anchor Diffusion Model for Multi-motif Scaffolding

    Authors: Ke Liu, Weian Mao, Shuaike Shen, Xiaoran Jiao, Zheng Sun, Hao Chen, Chunhua Shen

    Abstract: Motif scaffolding seeks to design scaffold structures for constructing proteins with functions derived from the desired motif, which is crucial for the design of vaccines and enzymes. Previous works approach the problem by inpainting or conditional generation. Both of them can only scaffold motifs with fixed positions, and the conditional generation cannot guarantee the presence of motifs. However… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  10. arXiv:2406.02064  [pdf, other

    cs.LG cs.CR cs.CV

    Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation

    Authors: Yaohua Liu, Jiaxin Gao, Xuan Liu, Xianghao Jiao, Xin Fan, Risheng Liu

    Abstract: Transfer attacks generate significant interest for real-world black-box applications by crafting transferable adversarial examples through surrogate models. Whereas, existing works essentially directly optimize the single-level objective w.r.t. the surrogate model, which always leads to poor interpretability of attack mechanism and limited generalization performance over unknown victim models. In… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024. 10 pages

  11. arXiv:2406.00034  [pdf, other

    cs.CL cs.AI

    Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories

    Authors: Tianlong Wang, Xianfeng Jiao, Yifan He, Zhongzhi Chen, Yinghao Zhu, Xu Chu, Junyi Gao, Yasha Wang, Liantao Ma

    Abstract: Recent studies have indicated that Large Language Models (LLMs) harbor an inherent understanding of truthfulness, yet often fail to express fully and generate false statements. This gap between "knowing" and "telling" poses a challenge for ensuring the truthfulness of generated content. To address this, we introduce Adaptive Activation Steering (ACT), a tuning-free method that adaptively shift LLM… ▽ More

    Submitted 26 May, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.17811

  12. arXiv:2405.21040  [pdf, other

    cs.CL cs.AI

    Direct Alignment of Language Models via Quality-Aware Self-Refinement

    Authors: Runsheng Yu, Yong Wang, Xiaoqi Jiao, Youzhi Zhang, James T. Kwok

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has been commonly used to align the behaviors of Large Language Models (LLMs) with human preferences. Recently, a popular alternative is Direct Policy Optimization (DPO), which replaces an LLM-based reward model with the policy itself, thus obviating the need for extra memory and training time to learn the reward model. However, DPO does not consid… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  13. arXiv:2405.21018  [pdf, other

    cs.LG cs.CL cs.CR

    Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

    Authors: Xiaojun Jia, Tianyu Pang, Chao Du, Yihao Huang, Jindong Gu, Yang Liu, Xiaochun Cao, Min Lin

    Abstract: Large language models (LLMs) are being rapidly developed, and a key component of their widespread deployment is their safety-related alignment. Many red-teaming efforts aim to jailbreak LLMs, where among these efforts, the Greedy Coordinate Gradient (GCG) attack's success has led to a growing interest in the study of optimization-based jailbreaking techniques. Although GCG is a significant milesto… ▽ More

    Submitted 5 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  14. arXiv:2405.20224  [pdf, other

    cs.CV

    EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images

    Authors: Wangbo Yu, Chaoran Feng, Jiye Tang, Xu Jia, Li Yuan, Yonghong Tian

    Abstract: 3D Gaussian Splatting (3D-GS) has demonstrated exceptional capabilities in 3D scene reconstruction and novel view synthesis. However, its training heavily depends on high-quality, sharp images and accurate camera poses. Fulfilling these requirements can be challenging in non-ideal real-world scenarios, where motion-blurred images are commonly encountered in high-speed moving cameras or low-light e… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Project Page: https://drexubery.github.io/EvaGaussians/

  15. arXiv:2405.18071  [pdf, other

    cs.CV

    Text Modality Oriented Image Feature Extraction for Detecting Diffusion-based DeepFake

    Authors: Di Yang, Yihao Huang, Qing Guo, Felix Juefei-Xu, Xiaojun Jia, Run Wang, Geguang Pu, Yang Liu

    Abstract: The widespread use of diffusion methods enables the creation of highly realistic images on demand, thereby posing significant risks to the integrity and safety of online information and highlighting the necessity of DeepFake detection. Our analysis of features extracted by traditional image encoders reveals that both low-level and high-level features offer distinct advantages in identifying DeepFa… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  16. arXiv:2405.18023  [pdf, ps, other

    cs.IT

    Generator polynomials of cyclic expurgated or extended Goppa codes

    Authors: Xue Jia, Fengwei Li, Huan Sun, Qin Yue

    Abstract: Classical Goppa codes are a well-known class of codes with applications in code-based cryptography, which are a special case of alternant codes. Many papers are devoted to the search for Goppa codes with a cyclic extension or with a cyclic parity-check subcode. Let $\Bbb F_q$ be a finite field with $q=2^l$ elements, where $l$ is a positive integer. In this paper, we determine all the generator pol… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  17. arXiv:2405.14517  [pdf, other

    cs.LG cs.CR

    Identity Inference from CLIP Models using Only Textual Data

    Authors: Songze Li, Ruoxi Cheng, Xiaojun Jia

    Abstract: The widespread usage of large-scale multimodal models like CLIP has heightened concerns about the leakage of personally identifiable information (PII). Existing methods for identity inference in CLIP models, i.e., to detect the presence of a person's PII used for training a CLIP model, require querying the model with full PII, including textual descriptions of the person and corresponding images (… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  18. arXiv:2405.14189  [pdf, other

    cs.CL cs.CV

    Semantic-guided Prompt Organization for Universal Goal Hijacking against LLMs

    Authors: Yihao Huang, Chong Wang, Xiaojun Jia, Qing Guo, Felix Juefei-Xu, Jian Zhang, Geguang Pu, Yang Liu

    Abstract: With the rising popularity of Large Language Models (LLMs), assessing their trustworthiness through security tasks has gained critical importance. Regarding the new task of universal goal hijacking, previous efforts have concentrated solely on optimization algorithms, overlooking the crucial role of the prompt. To fill this gap, we propose a universal goal hijacking method called POUGH that incorp… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 15 pages

  19. arXiv:2405.12447  [pdf, ps, other

    cs.CV

    EPL: Empirical Prototype Learning for Deep Face Recognition

    Authors: Weijia Fan, Jiajun Wen, Xi Jia, Linlin Shen, Jiancan Zhou, Qiufu Li

    Abstract: Prototype learning is widely used in face recognition, which takes the row vectors of coefficient matrix in the last linear layer of the feature extraction model as the prototypes for each class. When the prototypes are updated using the facial sample feature gradients in the model training, they are prone to being pulled away from the class center by the hard samples, resulting in decreased overa… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 16pages, 2 figures, 6 tables

  20. arXiv:2405.11884  [pdf, other

    cs.LG cs.DC

    Vertical Federated Learning Hybrid Local Pre-training

    Authors: Wenguo Li, Xinling Guo, Xu Jiao, Tiancheng Huang, Xiaoran Yan, Yao Yang

    Abstract: Vertical Federated Learning (VFL), which has a broad range of real-world applications, has received much attention in both academia and industry. Enterprises aspire to exploit more valuable features of the same users from diverse departments to boost their model prediction skills. VFL addresses this demand and concurrently secures individual parties from exposing their raw data. However, conventio… ▽ More

    Submitted 21 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  21. arXiv:2405.07594  [pdf, other

    cs.CV

    RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

    Authors: Congjia Chen, Xiaoyu Jia, Yanhong Zheng, Yufu Qu

    Abstract: Point cloud registration is a fundamental task for estimating rigid transformations between point clouds. Previous studies have used geometric information for extracting features, matching and estimating transformation. Recently, owing to the advancement of RGB-D sensors, researchers have attempted to utilize visual information to improve registration performance. However, these studies focused on… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  22. arXiv:2405.01741  [pdf, other

    cs.CR cs.AI cs.AR cs.LG

    PVF (Parameter Vulnerability Factor): A Scalable Metric for Understanding AI Vulnerability Against SDCs in Model Parameters

    Authors: Xun Jiao, Fred Lin, Harish D. Dixit, Joel Coburn, Abhinav Pandey, Han Wang, Venkat Ramesh, Jianyu Huang, Wang Xu, Daniel Moore, Sriram Sankar

    Abstract: Reliability of AI systems is a fundamental concern for the successful deployment and widespread adoption of AI technologies. Unfortunately, the escalating complexity and heterogeneity of AI hardware systems make them increasingly susceptible to hardware faults, e.g., silent data corruptions (SDC), that can potentially corrupt model parameters. When this occurs during AI inference/servicing, it can… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  23. arXiv:2405.01356  [pdf, other

    cs.CV

    Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

    Authors: Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang

    Abstract: In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt. In this work, we propose Subject-Agnostic Guidance (SAG), a simple yet effective solution to remedy the problem. We show that through constructing a subject-agnostic condition and applying our pr… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  24. arXiv:2404.18213  [pdf, other

    cs.CV cs.AI

    S$^2$Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification

    Authors: Guanchun Wang, Xiangrong Zhang, Zelin Peng, Tianyang Zhang, Xiuping Jia, Licheng Jiao

    Abstract: Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information. Recent studies are primarily dedicated to designing Transformer-based architectures for spatial-spectral long-range dependencies modeling, which is computationally expensive with quadratic complexity. Selective structured state space model (Mamba), whic… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures

  25. arXiv:2404.16054  [pdf, other

    cs.HC cs.AI

    LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Automation Task Evaluation

    Authors: Li Zhang, Shihe Wang, Xianqing Jia, Zhihan Zheng, Yunhe Yan, Longxi Gao, Yuanchun Li, Mengwei Xu

    Abstract: The emergent large language/multimodal models facilitate the evolution of mobile agents, especially in the task of mobile UI automation. However, existing evaluation approaches, which rely on human validation or established datasets to compare agent-predicted actions with predefined ones, are unscalable and unfaithful. To overcome these limitations, this paper presents LlamaTouch, a testbed for on… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  26. arXiv:2404.15677  [pdf, other

    cs.CV

    CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models

    Authors: Qinghe Wang, Baolu Li, Xiaomin Li, Bing Cao, Liqian Ma, Huchuan Lu, Xu Jia

    Abstract: Recent advances in text-to-image models have opened new frontiers in human-centric generation. However, these models cannot be directly employed to generate images with consistent newly coined identities. In this work, we propose CharacterFactory, a framework that allows sampling new characters with consistent identities in the latent space of GANs for diffusion models. More specifically, we consi… ▽ More

    Submitted 27 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Code will be released very soon: https://github.com/qinghew/CharacterFactory

  27. arXiv:2404.13509  [pdf, ps, other

    cs.SD cs.AI eess.AS

    MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention

    Authors: Xinxin Jiao, Liejun Wang, Yinfeng Yu

    Abstract: Speech emotion recognition is crucial in human-computer interaction, but extracting and using emotional cues from audio poses challenges. This paper introduces MFHCA, a novel method for Speech Emotion Recognition using Multi-Spatial Fusion and Hierarchical Cooperative Attention on spectrograms and raw audio. We employ the Multi-Spatial Fusion module (MF) to efficiently identify emotion-related spe… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Main paper (5 pages). Accepted for publication by ICME 2024

  28. arXiv:2404.13420  [pdf, other

    cs.CV

    NeurCADRecon: Neural Representation for Reconstructing CAD Surfaces by Enforcing Zero Gaussian Curvature

    Authors: Qiujie Dong, Rui Xu, Pengfei Wang, Shuangmin Chen, Shiqing Xin, Xiaohong Jia, Wenping Wang, Changhe Tu

    Abstract: Despite recent advances in reconstructing an organic model with the neural signed distance function (SDF), the high-fidelity reconstruction of a CAD model directly from low-quality unoriented point clouds remains a significant challenge. In this paper, we address this challenge based on the prior observation that the surface of a CAD model is generally composed of piecewise surface patches, each a… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ACM Transactions on Graphics (SIGGRAPH 2024)

  29. arXiv:2404.11797  [pdf, other

    cs.CV cs.AI cs.LG

    When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery

    Authors: Yiqun Xie, Zhihao Wang, Weiye Chen, Zhili Li, Xiaowei Jia, Yanhua Li, Ruichen Wang, Kangyang Chai, Ruohan Li, Sergii Skakun

    Abstract: Foundation models, i.e., very large deep learning models, have demonstrated impressive performances in various language and vision tasks that are otherwise difficult to reach using smaller-size models. The major success of GPT-type of language models is particularly exciting and raises expectations on the potential of foundation models in other domains including satellite remote sensing. In this c… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  30. arXiv:2404.11056  [pdf, other

    cs.LG cs.AI cs.CR

    LMEraser: Large Model Unlearning through Adaptive Prompt Tuning

    Authors: Jie Xu, Zihan Wu, Cong Wang, Xiaohua Jia

    Abstract: To address the growing demand for privacy protection in machine learning, we propose a novel and efficient machine unlearning approach for \textbf{L}arge \textbf{M}odels, called \textbf{LM}Eraser. Existing unlearning research suffers from entangled training data and complex model architectures, incurring extremely high computational costs for large models. LMEraser takes a divide-and-conquer strat… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  31. arXiv:2404.10595  [pdf, other

    cs.CV

    Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases

    Authors: Kai Chen, Yanze Li, Wenhua Zhang, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-Yan Yeung, Huchuan Lu, Xu Jia

    Abstract: Large Vision-Language Models (LVLMs) have received widespread attention in advancing the interpretable self-driving. Existing evaluations of LVLMs primarily focus on the multi-faceted capabilities in natural circumstances, lacking automated and quantifiable assessment for self-driving, let alone the severe road corner cases. In this paper, we propose CODA-LM, the very first benchmark for the autom… ▽ More

    Submitted 26 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Project Page: https://coda-dataset.github.io/coda-lm/

  32. arXiv:2404.10335  [pdf, other

    cs.CV

    Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models

    Authors: Qi Guo, Shanmin Pang, Xiaojun Jia, Qing Guo

    Abstract: Targeted transfer-based attacks involving adversarial examples pose a significant threat to large visual-language models (VLMs). However, the state-of-the-art (SOTA) transfer-based attacks incur high costs due to excessive iteration counts. Furthermore, the generated adversarial examples exhibit pronounced adversarial noise and demonstrate limited efficacy in evading defense methods such as DiffPu… ▽ More

    Submitted 18 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  33. arXiv:2404.09497  [pdf, other

    cs.AR

    Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity

    Authors: Cenlin Duan, Jianlei Yang, Yiou Wang, Yikun Wang, Yingjie Qi, Xiaolin He, Bonan Yan, Xueyan Wang, Xiaotao Jia, Weisheng Zhao

    Abstract: Bit-level sparsity in neural network models harbors immense untapped potential. Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency. Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity. To address this challenge, we propose Dyadic Block PIM… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by DAC'24

  34. The Survey on Multi-Source Data Fusion in Cyber-Physical-Social Systems:Foundational Infrastructure for Industrial Metaverses and Industries 5.0

    Authors: Xiao Wang, Yutong Wang, Jing Yang, Xiaofeng Jia, Lijun Li, Weiping Ding, Fei-Yue Wang

    Abstract: As the concept of Industries 5.0 develops, industrial metaverses are expected to operate in parallel with the actual industrial processes to offer ``Human-Centric" Safe, Secure, Sustainable, Sensitive, Service, and Smartness ``6S" manufacturing solutions. Industrial metaverses not only visualize the process of productivity in a dynamic and evolutional way, but also provide an immersive laboratory… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Journal ref: Information Fusion 2024

  35. arXiv:2404.04201  [pdf, ps, other

    cs.PL cs.FL

    V-Star: Learning Visibly Pushdown Grammars from Program Inputs

    Authors: Xiaodong Jia, Gang Tan

    Abstract: Accurate description of program inputs remains a critical challenge in the field of programming languages. Active learning, as a well-established field, achieves exact learning for regular languages. We offer an innovative grammar inference tool, V-Star, based on the active learning of visibly pushdown automata. V-Star deduces nesting structures of program input languages from sample inputs, emplo… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: PLDI '24

  36. arXiv:2404.01165  [pdf, other

    cs.CL

    LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models

    Authors: Haoran Li, Junqi Liu, Zexian Wang, Shiyuan Luo, Xiaowei Jia, Huaxiu Yao

    Abstract: The modeling of environmental ecosystems plays a pivotal role in the sustainable management of our planet. Accurate prediction of key environmental variables over space and time can aid in informed policy and decision-making, thus improving people's livelihood. Recently, deep learning-based methods have shown promise in modeling the spatial-temporal relationships for predicting environmental varia… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  37. arXiv:2403.19907  [pdf, ps, other

    cs.LG cs.AI

    Beyond the Known: Novel Class Discovery for Open-world Graph Learning

    Authors: Yucheng Jin, Yun Xiong, Juncheng Fang, Xixi Wu, Dongxiao He, Xing Jia, Bingchen Zhao, Philip Yu

    Abstract: Node classification on graphs is of great importance in many applications. Due to the limited labeling capability and evolution in real-world open scenarios, novel classes can emerge on unlabeled testing nodes. However, little attention has been paid to novel class discovery on graphs. Discovering novel classes is challenging as novel and known class nodes are correlated by edges, which makes thei… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  38. arXiv:2403.18923  [pdf, other

    cs.NE cs.AI cs.LG

    Nature-Guided Cognitive Evolution for Predicting Dissolved Oxygen Concentrations in North Temperate Lakes

    Authors: Runlong Yu, Robert Ladwig, Xiang Xu, Peijun Zhu, Paul C. Hanson, Yiqun Xie, Xiaowei Jia

    Abstract: Predicting dissolved oxygen (DO) concentrations in north temperate lakes requires a comprehensive study of phenological patterns across various ecosystems, which highlights the significance of selecting phenological features and feature interactions. Process-based models are limited by partial process knowledge or oversimplified feature representations, while machine learning models face challenge… ▽ More

    Submitted 15 February, 2024; originally announced March 2024.

  39. arXiv:2403.15989  [pdf, other

    cs.LG cs.AI cs.CE

    Knowledge-guided Machine Learning: Current Trends and Future Prospects

    Authors: Anuj Karpatne, Xiaowei Jia, Vipin Kumar

    Abstract: This paper presents an overview of scientific modeling and discusses the complementary strengths and weaknesses of ML methods for scientific modeling in comparison to process-based models. It also provides an introduction to the current state of research in the emerging field of scientific knowledge-guided machine learning (KGML) that aims to use both scientific knowledge and data in ML frameworks… ▽ More

    Submitted 1 May, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

  40. arXiv:2403.13331  [pdf, other

    cs.CV cs.RO

    AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving

    Authors: Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yan

    Abstract: As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation. One natural solution is to estimate the position of other agents in a step-by-step manner where each predicted time-step is conditioned on both observed time-steps and previously predicted time-steps, i.e., autoregressive prediction. Pioneering works like SocialL… ▽ More

    Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  41. arXiv:2403.12445  [pdf, other

    cs.CV

    Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

    Authors: Sensen Gao, Xiaojun Jia, Xuhong Ren, Ivor Tsang, Qing Guo

    Abstract: Vision-language pre-training (VLP) models exhibit remarkable capabilities in comprehending both images and text, yet they remain susceptible to multimodal adversarial examples (AEs). Strengthening adversarial attacks and uncovering vulnerabilities, especially common issues in VLP models (e.g., high transferable AEs), can stimulate further research on constructing reliable and practical VLP models.… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  42. arXiv:2403.07289  [pdf, other

    cs.CV

    Rediscovering BCE Loss for Uniform Classification

    Authors: Qiufu Li, Xi Jia, Jiancan Zhou, Linlin Shen, Jinming Duan

    Abstract: This paper introduces the concept of uniform classification, which employs a unified threshold to classify all samples rather than adaptive threshold classifying each individual sample. We also propose the uniform classification accuracy as a metric to measure the model's performance in uniform classification. Furthermore, begin with a naive loss, we mathematically derive a loss function suitable… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  43. arXiv:2403.05247  [pdf, other

    cs.CV eess.IV

    Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds

    Authors: Tianrui Lou, Xiaojun Jia, Jindong Gu, Li Liu, Siyuan Liang, Bangyan He, Xiaochun Cao

    Abstract: Adversarial attack methods based on point manipulation for 3D point cloud classification have revealed the fragility of 3D models, yet the adversarial examples they produce are easily perceived or defended against. The trade-off between the imperceptibility and adversarial strength leads most point attack methods to inevitably introduce easily detectable outlier points upon a successful attack. An… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  44. arXiv:2403.02877  [pdf, other

    cs.CV cs.AI cs.RO

    ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving

    Authors: Han Lu, Xiaosong Jia, Yichen Xie, Wenlong Liao, Xiaokang Yang, Junchi Yan

    Abstract: End-to-end differentiable learning for autonomous driving (AD) has recently become a prominent paradigm. One main bottleneck lies in its voracious appetite for high-quality labeled data e.g. 3D bounding boxes and semantic segmentation, which are notoriously expensive to manually annotate. The difficulty is further pronounced due to the prominent fact that the behaviors within samples in AD often s… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  45. arXiv:2402.16720  [pdf, other

    cs.RO

    Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)

    Authors: Qifeng Li, Xiaosong Jia, Shaobo Wang, Junchi Yan

    Abstract: Real-world autonomous driving (AD) especially urban driving involves many corner cases. The lately released AD simulator CARLA v2 adds 39 common events in the driving scene, and provide more quasi-realistic testbed compared to CARLA v1. It poses new challenge to the community and so far no literature has reported any success on the new scenarios in V2 as existing works mostly have to rely on speci… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  46. arXiv:2402.15627  [pdf, other

    cs.LG cs.DC

    MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

    Authors: Ziheng Jiang, Haibin Lin, Yinmin Zhong, Qi Huang, Yangrui Chen, Zhi Zhang, Yanghua Peng, Xiang Li, Cong Xie, Shibiao Nong, Yulu Jia, Sun He, Hongmin Chen, Zhihao Bai, Qi Hou, Shipeng Yan, Ding Zhou, Yiyao Sheng, Zhuo Jiang, Haohan Xu, Haoran Wei, Zhang Zhang, Pengfei Nie, Leqi Zou, Sida Zhao , et al. (7 additional authors not shown)

    Abstract: We present the design, implementation and engineering experience in building and deploying MegaScale, a production system for training large language models (LLMs) at the scale of more than 10,000 GPUs. Training LLMs at this scale brings unprecedented challenges to training efficiency and stability. We take a full-stack approach that co-designs the algorithmic and system components across model bl… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  47. arXiv:2402.15572  [pdf, other

    cs.AI cs.CV cs.RO

    Improving Explainable Object-induced Model through Uncertainty for Automated Vehicles

    Authors: Shihong Ling, Yue Wan, Xiaowei Jia, Na Du

    Abstract: The rapid evolution of automated vehicles (AVs) has the potential to provide safer, more efficient, and comfortable travel options. However, these systems face challenges regarding reliability in complex driving scenarios. Recent explainable AV architectures neglect crucial information related to inherent uncertainties while providing explanations for actions. To overcome such challenges, our stud… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: In Proceedings of the 2024 ACM / IEEE International Conference on Human-Robot Interaction (HRI '24), March 11--14, 2024, Boulder, CO, USA. ACM, New York, NY, USA, 9 pages

  48. arXiv:2402.14606  [pdf, other

    cs.RO

    Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations

    Authors: Xiaogang Jia, Denis Blessing, Xinkai Jiang, Moritz Reuss, Atalay Donat, Rudolf Lioutikov, Gerhard Neumann

    Abstract: Imitation learning with human data has demonstrated remarkable success in teaching robots in a wide range of skills. However, the inherent diversity in human behavior leads to the emergence of multi-modal data distributions, thereby presenting a formidable challenge for existing imitation learning algorithms. Quantifying a model's capacity to capture and replicate this diversity effectively is sti… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  49. arXiv:2402.13379  [pdf, other

    cs.LG cs.CY

    Referee-Meta-Learning for Fast Adaptation of Locational Fairness

    Authors: Weiye Chen, Yiqun Xie, Xiaowei Jia, Erhu He, Han Bao, Bang An, Xun Zhou

    Abstract: When dealing with data from distinct locations, machine learning algorithms tend to demonstrate an implicit preference of some locations over the others, which constitutes biases that sabotage the spatial fairness of the algorithm. This unfairness can easily introduce biases in subsequent decision-making given broad adoptions of learning-based solutions in practice. However, locational biases in A… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  50. arXiv:2402.11497  [pdf, other

    cs.CV

    Thyroid ultrasound diagnosis improvement via multi-view self-supervised learning and two-stage pre-training

    Authors: Jian Wang, Xin Yang, Xiaohong Jia, Wufeng Xue, Rusi Chen, Yanlin Chen, Xiliang Zhu, Lian Liu, Yan Cao, Jianqiao Zhou, Dong Ni, Ning Gu

    Abstract: Thyroid nodule classification and segmentation in ultrasound images are crucial for computer-aided diagnosis; however, they face limitations owing to insufficient labeled data. In this study, we proposed a multi-view contrastive self-supervised method to improve thyroid nodule classification and segmentation performance with limited manual labels. Our method aligns the transverse and longitudinal… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: The article has been accepted by the journal of Computers in Biology and Medicine