Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 206 results for author: Luo, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02886  [pdf, other

    cs.CR

    A Wolf in Sheep's Clothing: Practical Black-box Adversarial Attacks for Evading Learning-based Windows Malware Detection in the Wild

    Authors: Xiang Ling, Zhiyu Wu, Bin Wang, Wei Deng, Jingzheng Wu, Shouling Ji, Tianyue Luo, Yanjun Wu

    Abstract: Given the remarkable achievements of existing learning-based malware detection in both academia and industry, this paper presents MalGuise, a practical black-box adversarial attack framework that evaluates the security risks of existing learning-based Windows malware detection systems under the black-box setting. MalGuise first employs a novel semantics-preserving transformation of call-based redi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by 33rd USENIX Security Symposium 2024

  2. arXiv:2406.17245  [pdf, other

    cs.LG cs.AI cs.CL

    Unlocking Continual Learning Abilities in Language Models

    Authors: Wenyu Du, Shuang Cheng, Tongxu Luo, Zihan Qiu, Zeyu Huang, Ka Chun Cheung, Reynold Cheng, Jie Fu

    Abstract: Language models (LMs) exhibit impressive performance and generalization capabilities. However, LMs struggle with the persistent challenge of catastrophic forgetting, which undermines their long-term sustainability in continual learning (CL). Existing approaches usually address the issue by incorporating old task data or task-wise inductive bias into LMs. However, old data and accurate task informa… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: preprint, 19 pages

  3. arXiv:2406.16560  [pdf

    cs.SI physics.soc-ph

    GNNTAL:A Novel Model for Identifying Critical Nodes in Complex Networks

    Authors: Hao Wang, Ting Luo, Shuang-ping Yang, Ming Jing, Jian Wang, Na Zhao

    Abstract: Identification of critical nodes is a prominent topic in the study of complex networks. Numerous methods have been proposed, yet most exhibit inherent limitations. Traditional approaches primarily analyze specific structural features of the network; however, node influence is typically the result of a combination of multiple factors. Machine learning-based methods struggle to effectively represent… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.08148  [pdf, other

    cs.LG cs.AI

    Probing Implicit Bias in Semi-gradient Q-learning: Visualizing the Effective Loss Landscapes via the Fokker--Planck Equation

    Authors: Shuyu Yin, Fei Wen, Peilin Liu, Tao Luo

    Abstract: Semi-gradient Q-learning is applied in many fields, but due to the absence of an explicit loss function, studying its dynamics and implicit bias in the parameter space is challenging. This paper introduces the Fokker--Planck equation and employs partial data obtained through sampling to construct and visualize the effective loss landscape within a two-dimensional parameter space. This visualizatio… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2406.05852  [pdf, other

    cs.CV cs.GR

    RefGaussian: Disentangling Reflections from 3D Gaussian Splatting for Realistic Rendering

    Authors: Rui Zhang, Tianyue Luo, Weidong Yang, Ben Fei, Jingyi Xu, Qingyuan Zhou, Keyi Liu, Ying He

    Abstract: 3D Gaussian Splatting (3D-GS) has made a notable advancement in the field of neural rendering, 3D scene reconstruction, and novel view synthesis. Nevertheless, 3D-GS encounters the main challenge when it comes to accurately representing physical reflections, especially in the case of total reflection and semi-reflection that are commonly found in real-world scenes. This limitation causes reflectio… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  6. arXiv:2406.00079  [pdf, other

    cs.LG

    Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling

    Authors: Sili Huang, Jifeng Hu, Zhejian Yang, Liwei Yang, Tao Luo, Hechang Chen, Lichao Sun, Bo Yang

    Abstract: Recent works have shown the remarkable superiority of transformer models in reinforcement learning (RL), where the decision-making problem is formulated as sequential generation. Transformer-based agents could emerge with self-improvement in online environments by providing task contexts, such as multiple trajectories, called in-context RL. However, due to the quadratic computation complexity of a… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2405.20692. arXiv admin note: text overlap with arXiv:2405.20692; text overlap with arXiv:2305.16554, arXiv:2210.14215 by other authors

  7. arXiv:2405.17501  [pdf, other

    cs.LG math.OC

    Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks

    Authors: Leyang Zhang, Yaoyu Zhang, Tao Luo

    Abstract: This paper presents a comprehensive analysis of critical point sets in two-layer neural networks. To study such complex entities, we introduce the critical embedding operator and critical reduction operator as our tools. Given a critical point, we use these operators to uncover the whole underlying critical set representing the same output function, which exhibits a hierarchical structure. Further… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  8. arXiv:2405.15413  [pdf, other

    eess.IV cs.CV cs.IT

    MambaVC: Learned Visual Compression with Selective State Spaces

    Authors: Shiyu Qin, Jinpeng Wang, Yimin Zhou, Bin Chen, Tianci Luo, Baoyi An, Tao Dai, Shutao Xia, Yaowei Wang

    Abstract: Learned visual compression is an important and active task in multimedia. Existing approaches have explored various CNN- and Transformer-based designs to model content distribution and eliminate redundancy, where balancing efficacy (i.e., rate-distortion trade-off) and efficiency remains a challenge. Recently, state-space models (SSMs) have shown promise due to their long-range modeling capacity a… ▽ More

    Submitted 28 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 17pages,15 figures

  9. arXiv:2405.15319  [pdf, other

    cs.CL cs.AI

    Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

    Authors: Wenyu Du, Tongxu Luo, Zihan Qiu, Zeyu Huang, Yikang Shen, Reynold Cheng, Yike Guo, Jie Fu

    Abstract: LLMs are computationally expensive to pre-train due to their large scale. Model growth emerges as a promising approach by leveraging smaller models to accelerate the training of larger ones. However, the viability of these model growth methods in efficient LLM pre-training remains underexplored. This work identifies three critical $\underline{\textit{O}}$bstacles: ($\textit{O}$1) lack of comprehen… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Preprint; The project link: $\href{https://llm-stacking.github.io/}{https://llm-stacking.github.io/}$

  10. arXiv:2405.12398  [pdf, other

    cs.LG

    ASMR: Activation-sharing Multi-resolution Coordinate Networks For Efficient Inference

    Authors: Jason Chun Lok Li, Steven Tin Sui Luo, Le Xu, Ngai Wong

    Abstract: Coordinate network or implicit neural representation (INR) is a fast-emerging method for encoding natural signals (such as images and videos) with the benefits of a compact neural representation. While numerous methods have been proposed to increase the encoding capabilities of an INR, an often overlooked aspect is the inference efficiency, usually measured in multiply-accumulate (MAC) count. This… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: ICLR 2024 (v3: 21 pages, 11 figures, Project Page: https://github.com/stevolopolis/asmr.git)

  11. arXiv:2405.10531  [pdf, other

    cs.LG cs.CV

    Nonparametric Teaching of Implicit Neural Representations

    Authors: Chen Zhang, Steven Tin Sui Luo, Jason Chun Lok Li, Yik-Chung Wu, Ngai Wong

    Abstract: We investigate the learning of implicit neural representation (INR) using an overparameterized multilayer perceptron (MLP) via a novel nonparametric teaching perspective. The latter offers an efficient example selection framework for teaching nonparametrically defined (viz. non-closed-form) target functions, such as image functions defined by 2D grids of pixels. To address the costly training of I… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: ICML 2024 (24 pages, 13 figures)

  12. arXiv:2405.08419  [pdf, other

    cs.CV

    WaterMamba: Visual State Space Model for Underwater Image Enhancement

    Authors: Meisheng Guan, Haiyong Xu, Gangyi Jiang, Mei Yu, Yeyao Chen, Ting Luo, Yang Song

    Abstract: Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. However, CNN-based UIE methods are limited in modeling long-range dependencies, and Transformer-based methods involve a large n… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.06098

  13. arXiv:2404.15615  [pdf, other

    cs.HC cs.LG

    MDDD: Manifold-based Domain Adaptation with Dynamic Distribution for Non-Deep Transfer Learning in Cross-subject and Cross-session EEG-based Emotion Recognition

    Authors: Ting Luo, Jing Zhang, Yingwei Qiu, Li Zhang, Yaohua Hu, Zhuliang Yu, Zhen Liang

    Abstract: Emotion decoding using Electroencephalography (EEG)-based affective brain-computer interfaces represents a significant area within the field of affective computing. In the present study, we propose a novel non-deep transfer learning method, termed as Manifold-based Domain adaptation with Dynamic Distribution (MDDD). The proposed MDDD includes four main modules: manifold feature transformation, dyn… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  14. arXiv:2404.07984  [pdf, other

    cs.CV

    View Selection for 3D Captioning via Diffusion Ranking

    Authors: Tiange Luo, Justin Johnson, Honglak Lee

    Abstract: Scalable annotation approaches are crucial for constructing extensive 3D-text datasets, facilitating a broader range of applications. However, existing methods sometimes lead to the generation of hallucinated captions, compromising caption quality. This paper explores the issue of hallucination in 3D object captioning, with a focus on Cap3D method, which renders 3D objects into 2D views for captio… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Dataset link: https://huggingface.co/datasets/tiange/Cap3D

  15. arXiv:2404.04859  [pdf, other

    cs.LG stat.ML

    Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint

    Authors: Yuqing Li, Tao Luo, Qixuan Zhou

    Abstract: In this paper, we advance the understanding of neural network training dynamics by examining the intricate interplay of various factors introduced by weight parameters in the initialization process. Motivated by the foundational work of Luo et al. (J. Mach. Learn. Res., Vol. 22, Iss. 1, No. 71, pp 3327-3373), we explore the gradient descent dynamics of neural networks through the lens of macroscop… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  16. arXiv:2403.16048  [pdf, other

    cs.CV

    Edit3K: Universal Representation Learning for Video Editing Components

    Authors: Xin Gu, Libo Zhang, Fan Chen, Longyin Wen, Yufei Wang, Tiejian Luo, Sijie Zhu

    Abstract: This paper focuses on understanding the predominant video creation pipeline, i.e., compositional video editing with six main types of editing components, including video effects, animation, transition, filter, sticker, and text. In contrast to existing visual representation learning of visual materials (i.e., images/videos), we aim to learn visual representations of editing actions/components that… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  17. Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic

    Authors: Daniel Gerlinghoff, Benjamin Chen Ming Choong, Rick Siow Mong Goh, Weng-Fai Wong, Tao Luo

    Abstract: Recent advancements in neural network quantisation have yielded remarkable outcomes, with three-bit networks reaching state-of-the-art full-precision accuracy in complex tasks. These achievements present valuable opportunities for accelerating neural networks by computing in reduced precision. Implementing it on FPGAs can take advantage of bit-level reconfigurability, which is not available on con… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  18. arXiv:2403.02990  [pdf, other

    cs.CL cs.AI

    Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges

    Authors: Bosheng Ding, Chengwei Qin, Ruochen Zhao, Tianze Luo, Xinze Li, Guizhen Chen, Wenhan Xia, Junjie Hu, Anh Tuan Luu, Shafiq Joty

    Abstract: In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection. This survey explores the transformative impact of LLMs on DA, particularly addressing the unique challenges and opportunities they present in the context of natural… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  19. arXiv:2403.00448  [pdf, other

    cs.SE

    When Large Language Models Confront Repository-Level Automatic Program Repair: How Well They Done?

    Authors: Yuxiao Chen, Jingzheng Wu, Xiang Ling, Changjiang Li, Zhiqing Rui, Tianyue Luo, Yanjun Wu

    Abstract: In recent years, large language models (LLMs) have demonstrated substantial potential in addressing automatic program repair (APR) tasks. However, the current evaluation of these models for APR tasks focuses solely on the limited context of the single function or file where the bug is located, overlooking the valuable information in the repository-level context. This paper investigates the perform… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Accepted by ICSE 2024 Industry Challenge Track

  20. arXiv:2402.16899  [pdf, other

    cs.LG cs.AI

    A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning

    Authors: Shuyu Yin, Qixuan Zhou, Fei Wen, Tao Luo

    Abstract: Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is a… ▽ More

    Submitted 7 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  21. arXiv:2402.16349  [pdf, other

    cs.LG eess.SY

    C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory

    Authors: Tianjiao Luo, Tim Pearce, Huayu Chen, Jianfei Chen, Jun Zhu

    Abstract: Generative Adversarial Imitation Learning (GAIL) trains a generative policy to mimic a demonstrator. It uses on-policy Reinforcement Learning (RL) to optimize a reward signal derived from a GAN-like discriminator. A major drawback of GAIL is its training instability - it inherits the complex training dynamics of GANs, and the distribution shift introduced by RL. This can cause oscillations during… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  22. arXiv:2402.16008  [pdf, other

    cs.CV cs.LG

    Unmasking Dementia Detection by Masking Input Gradients: A JSM Approach to Model Interpretability and Precision

    Authors: Yasmine Mustafa, Tie Luo

    Abstract: The evolution of deep learning and artificial intelligence has significantly reshaped technological landscapes. However, their effective application in crucial sectors such as medicine demands more than just superior performance, but trustworthiness as well. While interpretability plays a pivotal role, existing explainable AI (XAI) approaches often do not reveal {\em Clever Hans} behavior where a… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), May 2024, Taiwan

  23. arXiv:2402.16005  [pdf, other

    cs.CV cs.LG

    Adversarial-Robust Transfer Learning for Medical Imaging via Domain Assimilation

    Authors: Xiaohui Chen, Tie Luo

    Abstract: In the field of Medical Imaging, extensive research has been dedicated to leveraging its potential in uncovering critical diagnostic features in patients. Artificial Intelligence (AI)-driven medical diagnosis relies on sophisticated machine learning and deep learning models to analyze, detect, and identify diseases from medical images. Despite the remarkable performance of these models, characteri… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

    Comments: 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), May 2024, Taiwan

  24. arXiv:2402.15958  [pdf, other

    cs.LG math.DS

    On the dynamics of three-layer neural networks: initial condensation

    Authors: Zheng-An Chen, Tao Luo

    Abstract: Empirical and theoretical works show that the input weights of two-layer neural networks, when initialized with small values, converge towards isolated orientations. This phenomenon, referred to as condensation, indicates that the gradient descent methods tend to spontaneously reduce the complexity of neural networks during the training process. In this work, we elucidate the mechanisms behind the… ▽ More

    Submitted 27 February, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    MSC Class: 37N40; 68T07; 34E05; 34C11

  25. arXiv:2402.13717  [pdf, other

    cs.CL

    Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

    Authors: Xiaoyan Yu, Tongxu Luo, Yifan Wei, Fangyu Lei, Yiming Huang, Hao Peng, Liehuang Zhu

    Abstract: Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse char… ▽ More

    Submitted 1 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  26. arXiv:2402.12851  [pdf, other

    cs.CL

    MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

    Authors: Tongxu Luo, Jiahe Lei, Fangyu Lei, Weihao Liu, Shizhu He, Jun Zhao, Kang Liu

    Abstract: Fine-tuning is often necessary to enhance the adaptability of Large Language Models (LLM) to downstream tasks. Nonetheless, the process of updating billions of parameters demands significant computational resources and training time, which poses a substantial obstacle to the widespread application of large-scale models in various scenarios. To address this issue, Parameter-Efficient Fine-Tuning (P… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  27. arXiv:2401.15541  [pdf, other

    cs.DC cs.LG

    Stitching Satellites to the Edge: Pervasive and Efficient Federated LEO Satellite Learning

    Authors: Mohamed Elmahallawy, Tie Luo

    Abstract: In the ambitious realm of space AI, the integration of federated learning (FL) with low Earth orbit (LEO) satellite constellations holds immense promise. However, many challenges persist in terms of feasibility, learning efficiency, and convergence. These hurdles stem from the bottleneck in communication, characterized by sporadic and irregular connectivity between LEO satellites and ground statio… ▽ More

    Submitted 8 April, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  28. arXiv:2401.13858  [pdf, other

    cs.LG q-bio.BM

    Graph Diffusion Transformer for Multi-Conditional Molecular Generation

    Authors: Gang Liu, Jiaxin Xu, Tengfei Luo, Meng Jiang

    Abstract: Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery. Despite success in unconditional molecule generation, integrating multiple properties such as synthetic score and gas permeability as condition constraints into diffusion models remains unexplored. We present the Graph Diffusion Transformer (Graph DiT) for multi-conditional molecul… ▽ More

    Submitted 6 May, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 21 pages, 9 figures, 7 tables

  29. arXiv:2401.11944  [pdf, other

    cs.CL cs.AI cs.CV

    CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

    Authors: Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Wenhu Chen, Jie Fu

    Abstract: As the capabilities of large multimodal models (LMMs) continue to advance, evaluating the performance of LMMs emerges as an increasing need. Additionally, there is an even larger gap in evaluating the advanced knowledge and reasoning abilities of LMMs in non-English contexts such as Chinese. We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to e… ▽ More

    Submitted 18 March, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

  30. arXiv:2401.10153  [pdf, other

    cs.NI cs.CV

    Importance-Aware Image Segmentation-based Semantic Communication for Autonomous Driving

    Authors: Jie Lv, Haonan Tong, Qiang Pan, Zhilong Zhang, Xinxin He, Tao Luo, Changchuan Yin

    Abstract: This article studies the problem of image segmentation-based semantic communication in autonomous driving. In real traffic scenes, detecting the key objects (e.g., vehicles, pedestrians and obstacles) is more crucial than that of other objects to guarantee driving safety. Therefore, we propose a vehicular image segmentation-oriented semantic communication system, termed VIS-SemCom, where image seg… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 10 pages, 8 figures

  31. arXiv:2401.01578  [pdf, other

    cs.CV

    Context-Guided Spatio-Temporal Video Grounding

    Authors: Xin Gu, Heng Fan, Yan Huang, Tiejian Luo, Libo Zhang

    Abstract: Spatio-temporal video grounding (or STVG) task aims at locating a spatio-temporal tube for a specific instance given a text query. Despite advancements, current methods easily suffer the distractors or heavy object appearance variations in videos due to insufficient object information from the text, leading to degradation. Addressing this, we propose a novel framework, context-guided STVG (CG-STVG… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  32. arXiv:2401.00685  [pdf, other

    cs.LG cs.AI cs.DC

    Communication-Efficient Federated Learning for LEO Satellite Networks Integrated with HAPs Using Hybrid NOMA-OFDM

    Authors: Mohamed Elmahallawy, Tie Luo, Khaled Ramadan

    Abstract: Space AI has become increasingly important and sometimes even necessary for government, businesses, and society. An active research topic under this mission is integrating federated learning (FL) with satellite communications (SatCom) so that numerous low Earth orbit (LEO) satellites can collaboratively train a machine learning model. However, the special communication environment of SatCom leads… ▽ More

    Submitted 16 February, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  33. arXiv:2401.00161  [pdf, other

    cs.LG cs.AI

    DiffHybrid-UQ: Uncertainty Quantification for Differentiable Hybrid Neural Modeling

    Authors: Deepak Akhare, Tengfei Luo, Jian-Xun Wang

    Abstract: The hybrid neural differentiable models mark a significant advancement in the field of scientific machine learning. These models, integrating numerical representations of known physics into deep neural networks, offer enhanced predictive capabilities and show great potential for data-driven modeling of complex physical systems. However, a critical and yet unaddressed challenge lies in the quantifi… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

  34. arXiv:2312.13555  [pdf, other

    cs.LG cs.CV

    CR-SAM: Curvature Regularized Sharpness-Aware Minimization

    Authors: Tao Wu, Tie Luo, Donald C. Wunsch

    Abstract: The capacity to generalize to future unseen data stands as one of the utmost crucial attributes of deep neural networks. Sharpness-Aware Minimization (SAM) aims to enhance the generalizability by minimizing worst-case loss using one-step gradient ascent as an approximation. However, as training progresses, the non-linearity of the loss landscape increases, rendering one-step gradient ascent less e… ▽ More

    Submitted 23 December, 2023; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: AAAI 2024, main track. Code available on Github. Appendix is also included in this updated version

  35. arXiv:2312.13118  [pdf, other

    cs.LG cs.CR

    LRS: Enhancing Adversarial Transferability through Lipschitz Regularized Surrogate

    Authors: Tao Wu, Tie Luo, Donald C. Wunsch

    Abstract: The transferability of adversarial examples is of central importance to transfer-based black-box adversarial attacks. Previous works for generating transferable adversarial examples focus on attacking \emph{given} pretrained surrogate models while the connections between surrogate models and adversarial trasferability have been overlooked. In this paper, we propose {\em Lipschitz Regularized Surro… ▽ More

    Submitted 21 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: AAAI 2024 main track. Code available on Github (see abstract). Appendix is included in this updated version

  36. arXiv:2312.12273  [pdf, other

    cs.CV

    VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

    Authors: Chun-Mei Feng, Yang Bai, Tao Luo, Zhen Li, Salman Khan, Wangmeng Zuo, Xinxing Xu, Rick Siow Mong Goh, Yong Liu

    Abstract: Albeit progress has been made in Composed Image Retrieval (CIR), we empirically find that a certain percentage of failure retrieval results are not consistent with their relative captions. To address this issue, this work provides a Visual Question Answering (VQA) perspective to boost the performance of CIR. The resulting VQA4CIR is a post-processing approach and can be directly plugged into exist… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  37. arXiv:2311.07798  [pdf, other

    cs.LG

    Probabilistic Physics-integrated Neural Differentiable Modeling for Isothermal Chemical Vapor Infiltration Process

    Authors: Deepak Akhare, Zeping Chen, Richard Gulotty, Tengfei Luo, Jian-Xun Wang

    Abstract: Chemical vapor infiltration (CVI) is a widely adopted manufacturing technique used in producing carbon-carbon and carbon-silicon carbide composites. These materials are especially valued in the aerospace and automotive industries for their robust strength and lightweight characteristics. The densification process during CVI critically influences the final performance, quality, and consistency of t… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  38. arXiv:2311.02382  [pdf, other

    cs.DC cs.AI

    Ultra-Long Sequence Distributed Transformer

    Authors: Xiao Wang, Isaac Lyngaas, Aristeidis Tsaris, Peng Chen, Sajal Dash, Mayanka Chandra Shekar, Tao Luo, Hong-Jun Yoon, Mohamed Wahib, John Gouley

    Abstract: Transformer models trained on long sequences often achieve higher accuracy than short sequences. Unfortunately, conventional transformers struggle with long sequence training due to the overwhelming computation and memory requirements. Existing methods for long sequence training offer limited speedup and memory reduction, and may compromise accuracy. This paper presents a novel and efficient distr… ▽ More

    Submitted 8 November, 2023; v1 submitted 4 November, 2023; originally announced November 2023.

  39. arXiv:2310.17331  [pdf

    cs.CE

    A novel solution for seepage problems using physics-informed neural networks

    Authors: Tianfu Luo, Yelin Feng, Qingfu Huang, Zongliang Zhang, Mingjiao Yan, Zaihong Yang, Dawei Zheng, Yang Yang

    Abstract: A Physics-Informed Neural Network (PINN) provides a distinct advantage by synergizing neural networks' capabilities with the problem's governing physical laws. In this study, we introduce an innovative approach for solving seepage problems by utilizing the PINN, harnessing the capabilities of Deep Neural Networks (DNNs) to approximate hydraulic head distributions in seepage analysis. To effectivel… ▽ More

    Submitted 25 November, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

  40. arXiv:2310.16936  [pdf, other

    cs.CV cs.LG

    Diagnosing Alzheimer's Disease using Early-Late Multimodal Data Fusion with Jacobian Maps

    Authors: Yasmine Mustafa, Tie Luo

    Abstract: Alzheimer's disease (AD) is a prevalent and debilitating neurodegenerative disorder impacting a large aging population. Detecting AD in all its presymptomatic and symptomatic stages is crucial for early intervention and treatment. An active research direction is to explore machine learning methods that harness multimodal data fusion to outperform human inspection of medical scans. However, existin… ▽ More

    Submitted 27 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: To be published in Proceedings of 2023 IEEE Healthcom, December 2023

  41. arXiv:2310.15075  [pdf, other

    cs.CL

    TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

    Authors: Fangyu Lei, Tongxu Luo, Pengqi Yang, Weihao Liu, Hanwen Liu, Jiahe Lei, Yiming Huang, Yifan Wei, Shizhu He, Jun Zhao, Kang Liu

    Abstract: Table-based question answering (TableQA) is an important task in natural language processing, which requires comprehending tables and employing various reasoning ways to answer the questions. This paper introduces TableQAKit, the first comprehensive toolkit designed specifically for TableQA. The toolkit designs a unified platform that includes plentiful TableQA datasets and integrates popular meth… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Work in progress

  42. arXiv:2310.06603  [pdf, other

    cs.AI cs.CV cs.RO

    V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric Heterogenous Distillation Network

    Authors: Caizhen He, Hai Wang, Long Chen, Tong Luo, Yingfeng Cai

    Abstract: Object detection is the central issue of intelligent traffic systems, and recent advancements in single-vehicle lidar-based 3D detection indicate that it can provide accurate position information for intelligent agents to make decisions and plan. Compared with single-vehicle perception, multi-view vehicle-road cooperation perception has fundamental advantages, such as the elimination of blind spot… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  43. arXiv:2309.15431  [pdf, other

    cs.CV cs.AI

    Local Compressed Video Stream Learning for Generic Event Boundary Detection

    Authors: Libo Zhang, Xin Gu, Congcong Li, Tiejian Luo, Heng Fan

    Abstract: Generic event boundary detection aims to localize the generic, taxonomy-free event boundaries that segment videos into chunks. Existing methods typically require video frames to be decoded before feeding into the network, which contains significant spatio-temporal redundancy and demands considerable computational power and storage space. To remedy these issues, we propose a novel compressed video… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted by IJCV. arXiv admin note: substantial text overlap with arXiv:2203.15336

  44. arXiv:2309.12669  [pdf, other

    cs.CL

    HRoT: Hybrid prompt strategy and Retrieval of Thought for Table-Text Hybrid Question Answering

    Authors: Tongxu Luo, Fangyu Lei, Jiahe Lei, Weihao Liu, Shihu He, Jun Zhao, Kang Liu

    Abstract: Answering numerical questions over hybrid contents from the given tables and text(TextTableQA) is a challenging task. Recently, Large Language Models (LLMs) have gained significant attention in the NLP community. With the emergence of large language models, In-Context Learning and Chain-of-Thought prompting have become two particularly popular research topics in this field. In this paper, we intro… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  45. Collaborative Three-Stream Transformers for Video Captioning

    Authors: Hao Wang, Libo Zhang, Heng Fan, Tiejian Luo

    Abstract: As the most critical components in a sentence, subject, predicate and object require special attention in the video captioning task. To implement this idea, we design a novel framework, named COllaborative three-Stream Transformers (COST), to model the three parts separately and complement each other for better representation. Specifically, COST is formed by three branches of transformers to explo… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted by CVIU

  46. arXiv:2309.04790  [pdf, other

    cs.CL

    MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

    Authors: Weihao Liu, Fangyu Lei, Tongxu Luo, Jiahe Lei, Shizhu He, Jun Zhao, Kang Liu

    Abstract: In the real world, knowledge often exists in a multimodal and heterogeneous form. Addressing the task of question answering with hybrid data types, including text, tables, and images, is a challenging task (MMHQA). Recently, with the rise of large language models (LLM), in-context learning (ICL) has become the most popular way to solve QA problems. We propose MMHQA-ICL framework for addressing thi… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

  47. arXiv:2309.01828  [pdf, other

    cs.CR cs.LG

    Secure and Efficient Federated Learning in LEO Constellations using Decentralized Key Generation and On-Orbit Model Aggregation

    Authors: Mohamed Elmahallawy, Tie Luo, Mohamed I. Ibrahem

    Abstract: Satellite technologies have advanced drastically in recent years, leading to a heated interest in launching small satellites into low Earth orbit (LEOs) to collect massive data such as satellite imagery. Downloading these data to a ground station (GS) to perform centralized learning to build an AI model is not practical due to the limited and expensive bandwidth. Federated learning (FL) offers a p… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  48. arXiv:2309.00508  [pdf, other

    cs.LG math.DS

    Local Recovery of Two-layer Neural Networks at Overparameterization

    Authors: Leyang Zhang, Yaoyu Zhang, Tao Luo

    Abstract: Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks near global minima, determine the set of parameters which recovers the target function, and characterize the gradient flows around it. With novel techniques, our work uncovers some simple aspects of the complicated loss landscape and reveals how model, target function, samples and initialization aff… ▽ More

    Submitted 18 June, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

  49. arXiv:2308.08230  [pdf, other

    cs.LG

    Exploring Winograd Convolution for Cost-effective Neural Network Fault Tolerance

    Authors: Xinghua Xue, Cheng Liu, Bo Liu, Haitong Huang, Ying Wang, Tao Luo, Lei Zhang, Huawei Li, Xiaowei Li

    Abstract: Winograd is generally utilized to optimize convolution performance and computational efficiency because of the reduced multiplication operations, but the reliability issues brought by winograd are usually overlooked. In this work, we observe the great potential of winograd convolution in improving neural network (NN) fault tolerance. Based on the observation, we evaluate winograd convolution fault… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  50. arXiv:2308.08182  [pdf, other

    cs.CV

    Unsupervised Domain Adaptive Detection with Network Stability Analysis

    Authors: Wenzhang Zhou, Heng Fan, Tiejian Luo, Libo Zhang

    Abstract: Domain adaptive detection aims to improve the generality of a detector, learned from the labeled source domain, on the unlabeled target domain. In this work, drawing inspiration from the concept of stability from the control theory that a robust system requires to remain consistent both externally and internally regardless of disturbances, we propose a novel framework that achieves unsupervised do… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Journal ref: ICCV,2023