Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 739 results for author: Yu, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18995  [pdf, other

    cs.LG cs.AI

    FedMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity

    Authors: Zhaobin Sun, Nannan Wu, Junjie Shi, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Cross-silo federated learning (FL) enables decentralized organizations to collaboratively train models while preserving data privacy and has made significant progress in medical image classification. One common assumption is task homogeneity where each client has access to all classes during training. However, in clinical practice, given a multi-label classification task, constrained by the level… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Early accepted by MICCAI 2024

  2. arXiv:2406.18992  [pdf, other

    cs.CV cs.AI cs.LG

    Semi-supervised Concept Bottleneck Models

    Authors: Lijie Hu, Tianhao Huang, Huanyi Xie, Chenyang Ren, Zhengyu Hu, Lu Yu, Di Wang

    Abstract: Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs heavily relies on the accuracy and richness of annotated concepts in the dataset. These concept labels are typically provided… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 17 pages

  3. arXiv:2406.17582  [pdf, other

    cs.HC cs.GR cs.MM

    Crafting Dynamic Virtual Activities with Advanced Multimodal Models

    Authors: Changyang Li, Lap-Fai Yu

    Abstract: In this paper, we investigate the use of large multimodal models (LMMs) for generating virtual activities, leveraging the integration of vision-language modalities to enable the interpretation of virtual environments. This approach not only facilitates the recognition of scene layouts, semantic contexts, and object identities, but also empowers LMMs to abstract the elements of a scene. By correlat… ▽ More

    Submitted 15 March, 2024; originally announced June 2024.

  4. arXiv:2406.17274  [pdf, other

    cs.CL cs.LG

    Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

    Authors: Jianfeng He, Runing Yang, Linlin Yu, Changbin Li, Ruoxi Jia, Feng Chen, Ming Jin, Chang-Tien Lu

    Abstract: Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncerta… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 63 pages, 41 figures, 11 tables

  5. arXiv:2406.15034  [pdf, other

    cs.CV

    SVFormer: A Direct Training Spiking Transformer for Efficient Video Action Recognition

    Authors: Liutao Yu, Liwei Huang, Chenlin Zhou, Han Zhang, Zhengyu Ma, Huihui Zhou, Yonghong Tian

    Abstract: Video action recognition (VAR) plays crucial roles in various domains such as surveillance, healthcare, and industrial automation, making it highly significant for the society. Consequently, it has long been a research spot in the computer vision field. As artificial neural networks (ANNs) are flourishing, convolution neural networks (CNNs), including 2D-CNNs and 3D-CNNs, as well as variants of th… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024 workshop - Human Brain and Artificial Intelligence

  6. arXiv:2406.14844  [pdf, other

    cs.LG cs.AI

    DN-CL: Deep Symbolic Regression against Noise via Contrastive Learning

    Authors: Jingyi Liu, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Wenqiang Li, Meilan Hao, Yusong Deng, Shu Wei

    Abstract: Noise ubiquitously exists in signals due to numerous factors including physical, electronic, and environmental effects. Traditional methods of symbolic regression, such as genetic programming or deep learning models, aim to find the most fitting expressions for these signals. However, these methods often overlook the noise present in real-world data, leading to reduced fitting accuracy. To tackle… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.14045  [pdf, other

    cs.LG cs.AI

    Understanding Different Design Choices in Training Large Time Series Models

    Authors: Yu-Neng Chuang, Songchen Li, Jiayi Yuan, Guanchu Wang, Kwei-Herng Lai, Leisheng Yu, Sirui Ding, Chia-Yuan Chang, Qiaoyu Tan, Daochen Zha, Xia Hu

    Abstract: Inspired by Large Language Models (LLMs), Time Series Forecasting (TSF), a long-standing task in time series analysis, is undergoing a transition towards Large Time Series Models (LTSMs), aiming to train universal transformer-based models for TSF. However, training LTSMs on heterogeneous time series data poses unique challenges, including diverse frequencies, dimensions, and patterns across datase… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  8. arXiv:2406.13783  [pdf, ps, other

    econ.TH cs.GT

    Nash equilibria of quasisupermodular games

    Authors: Lu Yu

    Abstract: We prove three results on the existence and structure of Nash equilibria for quasisupermodular games. A theorem is purely order-theoretic, and the other two involve topological hypotheses. Our topological results genralize Zhou's theorem (for supermodular games) and Calciano's theorem.

    Submitted 19 June, 2024; originally announced June 2024.

  9. arXiv:2406.13565  [pdf, other

    cs.CV cs.CR

    Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization

    Authors: Zijie Lou, Gang Cao, Kun Guo, Haochen Zhu, Lifang Yu

    Abstract: Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label mappings without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-w… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.12569  [pdf, other

    cs.LG

    MOYU: A Theoretical Study on Massive Over-activation Yielded Uplifts in LLMs

    Authors: Chi Ma, Mincong Huang, Chao Wang, Yujie Wang, Lei Yu, Chuan Liu, Wei Lin

    Abstract: Massive Over-activation Yielded Uplifts(MOYU) is an inherent property of large language models, and dynamic activation(DA) based on the MOYU property is a clever yet under-explored strategy designed to accelerate inference in these models. Existing methods that utilize MOYU often face a significant 'Impossible Trinity': struggling to simultaneously maintain model performance, enhance inference spe… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  11. arXiv:2406.11614  [pdf, other

    cs.CL cs.AI

    Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces

    Authors: Yihuai Hong, Lei Yu, Shauli Ravfogel, Haiqin Yang, Mor Geva

    Abstract: The task of "unlearning" certain concepts in large language models (LLMs) has attracted immense attention recently, due to its importance for mitigating undesirable model behaviours, such as the generation of harmful, private, or incorrect information. Current protocols to evaluate unlearning methods largely rely on behavioral tests, without monitoring the presence of unlearned knowledge within th… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  12. arXiv:2406.10655  [pdf, ps, other

    cs.CR

    E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

    Authors: Dingqiang Yuan, Xiaohua Xu, Lei Yu, Tongchang Han, Rongchang Li, Meng Han

    Abstract: Graph Neural Networks (GNNs) have recently been widely adopted in multiple domains. Yet, they are notably vulnerable to adversarial and backdoor attacks. In particular, backdoor attacks based on subgraph insertion have been shown to be effective in graph classification tasks while being stealthy, successfully circumventing various existing defense methods. In this paper, we propose E-SAGE, a novel… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  13. arXiv:2406.09904  [pdf, other

    cs.LG

    QQQ: Quality Quattuor-Bit Quantization for Large Language Models

    Authors: Ying Zhang, Peng Zhang, Mincong Huang, Jingyang Xiang, Yujie Wang, Chao Wang, Yineng Zhang, Lei Yu, Chuan Liu, Wei Lin

    Abstract: Quantization is a proven effective method for compressing large language models. Although popular techniques like W8A8 and W4A16 effectively maintain model performance, they often fail to concurrently speed up the prefill and decoding stages of inference. W4A8 is a promising strategy to accelerate both of them while usually leads to a significant performance degradation. To address these issues, w… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  14. arXiv:2406.05410  [pdf, other

    cs.AI cs.CL

    MLLM-SR: Conversational Symbolic Regression base Multi-Modal Large Language Models

    Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Shu Wei, Yusong Deng

    Abstract: Formulas are the language of communication between humans and nature. It is an important research topic of artificial intelligence to find expressions from observed data to reflect the relationship between each variable in the data, which is called a symbolic regression problem. The existing symbolic regression methods directly generate expressions according to the given observation data, and we c… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 13 pages,

  15. arXiv:2406.04680  [pdf, other

    eess.IV cs.CV

    MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

    Authors: Yixin Huang, Yiqi Jin, Ke Tao, Kaijian Xia, Jianfeng Gu, Lei Yu, Lan Du, Cunjian Chen

    Abstract: May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  16. arXiv:2406.01062  [pdf, other

    cs.CV

    SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

    Authors: Qilong Zhangli, Jindong Jiang, Di Liu, Licheng Yu, Xiaoliang Dai, Ankit Ramchandani, Guan Pang, Dimitris N. Metaxas, Praveen Krishnan

    Abstract: While diffusion models have significantly advanced the quality of image generation, their capability to accurately and coherently render text within these images remains a substantial challenge. Conventional diffusion-based methods for scene text generation are typically limited by their reliance on an intermediate layout output. This dependency often results in a constrained diversity of text sty… ▽ More

    Submitted 10 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  17. arXiv:2406.00588  [pdf, other

    cs.LG cs.CR math.ST

    Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

    Authors: Lijia Yu, Shuang Liu, Yibo Miao, Xiao-Shan Gao, Lijun Zhang

    Abstract: The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpo… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  18. arXiv:2405.20986  [pdf, other

    cs.LG cs.CV

    Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks

    Authors: Linlin Yu, Bowen Yang, Tianhao Wang, Kangshuo Li, Feng Chen

    Abstract: The fusion of raw features from multiple sensors on an autonomous vehicle to create a Bird's Eye View (BEV) representation is crucial for planning and control systems. There is growing interest in using deep learning models for BEV semantic segmentation. Anticipating segmentation errors and improving the explainability of DNNs is essential for autonomous driving, yet it is under-studied. This pape… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  19. arXiv:2405.19761  [pdf, other

    cs.AI

    Revisiting CNNs for Trajectory Similarity Learning

    Authors: Zhihao Chang, Linzhu Yu, Huan Li, Sai Wu, Gang Chen, Dongxiang Zhang

    Abstract: Similarity search is a fundamental but expensive operator in querying trajectory data, due to its quadratic complexity of distance computation. To mitigate the computational burden for long trajectories, neural networks have been widely employed for similarity learning and each trajectory is encoded as a high-dimensional vector for similarity search with linear complexity. Given the sequential nat… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  20. arXiv:2405.19327  [pdf, other

    cs.CL cs.AI cs.LG

    MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

    Authors: Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu , et al. (20 additional authors not shown)

    Abstract: Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparabl… ▽ More

    Submitted 2 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: https://map-neo.github.io/

  21. arXiv:2405.18997  [pdf, other

    stat.ML cs.LG

    Kernel Semi-Implicit Variational Inference

    Authors: Ziheng Cheng, Longlin Yu, Tianyu Xie, Shiyue Zhang, Cheng Zhang

    Abstract: Semi-implicit variational inference (SIVI) extends traditional variational families with semi-implicit distributions defined in a hierarchical manner. Due to the intractable densities of semi-implicit distributions, classical SIVI often resorts to surrogates of evidence lower bound (ELBO) that would introduce biases for training. A recent advancement in SIVI, named SIVI-SM, utilizes an alternative… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024 camera ready

  22. arXiv:2405.18035  [pdf, other

    cs.CL

    Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis

    Authors: Guangmin Zheng, Jin Wang, Liang-Chih Yu, Xuejie Zhang

    Abstract: Aspect-based sentiment analysis (ABSA) identifies sentiment information related to specific aspects and provides deeper market insights to businesses and organizations. With the emergence of large language models (LMs), recent studies have proposed using fixed examples for instruction tuning to reformulate ABSA as a generation task. However, the performance is sensitive to the selection of in-cont… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: ACL Findings 2024

  23. arXiv:2405.16728  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Towards Multi-Task Multi-Modal Models: A Video Generative Perspective

    Authors: Lijun Yu

    Abstract: Advancements in language foundation models have primarily fueled the recent surge in artificial intelligence. In contrast, generative learning of non-textual modalities, especially videos, significantly trails behind language modeling. This thesis chronicles our endeavor to build multi-task models for generating videos and other modalities under diverse conditions, as well as for understanding and… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: PhD thesis

  24. arXiv:2405.16577  [pdf, other

    stat.ML cs.LG

    Reflected Flow Matching

    Authors: Tianyu Xie, Yu Zhu, Longlin Yu, Tong Yang, Ziheng Cheng, Shiyue Zhang, Xiangyu Zhang, Cheng Zhang

    Abstract: Continuous normalizing flows (CNFs) learn an ordinary differential equation to transform prior samples into data. Flow matching (FM) has recently emerged as a simulation-free approach for training CNFs by regressing a velocity model towards the conditional velocity field. However, on constrained domains, the learned velocity model may lead to undesirable flows that result in highly unnatural sampl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: ICML 2024 camera-ready

  25. arXiv:2405.15984  [pdf, other

    cs.CL cs.AI

    Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models

    Authors: Simon Chi Lok Yu, Jie He, Pasquale Minervini, Jeff Z. Pan

    Abstract: With the emergence of large language models, such as LLaMA and OpenAI GPT-3, In-Context Learning (ICL) gained significant attention due to its effectiveness and efficiency. However, ICL is very sensitive to the choice, order, and verbaliser used to encode the demonstrations in the prompt. Retrieval-Augmented ICL methods try to address this problem by leveraging retrievers to extract semantically r… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 29 pages, 6 figures

  26. arXiv:2405.15549  [pdf, other

    cs.CV

    SEP: Self-Enhanced Prompt Tuning for Visual-Language Model

    Authors: Hantao Yao, Rui Zhang, Lu Yu, Changsheng Xu

    Abstract: Prompt tuning based on Context Optimization (CoOp) effectively adapts visual-language models (VLMs) to downstream tasks by inferring additional learnable prompt tokens. However, these tokens are less discriminative as they are independent of the pre-trained tokens and fail to capture input-specific knowledge, such as class-aware textual or instance-aware visual knowledge. Leveraging the discrimina… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  27. arXiv:2405.15452  [pdf, other

    cs.CL cs.AI cs.LG

    Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top

    Authors: Keyuan Cheng, Muhammad Asif Ali, Shu Yang, Gang Lin, Yuxuan Zhai, Haoyang Fei, Ke Xu, Lu Yu, Lijie Hu, Di Wang

    Abstract: Multi-hop Question Answering (MQA) under knowledge editing (KE) is a key challenge in Large Language Models (LLMs). While best-performing solutions in this domain use a plan and solve paradigm to split a question into sub-questions followed by response generation, we claim that this approach is sub-optimal as it fails for hard to decompose questions, and it does not explicitly cater to correlated… ▽ More

    Submitted 27 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 18 pages

  28. arXiv:2405.15379  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Log-Concave Sampling on Compact Supports: A Versatile Proximal Framework

    Authors: Lu Yu

    Abstract: In this paper, we explore sampling from strongly log-concave distributions defined on convex and compact supports. We propose a general proximal framework that involves projecting onto the constrained set, which is highly flexible and supports various projection options. Specifically, we consider the cases of Euclidean and Gauge projections, with the latter having the advantage of being performed… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  29. arXiv:2405.14620  [pdf, other

    cs.LG

    Closed-form Symbolic Solutions: A New Perspective on Solving Partial Differential Equations

    Authors: Shu Wei, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Meilan Hao, Wenqiang Li, Jingyi Liu, Yusong Deng

    Abstract: Solving partial differential equations (PDEs) in Euclidean space with closed-form symbolic solutions has long been a dream for mathematicians. Inspired by deep learning, Physics-Informed Neural Networks (PINNs) have shown great promise in numerically solving PDEs. However, since PINNs essentially approximate solutions within the continuous function space, their numerical solutions fall short in bo… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  30. arXiv:2405.13762  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

    Authors: Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

    Abstract: Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  31. arXiv:2405.10620  [pdf, other

    cs.AI cs.CL cs.CV

    MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains

    Authors: Zhaohuan Zhan, Lisha Yu, Sijie Yu, Guang Tan

    Abstract: In the Vision-and-Language Navigation (VLN) task, the agent is required to navigate to a destination following a natural language instruction. While learning-based approaches have been a major solution to the task, they suffer from high training costs and lack of interpretability. Recently, Large Language Models (LLMs) have emerged as a promising tool for VLN due to their strong generalization cap… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  32. arXiv:2405.10166  [pdf, other

    cs.CL cs.PF

    LFED: A Literary Fiction Evaluation Dataset for Large Language Models

    Authors: Linhao Yu, Qun Liu, Deyi Xiong

    Abstract: The rapid evolution of large language models (LLMs) has ushered in the need for comprehensive assessments of their performance across various dimensions. In this paper, we propose LFED, a Literary Fiction Evaluation Dataset, which aims to evaluate the capability of LLMs on the long fiction comprehension and reasoning. We collect 95 literary fictions that are either originally written in Chinese or… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  33. arXiv:2405.09882  [pdf, other

    cs.CV cs.AI

    DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

    Authors: Yuhao Sun, Lingyun Yu, Hongtao Xie, Jiaming Li, Yongdong Zhang

    Abstract: With the rapid development of face recognition (FR) systems, the privacy of face images on social media is facing severe challenges due to the abuse of unauthorized FR systems. Some studies utilize adversarial attack techniques to defend against malicious FR systems by generating adversarial examples. However, the generated adversarial examples, i.e., the protected face images, tend to suffer from… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 16 pages, 11 figures

  34. arXiv:2405.09274  [pdf, other

    cs.LG

    Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study

    Authors: Chi Ma, Mincong Huang, Chao Wang, Yujie Wang, Lei Yu

    Abstract: In this work, we systematically investigate the efficacy of dynamic activation mechanisms within the LLaMA family of language models. Despite the potential of dynamic activation methods to reduce computation and increase speed in models using the ReLU activation function, our empirical findings have uncovered several inherent pitfalls in the current dynamic activation schemes. Through extensive ex… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  35. arXiv:2405.09113  [pdf, ps, other

    cs.LG

    Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization

    Authors: Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson

    Abstract: Recent research indicates that large language models (LLMs) are susceptible to jailbreaking attacks that can generate harmful content. This paper introduces a novel token-level attack method, Adaptive Dense-to-Sparse Constrained Optimization (ADC), which effectively jailbreaks several open-source LLMs. Our approach relaxes the discrete jailbreak optimization into a continuous optimization and prog… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  36. arXiv:2405.07905  [pdf, other

    eess.IV cs.CV

    PLUTO: Pathology-Universal Transformer

    Authors: Dinkar Juyal, Harshith Padigela, Chintan Shah, Daniel Shenker, Natalia Harguindeguy, Yi Liu, Blake Martin, Yibo Zhang, Michael Nercessian, Miles Markey, Isaac Finberg, Kelsey Luu, Daniel Borders, Syed Ashar Javed, Emma Krause, Raymond Biju, Aashish Sood, Allen Ma, Jackson Nyman, John Shamshoian, Guillaume Chhor, Darpan Sanghavi, Marc Thibault, Limin Yu, Fedaa Najdawi , et al. (8 additional authors not shown)

    Abstract: Pathology is the study of microscopic inspection of tissue, and a pathology diagnosis is often the medical gold standard to diagnose disease. Pathology images provide a unique challenge for computer-vision-based analysis: a single pathology Whole Slide Image (WSI) is gigapixel-sized and often contains hundreds of thousands to millions of objects of interest across multiple resolutions. In this wor… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  37. arXiv:2405.06944  [pdf, other

    cs.CV

    Learning Monocular Depth from Focus with Event Focal Stack

    Authors: Chenxu Jiang, Mingyuan Lin, Chi Zhang, Zhenghai Wang, Lei Yu

    Abstract: Depth from Focus estimates depth by determining the moment of maximum focus from multiple shots at different focal distances, i.e. the Focal Stack. However, the limited sampling rate of conventional optical cameras makes it difficult to obtain sufficient focus cues during the focal sweep. Inspired by biological vision, the event camera records intensity changes over time in extremely low latency,… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  38. arXiv:2405.06918  [pdf, other

    cs.CV

    Super-Resolving Blurry Images with Events

    Authors: Chi Zhang, Mingyuan Lin, Xiang Zhang, Chenxu Jiang, Lei Yu

    Abstract: Super-resolution from motion-blurred images poses a significant challenge due to the combined effects of motion blur and low spatial resolution. To address this challenge, this paper introduces an Event-based Blurry Super Resolution Network (EBSR-Net), which leverages the high temporal resolution of events to mitigate motion blur and improve high-resolution image prediction. Specifically, we propo… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  39. arXiv:2405.06659  [pdf, other

    q-bio.BM cs.AI cs.LG physics.chem-ph

    ControlMol: Adding Substruture Control To Molecule Diffusion Models

    Authors: Qi Zhengyang, Liu Zijing, Zhang Jiying, Cao He, Li Yu

    Abstract: Designing new molecules is an important task in the field of pharmaceuticals. Due to the vast design space of molecules, generating molecules conditioned on a specific sub-structure relevant to a particular function or therapeutic target is a crucial task in computer-aided drug design. In this paper, we present ControlMol, which adds sub-structure control to molecule generation with diffusion mode… ▽ More

    Submitted 22 April, 2024; originally announced May 2024.

    Comments: 9 pages,7 figures

  40. arXiv:2405.05119  [pdf, other

    stat.ME cs.SI

    Combining Rollout Designs and Clustering for Causal Inference under Low-order Interference

    Authors: Mayleen Cortez-Rodriguez, Matthew Eichhorn, Christina Lee Yu

    Abstract: Estimating causal effects under interference is pertinent to many real-world settings. However, the true interference network may be unknown to the practitioner, precluding many existing techniques that leverage this information. A recent line of work with low-order potential outcomes models uses staggered rollout designs to obtain unbiased estimators that require no network information. However,… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 30 pages, 13 figures

    MSC Class: 62K99 (Primary); 62P30 (Secondary)

  41. arXiv:2405.04289  [pdf, ps, other

    cs.NE

    Direct Training High-Performance Deep Spiking Neural Networks: A Review of Theories and Methods

    Authors: Chenlin Zhou, Han Zhang, Liutao Yu, Yumin Ye, Zhaokun Zhou, Liwei Huang, Zhengyu Ma, Xiaopeng Fan, Huihui Zhou, Yonghong Tian

    Abstract: Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks (ANNs), in virtue of their high biological plausibility, rich spatial-temporal dynamics, and event-driven computation. The direct training algorithms based on the surrogate gradient method provide sufficient flexibility to design novel SNN architectures and explore the spatial-temporal dynam… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 29 pages

  42. arXiv:2405.01827  [pdf, other

    cs.CL

    SoftMCL: Soft Momentum Contrastive Learning for Fine-grained Sentiment-aware Pre-training

    Authors: Jin Wang, Liang-Chih Yu, Xuejie Zhang

    Abstract: The pre-training for language models captures general language understanding but fails to distinguish the affective impact of a particular context to a specific word. Recent works have sought to introduce contrastive learning (CL) for sentiment-aware pre-training in acquiring affective information. Nevertheless, these methods present two significant limitations. First, the compatibility of the GPU… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by LREC-COLING 2024

  43. arXiv:2404.19246  [pdf

    cs.CR cs.AR

    Logistic Map Pseudo Random Number Generator in FPGA

    Authors: Mateo Jalen Andrew Calderon, Lee Jun Lei Lucas, Syarifuddin Azhar Bin Rosli, Stephanie See Hui Ying, Jarell Lim En Yu, Maoyang Xiang, T. Hui Teo

    Abstract: This project develops a pseudo-random number generator (PRNG) using the logistic map, implemented in Verilog HDL on an FPGA and processes its output through a Central Limit Theorem (CLT) function to achieve a Gaussian distribution. The system integrates additional FPGA modules for real-time interaction and visualisation, including a clock generator, UART interface, XADC, and a 7-segment display dr… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures

  44. arXiv:2404.17900  [pdf, other

    cs.CV

    Unsupervised Anomaly Detection via Masked Diffusion Posterior Sampling

    Authors: Di Wu, Shicai Fan, Xue Zhou, Li Yu, Yuzhong Deng, Jianxiao Zou, Baihong Lin

    Abstract: Reconstruction-based methods have been commonly used for unsupervised anomaly detection, in which a normal image is reconstructed and compared with the given test image to detect and locate anomalies. Recently, diffusion models have shown promising applications for anomaly detection due to their powerful generative ability. However, these models lack strict mathematical support for normal image re… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Journal ref: International Joint Conference on Artificial Intelligence 2024

  45. arXiv:2404.17805  [pdf, other

    cs.LG cs.CV

    From Optimization to Generalization: Fair Federated Learning against Quality Shift via Inter-Client Sharpness Matching

    Authors: Nannan Wu, Zhuo Kuang, Zengqiang Yan, Li Yu

    Abstract: Due to escalating privacy concerns, federated learning has been recognized as a vital approach for training deep neural networks with decentralized medical data. In practice, it is challenging to ensure consistent imaging quality across various institutions, often attributed to equipment malfunctions affecting a minority of clients. This imbalance in image quality can cause the federated model to… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: This paper is accepted at IJCAI'24 (Main Track)

  46. arXiv:2404.13972  [pdf, other

    cs.CV

    Non-Uniform Exposure Imaging via Neuromorphic Shutter Control

    Authors: Mingyuan Lin, Jian Liu, Chi Zhang, Zibo Zhao, Chu He, Lei Yu

    Abstract: By leveraging the blur-noise trade-off, imaging with non-uniform exposures largely extends the image acquisition flexibility in harsh environments. However, the limitation of conventional cameras in perceiving intra-frame dynamic information prevents existing methods from being implemented in the real-world frame acquisition for real-time adaptive camera shutter control. To address this challenge,… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  47. arXiv:2404.13550  [pdf, other

    cs.CV eess.IV

    Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes

    Authors: Kang You, Kai Liu, Li Yu, Pan Gao, Dandan Ding

    Abstract: Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance an… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  48. arXiv:2404.11979  [pdf, other

    cs.CV

    MTGA: Multi-view Temporal Granularity aligned Aggregation for Event-based Lip-reading

    Authors: Wenhao Zhang, Jun Wang, Yong Luo, Lei Yu, Wei Yu, Zheng He

    Abstract: Lip-reading is to utilize the visual information of the speaker's lip movements to recognize words and sentences. Existing event-based lip-reading solutions integrate different frame rate branches to learn spatio-temporal features of varying granularities. However, aggregating events into event frames inevitably leads to the loss of fine-grained temporal information within frames. To remedy this d… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  49. arXiv:2404.08801  [pdf, other

    cs.LG cs.CL

    Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

    Authors: Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

    Abstract: The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy. We introduce Megalodon, a neural architecture for efficient sequence modeling with unlimited co… ▽ More

    Submitted 16 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 9 pages, 6 figures and 8 tables

  50. arXiv:2404.06330  [pdf, other

    cs.LG cs.AI

    Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning

    Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Meilan Hao, Shu Wei, Yusong Deng

    Abstract: The mathematical formula is the human language to describe nature and is the essence of scientific research. Finding mathematical formulas from observational data is a major demand of scientific research and a major challenge of artificial intelligence. This area is called symbolic regression. Originally symbolic regression was often formulated as a combinatorial optimization problem and solved us… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 21 pages