Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,445 results for author: Yu, L

.
  1. arXiv:2407.19423  [pdf, ps, other

    math.CO math.AT

    On Simplicial Complexes with Extremal Total Betti number and Total Bigraded Betti Number

    Authors: Pimeng Dai, Li Yu

    Abstract: We determine which simplicial complexes have the maximum or minimum sum of Betti numbers and sum of bigraded Betti numbers with a given number of vertices in each dimension.

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: 22 pages, 1 figure

    MSC Class: 05E45; 13F55

  2. arXiv:2407.18582  [pdf, ps, other

    econ.TH

    Order-theoretical fixed point theorems for correspondences and application in game theory

    Authors: Lu Yu

    Abstract: For an ascending correspondence $F:X\to 2^X$ with chain-complete values on a complete lattice $X$, we prove that the set of fixed points is a complete lattice. This strengthens Zhou's fixed point theorem. For chain-complete posets that are not necessarily lattices, we generalize the Abian-Brown and the Markowsky fixed point theorems from single-valued maps to multivalued correspondences. We provid… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  3. arXiv:2407.18390  [pdf, other

    eess.IV cs.CV

    Adapting Mouse Pathological Model to Human Glomerular Lesion Segmentation

    Authors: Lining Yu, Mengmeng Yin, Ruining Deng, Quan Liu, Tianyuan Yao, Can Cui, Yu Wang, Yaohong Wang, Shilin Zhao, Haichun Yang, Yuankai Huo

    Abstract: Moving from animal models to human applications in preclinical research encompasses a broad spectrum of disciplines in medical science. A fundamental element in the development of new drugs, treatments, diagnostic methods, and in deepening our understanding of disease processes is the accurate measurement of kidney tissues. Past studies have demonstrated the viability of translating glomeruli segm… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  4. arXiv:2407.17884  [pdf, ps, other

    econ.TH

    Generalization of Zhou fixed point theorem

    Authors: Lu Yu

    Abstract: We give two generalizations of the Zhou fixed point theorem. They weaken the subcompleteness condition of values, and relax the ascending condition of the correspondence. As an application, we derive a generalization of Topkis's theorem on the existence and order structure of the set of Nash equilibria of supermodular games.

    Submitted 25 July, 2024; originally announced July 2024.

  5. arXiv:2407.15158  [pdf, other

    cs.CV

    HERGen: Elevating Radiology Report Generation with Longitudinal Data

    Authors: Fuying Wang, Shenghui Du, Lequan Yu

    Abstract: Radiology reports provide detailed descriptions of medical imaging integrated with patients' medical histories, while report writing is traditionally labor-intensive, increasing radiologists' workload and the risk of diagnostic errors. Recent efforts in automating this process seek to mitigate these issues by enhancing accuracy and clinical efficiency. Emerging research in automating this process… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  6. arXiv:2407.14833  [pdf, other

    cs.HC

    SpatialTouch: Exploring Spatial Data Visualizations in Cross-reality

    Authors: Lixiang Zhao, Tobias Isenberg, Fuqi Xie, Hai-Ning Liang, Lingyun Yu

    Abstract: We propose and study a novel cross-reality environment that seamlessly integrates a monoscopic 2D surface (an interactive screen with touch and pen input) with a stereoscopic 3D space (an augmented reality HMD) to jointly host spatial data visualizations. This innovative approach combines the best of two conventional methods of displaying and manipulating spatial 3D data, enabling users to fluidly… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: 15 pages, 20 figures, IEEE VIS2024

  7. arXiv:2407.14796  [pdf, other

    cs.CV cs.AI

    PASSION: Towards Effective Incomplete Multi-Modal Medical Image Segmentation with Imbalanced Missing Rates

    Authors: Junjie Shi, Caozhi Shang, Zhaobin Sun, Li Yu, Xin Yang, Zengqiang Yan

    Abstract: Incomplete multi-modal image segmentation is a fundamental task in medical imaging to refine deployment efficiency when only partial modalities are available. However, the common practice that complete-modality data is visible during model training is far from realistic, as modalities can have imbalanced missing rates in clinical scenarios. In this paper, we, for the first time, formulate such a c… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM MM 2024

  8. arXiv:2407.13212  [pdf, ps, other

    astro-ph.EP

    Probe the regolith characteristics of asteroids from 9-years infrared observations of WISE/NEOWISE: A case study of the Main-Belt Object (656) Beagle

    Authors: Liang-Liang Yu

    Abstract: This work presents data processing, fitting procedure, modelling and analyzing of 9-years infrared light curves provided by the WISE/NEOWISE telescope, by which the regolith characteristics of Main-Belt Object (656) Beagle is studied. We determine Beagle's effective diameter $D_{\rm eff}=57.3^{+4.5}_{-2.2}$ km, geometric albedo $p_{\rm v}=0.05^{+0.004}_{-0.007}$, mean roughness… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 10 pages, 9 figures, accepted for publication in The Astronomical Journal. arXiv admin note: text overlap with arXiv:2104.02909

  9. arXiv:2407.11448  [pdf, other

    cs.CV

    cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process

    Authors: Yihang Chen, Tsai Hor Chan, Guosheng Yin, Yuming Jiang, Lequan Yu

    Abstract: Multiple instance learning (MIL) has been extensively applied to whole slide histopathology image (WSI) analysis. The existing aggregation strategy in MIL, which primarily relies on the first-order distance (e.g., mean difference) between instances, fails to accurately approximate the true feature distribution of each instance, leading to biased slide-level representations. Moreover, the scarcity… ▽ More

    Submitted 19 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  10. arXiv:2407.07357  [pdf, ps, other

    cs.LG q-bio.MN

    A deep graph model for the signed interaction prediction in biological network

    Authors: Shuyi Jin, Mengji Zhang, Meijie Wang, Lun Yu

    Abstract: In pharmaceutical research, the strategy of drug repurposing accelerates the development of new therapies while reducing R&D costs. Network pharmacology lays the theoretical groundwork for identifying new drug indications, and deep graph models have become essential for their precision in mapping complex biological networks. Our study introduces an advanced graph model that utilizes graph convolut… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  11. arXiv:2407.06132  [pdf, other

    cs.IT

    Rényi Common Information for Doubly Symmetric Binary Sources

    Authors: Lei Yu

    Abstract: In this note, we provide analytic expressions for the Rényi common information of orders in $(1,\infty)$ for the doubly symmetric binary source (DSBS). Until now, analytic expressions for the Rényi common information of all orders in $[0,\infty]$ have been completely known for this source. We also consider the Rényi common information of all orders in $[-\infty,0)$ and evaluate it for the DSBS. We… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 20 pages

  12. arXiv:2407.05965  [pdf, other

    cs.CV cs.AI cs.CL cs.CR cs.LG

    T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

    Authors: Yibo Miao, Yifan Zhu, Yinpeng Dong, Lijia Yu, Jun Zhu, Xiao-Shan Gao

    Abstract: The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus o… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  13. arXiv:2407.05000  [pdf, other

    cs.LG cs.CL

    LoRA-GA: Low-Rank Adaptation with Gradient Approximation

    Authors: Shaowen Wang, Linxi Yu, Jian Li

    Abstract: Fine-tuning large-scale pretrained models is prohibitively expensive in terms of computational and memory costs. LoRA, as one of the most popular Parameter-Efficient Fine-Tuning (PEFT) methods, offers a cost-effective alternative by fine-tuning an auxiliary low-rank model that has significantly fewer parameters. Although LoRA reduces the computational and memory requirements significantly at each… ▽ More

    Submitted 16 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

  14. arXiv:2407.04916  [pdf, other

    cs.CV

    Completed Feature Disentanglement Learning for Multimodal MRIs Analysis

    Authors: Tianling Liu, Hongying Liu, Fanhua Shang, Lequan Yu, Tong Han, Liang Wan

    Abstract: Multimodal MRIs play a crucial role in clinical diagnosis and treatment. Feature disentanglement (FD)-based methods, aiming at learning superior feature representations for multimodal data analysis, have achieved significant success in multimodal learning (MML). Typically, existing FD-based methods separate multimodal data into modality-shared and modality-specific features, and employ concatenati… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Submitted to IEEE JBHI in April 2024

  15. arXiv:2407.03779  [pdf, other

    cs.CL cs.LG

    Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning

    Authors: Lei Yu, Jingcheng Niu, Zining Zhu, Gerald Penn

    Abstract: In this paper, we introduce a comprehensive reformulation of the task known as Circuit Discovery, along with DiscoGP, a novel and effective algorithm based on differentiable masking for discovering circuits. Circuit discovery is the task of interpreting the computational mechanisms of language models (LMs) by dissecting their functions and capabilities into sparse subnetworks (circuits). We identi… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  16. arXiv:2407.02431  [pdf, other

    cs.LG cs.CR

    On the Robustness of Graph Reduction Against GNN Backdoor

    Authors: Yuxuan Zhu, Michael Mandulak, Kerui Wu, George Slota, Yuseok Jeon, Ka-Ho Chow, Lei Yu

    Abstract: Graph Neural Networks (GNNs) are gaining popularity across various domains due to their effectiveness in learning graph-structured data. Nevertheless, they have been shown to be susceptible to backdoor poisoning attacks, which pose serious threats to real-world applications. Meanwhile, graph reduction techniques, including coarsening and sparsification, which have long been employed to improve the… ▽ More

    Submitted 8 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  17. arXiv:2407.02280  [pdf, other

    cs.CV cs.AI

    FedIA: Federated Medical Image Segmentation with Heterogeneous Annotation Completeness

    Authors: Yangyang Xiang, Nannan Wu, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Federated learning has emerged as a compelling paradigm for medical image segmentation, particularly in light of increasing privacy concerns. However, most of the existing research relies on relatively stringent assumptions regarding the uniformity and completeness of annotations across clients. Contrary to this, this paper highlights a prevalent challenge in medical practice: incomplete annotatio… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI 2024

  18. arXiv:2407.00636  [pdf, ps, other

    econ.TH

    Nash equilibria of games with generalized complementarities

    Authors: Lu Yu

    Abstract: To generalize complementarities for games, we introduce some conditions weaker than quasisupermodularity and the single crossing property. We prove that the Nash equilibria of a game satisfying these conditions form a nonempty complete lattice. This is a purely order-theoretic generalization of Zhou's theorem.

    Submitted 30 June, 2024; originally announced July 2024.

  19. arXiv:2406.18995  [pdf, other

    cs.LG cs.AI

    FedMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity

    Authors: Zhaobin Sun, Nannan Wu, Junjie Shi, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Cross-silo federated learning (FL) enables decentralized organizations to collaboratively train models while preserving data privacy and has made significant progress in medical image classification. One common assumption is task homogeneity where each client has access to all classes during training. However, in clinical practice, given a multi-label classification task, constrained by the level… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Early accepted by MICCAI 2024

  20. arXiv:2406.18992  [pdf, other

    cs.CV cs.AI cs.LG

    Semi-supervised Concept Bottleneck Models

    Authors: Lijie Hu, Tianhao Huang, Huanyi Xie, Chenyang Ren, Zhengyu Hu, Lu Yu, Di Wang

    Abstract: Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs heavily relies on the accuracy and richness of annotated concepts in the dataset. These concept labels are typically provided… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 17 pages

  21. arXiv:2406.17582  [pdf, other

    cs.HC cs.GR cs.MM

    Crafting Dynamic Virtual Activities with Advanced Multimodal Models

    Authors: Changyang Li, Lap-Fai Yu

    Abstract: In this paper, we investigate the use of large multimodal models (LMMs) for generating virtual activities, leveraging the integration of vision-language modalities to enable the interpretation of virtual environments. This approach not only facilitates the recognition of scene layouts, semantic contexts, and object identities, but also empowers LMMs to abstract the elements of a scene. By correlat… ▽ More

    Submitted 15 March, 2024; originally announced June 2024.

  22. arXiv:2406.17274  [pdf, other

    cs.CL cs.LG

    Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

    Authors: Jianfeng He, Runing Yang, Linlin Yu, Changbin Li, Ruoxi Jia, Feng Chen, Ming Jin, Chang-Tien Lu

    Abstract: Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncerta… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 63 pages, 41 figures, 11 tables

  23. arXiv:2406.15671  [pdf, other

    physics.app-ph

    Fluxon transmission measurements of engineered long Josephson junctions for efficient computing

    Authors: Han Cai, Liuqi Yu, Ryan Clarke, Waltraut Wustmann, Kevin D. Osborn

    Abstract: Single-Flux Quantum (SFQ) digital logic is typically both energy efficient and fast, but the logic that uses reversibility provides the most extreme method for improving efficiency. We are studying engineered long Josephson junctions (LJJs) that are components for future ballistic logic gates within a logic family named Reversible Fluxon Logic (RFL). Therein, the bit states are represented by two… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  24. arXiv:2406.15034  [pdf, other

    cs.CV

    SVFormer: A Direct Training Spiking Transformer for Efficient Video Action Recognition

    Authors: Liutao Yu, Liwei Huang, Chenlin Zhou, Han Zhang, Zhengyu Ma, Huihui Zhou, Yonghong Tian

    Abstract: Video action recognition (VAR) plays crucial roles in various domains such as surveillance, healthcare, and industrial automation, making it highly significant for the society. Consequently, it has long been a research spot in the computer vision field. As artificial neural networks (ANNs) are flourishing, convolution neural networks (CNNs), including 2D-CNNs and 3D-CNNs, as well as variants of th… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024 workshop - Human Brain and Artificial Intelligence

  25. arXiv:2406.14844  [pdf, other

    cs.LG cs.AI

    DN-CL: Deep Symbolic Regression against Noise via Contrastive Learning

    Authors: Jingyi Liu, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Wenqiang Li, Meilan Hao, Yusong Deng, Shu Wei

    Abstract: Noise ubiquitously exists in signals due to numerous factors including physical, electronic, and environmental effects. Traditional methods of symbolic regression, such as genetic programming or deep learning models, aim to find the most fitting expressions for these signals. However, these methods often overlook the noise present in real-world data, leading to reduced fitting accuracy. To tackle… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  26. arXiv:2406.14045  [pdf, other

    cs.LG cs.AI

    Understanding Different Design Choices in Training Large Time Series Models

    Authors: Yu-Neng Chuang, Songchen Li, Jiayi Yuan, Guanchu Wang, Kwei-Herng Lai, Leisheng Yu, Sirui Ding, Chia-Yuan Chang, Qiaoyu Tan, Daochen Zha, Xia Hu

    Abstract: Inspired by Large Language Models (LLMs), Time Series Forecasting (TSF), a long-standing task in time series analysis, is undergoing a transition towards Large Time Series Models (LTSMs), aiming to train universal transformer-based models for TSF. However, training LTSMs on heterogeneous time series data poses unique challenges, including diverse frequencies, dimensions, and patterns across datase… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  27. arXiv:2406.13783  [pdf, ps, other

    econ.TH cs.GT

    Nash equilibria of quasisupermodular games

    Authors: Lu Yu

    Abstract: We prove three results on the existence and structure of Nash equilibria for quasisupermodular games. A theorem is purely order-theoretic, and the other two involve topological hypotheses. Our topological results genralize Zhou's theorem (for supermodular games) and Calciano's theorem.

    Submitted 19 June, 2024; originally announced June 2024.

  28. arXiv:2406.13565  [pdf, other

    cs.CV cs.CR

    Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization

    Authors: Zijie Lou, Gang Cao, Kun Guo, Haochen Zhu, Lifang Yu

    Abstract: Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label mappings without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-w… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  29. arXiv:2406.12569  [pdf, other

    cs.LG

    MOYU: A Theoretical Study on Massive Over-activation Yielded Uplifts in LLMs

    Authors: Chi Ma, Mincong Huang, Chao Wang, Yujie Wang, Lei Yu

    Abstract: Massive Over-activation Yielded Uplifts(MOYU) is an inherent property of large language models, and dynamic activation(DA) based on the MOYU property is a clever yet under-explored strategy designed to accelerate inference in these models. Existing methods that utilize MOYU often face a significant 'Impossible Trinity': struggling to simultaneously maintain model performance, enhance inference spe… ▽ More

    Submitted 28 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  30. arXiv:2406.12447  [pdf, other

    eess.AS

    Text-aware Speech Separation for Multi-talker Keyword Spotting

    Authors: Haoyu Li, Baochen Yang, Yu Xi, Linfeng Yu, Tian Tan, Hao Li, Kai Yu

    Abstract: For noisy environments, ensuring the robustness of keyword spotting (KWS) systems is essential. While much research has focused on noisy KWS, less attention has been paid to multi-talker mixed speech scenarios. Unlike the usual cocktail party problem where multi-talker speech is separated using speaker clues, the key challenge here is to extract the target speech for KWS based on text clues. To ad… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH2024

  31. arXiv:2406.11614  [pdf, other

    cs.CL cs.AI

    Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces

    Authors: Yihuai Hong, Lei Yu, Shauli Ravfogel, Haiqin Yang, Mor Geva

    Abstract: The task of "unlearning" certain concepts in large language models (LLMs) has attracted immense attention recently, due to its importance for mitigating undesirable model behaviours, such as the generation of harmful, private, or incorrect information. Current protocols to evaluate unlearning methods largely rely on behavioral tests, without monitoring the presence of unlearned knowledge within th… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  32. arXiv:2406.10677  [pdf, ps, other

    eess.SY

    Intermittent Encryption Strategies for Anti-Eavesdropping Estimation

    Authors: Zhongyao Hu, Bo Chen, Pindi Weng, Jianzheng Wang, Li Yu

    Abstract: In this paper, an anti-eavesdropping estimation problem is investigated. A linear encryption scheme is utilized, which first linearly transforms innovation via an encryption matrix and then encrypts some components of the transformed innovation. To reduce the computation and energy resources consumed by the linear encryption scheme, both stochastic and deterministic intermittent strategies which p… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures

    MSC Class: 93E-xx

  33. arXiv:2406.10655  [pdf, ps, other

    cs.CR

    E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

    Authors: Dingqiang Yuan, Xiaohua Xu, Lei Yu, Tongchang Han, Rongchang Li, Meng Han

    Abstract: Graph Neural Networks (GNNs) have recently been widely adopted in multiple domains. Yet, they are notably vulnerable to adversarial and backdoor attacks. In particular, backdoor attacks based on subgraph insertion have been shown to be effective in graph classification tasks while being stealthy, successfully circumventing various existing defense methods. In this paper, we propose E-SAGE, a novel… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  34. arXiv:2406.09904  [pdf, other

    cs.LG

    QQQ: Quality Quattuor-Bit Quantization for Large Language Models

    Authors: Ying Zhang, Peng Zhang, Mincong Huang, Jingyang Xiang, Yujie Wang, Chao Wang, Yineng Zhang, Lei Yu, Chuan Liu, Wei Lin

    Abstract: Quantization is a proven effective method for compressing large language models. Although popular techniques like W8A8 and W4A16 effectively maintain model performance, they often fail to concurrently speed up the prefill and decoding stages of inference. W4A8 is a promising strategy to accelerate both of them while usually leads to a significant performance degradation. To address these issues, w… ▽ More

    Submitted 28 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  35. arXiv:2406.09582  [pdf, ps, other

    econ.TH

    Existence and structure of Nash equilibria for supermodular games

    Authors: Lu Yu

    Abstract: Two theorems announced by Topkis about the topological description of sublattices are proved. They are applied to extend some classical results concerning the existence and the order structure of Nash equilibria of certain supermodular games, with some problems in Zhou's proof corrected.

    Submitted 13 June, 2024; originally announced June 2024.

  36. arXiv:2406.05410  [pdf, other

    cs.AI cs.CL

    MLLM-SR: Conversational Symbolic Regression base Multi-Modal Large Language Models

    Authors: Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Shu Wei, Yusong Deng

    Abstract: Formulas are the language of communication between humans and nature. It is an important research topic of artificial intelligence to find expressions from observed data to reflect the relationship between each variable in the data, which is called a symbolic regression problem. The existing symbolic regression methods directly generate expressions according to the given observation data, and we c… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 13 pages,

  37. arXiv:2406.04680  [pdf, other

    eess.IV cs.CV

    MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

    Authors: Yixin Huang, Yiqi Jin, Ke Tao, Kaijian Xia, Jianfeng Gu, Lei Yu, Lan Du, Cunjian Chen

    Abstract: May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  38. arXiv:2406.01062  [pdf, other

    cs.CV

    Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

    Authors: Qilong Zhangli, Jindong Jiang, Di Liu, Licheng Yu, Xiaoliang Dai, Ankit Ramchandani, Guan Pang, Dimitris N. Metaxas, Praveen Krishnan

    Abstract: While diffusion models have significantly advanced the quality of image generation their capability to accurately and coherently render text within these images remains a substantial challenge. Conventional diffusion-based methods for scene text generation are typically limited by their reliance on an intermediate layout output. This dependency often results in a constrained diversity of text styl… ▽ More

    Submitted 19 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7496-7506

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7496-7506

  39. arXiv:2406.00588  [pdf, other

    cs.LG cs.CR math.ST

    Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

    Authors: Lijia Yu, Shuang Liu, Yibo Miao, Xiao-Shan Gao, Lijun Zhang

    Abstract: The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpo… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  40. arXiv:2405.20986  [pdf, other

    cs.LG cs.CV

    Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks

    Authors: Linlin Yu, Bowen Yang, Tianhao Wang, Kangshuo Li, Feng Chen

    Abstract: The fusion of raw features from multiple sensors on an autonomous vehicle to create a Bird's Eye View (BEV) representation is crucial for planning and control systems. There is growing interest in using deep learning models for BEV semantic segmentation. Anticipating segmentation errors and improving the explainability of DNNs is essential for autonomous driving, yet it is under-studied. This pape… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  41. arXiv:2405.19761  [pdf, other

    cs.AI

    Revisiting CNNs for Trajectory Similarity Learning

    Authors: Zhihao Chang, Linzhu Yu, Huan Li, Sai Wu, Gang Chen, Dongxiang Zhang

    Abstract: Similarity search is a fundamental but expensive operator in querying trajectory data, due to its quadratic complexity of distance computation. To mitigate the computational burden for long trajectories, neural networks have been widely employed for similarity learning and each trajectory is encoded as a high-dimensional vector for similarity search with linear complexity. Given the sequential nat… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  42. arXiv:2405.19327  [pdf, other

    cs.CL cs.AI cs.LG

    MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

    Authors: Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu , et al. (20 additional authors not shown)

    Abstract: Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparabl… ▽ More

    Submitted 10 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: https://map-neo.github.io/

  43. arXiv:2405.18997  [pdf, other

    stat.ML cs.LG

    Kernel Semi-Implicit Variational Inference

    Authors: Ziheng Cheng, Longlin Yu, Tianyu Xie, Shiyue Zhang, Cheng Zhang

    Abstract: Semi-implicit variational inference (SIVI) extends traditional variational families with semi-implicit distributions defined in a hierarchical manner. Due to the intractable densities of semi-implicit distributions, classical SIVI often resorts to surrogates of evidence lower bound (ELBO) that would introduce biases for training. A recent advancement in SIVI, named SIVI-SM, utilizes an alternative… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024 camera ready

  44. arXiv:2405.18035  [pdf, other

    cs.CL

    Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis

    Authors: Guangmin Zheng, Jin Wang, Liang-Chih Yu, Xuejie Zhang

    Abstract: Aspect-based sentiment analysis (ABSA) identifies sentiment information related to specific aspects and provides deeper market insights to businesses and organizations. With the emergence of large language models (LMs), recent studies have proposed using fixed examples for instruction tuning to reformulate ABSA as a generation task. However, the performance is sensitive to the selection of in-cont… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: ACL Findings 2024

  45. arXiv:2405.16728  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Towards Multi-Task Multi-Modal Models: A Video Generative Perspective

    Authors: Lijun Yu

    Abstract: Advancements in language foundation models have primarily fueled the recent surge in artificial intelligence. In contrast, generative learning of non-textual modalities, especially videos, significantly trails behind language modeling. This thesis chronicles our endeavor to build multi-task models for generating videos and other modalities under diverse conditions, as well as for understanding and… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: PhD thesis

  46. arXiv:2405.16577  [pdf, other

    stat.ML cs.LG

    Reflected Flow Matching

    Authors: Tianyu Xie, Yu Zhu, Longlin Yu, Tong Yang, Ziheng Cheng, Shiyue Zhang, Xiangyu Zhang, Cheng Zhang

    Abstract: Continuous normalizing flows (CNFs) learn an ordinary differential equation to transform prior samples into data. Flow matching (FM) has recently emerged as a simulation-free approach for training CNFs by regressing a velocity model towards the conditional velocity field. However, on constrained domains, the learned velocity model may lead to undesirable flows that result in highly unnatural sampl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: ICML 2024 camera-ready

  47. arXiv:2405.15984  [pdf, other

    cs.CL cs.AI

    Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models

    Authors: Simon Chi Lok Yu, Jie He, Pasquale Minervini, Jeff Z. Pan

    Abstract: With the emergence of large language models, such as LLaMA and OpenAI GPT-3, In-Context Learning (ICL) gained significant attention due to its effectiveness and efficiency. However, ICL is very sensitive to the choice, order, and verbaliser used to encode the demonstrations in the prompt. Retrieval-Augmented ICL methods try to address this problem by leveraging retrievers to extract semantically r… ▽ More

    Submitted 10 July, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: COLM 2024, 29 pages, 6 figures

  48. arXiv:2405.15549  [pdf, other

    cs.CV

    SEP: Self-Enhanced Prompt Tuning for Visual-Language Model

    Authors: Hantao Yao, Rui Zhang, Lu Yu, Changsheng Xu

    Abstract: Prompt tuning based on Context Optimization (CoOp) effectively adapts visual-language models (VLMs) to downstream tasks by inferring additional learnable prompt tokens. However, these tokens are less discriminative as they are independent of the pre-trained tokens and fail to capture input-specific knowledge, such as class-aware textual or instance-aware visual knowledge. Leveraging the discrimina… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  49. arXiv:2405.15452  [pdf, other

    cs.CL cs.AI cs.LG

    Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top

    Authors: Keyuan Cheng, Muhammad Asif Ali, Shu Yang, Gang Lin, Yuxuan Zhai, Haoyang Fei, Ke Xu, Lu Yu, Lijie Hu, Di Wang

    Abstract: Multi-hop Question Answering (MQA) under knowledge editing (KE) is a key challenge in Large Language Models (LLMs). While best-performing solutions in this domain use a plan and solve paradigm to split a question into sub-questions followed by response generation, we claim that this approach is sub-optimal as it fails for hard to decompose questions, and it does not explicitly cater to correlated… ▽ More

    Submitted 27 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 18 pages

  50. arXiv:2405.15379  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Log-Concave Sampling on Compact Supports: A Versatile Proximal Framework

    Authors: Lu Yu

    Abstract: In this paper, we explore sampling from strongly log-concave distributions defined on convex and compact supports. We propose a general proximal framework that involves projecting onto the constrained set, which is highly flexible and supports various projection options. Specifically, we consider the cases of Euclidean and Gauge projections, with the latter having the advantage of being performed… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.