Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 240 results for author: Meng, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13048  [pdf, other

    cs.CL

    Establishing Knowledge Preference in Language Models

    Authors: Sizhe Zhou, Sha Li, Yu Meng, Yizhu Jiao, Heng Ji, Jiawei Han

    Abstract: Language models are known to encode a great amount of factual knowledge through pretraining. However, such knowledge might be insufficient to cater to user requests, requiring the model to integrate external knowledge sources and adhere to user-provided specifications. When answering questions about ongoing events, the model should use recent news articles to update its response; when asked to pro… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 27 pages, 8 figures, 23 tables, working in progress

  2. arXiv:2407.11282  [pdf, other

    cs.CL

    Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models

    Authors: Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang

    Abstract: Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial. One commonly used method to assess the reliability of LLMs' responses is uncertainty estimation, which gauges the likelihood of their answers being correct. While many studies focus on improving the accuracy of uncertainty estimations for LLMs, our research investigates… ▽ More

    Submitted 19 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  3. arXiv:2407.09852  [pdf

    cs.LG cs.CE

    Free-form Grid Structure Form Finding based on Machine Learning and Multi-objective Optimisation

    Authors: Yiping Meng, Yiming Sun

    Abstract: Free-form structural forms are widely used to design spatial structures for their irregular spatial morphology. Current free-form form-finding methods cannot adequately meet the material properties, structural requirements or construction conditions, which brings the deviation between the initial 3D geometric design model and the constructed free-form structure. Thus, the main focus of this paper… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 11 pages, 9 figures

  4. arXiv:2407.05010  [pdf, other

    cs.CV

    PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference

    Authors: Ye Li, Chen Tang, Yuan Meng, Jiajun Fan, Zenghao Chai, Xinzhu Ma, Zhi Wang, Wenwu Zhu

    Abstract: We introduce PRANCE, a Vision Transformer compression framework that jointly optimizes the activated channels and reduces tokens, based on the characteristics of inputs. Specifically, PRANCE~ leverages adaptive token optimization strategies for a certain computational budget, aiming to accelerate ViTs' inference from a unified data and architectural perspective. However, the joint framework poses… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  5. arXiv:2407.04068  [pdf, other

    cs.CV

    CLIP-DR: Textual Knowledge-Guided Diabetic Retinopathy Grading with Ranking-aware Prompting

    Authors: Qinkai Yu, Jianyang Xie, Anh Nguyen, He Zhao, Jiong Zhang, Huazhu Fu, Yitian Zhao, Yalin Zheng, Yanda Meng

    Abstract: Diabetic retinopathy (DR) is a complication of diabetes and usually takes decades to reach sight-threatening levels. Accurate and robust detection of DR severity is critical for the timely management and treatment of diabetes. However, most current DR grading methods suffer from insufficient robustness to data variability (\textit{e.g.} colour fundus images), posing a significant difficulty for ac… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI 2024

  6. arXiv:2407.02208  [pdf, other

    cs.CL cs.AI

    How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise on Machine Translation

    Authors: Yan Meng, Di Wu, Christof Monz

    Abstract: The massive amounts of web-mined parallel data contain large amounts of noise. Semantic misalignment, as the primary source of the noise, poses a challenge for training machine translation systems. In this paper, we first study the impact of real-world hard-to-detect misalignment noise by proposing a process to simulate the realistic misalignment controlled by semantic similarity. After quantitati… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  7. arXiv:2406.17343  [pdf, other

    cs.CV cs.AI

    Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

    Authors: Lei Chen, Yuan Meng, Chen Tang, Xinzhu Ma, Jingyan Jiang, Xin Wang, Zhi Wang, Wenwu Zhu

    Abstract: Recent advancements in diffusion models, particularly the trend of architectural transformation from UNet-based Diffusion to Diffusion Transformer (DiT), have significantly improved the quality and scalability of image synthesis. Despite the incredible generative quality, the large computational requirements of these large-scale models significantly hinder the deployments in real-world scenarios.… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  8. arXiv:2406.16357  [pdf, other

    cs.LG cs.AI cs.SI

    Towards Lightweight Graph Neural Network Search with Curriculum Graph Sparsification

    Authors: Beini Xie, Heng Chang, Ziwei Zhang, Zeyang Zhang, Simin Wu, Xin Wang, Yuan Meng, Wenwu Zhu

    Abstract: Graph Neural Architecture Search (GNAS) has achieved superior performance on various graph-structured tasks. However, existing GNAS studies overlook the applications of GNAS in resource-constraint scenarios. This paper proposes to design a joint graph data and architecture mechanism, which identifies important sub-architectures via the valuable graph data. To search for optimal lightweight Graph N… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. The two first authors made equal contributions

  9. arXiv:2406.13629  [pdf, other

    cs.CL cs.LG

    InstructRAG: Instructing Retrieval-Augmented Generation with Explicit Denoising

    Authors: Zhepei Wei, Wei-Lin Chen, Yu Meng

    Abstract: Retrieval-augmented generation (RAG) has shown promising potential to enhance the accuracy and factuality of language models (LMs). However, imperfect retrievers or noisy corpora can introduce misleading or even erroneous information to the retrieved contents, posing a significant challenge to the generation quality. Existing RAG methods typically address this challenge by directly predicting fina… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/weizhepei/InstructRAG

  10. arXiv:2406.12928  [pdf, other

    cs.LG cs.AI cs.CL

    Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox

    Authors: Yijun Liu, Yuan Meng, Fang Wu, Shenhao Peng, Hang Yao, Chaoyu Guan, Chen Tang, Xinzhu Ma, Zhi Wang, Wenwu Zhu

    Abstract: Large language models (LLMs) have exhibited exciting progress in multiple scenarios, while the huge computational demands hinder their deployments in lots of real-world applications. As an effective means to reduce memory footprint and inference cost, quantization also faces challenges in performance degradation at low bit-widths. Understanding the impact of quantization on LLM capabilities, espec… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  11. arXiv:2406.12211  [pdf, other

    cs.CV

    PCIE_LAM Solution for Ego4D Looking At Me Challenge

    Authors: Kanokphan Lertniphonphan, Jun Xie, Yaqing Meng, Shijing Wang, Feng Chen, Zhepeng Wang

    Abstract: This report presents our team's 'PCIE_LAM' solution for the Ego4D Looking At Me Challenge at CVPR2024. The main goal of the challenge is to accurately determine if a person in the scene is looking at the camera wearer, based on a video where the faces of social partners have been localized. Our proposed solution, InternLSTM, consists of an InternVL image encoder and a Bi-LSTM network. The InternVL… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  12. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, Jinming Guo, Xiaolin Chen, Jingcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  13. arXiv:2406.05033  [pdf, other

    cs.LG math.OC

    Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes

    Authors: Si Yi Meng, Antonio Orvieto, Daniel Yiming Cao, Christopher De Sa

    Abstract: We study gradient descent (GD) dynamics on logistic regression problems with large, constant step sizes. For linearly-separable data, it is known that GD converges to the minimizer with arbitrarily large step sizes, a property which no longer holds when the problem is not separable. In fact, the behaviour can be much more complex -- a sequence of period-doubling bifurcations begins at the critical… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  14. arXiv:2406.01171  [pdf, other

    cs.CL

    Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization

    Authors: Yu-Min Tseng, Yu-Chao Huang, Teng-Yun Hsiao, Wei-Lin Chen, Chao-Wei Huang, Yu Meng, Yun-Nung Chen

    Abstract: The concept of persona, originally adopted in dialogue literature, has re-surged as a promising framework for tailoring large language models (LLMs) to specific context (e.g., personalized search, LLM-as-a-judge). However, the growing research on leveraging persona in LLMs is relatively disorganized and lacks a systematic taxonomy. To close the gap, we present a comprehensive survey to categorize… ▽ More

    Submitted 26 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 8-page version

  15. arXiv:2405.20202  [pdf, other

    cs.AI

    One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

    Authors: Ke Yi, Yuhui Xu, Heng Chang, Chen Tang, Yuan Meng, Tong Zhang, Jia Li

    Abstract: Large Language Models (LLMs) have advanced rapidly but face significant memory demands. While quantization has shown promise for LLMs, current methods typically require lengthy training to alleviate the performance degradation from quantization loss. However, deploying LLMs across diverse scenarios with different resource constraints, e.g., servers and personal computers, requires repeated trainin… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  16. arXiv:2405.18194  [pdf, other

    cs.LG cs.CR

    Delving into Differentially Private Transformer

    Authors: Youlong Ding, Xueyang Wu, Yining Meng, Yonggang Luo, Hao Wang, Weike Pan

    Abstract: Deep learning with differential privacy (DP) has garnered significant attention over the past years, leading to the development of numerous methods aimed at enhancing model accuracy and training efficiency. This paper delves into the problem of training Transformer models with differential privacy. Our treatment is modular: the logic is to `reduce' the problem of training DP Transformer to the mor… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  17. arXiv:2405.14734  [pdf, other

    cs.CL cs.LG

    SimPO: Simple Preference Optimization with a Reference-Free Reward

    Authors: Yu Meng, Mengzhou Xia, Danqi Chen

    Abstract: Direct Preference Optimization (DPO) is a widely used offline preference optimization algorithm that reparameterizes reward functions in reinforcement learning from human feedback (RLHF) to enhance simplicity and training stability. In this work, we propose SimPO, a simpler yet more effective approach. The effectiveness of SimPO is attributed to a key design: using the average log probability of a… ▽ More

    Submitted 8 July, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Code: https://github.com/princeton-nlp/SimPO. v2 updates: additional baselines (RRHF, SLiC-HF, CPO); a new setting Llama3-Instruct-v0.2 (Appendix G); more analyses (Section 4.4 & Appendix H)

  18. arXiv:2405.14507  [pdf, other

    cs.CL cs.LG

    Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

    Authors: Chufan Shi, Cheng Yang, Xinyu Zhu, Jiahao Wang, Taiqiang Wu, Siheng Li, Deng Cai, Yujiu Yang, Yu Meng

    Abstract: Mixture-of-Experts (MoE) has emerged as a prominent architecture for scaling model size while maintaining computational efficiency. In MoE, each token in the input sequence activates a different subset of experts determined by a routing mechanism. However, the unchosen experts in MoE models do not contribute to the output, potentially leading to underutilization of the model's capacity. In this wo… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  19. arXiv:2405.01561  [pdf

    cs.SE cs.AI cs.CY

    Rapid Mobile App Development for Generative AI Agents on MIT App Inventor

    Authors: Jaida Gao, Calab Su, Etai Miller, Kevin Lu, Yu Meng

    Abstract: The evolution of Artificial Intelligence (AI) stands as a pivotal force shaping our society, finding applications across diverse domains such as education, sustainability, and safety. Leveraging AI within mobile applications makes it easily accessible to the public, catalyzing its transformative potential. In this paper, we present a methodology for the rapid development of AI agent applications u… ▽ More

    Submitted 31 March, 2024; originally announced May 2024.

    Journal ref: Journal of advances in information science and technology 2(3) 1-8, March 2024

  20. arXiv:2404.14786  [pdf, other

    cs.AI cs.LG stat.ME

    RealTCD: Temporal Causal Discovery from Interventional Data with Large Language Model

    Authors: Peiwen Li, Xin Wang, Zeyang Zhang, Yuan Meng, Fang Shen, Yue Li, Jialong Wang, Yang Li, Wenweu Zhu

    Abstract: In the field of Artificial Intelligence for Information Technology Operations, causal discovery is pivotal for operation and maintenance of graph construction, facilitating downstream industrial tasks such as root cause analysis. Temporal causal discovery, as an emerging method, aims to identify temporal causal relationships between variables directly from observations by utilizing interventional… ▽ More

    Submitted 26 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  21. arXiv:2404.13278  [pdf, other

    cs.LG cs.AI cs.DC eess.SP

    Federated Transfer Learning with Task Personalization for Condition Monitoring in Ultrasonic Metal Welding

    Authors: Ahmadreza Eslaminia, Yuquan Meng, Klara Nahrstedt, Chenhui Shao

    Abstract: Ultrasonic metal welding (UMW) is a key joining technology with widespread industrial applications. Condition monitoring (CM) capabilities are critically needed in UMW applications because process anomalies significantly deteriorate the joining quality. Recently, machine learning models emerged as a promising tool for CM in many manufacturing applications due to their ability to learn complex patt… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 37 pages, 8 figures

  22. arXiv:2404.11614  [pdf, other

    cs.CV

    Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

    Authors: Zichen Liu, Yihao Meng, Hao Ouyang, Yue Yu, Bolin Zhao, Daniel Cohen-Or, Huamin Qu

    Abstract: Text animation serves as an expressive medium, transforming static communication into dynamic experiences by infusing words with motion to evoke emotions, emphasize meanings, and construct compelling narratives. Crafting animations that are semantically aware poses significant challenges, demanding expertise in graphic design and animation. We present an automated text animation scheme, termed "Dy… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Our demo page is available at: https://animate-your-word.github.io/demo/

  23. arXiv:2404.09532  [pdf, other

    cs.CV cs.LG

    TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

    Authors: Haojun Sun, Chen Tang, Zhi Wang, Yuan Meng, Jingyan jiang, Xinzhu Ma, Wenwu Zhu

    Abstract: Diffusion models have emerged as preeminent contenders in the realm of generative models. Distinguished by their distinctive sequential generative processes, characterized by hundreds or even thousands of timesteps, diffusion models progressively reconstruct images from pure Gaussian noise, with each timestep necessitating full inference of the entire model. However, the substantial computational… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  24. arXiv:2404.08195  [pdf, other

    cs.CV

    Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation

    Authors: Zhiwei Yang, Yucong Meng, Kexue Fu, Shuo Wang, Zhijian Song

    Abstract: Weakly supervised semantic segmentation (WSSS) with image-level labels intends to achieve dense tasks without laborious annotations. However, due to the ambiguous contexts and fuzzy regions, the performance of WSSS, especially the stages of generating Class Activation Maps (CAMs) and refining pseudo masks, widely suffers from ambiguity while being barely noticed by previous literature. In this wor… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  25. arXiv:2404.07103  [pdf, other

    cs.CL cs.IR cs.LG

    Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs

    Authors: Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Zheng Li, Ruirui Li, Xianfeng Tang, Suhang Wang, Yu Meng, Jiawei Han

    Abstract: Large language models (LLMs), while exhibiting exceptional performance, suffer from hallucinations, especially on knowledge-intensive tasks. Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora to alleviate the issue. However, in many domains, texts are interconnected (e.g., academic papers in a bibliographic graph are linked by citations and… ▽ More

    Submitted 15 July, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 21 pages. Code: https://github.com/PeterGriffinJin/Graph-CoT

  26. arXiv:2404.07066  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

    Authors: Mingyu Jin, Qinkai Yu, Jingyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, Yongfeng Zhang

    Abstract: Large language models (LLMs) have shown remarkable performances across a wide range of tasks. However, the mechanisms by which these models encode tasks of varying complexities remain poorly understood. In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of "Concept Depth" to suggest that more complex concepts are ty… ▽ More

    Submitted 30 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 12 pages

  27. arXiv:2404.05639  [pdf, other

    cs.LG cs.AI cs.CR

    Investigating the Impact of Quantization on Adversarial Robustness

    Authors: Qun Li, Yuan Meng, Chen Tang, Jiacheng Jiang, Zhi Wang

    Abstract: Quantization is a promising technique for reducing the bit-width of deep models to improve their runtime performance and storage efficiency, and thus becomes a fundamental step for deployment. In real-world scenarios, quantized models are often faced with adversarial attacks which cause the model to make incorrect inferences by introducing slight perturbations. However, recent studies have paid le… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted to ICLR 2024 Workshop PML4LRS

  28. arXiv:2403.10013  [pdf, other

    eess.SY cs.LG math.OC

    LyZNet: A Lightweight Python Tool for Learning and Verifying Neural Lyapunov Functions and Regions of Attraction

    Authors: Jun Liu, Yiming Meng, Maxwell Fitzsimmons, Ruikun Zhou

    Abstract: In this paper, we describe a lightweight Python framework that provides integrated learning and verification of neural Lyapunov functions for stability analysis. The proposed tool, named LyZNet, learns neural Lyapunov functions using physics-informed neural networks (PINNs) to solve Zubov's equation and verifies them using satisfiability modulo theories (SMT) solvers. What distinguishes this tool… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: To appear in the 27th ACM International Conference on Hybrid Systems: Computation and Control (HSCC 2024). arXiv admin note: text overlap with arXiv:2312.09131

  29. arXiv:2403.08495  [pdf, other

    cs.CL

    Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator

    Authors: Yusheng Liao, Yutong Meng, Yuhao Wang, Hongcheng Liu, Yanfeng Wang, Yu Wang

    Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in human interactions, yet their application within the medical field remains insufficiently explored. Previous works mainly focus on the performance of medical knowledge with examinations, which is far from the realistic scenarios, falling short in assessing the abilities of LLMs on clinical tasks. In the quest to enhance the a… ▽ More

    Submitted 20 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 23 pages, 5 figures

  30. arXiv:2403.05300  [pdf, other

    cs.LG cs.AI

    Unity by Diversity: Improved Representation Learning in Multimodal VAEs

    Authors: Thomas M. Sutter, Yang Meng, Andrea Agostini, Daphné Chopard, Norbert Fortin, Julia E. Vogt, Bahbak Shahbaba, Stephan Mandt

    Abstract: Variational Autoencoders for multimodal data hold promise for many tasks in data analysis, such as representation learning, conditional generation, and imputation. Current architectures either share the encoder output, decoder input, or both across modalities to learn a shared representation. Such architectures impose hard constraints on the model. In this work, we show that a better latent repres… ▽ More

    Submitted 31 May, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  31. arXiv:2402.11142  [pdf, other

    cs.CL

    Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction

    Authors: Sizhe Zhou, Yu Meng, Bowen Jin, Jiawei Han

    Abstract: Relation extraction (RE), a crucial task in NLP, aims to identify semantic relationships between entities mentioned in texts. Despite significant advancements in this field, existing models typically rely on extensive annotated data for training, which can be both costly and time-consuming to acquire. Moreover, these models often struggle to adapt to new or unseen relationships. In contrast, few-s… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 21 pages, 12 Tables, 9 Figures

  32. arXiv:2402.01772  [pdf, other

    cs.CL cs.AI cs.LG

    Disentangling the Roles of Target-Side Transfer and Regularization in Multilingual Machine Translation

    Authors: Yan Meng, Christof Monz

    Abstract: Multilingual Machine Translation (MMT) benefits from knowledge transfer across different language pairs. However, improvements in one-to-many translation compared to many-to-one translation are only marginal and sometimes even negligible. This performance discrepancy raises the question of to what extent positive transfer plays a role on the target-side for one-to-many MT. In this paper, we conduc… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  33. arXiv:2402.00746  [pdf, other

    cs.CL

    Health-LLM: Personalized Retrieval-Augmented Disease Prediction System

    Authors: Mingyu Jin, Qinkai Yu, Dong Shu, Chong Zhang, Lizhou Fan, Wenyue Hua, Suiyuan Zhu, Yanda Meng, Zhenting Wang, Mengnan Du, Yongfeng Zhang

    Abstract: Recent advancements in artificial intelligence (AI), especially large language models (LLMs), have significantly advanced healthcare applications and demonstrated potentials in intelligent medical treatment. However, there are conspicuous challenges such as vast data volumes and inconsistent symptom characterization standards, preventing full integration of healthcare AI systems with individual pa… ▽ More

    Submitted 19 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  34. arXiv:2401.12413  [pdf, other

    cs.CL cs.LG

    How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

    Authors: Di Wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz

    Abstract: Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem. A common, albeit resource-consuming, solution is to add as many related translation directions as possible to the training corpus. In this paper, we show that for an English-centric model, surprisingly large zero-shot improveme… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 15 pages, 5 figures

  35. arXiv:2401.11825  [pdf, other

    math.NA cs.LG

    Sparse discovery of differential equations based on multi-fidelity Gaussian process

    Authors: Yuhuang Meng, Yue Qiu

    Abstract: Sparse identification of differential equations aims to compute the analytic expressions from the observed data explicitly. However, there exist two primary challenges. Firstly, it exhibits sensitivity to the noise in the observed data, particularly for the derivatives computations. Secondly, existing literature predominantly concentrates on single-fidelity (SF) data, which imposes limitations on… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  36. arXiv:2401.04925  [pdf, other

    cs.CL cs.AI

    The Impact of Reasoning Step Length on Large Language Models

    Authors: Mingyu Jin, Qinkai Yu, Dong Shu, Haiyan Zhao, Wenyue Hua, Yanda Meng, Yongfeng Zhang, Mengnan Du

    Abstract: Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations. Specifically, we design experiments that expand and compress the ra… ▽ More

    Submitted 22 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Findings of ACL 2024

  37. arXiv:2401.04737  [pdf

    cs.SD cs.AI eess.AS

    Music Genre Classification: A Comparative Analysis of CNN and XGBoost Approaches with Mel-frequency cepstral coefficients and Mel Spectrograms

    Authors: Yigang Meng

    Abstract: In recent years, various well-designed algorithms have empowered music platforms to provide content based on one's preferences. Music genres are defined through various aspects, including acoustic features and cultural considerations. Music genre classification works well with content-based filtering, which recommends content based on music similarity to users. Given a considerable dataset, one pr… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  38. arXiv:2401.01543  [pdf, other

    cs.CV

    Retraining-free Model Quantization via One-Shot Weight-Coupling Learning

    Authors: Chen Tang, Yuan Meng, Jiacheng Jiang, Shuzhao Xie, Rongwei Lu, Xinzhu Ma, Zhi Wang, Wenwu Zhu

    Abstract: Quantization is of significance for compressing the over-parameterized deep neural models and deploying them on resource-limited devices. Fixed-precision quantization suffers from performance drop due to the limited numerical representation ability. Conversely, mixed-precision quantization (MPQ) is advocated to compress the model effectively by allocating heterogeneous bit-width for layers. MPQ is… ▽ More

    Submitted 14 June, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  39. arXiv:2312.10807  [pdf, other

    cs.RO

    Language-conditioned Learning for Robotic Manipulation: A Survey

    Authors: Hongkuan Zhou, Xiangtong Yao, Yuan Meng, Siming Sun, Zhenshan Bing, Kai Huang, Alois Knoll

    Abstract: Language-conditioned robotic manipulation represents a cutting-edge area of research, enabling seamless communication and cooperation between humans and robotic agents. This field focuses on teaching robotic systems to comprehend and execute instructions conveyed in natural language. To achieve this, the development of robust language understanding models capable of extracting actionable insights… ▽ More

    Submitted 3 February, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

  40. arXiv:2312.09131  [pdf, other

    math.OC cs.LG eess.SY

    Physics-Informed Neural Network Lyapunov Functions: PDE Characterization, Learning, and Verification

    Authors: Jun Liu, Yiming Meng, Maxwell Fitzsimmons, Ruikun Zhou

    Abstract: We provide a systematic investigation of using physics-informed neural networks to compute Lyapunov functions. We encode Lyapunov conditions as a partial differential equation (PDE) and use this for training neural network Lyapunov functions. We analyze the analytical properties of the solutions to the Lyapunov and Zubov PDEs. In particular, we show that employing the Zubov equation in training ne… ▽ More

    Submitted 21 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: The current version has been submitted for publication; corrected some minor typos from v2

  41. arXiv:2312.03725  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    SCStory: Self-supervised and Continual Online Story Discovery

    Authors: Susik Yoon, Yu Meng, Dongha Lee, Jiawei Han

    Abstract: We present a framework SCStory for online story discovery, that helps people digest rapidly published news article streams in real-time without human annotations. To organize news article streams into stories, existing approaches directly encode the articles and cluster them based on representation similarity. However, these methods yield noisy and inaccurate story discovery results because the ge… ▽ More

    Submitted 26 November, 2023; originally announced December 2023.

    Comments: Presented at WWW'23

  42. arXiv:2312.00601  [pdf, ps, other

    cs.DS

    Online Graph Coloring with Predictions

    Authors: Antonios Antoniadis, Hajo Broersma, Yang Meng

    Abstract: We introduce learning augmented algorithms to the online graph coloring problem. Although the simple greedy algorithm FirstFit is known to perform poorly in the worst case, we are able to establish a relationship between the structure of any input graph $G$ that is revealed online and the number of colors that FirstFit uses for $G$. Based on this relationship, we propose an online coloring algorit… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  43. arXiv:2312.00375  [pdf, other

    cs.CV

    Text-Guided 3D Face Synthesis -- From Generation to Editing

    Authors: Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu

    Abstract: Text-guided 3D face synthesis has achieved remarkable results by leveraging text-to-image (T2I) diffusion models. However, most existing works focus solely on the direct generation, ignoring the editing, restricting them from synthesizing customized 3D faces through iterative adjustments. In this paper, we propose a unified text-guided framework from face generation to editing. In the generation s… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  44. arXiv:2312.00374  [pdf, other

    cs.CR

    The Philosopher's Stone: Trojaning Plugins of Large Language Models

    Authors: Tian Dong, Minhui Xue, Guoxing Chen, Rayne Holland, Shaofeng Li, Yan Meng, Zhen Liu, Haojin Zhu

    Abstract: Open-source Large Language Models (LLMs) have recently gained popularity because of their comparable performance to proprietary LLMs. To efficiently fulfill domain-specialized tasks, open-source LLMs can be refined, without expensive accelerators, using low-rank adapters. However, it is still unknown whether low-rank adapters can be exploited to control LLMs. To address this gap, we demonstrate th… ▽ More

    Submitted 13 March, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

  45. arXiv:2311.09445  [pdf, other

    cs.DC

    A Software-Hardware Co-Optimized Toolkit for Deep Reinforcement Learning on Heterogeneous Platforms

    Authors: Yuan Meng, Michael Kinsner, Deshanand Singh, Mahesh A Iyer, Viktor Prasanna

    Abstract: Deep Reinforcement Learning (DRL) is vital in various AI applications. DRL algorithms comprise diverse compute kernels, which may not be simultaneously optimized using a homogeneous architecture. However, even with available heterogeneous architectures, optimizing DRL performance remains a challenge due to the complexity of hardware and programming models employed in modern data centers. To addres… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Submitted to IPDPS 2024

  46. arXiv:2310.16717  [pdf, other

    cs.CV

    Prompt-Driven Building Footprint Extraction in Aerial Images with Offset-Building Model

    Authors: Kai Li, Yupeng Deng, Yunlong Kong, Diyou Liu, Jingbo Chen, Yu Meng, Junxian Ma

    Abstract: More accurate extraction of invisible building footprints from very-high-resolution (VHR) aerial images relies on roof segmentation and roof-to-footprint offset extraction. Existing state-of-the-art methods based on instance segmentation suffer from poor generalization when extended to large-scale data production and fail to achieve low-cost human interactive annotation. The latest prompt paradigm… ▽ More

    Submitted 11 March, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    ACM Class: I.4.6; I.4.7; I.3.5; I.5.1

  47. arXiv:2310.07641  [pdf, other

    cs.CL cs.LG

    Evaluating Large Language Models at Evaluating Instruction Following

    Authors: Zhiyuan Zeng, Jiatong Yu, Tianyu Gao, Yu Meng, Tanya Goyal, Danqi Chen

    Abstract: As research in large language models (LLMs) continues to accelerate, LLM-based evaluation has emerged as a scalable and cost-effective alternative to human evaluations for comparing the ever increasing list of models. This paper investigates the efficacy of these ``LLM evaluators'', particularly in using them to assess instruction following, a metric that gauges how closely generated text adheres… ▽ More

    Submitted 16 April, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  48. arXiv:2310.06684  [pdf, other

    cs.CL cs.LG

    Learning Multiplex Representations on Text-Attributed Graphs with One Language Model Encoder

    Authors: Bowen Jin, Wentao Zhang, Yu Zhang, Yu Meng, Han Zhao, Jiawei Han

    Abstract: In real-world scenarios, texts in a graph are often linked by multiple semantic relations (e.g., papers in an academic graph are referenced by other publications, written by the same author, or published in the same venue), where text documents and their relations form a multiplex text-attributed graph. Mainstream text representation learning methods use pretrained language models (PLMs) to genera… ▽ More

    Submitted 13 July, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: 9 pages, 11 appendix pages

  49. arXiv:2310.05447  [pdf, other

    cs.CV

    Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection

    Authors: Xinzhu Ma, Yongtao Wang, Yinmin Zhang, Zhiyi Xia, Yuan Meng, Zhihui Wang, Haojie Li, Wanli Ouyang

    Abstract: In this work, we build a modular-designed codebase, formulate strong training recipes, design an error diagnosis toolbox, and discuss current methods for image-based 3D object detection. In particular, different from other highly mature tasks, e.g., 2D object detection, the community of image-based 3D object detection is still evolving, where methods often adopt different training recipes and tric… ▽ More

    Submitted 11 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: ICCV23, code will be released soon

  50. arXiv:2310.05313  [pdf, other

    cs.PF cs.DC

    Accelerating Deep Neural Network guided MCTS using Adaptive Parallelism

    Authors: Yuan Meng, Qian Wang, Tianxin Zu, Viktor Prasanna

    Abstract: Deep Neural Network guided Monte-Carlo Tree Search (DNN-MCTS) is a powerful class of AI algorithms. In DNN-MCTS, a Deep Neural Network model is trained collaboratively with a dynamic Monte-Carlo search tree to guide the agent towards actions that yields the highest returns. While the DNN operations are highly parallelizable, the search tree operations involved in MCTS are sequential and often beco… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: The first two authors contributed equally. Accepted to the 13th Workshop on Irregular Applications: Architectures and Algorithms (IA^3) 2023