Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 38,265 results for author: Chen

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04699  [pdf, other

    cs.CV cs.AI

    LaRa: Efficient Large-Baseline Radiance Fields

    Authors: Anpei Chen, Haofei Xu, Stefano Esposito, Siyu Tang, Andreas Geiger

    Abstract: Radiance field methods have achieved photorealistic novel view synthesis and geometry reconstruction. But they are mostly applied in per-scene optimization or small-baseline settings. While several recent works investigate feed-forward reconstruction with large baselines by utilizing transformers, they all operate with a standard global attention mechanism and hence ignore the local nature of 3D r… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2407.04693  [pdf, other

    cs.CL cs.AI

    ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

    Authors: Yuzhe Gu, Ziwei Ji, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen

    Abstract: Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domains and sizes, which struggle to scale due to prohibitive labor costs and insufficient reliability of existing hallucination annotators. To facilitate the scalable oversight of LLM hallucin… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 9 pages

  3. arXiv:2407.04688  [pdf, other

    cs.CV

    Enhancing Vehicle Re-identification and Matching for Weaving Analysis

    Authors: Mei Qiu, Wei Lin, Stanley Chien, Lauren Christopher, Yaobin Chen, Shu Hu

    Abstract: Vehicle weaving on highways contributes to traffic congestion, raises safety issues, and underscores the need for sophisticated traffic management systems. Current tools are inadequate in offering precise and comprehensive data on lane-specific weaving patterns. This paper introduces an innovative method for collecting non-overlapping video data in weaving zones, enabling the generation of quantit… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  4. arXiv:2407.04686  [pdf, other

    cs.DS math.NA

    Near-optimal hierarchical matrix approximation from matrix-vector products

    Authors: Tyler Chen, Feyza Duman Keles, Diana Halikias, Cameron Musco, Christopher Musco, David Persson

    Abstract: We describe a randomized algorithm for producing a near-optimal hierarchical off-diagonal low-rank (HODLR) approximation to an $n\times n$ matrix $\mathbf{A}$, accessible only though matrix-vector products with $\mathbf{A}$ and $\mathbf{A}^{\mathsf{T}}$. We prove that, for the rank-$k$ HODLR approximation problem, our method achieves a $(1+β)^{\log(n)}$-optimal approximation in expected Frobenius… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  5. arXiv:2407.04681  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge

    Authors: Yuanze Lin, Yunsheng Li, Dongdong Chen, Weijian Xu, Ronald Clark, Philip Torr, Lu Yuan

    Abstract: In recent years, multimodal large language models (MLLMs) have made significant strides by training on vast high-quality image-text datasets, enabling them to generally understand images well. However, the inherent difficulty in explicitly conveying fine-grained or spatially dense information in text, such as masks, poses a challenge for MLLMs, limiting their ability to answer questions requiring… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  6. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chen Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  7. arXiv:2407.04672  [pdf, ps, other

    cs.DS math.PR

    Rapid Mixing via Coupling Independence for Spin Systems with Unbounded Degree

    Authors: Xiaoyu Chen, Weiming Feng

    Abstract: We develop a new framework to prove the mixing or relaxation time for the Glauber dynamics on spin systems with unbounded degree. It works for general spin systems including both $2$-spin and multi-spin systems. As applications for this approach: $\bullet$ We prove the optimal $O(n)$ relaxation time for the Glauber dynamics of random $q$-list-coloring on an $n$-vertices triangle-tree graph with… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  8. arXiv:2407.04620  [pdf, other

    cs.LG cs.AI cs.CL

    Learning to (Learn at Test Time): RNNs with Expressive Hidden States

    Authors: Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, Tatsunori Hashimoto, Carlos Guestrin

    Abstract: Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and t… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  9. arXiv:2407.04608  [pdf, other

    math.OC cs.GT cs.MA

    A Multi-Player Potential Game Approach for Sensor Network Localization with Noisy Measurements

    Authors: Gehui Xu, Guanpu Chen, Baris Fidan, Yiguang Hong, Hongsheng Qi, Thomas Parisini, Karl H. Johansson

    Abstract: Sensor network localization (SNL) is a challenging problem due to its inherent non-convexity and the effects of noise in inter-node ranging measurements and anchor node position. We formulate a non-convex SNL problem as a multi-player non-convex potential game and investigate the existence and uniqueness of a Nash equilibrium (NE) in both the ideal setting without measurement noise and the practic… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2311.03326, arXiv:2401.02471

  10. arXiv:2407.04576  [pdf, other

    cs.DM cs.DS math.PR

    Optimal Mixing for Randomly Sampling Edge Colorings on Trees Down to the Max Degree

    Authors: Charlie Carlson, Xiaoyu Chen, Weiming Feng, Eric Vigoda

    Abstract: We address the convergence rate of Markov chains for randomly generating an edge coloring of a given tree. Our focus is on the Glauber dynamics which updates the color at a randomly chosen edge in each step. For a tree $T$ with $n$ vertices and maximum degree $Δ$, when the number of colors $q$ satisfies $q\geqΔ+2$ then we prove that the Glauber dynamics has an optimal relaxation time of $O(n)$, wh… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  11. arXiv:2407.04490  [pdf, other

    cs.CV

    Micro-gesture Online Recognition using Learnable Query Points

    Authors: Pengyu Liu, Fei Wang, Kun Li, Guoliang Chen, Yanyan Wei, Shengeng Tang, Zhiliang Wu, Dan Guo

    Abstract: In this paper, we briefly introduce the solution developed by our team, HFUT-VUT, for the Micro-gesture Online Recognition track in the MiGA challenge at IJCAI 2024. The Micro-gesture Online Recognition task involves identifying the category and locating the start and end times of micro-gestures in video clips. Compared to the typical Temporal Action Detection task, the Micro-gesture Online Recogn… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Technical Report of HFUT-VUT for the MiGA challenge at IJCAI 2024

  12. arXiv:2407.04486  [pdf, other

    q-bio.QM cs.AI

    Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

    Authors: Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang

    Abstract: Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological i… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  13. arXiv:2407.04460  [pdf, other

    cs.LG

    Smart Sampling: Helping from Friendly Neighbors for Decentralized Federated Learning

    Authors: Lin Wang, Yang Chen, Yongxin Guo, Xiaoying Tang

    Abstract: Federated Learning (FL) is gaining widespread interest for its ability to share knowledge while preserving privacy and reducing communication costs. Unlike Centralized FL, Decentralized FL (DFL) employs a network architecture that eliminates the need for a central server, allowing direct communication among clients and leading to significant communication resource savings. However, due to data het… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  14. arXiv:2407.04416  [pdf, other

    cs.SD cs.MM eess.AS

    Improving Audio Generation with Visual Enhanced Caption

    Authors: Yi Yuan, Dongya Jia, Xiaobin Zhuang, Yuanzhe Chen, Zhengxi Liu, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xubo Liu, Mark D. Plumbley, Wenwu Wang

    Abstract: Generative models have shown significant achievements in audio generation tasks. However, existing models struggle with complex and detailed prompts, leading to potential performance degradation. We hypothesize that this problem stems from the low quality and relatively small quantity of training data. In this work, we aim to create a large-scale audio dataset with rich captions for improving audi… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 5 pages with 1 appendix

  15. arXiv:2407.04411  [pdf, other

    cs.CR cs.AI cs.CL

    Waterfall: Framework for Robust and Scalable Text Watermarking

    Authors: Gregory Kang Ruey Lau, Xinyuan Niu, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low

    Abstract: Protecting intellectual property (IP) of text such as articles and code is increasingly important, especially as sophisticated attacks become possible, such as paraphrasing by large language models (LLMs) or even unauthorized training of LLMs on copyrighted text to infringe such IP. However, existing text watermarking methods are not robust enough against such attacks nor scalable to millions of u… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  16. arXiv:2407.04362  [pdf, other

    cs.CV cs.HC

    Towards Context-aware Support for Color Vision Deficiency: An Approach Integrating LLM and AR

    Authors: Shogo Morita, Yan Zhang, Takuto Yamauchi, Sinan Chen, Jialong Li, Kenji Tei

    Abstract: People with color vision deficiency often face challenges in distinguishing colors such as red and green, which can complicate daily tasks and require the use of assistive tools or environmental adjustments. Current support tools mainly focus on presentation-based aids, like the color vision modes found in iPhone accessibility settings. However, offering context-aware support, like indicating the… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  17. arXiv:2407.04336  [pdf, ps, other

    eess.SP cs.AI

    AI-Based Beam-Level and Cell-Level Mobility Management for High Speed Railway Communications

    Authors: Wen Li, Wei Chen, Shiyue Wang, Yuanyuan Zhang, Michail Matthaiou, Bo Ai

    Abstract: High-speed railway (HSR) communications are pivotal for ensuring rail safety, operations, maintenance, and delivering passenger information services. The high speed of trains creates rapidly time-varying wireless channels, increases the signaling overhead, and reduces the system throughput, making it difficult to meet the growing and stringent needs of HSR applications. In this article, we explore… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  18. arXiv:2407.04315  [pdf, other

    cs.RO

    Gradient-based Regularization for Action Smoothness in Robotic Control with Reinforcement Learning

    Authors: I Lee, Hoang-Giang Cao, Cong-Tinh Dao, Yu-Cheng Chen, I-Chen Wu

    Abstract: Deep Reinforcement Learning (DRL) has achieved remarkable success, ranging from complex computer games to real-world applications, showing the potential for intelligent agents capable of learning in dynamic environments. However, its application in real-world scenarios presents challenges, including the jerky problem, in which jerky trajectories not only compromise system safety but also increase… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024

  19. arXiv:2407.04297  [pdf, other

    cs.CR

    HuntFUZZ: Enhancing Error Handling Testing through Clustering Based Fuzzing

    Authors: Jin Wei, Ping Chen, Jun Dai, Xiaoyan Sun, Zhihao Zhang, Chang Xu, Yi Wanga

    Abstract: Testing a program's capability to effectively handling errors is a significant challenge, given that program errors are relatively uncommon. To solve this, Software Fault Injection (SFI)-based fuzzing integrates SFI and traditional fuzzing, injecting and triggering errors for testing (error handling) code. However, we observe that current SFI-based fuzzing approaches have overlooked the correlatio… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  20. arXiv:2407.04294  [pdf, other

    cs.CR

    SQLaser: Detecting DBMS Logic Bugs with Clause-Guided Fuzzing

    Authors: Jin Wei, Ping Chen, Kangjie Lu, Jun Dai, Xiaoyan Sun

    Abstract: Database Management Systems (DBMSs) are vital components in modern data-driven systems. Their complexity often leads to logic bugs, which are implementation errors within the DBMSs that can lead to incorrect query results, data exposure, unauthorized access, etc., without necessarily causing visible system failures. Existing detection employs two strategies: rule-based bug detection and coverage-g… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  21. arXiv:2407.04281  [pdf, other

    cs.RO

    WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning

    Authors: Yiheng Li, Chongjian Ge, Chenran Li, Chenfeng Xu, Masayoshi Tomizuka, Chen Tang, Mingyu Ding, Wei Zhan

    Abstract: We propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a language annotation dataset built on WOMD, with a focus on describing and reasoning interactions and intentions in driving scenarios. Previous language datasets primarily captured interactions caused by close distances. However, interactions induced by traffic rules and human intentions, which can occur over long distances, are yet… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  22. arXiv:2407.04264  [pdf, ps, other

    cs.LG math.OC

    Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials

    Authors: August Y. Chen, Ayush Sekhari, Karthik Sridharan

    Abstract: We study the problem of non-convex optimization using Stochastic Gradient Langevin Dynamics (SGLD). SGLD is a natural and popular variation of stochastic gradient descent where at each step, appropriately scaled Gaussian noise is added. To our knowledge, the only strategy for showing global convergence of SGLD on the loss function is to show that SGLD can sample from a stationary distribution whic… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  23. arXiv:2407.04242  [pdf, other

    cs.CV

    Fine-grained Context and Multi-modal Alignment for Freehand 3D Ultrasound Reconstruction

    Authors: Zhongnuo Yan, Xin Yang, Mingyuan Luo, Jiongquan Chen, Rusi Chen, Lian Liu, Dong Ni

    Abstract: Fine-grained spatio-temporal learning is crucial for freehand 3D ultrasound reconstruction. Previous works mainly resorted to the coarse-grained spatial features and the separated temporal dependency learning and struggles for fine-grained spatio-temporal learning. Mining spatio-temporal information in fine-grained scales is extremely challenging due to learning difficulties in long-range dependen… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted at MICCAI 2024. This is the submitted manuscript and the preprint has not undergone peer review (when applicable) or any post-submission improvements or corrections

  24. arXiv:2407.04217  [pdf, other

    cs.DB cs.IR

    An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models

    Authors: Mengzhao Wang, Haotian Wu, Xiangyu Ke, Yunjun Gao, Xiaoliang Xu, Lu Chen

    Abstract: Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive Multi-modal Query Answering (MQA) system, empowered by our newly developed multi-modal retrieval framework and navigation graph… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: This demo paper has been accepted by VLDB 2024

  25. arXiv:2407.04215  [pdf, other

    cs.CV

    T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models

    Authors: Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen

    Abstract: While text-to-image diffusion models demonstrate impressive generation capabilities, they also exhibit vulnerability to backdoor attacks, which involve the manipulation of model outputs through malicious triggers. In this paper, for the first time, we propose a comprehensive defense method named T2IShield to detect, localize, and mitigate such attacks. Specifically, we find the "Assimilation Pheno… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  26. arXiv:2407.04203  [pdf, other

    cs.CV

    HCS-TNAS: Hybrid Constraint-driven Semi-supervised Transformer-NAS for Ultrasound Image Segmentation

    Authors: Renqi Chen

    Abstract: Accurate ultrasound segmentation is pursued because it aids clinicians in achieving a comprehensive diagnosis. Due to the presence of low image quality and high costs associated with annotation, two primary concerns arise: (1) enhancing the understanding of multi-scale features, and (2) improving the resistance to data dependency. To mitigate these concerns, we propose HCS-TNAS, a novel neural arc… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  27. arXiv:2407.04174  [pdf, other

    cs.NI eess.SP

    Gemini: Integrating Full-fledged Sensing upon Millimeter Wave Communications

    Authors: Yilong Li, Zhe Chen, Jun Luo, Suman Banerjee

    Abstract: Integrating millimeter wave (mmWave)technology in both communication and sensing is promising as it enables the reuse of existing spectrum and infrastructure without draining resources. Most existing systems piggyback sensing onto conventional communication modes without fully exploiting the potential of integrated sensing and communication (ISAC) in mmWave radios (not full-fledged). In this paper… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 12 pages

  28. arXiv:2407.04151  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers

    Authors: Terry Tong, Jiashu Xu, Qin Liu, Muhao Chen

    Abstract: The security of multi-turn conversational large language models (LLMs) is understudied despite it being one of the most popular LLM utilization. Specifically, LLMs are vulnerable to data poisoning backdoor attacks, where an adversary manipulates the training data to cause the model to output malicious responses to predefined triggers. Specific to the multi-turn dialogue setting, LLMs are at the ri… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Submitted to EMNLP 2024

  29. arXiv:2407.04147  [pdf, other

    cs.SE

    ALPINE: An adaptive language-agnostic pruning method for language models for code

    Authors: Mootez Saad, José Antonio Hernández López, Boqi Chen, Dániel Varró, Tushar Sharma

    Abstract: Language models of code have demonstrated state-of-the-art performance across various software engineering and source code analysis tasks. However, their demanding computational resource requirements and consequential environmental footprint remain as significant challenges. This work introduces ALPINE, an adaptive programming language-agnostic pruning technique designed to substantially reduce th… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  30. arXiv:2407.04121  [pdf, other

    cs.CL cs.AI

    Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models

    Authors: Yuyan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao

    Abstract: Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems. However, a major drawback of LLMs is the issue of hallucination, where they generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. In this paper, we propose a robust discriminator name… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to CIKM 2023 (Long Paper)

  31. arXiv:2407.04118  [pdf, other

    cs.CL cs.AI

    MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

    Authors: Yuyan Chen, Zhihao Wen, Ge Fan, Zhengyu Chen, Wei Wu, Dayiheng Liu, Zhixu Li, Bang Liu, Yanghua Xiao

    Abstract: Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community. The existing research primarily emphasizes the importance of adapting prompts to specific tasks, rather than specific LLMs. However, a good prompt is not solely defined by its wording, but also binds to the nature of the LLM in question. In this w… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to EMNLP 2023 (Findings)

  32. arXiv:2407.04106  [pdf, other

    cs.AI cs.CL cs.CV

    MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis

    Authors: Asma Alkhaldi, Raneem Alnajim, Layan Alabdullatef, Rawan Alyahya, Jun Chen, Deyao Zhu, Ahmed Alsinan, Mohamed Elhoseiny

    Abstract: Recent advancements in artificial intelligence (AI) have precipitated significant breakthroughs in healthcare, particularly in refining diagnostic procedures. However, previous studies have often been constrained to limited functionalities. This study introduces MiniGPT-Med, a vision-language model derived from large-scale language models and tailored for medical applications. MiniGPT-Med demonstr… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  33. arXiv:2407.04105  [pdf, other

    cs.CL cs.AI

    Can Pre-trained Language Models Understand Chinese Humor?

    Authors: Yuyan Chen, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Bang Liu, Yunwen Chen

    Abstract: Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first wo… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted to WSDM 2022

  34. arXiv:2407.04055  [pdf, other

    q-bio.QM cs.AI cs.LG

    Benchmark on Drug Target Interaction Modeling from a Structure Perspective

    Authors: Xinnan Zhang, Jialin Wu, Junyi Xie, Tianlong Chen, Kaixiong Zhou

    Abstract: The prediction modeling of drug-target interactions is crucial to drug discovery and design, which has seen rapid advancements owing to deep learning technologies. Recently developed methods, such as those based on graph neural networks (GNNs) and Transformers, demonstrate exceptional performance across various datasets by effectively extracting structural information. However, the benchmarking of… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Submitted to NIPS 2024 Dataset and Benchmark

  35. arXiv:2407.04041  [pdf, other

    cs.CV

    Towards Cross-View-Consistent Self-Supervised Surround Depth Estimation

    Authors: Laiyan Ding, Hualie Jiang, Jie Li, Yongquan Chen, Rui Huang

    Abstract: Depth estimation is a cornerstone for autonomous driving, yet acquiring per-pixel depth ground truth for supervised learning is challenging. Self-Supervised Surround Depth Estimation (SSSDE) from consecutive images offers an economical alternative. While previous SSSDE methods have proposed different mechanisms to fuse information across images, few of them explicitly consider the cross-view const… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  36. arXiv:2407.04029  [pdf, other

    cs.LG

    Robust Learning under Hybrid Noise

    Authors: Yang Wei, Shuo Chen, Shanshan Ye, Bo Han, Chen Gong

    Abstract: Feature noise and label noise are ubiquitous in practical scenarios, which pose great challenges for training a robust machine learning model. Most previous approaches usually deal with only a single problem of either feature noise or label noise. However, in real-world applications, hybrid noise, which contains both feature noise and label noise, is very common due to the unreliable data collecti… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  37. arXiv:2407.03994  [pdf, other

    cs.CL cs.AI

    Unlocking the Potential of Model Merging for Low-Resource Languages

    Authors: Mingxu Tao, Chen Zhang, Quzhe Huang, Tianyao Ma, Songfang Huang, Dongyan Zhao, Yansong Feng

    Abstract: Adapting large language models (LLMs) to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT). However, this CT-then-SFT approach struggles with limited data in the context of low-resource languages, failing to balance language modeling and task-solving capabilities. We thus propose model merging as an alternative for low-resource languages, combini… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  38. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (57 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  39. arXiv:2407.03925  [pdf, other

    cs.LG

    Reduced-Order Neural Operators: Learning Lagrangian Dynamics on Highly Sparse Graphs

    Authors: Hrishikesh Viswanath, Yue Chang, Julius Berner, Peter Yichen Chen, Aniket Bera

    Abstract: We present a neural operator architecture to simulate Lagrangian dynamics, such as fluid flow, granular flows, and elastoplasticity. Traditional numerical methods, such as the finite element method (FEM), suffer from long run times and large memory consumption. On the other hand, approaches based on graph neural networks are faster but still suffer from long computation times on dense graphs, whic… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  40. arXiv:2407.03917  [pdf, other

    cs.CV

    Timestep-Aware Correction for Quantized Diffusion Models

    Authors: Yuzhe Yao, Feng Tian, Jun Chen, Haonan Lin, Guang Dai, Yong Liu, Jingdong Wang

    Abstract: Diffusion models have marked a significant breakthrough in the synthesis of semantically coherent images. However, their extensive noise estimation networks and the iterative generation process limit their wider application, particularly on resource-constrained platforms like mobile devices. Existing post-training quantization (PTQ) methods have managed to compress diffusion models to low precisio… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  41. arXiv:2407.03897  [pdf, other

    cs.LG

    gFlora: a topology-aware method to discover functional co-response groups in soil microbial communities

    Authors: Nan Chen, Merlijn Schram, Doina Bucur

    Abstract: We aim to learn the functional co-response group: a group of taxa whose co-response effect (the representative characteristic of the group) associates well statistically with a functional variable. Different from the state-of-the-art method, we model the soil microbial community as an ecological co-occurrence network with the taxa as nodes (weighted by their abundance) and their relationships (a c… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: BIOKDD accepted

  42. arXiv:2407.03892  [pdf, other

    cs.SD cs.AI eess.AS

    On the Effectiveness of Acoustic BPE in Decoder-Only TTS

    Authors: Bohan Li, Feiyu Shen, Yiwei Guo, Shuai Wang, Xie Chen, Kai Yu

    Abstract: Discretizing speech into tokens and generating them by a decoder-only model have been a promising direction for text-to-speech (TTS) and spoken language modeling (SLM). To shorten the sequence length of speech tokens, acoustic byte-pair encoding (BPE) has emerged in SLM that treats speech tokens from self-supervised semantic representations as characters to further compress the token sequence. But… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 5 pages, 3 tables, 1 figures. accepted to Interspeech 2024

  43. arXiv:2407.03804  [pdf, other

    cs.LG cs.NI

    Multi-Time Scale Service Caching and Pricing in MEC Systems with Dynamic Program Popularity

    Authors: Yiming Chen, Xingyuan Hu, Bo Gu, Shimin Gong, Zhou Su

    Abstract: In mobile edge computing systems, base stations (BSs) equipped with edge servers can provide computing services to users to reduce their task execution time. However, there is always a conflict of interest between the BS and users. The BS prices the service programs based on user demand to maximize its own profit, while the users determine their offloading strategies based on the prices to minimiz… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  44. arXiv:2407.03776  [pdf, other

    cs.IT

    Energy-Efficient Probabilistic Semantic Communication over Space-Air-Ground Integrated Networks

    Authors: Zhouxiang Zhao, Zhaohui Yang, Mingzhe Chen, Zhaoyang Zhang, Wei Xu, Kaibin Huang

    Abstract: Space-air-ground integrated networks (SAGINs) are emerging as a pivotal element in the evolution of future wireless networks. Despite their potential, the joint design of communication and computation within SAGINs remains a formidable challenge. In this paper, the problem of energy efficiency in SAGIN-enabled probabilistic semantic communication (PSC) system is investigated. In the considered mod… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  45. arXiv:2407.03750  [pdf, other

    cs.DB

    GriDB: Scaling Blockchain Database via Sharding and Off-Chain Cross-Shard Mechanism

    Authors: Zicong Hong, Song Guo, Enyuan Zhou, Wuhui Chen, Huawei Huang, Albert Zomaya

    Abstract: Blockchain databases have attracted widespread attention but suffer from poor scalability due to underlying non-scalable blockchains. While blockchain sharding is necessary for a scalable blockchain database, it poses a new challenge named on-chain cross-shard database services. Each cross-shard database service (e.g., cross-shard queries or inter-shard load balancing) involves massive cross-shard… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  46. arXiv:2407.03741  [pdf, other

    cs.IT

    A Unified Expression for Upper Bounds on the BLER of Spinal Codes over Fading Channels

    Authors: Aimin Li, Xiaomeng Chen, Shaohua Wu, Gary C. F. Lee, Sumei Sun

    Abstract: Performance evaluation of particular channel coding has been a significant topic in coding theory, often involving the use of bounding techniques. This paper focuses on the new family of capacity-achieving codes, Spinal codes, to provide a comprehensive analysis framework to tightly upper bound the block error rate (BLER) of Spinal codes in the finite block length (FBL) regime. First, we resort to… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  47. arXiv:2407.03720  [pdf, other

    cs.IR cs.CL

    Query-oriented Data Augmentation for Session Search

    Authors: Haonan Chen, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen

    Abstract: Modeling contextual information in a search session has drawn more and more attention when understanding complex user intents. Recent methods are all data-driven, i.e., they train different models on large-scale search log data to identify the relevance between search contexts and candidate documents. The common training paradigm is to pair the search context with different candidate documents and… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: TKDE 2024

  48. arXiv:2407.03719  [pdf, other

    cs.CV

    Relative Difficulty Distillation for Semantic Segmentation

    Authors: Dong Liang, Yue Sun, Yun Du, Songcan Chen, Sheng-Jun Huang

    Abstract: Current knowledge distillation (KD) methods primarily focus on transferring various structured knowledge and designing corresponding optimization goals to encourage the student network to imitate the output of the teacher network. However, introducing too many additional optimization objectives may lead to unstable training, such as gradient conflicts. Moreover, these methods ignored the guideline… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  49. arXiv:2407.03672  [pdf, other

    cs.LG cs.AI

    A Survey of Data Synthesis Approaches

    Authors: Hsin-Yu Chang, Pei-Yu Chen, Tun-Hsiang Chou, Chang-Sheng Kao, Hsuan-Yun Yu, Yen-Ting Lin, Yun-Nung Chen

    Abstract: This paper provides a detailed survey of synthetic data techniques. We first discuss the expected goals of using synthetic data in data augmentation, which can be divided into four parts: 1) Improving Diversity, 2) Data Balancing, 3) Addressing Domain Shift, and 4) Resolving Edge Cases. Synthesizing data are closely related to the prevailing machine learning techniques at the time, therefore, we s… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  50. arXiv:2407.03658  [pdf, other

    cs.CL

    GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

    Authors: Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, Yue Zhang

    Abstract: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We a… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.