Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 6,253 results for author: Wang, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.12446  [pdf, other

    q-fin.RM cs.LG q-fin.ST

    EX-DRL: Hedging Against Heavy Losses with EXtreme Distributional Reinforcement Learning

    Authors: Parvin Malekzadeh, Zissis Poulos, Jacky Chen, Zeyu Wang, Konstantinos N. Plataniotis

    Abstract: Recent advancements in Distributional Reinforcement Learning (DRL) for modeling loss distributions have shown promise in developing hedging strategies in derivatives markets. A common approach in DRL involves learning the quantiles of loss distributions at specified levels using Quantile Regression (QR). This method is particularly effective in option hedging due to its direct quantile-based risk… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 14 pages

  2. arXiv:2408.12169  [pdf, other

    cs.HC

    ReorderBench: A Benchmark for Matrix Reordering

    Authors: Jiangning Zhu, Zheng Wang, Zhiyang Shen, Lai Wei, Fengyuan Tian, Mengchen Liu, Shixia Liu

    Abstract: Matrix reordering permutes the rows and columns of a matrix to reveal meaningful visual patterns, such as blocks that represent clusters. A comprehensive collection of matrices, along with a scoring method for measuring the quality of visual patterns in these matrices, contributes to building a benchmark. This benchmark is essential for selecting or designing suitable reordering algorithms for spe… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Submitted to IEEE TVCG

  3. arXiv:2408.12142  [pdf, other

    cs.CL cs.AI

    MDD-5k: A New Diagnostic Conversation Dataset for Mental Disorders Synthesized via Neuro-Symbolic LLM Agents

    Authors: Congchi Yin, Feng Li, Shu Zhang, Zike Wang, Jun Shao, Piji Li, Jianhua Chen, Xun Jiang

    Abstract: The clinical diagnosis of most mental disorders primarily relies on the conversations between psychiatrist and patient. The creation of such diagnostic conversation datasets is promising to boost the AI mental healthcare community. However, directly collecting the conversations in real diagnosis scenarios is near impossible due to stringent privacy and ethical considerations. To address this issue… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  4. arXiv:2408.12128  [pdf, other

    cs.AI cs.CV

    Diffusion-Based Visual Art Creation: A Survey and New Perspectives

    Authors: Bingyuan Wang, Qifeng Chen, Zeyu Wang

    Abstract: The integration of generative AI in visual art has revolutionized not only how visual content is created but also how AI interacts with and reflects the underlying domain knowledge. This survey explores the emerging realm of diffusion-based visual art creation, examining its development from both artistic and technical perspectives. We structure the survey into three phases, data feature and frame… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 35 pages, 9 figures

  5. arXiv:2408.12124  [pdf, other

    cs.LG cs.HC eess.SP

    Recording Brain Activity While Listening to Music Using Wearable EEG Devices Combined with Bidirectional Long Short-Term Memory Networks

    Authors: Jingyi Wang, Zhiqun Wang, Guiran Liu

    Abstract: Electroencephalography (EEG) signals are crucial for investigating brain function and cognitive processes. This study aims to address the challenges of efficiently recording and analyzing high-dimensional EEG signals while listening to music to recognize emotional states. We propose a method combining Bidirectional Long Short-Term Memory (Bi-LSTM) networks with attention mechanisms for EEG signal… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 15 pages

  6. arXiv:2408.12119  [pdf, other

    cs.CR cs.AI

    Understanding Data Reconstruction Leakage in Federated Learning from a Theoretical Perspective

    Authors: Zifan Wang, Binghui Zhang, Meng Pang, Yuan Hong, Binghui Wang

    Abstract: Federated learning (FL) is an emerging collaborative learning paradigm that aims to protect data privacy. Unfortunately, recent works show FL algorithms are vulnerable to the serious data reconstruction attacks. However, existing works lack a theoretical foundation on to what extent the devices' data can be reconstructed and the effectiveness of these attacks cannot be compared fairly due to their… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  7. arXiv:2408.11871  [pdf, other

    cs.CL cs.AI

    MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

    Authors: Lionel Z. Wang, Yiming Ma, Renfei Gao, Beichen Guo, Zhuoran Li, Han Zhu, Wenqi Fan, Zexin Lu, Ka Chung Ng

    Abstract: The advent of large language models (LLMs) has revolutionized online content creation, making it much easier to generate high-quality fake news. This misuse threatens the integrity of our digital environment and ethical standards. Therefore, understanding the motivations and mechanisms behind LLM-generated fake news is crucial. In this study, we analyze the creation of fake news from a social psyc… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  8. arXiv:2408.11869  [pdf, other

    cs.CL cs.AI cs.LG

    Enhance Lifelong Model Editing with Continuous Data-Adapter Association

    Authors: Jiaang Li, Quan Wang, Zhongnan Wang, Yongdong Zhang, Zhendong Mao

    Abstract: Large language models (LLMs) require model editing to efficiently update specific knowledge within them and avoid factual errors. Most model editing methods are solely designed for single-time use and lead to a significant forgetting effect after sequential edits over time, referred to as lifelong editing. Current approaches manage sequential edits by freezing original parameters and allocating ne… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: Preprint. Under Review

  9. arXiv:2408.11834  [pdf, other

    cs.CV cs.AI

    SCREENER: A general framework for task-specific experiment design in quantitative MRI

    Authors: Tianshu Zheng, Zican Wang, Timothy Bray, Daniel C. Alexander, Dan Wu, Hui Zhang

    Abstract: Quantitative magnetic resonance imaging (qMRI) is increasingly investigated for use in a variety of clinical tasks from diagnosis, through staging, to treatment monitoring. However, experiment design in qMRI, the identification of the optimal acquisition protocols, has been focused on obtaining the most precise parameter estimations, with no regard for the specific requirements of downstream tasks… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  10. arXiv:2408.11811  [pdf, other

    cs.CV cs.RO

    EmbodiedSAM: Online Segment Any 3D Thing in Real Time

    Authors: Xiuwei Xu, Huangxing Chen, Linqing Zhao, Ziwei Wang, Jie Zhou, Jiwen Lu

    Abstract: Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration, so an online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed. Since high-quality 3D data is limited, directly training such a model in 3D is almost infeasible. Meanwhile, vision foundation models (VFM) has revolutionized the field of 2D computer vision with… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Project page: https://xuxw98.github.io/ESAM/

  11. arXiv:2408.11745  [pdf, other

    cs.CL cs.AI

    FocusLLM: Scaling LLM's Context by Parallel Decoding

    Authors: Zhenyu Li, Yike Zhang, Tengyu Pan, Yutao Sun, Zhichao Duan, Junjie Fang, Rong Han, Zixuan Wang, Jianyong Wang

    Abstract: Empowering LLMs with the ability to utilize useful information from a long context is crucial for many downstream applications. However, achieving long context lengths with the conventional transformer architecture requires substantial training and inference resources. In this paper, we present FocusLLM, a framework designed to extend the context length of any decoder-only LLM, enabling the model… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  12. arXiv:2408.11609  [pdf, other

    cs.CL cs.AI

    Xinyu: An Efficient LLM-based System for Commentary Generation

    Authors: Yiquan Wu, Bo Tang, Chenyang Xi, Yu Yu, Pengyu Wang, Yifei Liu, Kun Kuang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Jie Hu, Peng Cheng, Zhonghao Wang, Yi Wang, Yi Luo, Mingchuan Yang

    Abstract: Commentary provides readers with a deep understanding of events by presenting diverse arguments and evidence. However, creating commentary is a time-consuming task, even for skilled commentators. Large language models (LLMs) have simplified the process of natural language generation, but their direct application in commentary creation still faces challenges due to unique task requirements. These r… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    ACM Class: I.2.7

  13. arXiv:2408.11554  [pdf, other

    cs.CL cs.AI

    Differentiating Choices via Commonality for Multiple-Choice Question Answering

    Authors: Wenqing Deng, Zhe Wang, Kewen Wang, Shirui Pan, Xiaowang Zhang, Zhiyong Feng

    Abstract: Multiple-choice question answering (MCQA) becomes particularly challenging when all choices are relevant to the question and are semantically similar. Yet this setting of MCQA can potentially provide valuable clues for choosing the right answer. Existing models often rank each choice separately, overlooking the context provided by other choices. Specifically, they fail to leverage the semantic com… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 9 pages, accepted to ECAI 2024

  14. arXiv:2408.11396  [pdf, other

    cs.CL

    MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing

    Authors: Hao Zhou, Zhijun Wang, Shujian Huang, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Weihua Luo, Jiajun Chen

    Abstract: Large Language Models (LLMs) are often English-centric due to the disproportionate distribution of languages in their pre-training data. Enhancing non-English language capabilities through post-pretraining often results in catastrophic forgetting of the ability of original languages. Previous methods either achieve good expansion with severe forgetting or slight forgetting with poor expansion, ind… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  15. arXiv:2408.11372  [pdf, other

    cs.IR cs.AI

    Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation

    Authors: Hao Wang, Yongqiang Han, Kefan Wang, Kai Cheng, Zhen Wang, Wei Guo, Yong Liu, Defu Lian, Enhong Chen

    Abstract: In the realm of recommendation systems, users exhibit a diverse array of behaviors when interacting with items. This phenomenon has spurred research into learning the implicit semantic relationships between these behaviors to enhance recommendation performance. However, these methods often entail high computational complexity. To address concerns regarding efficiency, pre-training presents a viabl… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  16. arXiv:2408.11370  [pdf, other

    cs.LG cs.AI

    Graph Classification via Reference Distribution Learning: Theory and Practice

    Authors: Zixiao Wang, Jicong Fan

    Abstract: Graph classification is a challenging problem owing to the difficulty in quantifying the similarity between graphs or representing graphs as vectors, though there have been a few methods using graph kernels or graph neural networks (GNNs). Graph kernels often suffer from computational costs and manual feature engineering, while GNNs commonly utilize global pooling operations, risking the loss of s… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  17. arXiv:2408.11338  [pdf, other

    cs.AI cs.LG

    Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond

    Authors: Minghao Liu, Zonglin Di, Jiaheng Wei, Zhongruo Wang, Hengxiang Zhang, Ruixuan Xiao, Haoyu Wang, Jinlong Pang, Hao Chen, Ankit Shah, Hongxin Wei, Xinlei He, Zhaowei Zhao, Haobo Wang, Lei Feng, Jindong Wang, James Davis, Yang Liu

    Abstract: Large-scale data collection is essential for developing personalized training data, mitigating the shortage of training data, and fine-tuning specialized models. However, creating high-quality datasets quickly and accurately remains a challenge due to annotation errors, the substantial time and costs associated with human labor. To address these issues, we propose Automatic Dataset Construction (A… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  18. arXiv:2408.11324  [pdf, other

    cs.SE

    HITS: High-coverage LLM-based Unit Test Generation via Method Slicing

    Authors: Zejun Wang, Kaibo Liu, Ge Li, Zhi Jin

    Abstract: Large language models (LLMs) have behaved well in generating unit tests for Java projects. However, the performance for covering the complex focal methods within the projects is poor. Complex methods comprise many conditions and loops, requiring the test cases to be various enough to cover all lines and branches. However, existing test generation methods with LLMs provide the whole method-to-test… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: to be published in ASE 24' Research Track

  19. arXiv:2408.11313  [pdf, other

    cs.AI

    Unlocking Adversarial Suffix Optimization Without Affirmative Phrases: Efficient Black-box Jailbreaking via LLM as Optimizer

    Authors: Weipeng Jiang, Zhenting Wang, Juan Zhai, Shiqing Ma, Zhengyu Zhao, Chao Shen

    Abstract: Despite prior safety alignment efforts, mainstream LLMs can still generate harmful and unethical content when subjected to jailbreaking attacks. Existing jailbreaking methods fall into two main categories: template-based and optimization-based methods. The former requires significant manual effort and domain knowledge, while the latter, exemplified by Greedy Coordinate Gradient (GCG), which seeks… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  20. arXiv:2408.11182  [pdf, other

    cs.CR cs.AI

    Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Neural Carrier Articles

    Authors: Zhilong Wang, Haizhou Wang, Nanqing Luo, Lan Zhang, Xiaoyan Sun, Yebo Cao, Peng Liu

    Abstract: Jailbreak attacks on Language Model Models (LLMs) entail crafting prompts aimed at exploiting the models to generate malicious content. This paper proposes a new type of jailbreak attacks which shift the attention of the LLM by inserting a prohibited query into a carrier article. The proposed attack leverage the knowledge graph and a composer LLM to automatically generating a carrier article that… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  21. arXiv:2408.11085  [pdf, other

    cs.CV

    GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting

    Authors: Changkun Liu, Shuai Chen, Yash Bhalgat, Siyan Hu, Zirui Wang, Ming Cheng, Victor Adrian Prisacariu, Tristan Braud

    Abstract: We leverage 3D Gaussian Splatting (3DGS) as a scene representation and propose a novel test-time camera pose refinement framework, GSLoc. This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordinate regression methods. The 3DGS model renders high-quality synthetic images and depth maps to facilitate the establishment of 2D-3D correspondences.… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: The project page is available at https://gsloc.active.vision

  22. arXiv:2408.10901  [pdf, other

    cs.CV cs.AI cs.LG

    A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse

    Authors: Zhongliang Guo, Lei Fang, Jingyu Lin, Yifei Qian, Shuai Zhao, Zeyu Wang, Junhao Dong, Cunjian Chen, Ognjen Arandjelović, Chun Pong Lau

    Abstract: Recent advancements in generative AI, particularly Latent Diffusion Models (LDMs), have revolutionized image synthesis and manipulation. However, these generative techniques raises concerns about data misappropriation and intellectual property infringement. Adversarial attacks on machine learning models have been extensively studied, and a well-established body of research has extended these techn… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 21 pages, 7 figures, 10 tables

  23. arXiv:2408.10899  [pdf, other

    cs.RO

    All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents

    Authors: Zhiqiang Wang, Hao Zheng, Yunshuang Nie, Wenjun Xu, Qingwei Wang, Hua Ye, Zhe Li, Kaidong Zhang, Xuewen Cheng, Wanxi Dong, Chang Cai, Liang Lin, Feng Zheng, Xiaodan Liang

    Abstract: Embodied AI is transforming how AI systems interact with the physical world, yet existing datasets are inadequate for developing versatile, general-purpose agents. These limitations include a lack of standardized formats, insufficient data diversity, and inadequate data volume. To address these issues, we introduce ARIO (All Robots In One), a new data standard that enhances existing datasets by of… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: Project website: https://imaei.github.io/project_pages/ario/

  24. arXiv:2408.10895  [pdf, ps, other

    cs.AI

    Analytical and Empirical Study of Herding Effects in Recommendation Systems

    Authors: Hong Xie, Mingze Zhong, Defu Lian, Zhen Wang, Enhong Chen

    Abstract: Online rating systems are often used in numerous web or mobile applications, e.g., Amazon and TripAdvisor, to assess the ground-truth quality of products. Due to herding effects, the aggregation of historical ratings (or historical collective opinion) can significantly influence subsequent ratings, leading to misleading and erroneous assessments. We study how to manage product ratings via rating a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 29 pages

  25. arXiv:2408.10853  [pdf, other

    cs.SD cs.AI eess.AS

    Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?

    Authors: Yuankun Xie, Chenxu Xiong, Xiaopeng Wang, Zhiyong Wang, Yi Lu, Xin Qi, Ruibo Fu, Yukun Liu, Zhengqi Wen, Jianhua Tao, Guanjun Li, Long Ye

    Abstract: Currently, Audio Language Models (ALMs) are rapidly advancing due to the developments in large language models and audio neural codecs. These ALMs have significantly lowered the barrier to creating deepfake audio, generating highly realistic and diverse types of deepfake audio, which pose severe threats to society. Consequently, effective audio deepfake detection technologies to detect ALM-based a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  26. arXiv:2408.10852  [pdf, other

    cs.SD eess.AS

    EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech

    Authors: Xin Qi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Shuchen Shi, Yi Lu, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Guanjun Li, Xuefei Liu, Yongwei Li

    Abstract: In the current era of Artificial Intelligence Generated Content (AIGC), a Low-Rank Adaptation (LoRA) method has emerged. It uses a plugin-based approach to learn new knowledge with lower parameter quantities and computational costs, and it can be plugged in and out based on the specific sub-tasks, offering high flexibility. However, the current application schemes primarily incorporate LoRA into t… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  27. arXiv:2408.10849  [pdf, other

    cs.SD eess.AS

    A Noval Feature via Color Quantisation for Fake Audio Detection

    Authors: Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Yukun Liu, Guanjun Li, Xin Qi, Yi Lu, Xuefei Liu, Yongwei Li

    Abstract: In the field of deepfake detection, previous studies focus on using reconstruction or mask and prediction methods to train pre-trained models, which are then transferred to fake audio detection training where the encoder is used to extract features, such as wav2vec2.0 and Masked Auto Encoder. These methods have proven that using real audio for reconstruction pre-training can better help the model… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: accepted by ISCSLP2024

  28. arXiv:2408.10588  [pdf, other

    cs.CV cs.GR

    DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

    Authors: Zhijing Shao, Duotun Wang, Qing-Yao Tian, Yao-Dong Yang, Hengyu Meng, Zeyu Cai, Bo Dong, Yu Zhang, Kang Zhang, Zeyu Wang

    Abstract: Although neural rendering has made significant advancements in creating lifelike, animatable full-body and head avatars, incorporating detailed expressions into full-body avatars remains largely unexplored. We present DEGAS, the first 3D Gaussian Splatting (3DGS)-based modeling method for full-body avatars with rich facial expressions. Trained on multiview videos of a given subject, our method lea… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  29. arXiv:2408.10568  [pdf, other

    cs.RO

    Constrained Behavior Cloning for Robotic Learning

    Authors: Wensheng Liang, Jun Xie, Zhicheng Wang, Jianwei Tan, Xiaoguang Ma

    Abstract: Behavior cloning (BC) is a popular supervised imitation learning method in the societies of robotics, autonomous driving, etc., wherein complex skills can be learned by direct imitation from expert demonstrations. Despite its rapid development, it is still affected by limited field of view where accumulation of sensors and joint noise bring compounding errors. In this paper, we introduced geometri… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  30. arXiv:2408.10228  [pdf, other

    eess.SP cs.LG

    ECG Unveiled: Analysis of Client Re-identification Risks in Real-World ECG Datasets

    Authors: Ziyu Wang, Anil Kanduri, Seyed Amir Hossein Aqajari, Salar Jafarlou, Sanaz R. Mousavi, Pasi Liljeberg, Shaista Malik, Amir M. Rahmani

    Abstract: While ECG data is crucial for diagnosing and monitoring heart conditions, it also contains unique biometric information that poses significant privacy risks. Existing ECG re-identification studies rely on exhaustive analysis of numerous deep learning features, confining to ad-hoc explainability towards clinicians decision making. In this work, we delve into explainability of ECG re-identification… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  31. arXiv:2408.10198  [pdf, other

    cs.CV cs.GR

    MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model

    Authors: Minghua Liu, Chong Zeng, Xinyue Wei, Ruoxi Shi, Linghao Chen, Chao Xu, Mengqi Zhang, Zhaoning Wang, Xiaoshuai Zhang, Isabella Liu, Hongzhi Wu, Hao Su

    Abstract: Open-world 3D reconstruction models have recently garnered significant attention. However, without sufficient 3D inductive bias, existing methods typically entail expensive training costs and struggle to extract high-quality 3D meshes. In this work, we introduce MeshFormer, a sparse-view reconstruction model that explicitly leverages 3D native structure, input guidance, and training supervision. S… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 20 pages, 9 figures

  32. arXiv:2408.10134  [pdf, other

    cs.CV cs.MM eess.IV

    Perceptual Depth Quality Assessment of Stereoscopic Omnidirectional Images

    Authors: Wei Zhou, Zhou Wang

    Abstract: Depth perception plays an essential role in the viewer experience for immersive virtual reality (VR) visual environments. However, previous research investigations in the depth quality of 3D/stereoscopic images are rather limited, and in particular, are largely lacking for 3D viewing of 360-degree omnidirectional content. In this work, we make one of the first attempts to develop an objective qual… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted by IEEE TCSVT

  33. arXiv:2408.10088  [pdf, other

    cs.SI

    Recent Surge in Public Interest in Transportation: Sentiment Analysis of Baidu Apollo Go Using Weibo Data

    Authors: Shiqi Wang, Zhouye Zhao, Yuhang Xie, Mingchuan Ma, Zirui Chen, Zeyu Wang, Bohao Su, Wenrui Xu, Tianyi Li

    Abstract: Urban mobility and transportation systems have been profoundly transformed by the advancement of autonomous vehicle technologies. Baidu Apollo Go, a pioneer robotaxi service from the Chinese tech giant Baidu, has recently been widely deployed in major cities like Beijing and Wuhan, sparking increased conversation and offering a glimpse into the future of urban mobility. This study investigates p… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    ACM Class: J.4

  34. arXiv:2408.10006  [pdf, other

    cs.LG

    Unlocking the Power of LSTM for Long Term Time Series Forecasting

    Authors: Yaxuan Kong, Zepu Wang, Yuqi Nie, Tian Zhou, Stefan Zohren, Yuxuan Liang, Peng Sun, Qingsong Wen

    Abstract: Traditional recurrent neural network architectures, such as long short-term memory neural networks (LSTM), have historically held a prominent role in time series forecasting (TSF) tasks. While the recently introduced sLSTM for Natural Language Processing (NLP) introduces exponential gating and memory mixing that are beneficial for long term sequential learning, its potential short memory issue is… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  35. arXiv:2408.09702  [pdf, other

    cs.CV cs.AI cs.GR

    Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

    Authors: Ruofan Liang, Zan Gojcic, Merlin Nimier-David, David Acuna, Nandita Vijaykumar, Sanja Fidler, Zian Wang

    Abstract: The correct insertion of virtual objects in images of real-world scenes requires a deep understanding of the scene's lighting, geometry and materials, as well as the image formation process. While recent large-scale diffusion models have shown strong generative and inpainting capabilities, we find that current models do not sufficiently "understand" the scene shown in a single picture to generate… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: ECCV 2024, Project page: https://research.nvidia.com/labs/toronto-ai/DiPIR/

  36. arXiv:2408.09485  [pdf, other

    cs.CL

    Activated Parameter Locating via Causal Intervention for Model Merging

    Authors: Fanshuang Kong, Richong Zhang, Ziqiao Wang

    Abstract: Model merging combines multiple homologous models into one model, achieving convincing generalization without the necessity of additional training. A key challenge in this problem is resolving parameter redundancies and conflicts across multiple models. Existing models have demonstrated that dropping a portion of delta parameters can alleviate conflicts while maintaining performance. However, thes… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  37. arXiv:2408.09189  [pdf, other

    cs.LG cs.AI

    SA-GDA: Spectral Augmentation for Graph Domain Adaptation

    Authors: Jinhui Pang, Zixuan Wang, Jiliang Tang, Mingyan Xiao, Nan Yin

    Abstract: Graph neural networks (GNNs) have achieved impressive impressions for graph-related tasks. However, most GNNs are primarily studied under the cases of signal domain with supervised training, which requires abundant task-specific labels and is difficult to transfer to other domains. There are few works focused on domain adaptation for graph node classification. They mainly focused on aligning the f… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  38. arXiv:2408.08994  [pdf, ps, other

    cs.LG

    Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

    Authors: Zhiyong Wang, Dongruo Zhou, John C. S. Lui, Wen Sun

    Abstract: Learning a transition model via Maximum Likelihood Estimation (MLE) followed by planning inside the learned model is perhaps the most standard and simplest Model-based Reinforcement Learning (RL) framework. In this work, we show that such a simple Model-based RL scheme, when equipped with optimistic and pessimistic planning procedures, achieves strong regret and sample complexity bounds in online… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  39. arXiv:2408.08989  [pdf, other

    cs.AI cs.CV

    Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models

    Authors: Qingyuan Zeng, Zhenzhong Wang, Yiu-ming Cheung, Min Jiang

    Abstract: While image-to-text models have demonstrated significant advancements in various vision-language tasks, they remain susceptible to adversarial attacks. Existing white-box attacks on image-to-text models require access to the architecture, gradients, and parameters of the target model, resulting in low practicality. Although the recently proposed gray-box attacks have improved practicality, they su… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  40. arXiv:2408.08909  [pdf

    cs.CR cs.AI cs.DC

    An Adaptive Differential Privacy Method Based on Federated Learning

    Authors: Zhiqiang Wang, Xinyue Yu, Qianli Huang, Yongguang Gong

    Abstract: Differential privacy is one of the methods to solve the problem of privacy protection in federated learning. Setting the same privacy budget for each round will result in reduced accuracy in training. The existing methods of the adjustment of privacy budget consider fewer influencing factors and tend to ignore the boundaries, resulting in unreasonable privacy budgets. Therefore, we proposed an ada… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  41. arXiv:2408.08862  [pdf, other

    cs.LG

    Visual Agents as Fast and Slow Thinkers

    Authors: Guangyan Sun, Mingyu Jin, Zhenting Wang, Cheng-Long Wang, Siqi Ma, Qifan Wang, Ying Nian Wu, Yongfeng Zhang, Dongfang Liu

    Abstract: Achieving human-level intelligence requires refining cognitive distinctions between System 1 and System 2 thinking. While contemporary AI, driven by large language models, demonstrates human-like traits, it falls short of genuine cognition. Transitioning from structured benchmarks to real-world scenarios presents challenges for visual agents, often leading to inaccurate and overly confident respon… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  42. arXiv:2408.08780   

    cs.CL

    Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions

    Authors: Chenming Tang, Zhixiang Wang, Yunfang Wu

    Abstract: With the help of in-context learning (ICL), large language models (LLMs) have achieved impressive performance across various tasks. However, the function of descriptive instructions during ICL remains under-explored. In this work, we propose an ensemble prompt framework to describe the selection criteria of multiple in-context examples, and preliminary experiments on machine translation (MT) acros… ▽ More

    Submitted 21 August, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

    Comments: There are some mistakes in the experimental data

  43. arXiv:2408.08708  [pdf, other

    cs.CV

    Decoupling Feature Representations of Ego and Other Modalities for Incomplete Multi-modal Brain Tumor Segmentation

    Authors: Kaixiang Yang, Wenqi Shan, Xudong Li, Xuan Wang, Xikai Yang, Xi Wang, Pheng-Ann Heng, Qiang Li, Zhiwei Wang

    Abstract: Multi-modal brain tumor segmentation typically involves four magnetic resonance imaging (MRI) modalities, while incomplete modalities significantly degrade performance. Existing solutions employ explicit or implicit modality adaptation, aligning features across modalities or learning a fused feature robust to modality incompleteness. They share a common goal of encouraging each modality to express… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 8 pages, 4 figures

  44. arXiv:2408.08524  [pdf, other

    cs.CV cs.AI

    GS-ID: Illumination Decomposition on Gaussian Splatting via Diffusion Prior and Parametric Light Source Optimization

    Authors: Kang Du, Zhihao Liang, Zeyu Wang

    Abstract: We present GS-ID, a novel framework for illumination decomposition on Gaussian Splatting, achieving photorealistic novel view synthesis and intuitive light editing. Illumination decomposition is an ill-posed problem facing three main challenges: 1) priors for geometry and material are often lacking; 2) complex illumination conditions involve multiple unknown light sources; and 3) calculating surfa… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 15 pages, 13 figures

  45. arXiv:2408.08515  [pdf, other

    cs.SE

    Selecting Initial Seeds for Better JVM Fuzzing

    Authors: Tianchang Gao, Junjie Chen, Dong Wang, Yile Guo, Yingquan Zhao, Zan Wang

    Abstract: Literature in traditional program fuzzing has confirmed that effectiveness is largely impacted by redundancy among initial seeds, thereby proposing a series of seed selection methods. JVM fuzzing, compared to traditional ones, presents unique characteristics, including large-scale and intricate code, and programs with both syntactic and semantic features. However, it remains unclear whether the ex… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  46. arXiv:2408.08147  [pdf, other

    cs.DC cs.CL cs.LG

    P/D-Serve: Serving Disaggregated Large Language Model at Scale

    Authors: Yibo Jin, Tao Wang, Huimin Lin, Mingyang Song, Peiyang Li, Yipeng Ma, Yicheng Shan, Zhengfan Yuan, Cailong Li, Yajing Sun, Tiandeng Wu, Xing Chu, Ruizhi Huan, Li Ma, Xiao You, Wenting Zhou, Yunpeng Ye, Wen Liu, Xiangkun Xu, Yongsheng Zhang, Tiantian Dong, Jiawei Zhu, Zhe Wang, Xijian Ju, Jianxun Song , et al. (5 additional authors not shown)

    Abstract: Serving disaggregated large language models (LLMs) over tens of thousands of xPU devices (GPUs or NPUs) with reliable performance faces multiple challenges. 1) Ignoring the diversity (various prefixes and tidal requests), treating all the prompts in a mixed pool is inadequate. To facilitate the similarity per scenario and minimize the inner mismatch on P/D (prefill and decoding) processing, fine-g… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  47. arXiv:2408.08078  [pdf, other

    cs.CV cs.AI

    Treat Stillness with Movement: Remote Sensing Change Detection via Coarse-grained Temporal Foregrounds Mining

    Authors: Xixi Wang, Zitian Wang, Jingtao Jiang, Lan Chen, Xiao Wang, Bo Jiang

    Abstract: Current works focus on addressing the remote sensing change detection task using bi-temporal images. Although good performance can be achieved, however, seldom of they consider the motion cues which may also be vital. In this work, we revisit the widely adopted bi-temporal images-based framework and propose a novel Coarse-grained Temporal Mining Augmented (CTMA) framework. To be specific, given th… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: In Peer Review

  48. arXiv:2408.08067  [pdf, other

    cs.CL cs.AI

    RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation

    Authors: Dongyu Ru, Lin Qiu, Xiangkun Hu, Tianhang Zhang, Peng Shi, Shuaichen Chang, Cheng Jiayang, Cunxiang Wang, Shichao Sun, Huanyu Li, Zizhao Zhang, Binjie Wang, Jiarong Jiang, Tong He, Zhiguo Wang, Pengfei Liu, Yue Zhang, Zheng Zhang

    Abstract: Despite Retrieval-Augmented Generation (RAG) showing promising capability in leveraging external knowledge, a comprehensive evaluation of RAG systems is still challenging due to the modular nature of RAG, evaluation of long-form responses and reliability of measurements. In this paper, we propose a fine-grained evaluation framework, RAGChecker, that incorporates a suite of diagnostic metrics for b… ▽ More

    Submitted 16 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

    Comments: Under Review. Github Repo: https://github.com/amazon-science/RAGChecker

  49. arXiv:2408.08050  [pdf, other

    cs.CV

    CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection

    Authors: Xunfa Lai, Zhiyu Yang, Jie Hu, Shengchuan Zhang, Liujuan Cao, Guannan Jiang, Zhiyu Wang, Songan Zhang, Rongrong Ji

    Abstract: Existing camouflaged object detection~(COD) methods depend heavily on large-scale pixel-level annotations.However, acquiring such annotations is laborious due to the inherent camouflage characteristics of the objects.Semi-supervised learning offers a promising solution to this challenge.Yet, its application in COD is hindered by significant pseudo-label noise, both pixel-level and instance-level.W… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: Accepted to ECCV 2024

  50. arXiv:2408.08044  [pdf, other

    cs.CE

    Crystalline Material Discovery in the Era of Artificial Intelligence

    Authors: Zhenzhong Wang, Haowei Hua, Wanyu Lin, Ming Yang, Kay Chen Tan

    Abstract: Crystalline materials, with their symmetrical and periodic structures, possess a diverse array of properties and have been widely used in various fields, e.g., sustainable development. To discover crystalline materials, traditional experimental and computational approaches are often time-consuming and expensive. In these years, thanks to the explosive amount of crystalline materials data, great in… ▽ More

    Submitted 21 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.