Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 200 results for author: Xie, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03636  [pdf, other

    cs.CV

    Diff-Restorer: Unleashing Visual Prompts for Diffusion-based Universal Image Restoration

    Authors: Yuhong Zhang, Hengsheng Zhang, Xinning Chai, Zhengxue Cheng, Rong Xie, Li Song, Wenjun Zhang

    Abstract: Image restoration is a classic low-level problem aimed at recovering high-quality images from low-quality images with various degradations such as blur, noise, rain, haze, etc. However, due to the inherent complexity and non-uniqueness of degradation in real-world images, it is challenging for a model trained for single tasks to handle real-world restoration problems effectively. Moreover, existin… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2407.03635  [pdf, other

    cs.CV

    MRIR: Integrating Multimodal Insights for Diffusion-based Realistic Image Restoration

    Authors: Yuhong Zhang, Hengsheng Zhang, Xinning Chai, Rong Xie, Li Song, Wenjun Zhang

    Abstract: Realistic image restoration is a crucial task in computer vision, and the use of diffusion-based models for image restoration has garnered significant attention due to their ability to produce realistic results. However, the quality of the generated images is still a significant challenge due to the severity of image degradation and the uncontrollability of the diffusion model. In this work, we de… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2407.02371  [pdf, other

    cs.CV

    OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

    Authors: Kepan Nan, Rui Xie, Penghao Zhou, Tiehan Fan, Zhenheng Yang, Zhijie Chen, Xiang Li, Jian Yang, Ying Tai

    Abstract: Text-to-video (T2V) generation has recently garnered significant attention thanks to the large multi-modality model Sora. However, T2V generation still faces two important challenges: 1) Lacking a precise open sourced high-quality dataset. The previous popular video datasets, e.g. WebVid-10M and Panda-70M, are either with low quality or too large for most research institutions. Therefore, it is ch… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 15 pages, 9 figures

  4. arXiv:2407.00100  [pdf, other

    cs.LG cs.AI cs.CL

    Enhancing In-Context Learning via Implicit Demonstration Augmentation

    Authors: Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang

    Abstract: The emergence of in-context learning (ICL) enables large pre-trained language models (PLMs) to make predictions for unseen inputs without updating parameters. Despite its potential, ICL's effectiveness heavily relies on the quality, quantity, and permutation of demonstrations, commonly leading to suboptimal and unstable performance. In this paper, we tackle this challenge for the first time from t… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024 Main 19 pages,10 figures

    ACM Class: I.2.7

  5. arXiv:2406.15968  [pdf, other

    cs.CL cs.LG

    ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods

    Authors: Roy Xie, Junlin Wang, Ruomin Huang, Minxing Zhang, Rong Ge, Jian Pei, Neil Zhenqiang Gong, Bhuwan Dhingra

    Abstract: The rapid scaling of large language models (LLMs) has raised concerns about the transparency and fair use of the pretraining data used for training them. Detecting such content is challenging due to the scale of the data and limited exposure of each instance during training. We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) to detect LLMs' pretraini… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  6. arXiv:2406.12501  [pdf, other

    cs.IR

    Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback

    Authors: Guipeng Xv, Xinyu Li, Ruobing Xie, Chen Lin, Chong Liu, Feng Xia, Zhanhui Kang, Leyu Lin

    Abstract: Multi-modal recommender systems (MRSs) are pivotal in diverse online web platforms and have garnered considerable attention in recent years. However, previous studies overlook the challenges of (1) noisy multi-modal content, (2) noisy user feedback, and (3) aligning multi-modal content with user feedback. In order to tackle these challenges, we propose Denoising and Aligning Multi-modal Recommende… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.12298  [pdf

    cs.AI physics.ao-ph

    Research on Dangerous Flight Weather Prediction based on Machine Learning

    Authors: Haoxing Liu, Renjie Xie, Haoshen Qin, Yizhou Li

    Abstract: With the continuous expansion of the scale of air transport, the demand for aviation meteorological support also continues to grow. The impact of hazardous weather on flight safety is critical. How to effectively use meteorological data to improve the early warning capability of flight dangerous weather and ensure the safe flight of aircraft is the primary task of aviation meteorological services.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  8. arXiv:2406.07961  [pdf, other

    cs.CV cs.AI

    Accurate Explanation Model for Image Classifiers using Class Association Embedding

    Authors: Ruitao Xie, Jingbang Chen, Limai Jiang, Rui Xiao, Yi Pan, Yunpeng Cai

    Abstract: Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 40th IEEE International Conference on Data Engineering

  9. arXiv:2406.06737  [pdf, other

    cs.CR cs.CL

    Raccoon: Prompt Extraction Benchmark of LLM-Integrated Applications

    Authors: Junlin Wang, Tianyi Yang, Roy Xie, Bhuwan Dhingra

    Abstract: With the proliferation of LLM-integrated applications such as GPT-s, millions are deployed, offering valuable services through proprietary instruction prompts. These systems, however, are prone to prompt extraction attacks through meticulously designed queries. To help mitigate this problem, we introduce the Raccoon benchmark which comprehensively evaluates a model's susceptibility to prompt extra… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  10. arXiv:2406.04828  [pdf, other

    cs.IR

    QAGCF: Graph Collaborative Filtering for Q&A Recommendation

    Authors: Changshuo Zhang, Teng Shi, Xiao Zhang, Yanping Zheng, Ruobing Xie, Qi Liu, Jun Xu, Ji-Rong Wen

    Abstract: Question and answer (Q&A) platforms usually recommend question-answer pairs to meet users' knowledge acquisition needs, unlike traditional recommendations that recommend only one item. This makes user behaviors more complex, and presents two challenges for Q&A recommendation, including: the collaborative information entanglement, which means user feedback is influenced by either the question or th… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  11. arXiv:2406.01873  [pdf, other

    cs.CL cs.CR cs.LG

    CR-UTP: Certified Robustness against Universal Text Perturbations on Large Language Models

    Authors: Qian Lou, Xin Liang, Jiaqi Xue, Yancheng Zhang, Rui Xie, Mengxin Zheng

    Abstract: It is imperative to ensure the stability of every prediction made by a language model; that is, a language's prediction should remain consistent despite minor input variations, like word substitutions. In this paper, we investigate the problem of certifying a language model's robustness against Universal Text Perturbations (UTPs), which have been widely used in universal adversarial attacks and ba… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL Findings 2024

  12. arXiv:2405.20701  [pdf, other

    cs.CL cs.AI

    Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement

    Authors: Pengwei Zhan, Zhen Xu, Qian Tan, Jie Song, Ru Xie

    Abstract: Large language models (LLMs) demonstrate exceptional instruct-following ability to complete various downstream tasks. Although this impressive ability makes LLMs flexible task solvers, their performance in solving tasks also heavily relies on instructions. In this paper, we reveal that LLMs are over-sensitive to lexical variations in task instructions, even when the variations are imperceptible to… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  13. arXiv:2405.18979  [pdf, other

    cs.LG stat.ML

    MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts

    Authors: Renchunzi Xie, Ambroise Odonnat, Vasilii Feofanov, Weijian Deng, Jianfeng Zhang, Bo An

    Abstract: Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to predict… ▽ More

    Submitted 24 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: The three first authors contributed equally

  14. arXiv:2405.15280  [pdf, other

    cs.IR cs.AI cs.LG

    DFGNN: Dual-frequency Graph Neural Network for Sign-aware Feedback

    Authors: Yiqing Wu, Ruobing Xie, Zhao Zhang, Xu Zhang, Fuzhen Zhuang, Leyu Lin, Zhanhui Kang, Yongjun Xu

    Abstract: The graph-based recommendation has achieved great success in recent years. However, most existing graph-based recommendations focus on capturing user preference based on positive edges/feedback, while ignoring negative edges/feedback (e.g., dislike, low rating) that widely exist in real-world recommender systems. How to utilize negative feedback in graph-based recommendations still remains underex… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024 Research Track

  15. arXiv:2405.14580  [pdf, other

    cs.GR

    LDM: Large Tensorial SDF Model for Textured Mesh Generation

    Authors: Rengan Xie, Wenting Zheng, Kai Huang, Yizheng Chen, Qi Wang, Qi Ye, Wei Chen, Yuchi Huo

    Abstract: Previous efforts have managed to generate production-ready 3D assets from text or images. However, these methods primarily employ NeRF or 3D Gaussian representations, which are not adept at producing smooth, high-quality geometries required by modern rendering pipelines. In this paper, we propose LDM, a novel feed-forward framework capable of generating high-fidelity, illumination-decoupled textur… ▽ More

    Submitted 20 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  16. arXiv:2405.11270  [pdf, other

    cs.CV

    HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos

    Authors: Qifeng Chen, Rengan Xie, Kai Huang, Qi Wang, Wenting Zheng, Rong Li, Yuchi Huo

    Abstract: Recently, implicit neural representation has been widely used to generate animatable human avatars. However, the materials and geometry of those representations are coupled in the neural network and hard to edit, which hinders their application in traditional graphics engines. We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textur… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  17. arXiv:2405.10861  [pdf, other

    cs.CL cs.AI cs.CY

    Tailoring Vaccine Messaging with Common-Ground Opinions

    Authors: Rickard Stureborg, Sanxing Chen, Ruoyu Xie, Aayushi Patel, Christopher Li, Chloe Qinyu Zhu, Tingnan Hu, Jun Yang, Bhuwan Dhingra

    Abstract: One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms of messaging which aim to answer concerns expressed about vaccination. Tailoring responses in this domain is difficult, since opinions often hav… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: NAACL Findings 2024

    MSC Class: 68T50 (Primary) 68T01; 68T37; 91F20 (Secondary) ACM Class: I.2; I.2.7; I.7

  18. arXiv:2405.03562  [pdf, other

    cs.IR

    ID-centric Pre-training for Recommendation

    Authors: Yiqing Wu, Ruobing Xie, Zhao Zhang, Fuzhen Zhuang, Xu Zhang, Leyu Lin, Zhanhui Kang, Yongjun Xu

    Abstract: Classical sequential recommendation models generally adopt ID embeddings to store knowledge learned from user historical behaviors and represent items. However, these unique IDs are challenging to be transferred to new domains. With the thriving of pre-trained language model (PLM), some pioneer works adopt PLM for pre-trained recommendation, where modality information (e.g., text) is considered un… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  19. arXiv:2405.00742  [pdf, other

    cs.CR cs.LG stat.ML

    Federated Graph Learning for EV Charging Demand Forecasting with Personalization Against Cyberattacks

    Authors: Yi Li, Renyou Xie, Chaojie Li, Yi Wang, Zhaoyang Dong

    Abstract: Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among differe… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages,4 figures

  20. arXiv:2404.18109  [pdf, other

    cs.CV

    Finding Beautiful and Happy Images for Mental Health and Well-being Applications

    Authors: Ruitao Xie, Connor Qiu, Guoping Qiu

    Abstract: This paper explores how artificial intelligence (AI) technology can contribute to achieve progress on good health and well-being, one of the United Nations' 17 Sustainable Development Goals. It is estimated that one in ten of the global population lived with a mental disorder. Inspired by studies showing that engaging and viewing beautiful natural images can make people feel happier and less stres… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  21. arXiv:2404.16678  [pdf, other

    cs.CV

    Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior

    Authors: Han Wang, Xinning Chai, Yiwen Wang, Yuhong Zhang, Rong Xie, Li Song

    Abstract: Colorizing grayscale images offers an engaging visual experience. Existing automatic colorization methods often fail to generate satisfactory results due to incorrect semantic colors and unsaturated colors. In this work, we propose an automatic colorization pipeline to overcome these challenges. We leverage the extraordinary generative ability of the diffusion prior to synthesize color with plausi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  22. arXiv:2404.16307  [pdf, other

    cs.LG cs.CV

    Boosting Model Resilience via Implicit Adversarial Data Augmentation

    Authors: Xiaoling Zhou, Wei Ye, Zhemg Lee, Rui Xie, Shikun Zhang

    Abstract: Data augmentation plays a pivotal role in enhancing and diversifying training data. Nonetheless, consistently improving model performance in varied learning scenarios, especially those with inherent data biases, remains challenging. To address this, we propose to augment the deep features of samples by incorporating their adversarial and anti-adversarial perturbation distributions, enabling adapti… ▽ More

    Submitted 1 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: 9 pages, 6 figures, accepted by IJCAI 2024

    ACM Class: I.2.6; I.4.3

  23. arXiv:2404.14567  [pdf, other

    cs.CL

    WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models

    Authors: Ronald Xie, Steven Palayew, Augustin Toma, Gary Bader, Bo Wang

    Abstract: This paper outlines our submission to the MEDIQA2024 Multilingual and Multimodal Medical Answer Generation (M3G) shared task. We report results for two standalone solutions under the English category of the task, the first involving two consecutive API calls to the Claude 3 Opus API and the second involving training an image-disease label joint embedding in the style of CLIP for image classificati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  24. arXiv:2404.14544  [pdf, other

    cs.CL

    WangLab at MEDIQA-CORR 2024: Optimized LLM-based Programs for Medical Error Detection and Correction

    Authors: Augustin Toma, Ronald Xie, Steven Palayew, Patrick R. Lawler, Bo Wang

    Abstract: Medical errors in clinical text pose significant risks to patient safety. The MEDIQA-CORR 2024 shared task focuses on detecting and correcting these errors across three subtasks: identifying the presence of an error, extracting the erroneous sentence, and generating a corrected sentence. In this paper, we present our approach that achieved top performance in all three subtasks. For the MS dataset,… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  25. arXiv:2404.08796  [pdf, other

    cs.IR

    The Elephant in the Room: Rethinking the Usage of Pre-trained Language Model in Sequential Recommendation

    Authors: Zekai Qu, Ruobing Xie, Chaojun Xiao, Xingwu Sun, Zhanhui Kang

    Abstract: Sequential recommendation (SR) has seen significant advancements with the help of Pre-trained Language Models (PLMs). Some PLM-based SR models directly use PLM to encode user historical behavior's text sequences to learn user representations, while there is seldom an in-depth exploration of the capability and suitability of PLM in behavior sequence modeling. In this work, we first conduct extensiv… ▽ More

    Submitted 17 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 10 pages

  26. arXiv:2404.04062  [pdf, other

    cs.LG math.OC

    Derivative-free tree optimization for complex systems

    Authors: Ye Wei, Bo Peng, Ruiwen Xie, Yangtao Chen, Yu Qin, Peng Wen, Stefan Bauer, Po-Yen Tung

    Abstract: A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 d… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 39 pages, 3 figures

  27. arXiv:2404.02078  [pdf, other

    cs.AI cs.CL cs.LG

    Advancing LLM Reasoning Generalists with Preference Trees

    Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

    Abstract: We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning. Finetuned from Mistral-7B and CodeLlama-70B, Eurus models achieve state-of-the-art results among open-source models on a diverse set of benchmarks covering mathematics, code generation, and logical reasoning problems. Notably, Eurus-70B beats GPT-3.5 Turbo in reasoning through a comprehensive benchmarking across 1… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Models and data are available at https://github.com/OpenBMB/Eurus

  28. arXiv:2404.01717  [pdf, other

    cs.CV eess.IV

    AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation

    Authors: Rui Xie, Ying Tai, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Xiaoqian Ye, Qian Wang, Jian Yang

    Abstract: Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. However, their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion di… ▽ More

    Submitted 23 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  29. arXiv:2403.19287  [pdf, other

    cs.SE

    CoderUJB: An Executable and Unified Java Benchmark for Practical Programming Scenarios

    Authors: Zhengran Zeng, Yidong Wang, Rui Xie, Wei Ye, Shikun Zhang

    Abstract: In the evolving landscape of large language models (LLMs) tailored for software engineering, the need for benchmarks that accurately reflect real-world development scenarios is paramount. Current benchmarks are either too simplistic or fail to capture the multi-tasking nature of software development. To address this, we introduce CoderUJB, a new benchmark designed to evaluate LLMs across diverse J… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures, issta2024 accepted

    MSC Class: 68N30 (Primary) 68T20 (Secondary) ACM Class: D.2.0

  30. arXiv:2403.19185  [pdf, other

    cs.IT eess.SP

    Deep CSI Compression for Dual-Polarized Massive MIMO Channels with Disentangled Representation Learning

    Authors: Suhang Fan, Wei Xu, Renjie Xie, Shi Jin, Derrick Wing Kwan Ng, Naofal Al-Dhahir

    Abstract: Channel state information (CSI) feedback is critical for achieving the promised advantages of enhancing spectral and energy efficiencies in massive multiple-input multiple-output (MIMO) wireless communication systems. Deep learning (DL)-based methods have been proven effective in reducing the required signaling overhead for CSI feedback. In practical dual-polarized MIMO scenarios, channels in the… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  31. arXiv:2403.18480  [pdf, other

    cs.IR

    Enhanced Generative Recommendation via Content and Collaboration Integration

    Authors: Yidan Wang, Zhaochun Ren, Weiwei Sun, Jiyuan Yang, Zhixiang Liang, Xin Chen, Ruobing Xie, Su Yan, Xu Zhang, Pengjie Ren, Zhumin Chen, Xin Xin

    Abstract: Generative recommendation has emerged as a promising paradigm aimed at augmenting recommender systems with recent advancements in generative artificial intelligence. This task has been formulated as a sequence-to-sequence generation process, wherein the input sequence encompasses data pertaining to the user's previously interacted items, and the output sequence denotes the generative identifier fo… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  32. arXiv:2403.15747  [pdf, other

    cs.SE cs.AI

    CodeShell Technical Report

    Authors: Rui Xie, Zhengran Zeng, Zhuohao Yu, Chang Gao, Shikun Zhang, Wei Ye

    Abstract: Code large language models mark a pivotal breakthrough in artificial intelligence. They are specifically crafted to understand and generate programming languages, significantly boosting the efficiency of coding development workflows. In this technical report, we present CodeShell-Base, a seven billion-parameter foundation model with 8K context length, showcasing exceptional proficiency in code com… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  33. From Hardware Fingerprint to Access Token: Enhancing the Authentication on IoT Devices

    Authors: Yue Xiao, Yi He, Xiaoli Zhang, Qian Wang, Renjie Xie, Kun Sun, Ke Xu, Qi Li

    Abstract: The proliferation of consumer IoT products in our daily lives has raised the need for secure device authentication and access control. Unfortunately, these resource-constrained devices typically use token-based authentication, which is vulnerable to token compromise attacks that allow attackers to impersonate the devices and perform malicious operations by stealing the access token. Using hardware… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  34. arXiv:2403.11116  [pdf, other

    cs.CV cs.AI

    PhD: A Prompted Visual Hallucination Evaluation Dataset

    Authors: Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

    Abstract: The rapid growth of Large Language Models (LLMs) has driven the development of Large Vision-Language Models (LVLMs). The challenge of hallucination, prevalent in LLMs, also emerges in LVLMs. However, most existing efforts mainly focus on object hallucination in LVLM, ignoring diverse types of LVLM hallucinations. In this study, we delve into the Intrinsic Vision-Language Hallucination (IVL-Hallu)… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  35. arXiv:2403.10104  [pdf, other

    cs.CV

    CSDNet: Detect Salient Object in Depth-Thermal via A Lightweight Cross Shallow and Deep Perception Network

    Authors: Xiaotong Yu, Ruihan Xie, Zhihe Zhao, Chang-Wen Chen

    Abstract: While we enjoy the richness and informativeness of multimodal data, it also introduces interference and redundancy of information. To achieve optimal domain interpretation with limited resources, we propose CSDNet, a lightweight \textbf{C}ross \textbf{S}hallow and \textbf{D}eep Perception \textbf{Net}work designed to integrate two modalities with less coherence, thereby discarding redundant inform… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  36. arXiv:2403.09671  [pdf, other

    cs.DC cs.AI

    CoRaiS: Lightweight Real-Time Scheduler for Multi-Edge Cooperative Computing

    Authors: Yujiao Hu, Qingmin Jia, Jinchao Chen, Yuan Yao, Yan Pan, Renchao Xie, F. Richard Yu

    Abstract: Multi-edge cooperative computing that combines constrained resources of multiple edges into a powerful resource pool has the potential to deliver great benefits, such as a tremendous computing power, improved response time, more diversified services. However, the mass heterogeneous resources composition and lack of scheduling strategies make the modeling and cooperating of multi-edge computing sys… ▽ More

    Submitted 20 May, 2024; v1 submitted 4 February, 2024; originally announced March 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  37. arXiv:2403.08281  [pdf, other

    cs.CL cs.AI

    Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

    Authors: Ning Ding, Yulin Chen, Ganqu Cui, Xingtai Lv, Weilin Zhao, Ruobing Xie, Bowen Zhou, Zhiyuan Liu, Maosong Sun

    Abstract: Underlying data distributions of natural language, programming code, and mathematical symbols vary vastly, presenting a complex challenge for large language models (LLMs) that strive to achieve high performance across all three domains simultaneously. Achieving a very high level of proficiency for an LLM within a specific domain often requires extensive training with relevant corpora, which is typ… ▽ More

    Submitted 26 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  38. arXiv:2403.04526  [pdf, other

    cs.LG cs.AI cs.CV

    Hyperspectral unmixing for Raman spectroscopy via physics-constrained autoencoders

    Authors: Dimitar Georgiev, Álvaro Fernández-Galiana, Simon Vilms Pedersen, Georgios Papadopoulos, Ruoxiao Xie, Molly M. Stevens, Mauricio Barahona

    Abstract: Raman spectroscopy is widely used across scientific domains to characterize the chemical composition of samples in a non-destructive, label-free manner. Many applications entail the unmixing of signals from mixtures of molecular species to identify the individual components present and their proportions, yet conventional methods for chemometrics often struggle with complex mixture scenarios encoun… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  39. arXiv:2403.04015  [pdf, other

    cs.LG cs.AI stat.ML

    Knockoff-Guided Feature Selection via A Single Pre-trained Reinforced Agent

    Authors: Xinyuan Wang, Dongjie Wang, Wangyang Ying, Rui Xie, Haifeng Chen, Yanjie Fu

    Abstract: Feature selection prepares the AI-readiness of data by eliminating redundant features. Prior research falls into two primary categories: i) Supervised Feature Selection, which identifies the optimal feature subset based on their relevance to the target variable; ii) Unsupervised Feature Selection, which reduces the feature space dimensionality by capturing the essential information within the feat… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  40. arXiv:2403.02063  [pdf, other

    cs.CV

    Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views

    Authors: Shuai Guo, Qiuwen Wang, Yijie Gao, Rong Xie, Li Song

    Abstract: Novel-view synthesis with sparse input views is important for real-world applications like AR/VR and autonomous driving. Recent methods have integrated depth information into NeRFs for sparse input synthesis, leveraging depth prior for geometric and spatial understanding. However, most existing works tend to overlook inaccuracies within depth maps and have low time efficiency. To address these iss… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  41. arXiv:2402.19085  [pdf, other

    cs.CL cs.AI eess.SY

    Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

    Authors: Yiju Guo, Ganqu Cui, Lifan Yuan, Ning Ding, Jiexin Wang, Huimin Chen, Bowen Sun, Ruobing Xie, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, exi… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  42. arXiv:2402.18439  [pdf, other

    cs.CL cs.AI

    Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication

    Authors: Weize Chen, Chenfei Yuan, Jiarui Yuan, Yusheng Su, Chen Qian, Cheng Yang, Ruobing Xie, Zhiyuan Liu, Maosong Sun

    Abstract: Natural language (NL) has long been the predominant format for human cognition and communication, and by extension, has been similarly pivotal in the development and application of Large Language Models (LLMs). Yet, besides NL, LLMs have seen various non-NL formats during pre-training, such as code and logical expression. NL's status as the optimal format for LLMs, particularly in single-LLM reaso… ▽ More

    Submitted 18 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Code release at https://github.com/thunlp/AutoForm

  43. arXiv:2402.17916  [pdf, other

    cs.CL cs.AI

    Adversarial Math Word Problem Generation

    Authors: Roy Xie, Chengxuan Huang, Junlin Wang, Bhuwan Dhingra

    Abstract: Large language models (LLMs) have significantly transformed the educational landscape. As current plagiarism detection tools struggle to keep pace with LLMs' rapid advancements, the educational community faces the challenge of assessing students' true problem-solving abilities in the presence of LLMs. In this work, we explore a new paradigm for ensuring fair evaluation -- generating adversarial ex… ▽ More

    Submitted 15 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Code/data: https://github.com/ruoyuxie/adversarial_mwps_generation

  44. arXiv:2402.17914  [pdf, other

    cs.CL cs.AI

    Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers

    Authors: Roy Xie, Orevaoghene Ahia, Yulia Tsvetkov, Antonios Anastasopoulos

    Abstract: Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis. This is largely due to the complexity and nuance involved in studying various dialects. We present a novel approach to extract distinguishing lexical features of dialects by utilizing interpretable dialect classifiers, even in the absence of human experts. We explore bot… ▽ More

    Submitted 23 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Code is available at https://github.com/ruoyuxie/interpretable_dialect_classifier

  45. arXiv:2402.17010  [pdf, other

    cs.CL cs.AI

    Can Large Language Models Recall Reference Location Like Humans?

    Authors: Ye Wang, Xinrun Xu, Rui Xie, Wenxin Hu, Wei Ye

    Abstract: When completing knowledge-intensive tasks, humans sometimes need not just an answer but also a corresponding reference passage for auxiliary reading. Previous methods required obtaining pre-segmented article chunks through additional retrieval models. This paper explores leveraging the parameterized knowledge stored during the pre-training phase of large language models (LLMs) to independently rec… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  46. arXiv:2402.16870  [pdf, other

    cs.NI

    Pioneering Deterministic Scheduling and Network Structure Optimization for Time-Critical Computing Tasks in Industrial IoT

    Authors: Yujiao Hu, Yining Zhu, Huayu Zhang, Yan Pan, Qingmin Jia, Renchao Xie, Gang Yang, F. Richard Yu

    Abstract: The Industrial Internet of Things (IIoT) has become a critical technology to accelerate the process of digital and intelligent transformation of industries. As the cooperative relationship between smart devices in IIoT becomes more complex, getting deterministic responses of IIoT periodic time-critical computing tasks becomes a crucial and nontrivial problem. However, few current works in cloud/ed… ▽ More

    Submitted 23 January, 2024; originally announced February 2024.

    Comments: Under Review

  47. arXiv:2402.02381  [pdf, other

    cs.NI cs.AI

    Empowering Computing and Networks Convergence System with Distributed Cooperative Routing

    Authors: Yujiao Hu, Qingmin Jia, Meng Shen, Renchao Xie, Tao Huang, F. Richard Yu

    Abstract: The emergence of intelligent applications and recent advances in the fields of computing and networks are driving the development of computing and networks convergence (CNC) system. However, existing researches failed to achieve comprehensive scheduling optimization of computing and network resources. This shortfall results in some requirements of computing requests unable to be guaranteed in an e… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Submit to IEEE Network

  48. arXiv:2401.17812  [pdf, other

    cs.NI cs.AI

    Deterministic Computing Power Networking: Architecture, Technologies and Prospects

    Authors: Qingmin Jia, Yujiao Hu, Xiaomao Zhou, Qianpiao Ma, Kai Guo, Huayu Zhang, Renchao Xie, Tao Huang, Yunjie Liu

    Abstract: With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  49. arXiv:2401.16777  [pdf, other

    cs.LG

    Addressing Distribution Shift in Time Series Forecasting with Instance Normalization Flows

    Authors: Wei Fan, Shun Zheng, Pengyang Wang, Rui Xie, Jiang Bian, Yanjie Fu

    Abstract: Due to non-stationarity of time series, the distribution shift problem largely hinders the performance of time series forecasting. Existing solutions either fail for the shifts beyond simple statistics or the limited compatibility with forecasting models. In this paper, we propose a general decoupled formulation for time series forecasting, with no reliance on fixed statistics and no restriction o… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 17 pages

  50. arXiv:2401.15841  [pdf, other

    cs.CV

    2L3: Lifting Imperfect Generated 2D Images into Accurate 3D

    Authors: Yizheng Chen, Rengan Xie, Qi Ye, Sen Yang, Zixuan Xie, Tianxiao Chen, Rong Li, Yuchi Huo

    Abstract: Reconstructing 3D objects from a single image is an intriguing but challenging problem. One promising solution is to utilize multi-view (MV) 3D reconstruction to fuse generated MV images into consistent 3D objects. However, the generated images usually suffer from inconsistent lighting, misaligned geometry, and sparse views, leading to poor reconstruction quality. To cope with these problems, we p… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.