Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 280 results for author: Lin, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14402  [pdf, other

    cs.AI cs.CL cs.DC cs.MA cs.SE

    The Vision of Autonomic Computing: Can LLMs Make It a Reality?

    Authors: Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Jue Zhang, Qingwei Lin, Gong Cheng, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

    Abstract: The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments. Despite decades of research, achieving ACV remains challenging due to the dynamic and complex nature of modern computing systems. Recent advancements in Large Language Models (LLMs) offer promising solutions… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  2. EmoFace: Audio-driven Emotional 3D Face Animation

    Authors: Chang Liu, Qunfen Lin, Zijiao Zeng, Ye Pan

    Abstract: Audio-driven emotional 3D face animation aims to generate emotionally expressive talking heads with synchronized lip movements. However, previous research has often overlooked the influence of diverse emotions on facial expressions or proved unsuitable for driving MetaHuman models. In response to this deficiency, we introduce EmoFace, a novel audio-driven methodology for creating facial animations… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR). IEEE, 2024

  3. arXiv:2407.10627  [pdf, other

    cs.CL cs.AI cs.LG

    Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

    Authors: Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, Qingwei Lin, Jianguang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen

    Abstract: Assessing the effectiveness of large language models (LLMs) presents substantial challenges. The method of conducting human-annotated battles in an online Chatbot Arena is a highly effective evaluative technique. However, this approach is limited by the costs and time required for human annotation. In this paper, we introduce Arena Learning, an innovative offline strategy designed to simulate thes… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  4. arXiv:2407.05023  [pdf, other

    cs.CV

    SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction

    Authors: Weixing Xie, Junfeng Yao, Xianpeng Cao, Qiqin Lin, Zerui Tang, Xiao Dong, Xiaohu Guo

    Abstract: Dynamic reconstruction of deformable tissues in endoscopic video is a key technology for robot-assisted surgery. Recent reconstruction methods based on neural radiance fields (NeRFs) have achieved remarkable results in the reconstruction of surgical scenes. However, based on implicit representation, NeRFs struggle to capture the intricate details of objects in the scene and cannot achieve real-tim… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  5. arXiv:2407.01103  [pdf, other

    cs.RO

    FedRC: A Rapid-Converged Hierarchical Federated Learning Framework in Street Scene Semantic Understanding

    Authors: Wei-Bin Kou, Qingfeng Lin, Ming Tang, Shuai Wang, Guangxu Zhu, Yik-Chung Wu

    Abstract: Street Scene Semantic Understanding (denoted as TriSU) is a crucial but complex task for world-wide distributed autonomous driving (AD) vehicles (e.g., Tesla). Its inference model faces poor generalization issue due to inter-city domain-shift. Hierarchical Federated Learning (HFL) offers a potential solution for improving TriSU model generalization, but suffers from slow convergence rate because o… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This work has been accepted by 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  6. arXiv:2406.19251  [pdf, other

    cs.CL cs.AI

    AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

    Authors: Jia Fu, Xiaoting Qin, Fangkai Yang, Lu Wang, Jue Zhang, Qingwei Lin, Yubo Chen, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

    Abstract: Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  7. arXiv:2406.14004  [pdf, other

    cs.IR cs.LG

    Do Not Wait: Learning Re-Ranking Model Without User Feedback At Serving Time in E-Commerce

    Authors: Yuan Wang, Zhiyu Li, Changshuo Zhang, Sirui Chen, Xiao Zhang, Jun Xu, Quan Lin

    Abstract: Recommender systems have been widely used in e-commerce, and re-ranking models are playing an increasingly significant role in the domain, which leverages the inter-item influence and determines the final recommendation lists. Online learning methods keep updating a deployed model with the latest available samples to capture the shifting of the underlying data distribution in e-commerce. However,… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  8. arXiv:2406.13923  [pdf, other

    cs.AI cs.CL cs.CV cs.MM

    PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

    Authors: Junjie Wang, Yin Zhang, Yatai Ji, Yuxiang Zhang, Chunyang Jiang, Yubo Wang, Kang Zhu, Zekun Wang, Tiezhen Wang, Wenhao Huang, Jie Fu, Bei Chen, Qunshu Lin, Minghao Liu, Ge Zhang, Wenhu Chen

    Abstract: Recent advancements in Large Multimodal Models (LMMs) have leveraged extensive multimodal datasets to enhance capabilities in complex knowledge-driven tasks. However, persistent challenges in perceptual and reasoning errors limit their efficacy, particularly in interpreting intricate visual data and deducing multimodal relationships. Addressing these issues, we introduce a novel dataset format, PI… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  9. arXiv:2406.13719  [pdf, other

    cs.CV

    GUI Action Narrator: Where and When Did That Action Take Place?

    Authors: Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Xiangwu Guo, Peiran Li, Weichen Zhang, Hengxu Wang, Mike Zheng Shou

    Abstract: The advent of Multimodal LLMs has significantly enhanced image OCR recognition capabilities, making GUI automation a viable reality for increasing efficiency in digital tasks. One fundamental aspect of developing a GUI automation system is understanding primitive GUI actions. This comprehension is crucial as it enables agents to learn from user demonstrations, an essential element of automation. T… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.13372  [pdf, other

    cs.AI

    Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation

    Authors: Kaikai An, Fangkai Yang, Liqun Li, Junting Lu, Sitao Cheng, Lu Wang, Pu Zhao, Lele Cao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

    Abstract: Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-conne… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 21 pages, 4 figures

  11. arXiv:2406.11816  [pdf, other

    cs.CV

    VideoLLM-online: Online Video Large Language Model for Streaming Video

    Authors: Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou

    Abstract: Recent Large Language Models have been enhanced with vision capabilities, enabling them to comprehend images, videos, and interleaved vision-language content. However, the learning methods of these large multimodal models typically treat videos as predetermined clips, making them less effective and efficient at handling streaming video inputs. In this paper, we propose a novel Learning-In-Video-St… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: CVPR 2024. This arxiv version is upgraded with Llama-3

  12. arXiv:2406.10227  [pdf, other

    cs.CV cs.AI

    VideoGUI: A Benchmark for GUI Automation from Instructional Videos

    Authors: Kevin Qinghong Lin, Linjie Li, Difei Gao, Qinchen WU, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou

    Abstract: Graphical User Interface (GUI) automation holds significant promise for enhancing human productivity by assisting with computer tasks. Existing task formulations primarily focus on simple tasks that can be specified by a single, language-only instruction, such as "Insert a new slide." In this work, we introduce VideoGUI, a novel multi-modal benchmark designed to evaluate GUI assistants on visual-c… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 24 pages, 16 tables, 17 figures

  13. arXiv:2406.05686  [pdf, other

    cs.LG cs.CV cs.CY

    Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning

    Authors: Qi Qi, Quanqi Hu, Qihang Lin, Tianbao Yang

    Abstract: This paper studies learning fair encoders in a self-supervised learning (SSL) setting, in which all data are unlabeled and only a small portion of them are annotated with sensitive attribute. Adversarial fair representation learning is well suited for this scenario by minimizing a contrastive loss over unlabeled data while maximizing an adversarial loss of predicting the sensitive attribute over… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  14. arXiv:2406.03159  [pdf, other

    cs.NI cs.DC

    Hurry: Dynamic Collaborative Framework For Low-orbit Mega-Constellation Data Downloading

    Authors: Handong Luo, Wenhao Liu, Qi Zhang, Ziheng Yang, Quanwei Lin, Wenjun Zhu, Kun Qiu, Zhe Chen, Yue Gao

    Abstract: Low-orbit mega-constellation network, which utilize thousands of satellites to provide a variety of network services and collect a wide range of space information, is a rapidly growing field. Each satellite collects TB-level data daily, including delay-sensitive data used for crucial tasks, such as military surveillance, natural disaster monitoring, and weather forecasting. According to NASA's sta… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 15 pages, 7 figures

  15. arXiv:2406.01047  [pdf, other

    cs.DC cs.AI cs.LG

    An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing

    Authors: Hang Dong, Liwen Zhu, Zhao Shan, Bo Qiao, Fangkai Yang, Si Qin, Chuan Luo, Qingwei Lin, Yuwen Yang, Gurpreet Virdi, Saravan Rajmohan, Dongmei Zhang, Thomas Moscibroda

    Abstract: Efficient resource utilization and perfect user experience usually conflict with each other in cloud computing platforms. Great efforts have been invested in increasing resource utilization but trying not to affect users' experience for cloud computing platforms. In order to better utilize the remaining pieces of computing resources spread over the whole platform, deferrable jobs are provided with… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2405.19327  [pdf, other

    cs.CL cs.AI cs.LG

    MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

    Authors: Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kaijing Ma, Minghao Liu, Morry Niu , et al. (20 additional authors not shown)

    Abstract: Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparabl… ▽ More

    Submitted 10 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: https://map-neo.github.io/

  17. arXiv:2405.19109  [pdf, other

    cs.CL

    PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering

    Authors: Fangzhi Xu, Qika Lin, Tianzhe Zhao, Jiawei Han, Jun Liu

    Abstract: Logical reasoning task has attracted great interest since it was proposed. Faced with such a task, current competitive models, even large language models (e.g., ChatGPT and PaLM 2), still perform badly. Previous promising LMs struggle in logical consistency modeling and logical structure perception. To this end, we model the logical reasoning task by transforming each logical sample into reasoning… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024

  18. arXiv:2405.16390  [pdf, other

    cs.AI cs.LG

    Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning

    Authors: Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Alois Knoll, Ming Jin

    Abstract: In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints. To tackle this issue, we propose a primal-based framework that orchestrates policy optimization between multi-objective learning and constraint adherence. Our method employs a novel natural policy gr… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  19. arXiv:2405.15571  [pdf, other

    cs.HC

    RCInvestigator: Towards Better Investigation of Anomaly Root Causes in Cloud Computing Systems

    Authors: Shuhan Liu, Yunfan Zhou, Lu Ying, Yuan Tian, Jue Zhang, Shandan Zhou, Weiwei Cui, Qingwei Lin, Thomas Moscibroda, Haidong Zhang, Di Weng, Yingcai Wu

    Abstract: Finding the root causes of anomalies in cloud computing systems quickly is crucial to ensure availability and efficiency since accurate root causes can guide engineers to take appropriate actions to address the anomalies and maintain customer satisfaction. However, it is difficult to investigate and identify the root causes based on large-scale and high-dimension monitoring data collected from com… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  20. arXiv:2405.15370  [pdf, other

    cs.CL

    Large Language Models can Deliver Accurate and Interpretable Time Series Anomaly Detection

    Authors: Jun Liu, Chaoyun Zhang, Jiaxu Qian, Minghua Ma, Si Qin, Chetan Bansal, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang

    Abstract: Time series anomaly detection (TSAD) plays a crucial role in various industries by identifying atypical patterns that deviate from standard trends, thereby maintaining system integrity and enabling prompt response measures. Traditional TSAD models, which often rely on deep learning, require extensive training data and operate as black boxes, lacking interpretability for detected anomalies. To addr… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  21. arXiv:2405.14506  [pdf, other

    cs.CV cs.AI

    SIAVC: Semi-Supervised Framework for Industrial Accident Video Classification

    Authors: Zuoyong Li, Qinghua Lin, Haoyi Fan, Tiesong Zhao, David Zhang

    Abstract: Semi-supervised learning suffers from the imbalance of labeled and unlabeled training data in the video surveillance scenario. In this paper, we propose a new semi-supervised learning method called SIAVC for industrial accident video classification. Specifically, we design a video augmentation module called the Super Augmentation Block (SAB). SAB adds Gaussian noise and randomly masks video frames… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  22. arXiv:2405.13748  [pdf, other

    cs.CV

    Monocular Gaussian SLAM with Language Extended Loop Closure

    Authors: Tian Lan, Qinwei Lin, Haoqian Wang

    Abstract: Recently,3DGaussianSplattinghasshowngreatpotentialin visual Simultaneous Localization And Mapping (SLAM). Existing methods have achieved encouraging results on RGB-D SLAM, but studies of the monocular case are still scarce. Moreover, they also fail to correct drift errors due to the lack of loop closure and global optimization. In this paper, we present MG-SLAM, a monocular Gaussian SLAM with a la… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  23. arXiv:2405.10970  [pdf, other

    cs.LG cs.AI cs.CR

    Untargeted Adversarial Attack on Knowledge Graph Embeddings

    Authors: Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Qika Lin, Yuxia Geng, Jun Liu

    Abstract: Knowledge graph embedding (KGE) methods have achieved great success in handling various knowledge graph (KG) downstream tasks. However, KGE methods may learn biased representations on low-quality KGs that are prevalent in the real world. Some recent studies propose adversarial attacks to investigate the vulnerabilities of KGE methods, but their attackers are target-oriented with the KGE method and… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by SIGIR 2024

  24. arXiv:2405.09362  [pdf, other

    stat.ML cs.LG

    On the Saturation Effect of Kernel Ridge Regression

    Authors: Yicheng Li, Haobo Zhang, Qian Lin

    Abstract: The saturation effect refers to the phenomenon that the kernel ridge regression (KRR) fails to achieve the information theoretical lower bound when the smoothness of the underground truth function exceeds certain level. The saturation effect has been widely observed in practices and a saturation lower bound of KRR has been conjectured for decades. In this paper, we provide a proof of this long-sta… ▽ More

    Submitted 28 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: ICLR 2023; Minor errors are corrected in this version

  25. arXiv:2405.08748  [pdf, other

    cs.CV

    Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

    Authors: Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu , et al. (20 additional authors not shown)

    Abstract: We present Hunyuan-DiT, a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. To construct Hunyuan-DiT, we carefully design the transformer structure, text encoder, and positional encoding. We also build from scratch a whole data pipeline to update and evaluate data for iterative model optimization. For fine-grained language understanding, we train a Mu… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Project Page: https://dit.hunyuan.tencent.com/

  26. arXiv:2405.04146  [pdf, other

    cs.RO cs.DC

    pFedLVM: A Large Vision Model (LVM)-Driven and Latent Feature-Based Personalized Federated Learning Framework in Autonomous Driving

    Authors: Wei-Bin Kou, Qingfeng Lin, Ming Tang, Sheng Xu, Rongguang Ye, Yang Leng, Shuai Wang, Guofa Li, Zhenyu Chen, Guangxu Zhu, Yik-Chung Wu

    Abstract: Deep learning-based Autonomous Driving (AD) models often exhibit poor generalization due to data heterogeneity in an ever domain-shifting environment. While Federated Learning (FL) could improve the generalization of an AD model (known as FedAD system), conventional models often struggle with under-fitting as the amount of accumulated training data progressively increases. To address this issue, i… ▽ More

    Submitted 17 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: This paper was submitted to CVPR 2024 in Nov. 2023

  27. arXiv:2405.01677  [pdf, other

    cs.LG cs.AI

    Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation

    Authors: Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Ming Jin, Alois Knoll

    Abstract: Ensuring the safety of Reinforcement Learning (RL) is crucial for its deployment in real-world applications. Nevertheless, managing the trade-off between reward and safety during exploration presents a significant challenge. Improving reward performance through policy adjustments may adversely affect safety performance. In this study, we aim to address this conflicting relation by leveraging the t… ▽ More

    Submitted 7 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  28. arXiv:2404.17780  [pdf, other

    cs.MA cs.AI

    Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning

    Authors: Dapeng Li, Hang Dong, Lu Wang, Bo Qiao, Si Qin, Qingwei Lin, Dongmei Zhang, Qi Zhang, Zhiwei Xu, Bin Zhang, Guoliang Fan

    Abstract: In recent years, multi-agent reinforcement learning algorithms have made significant advancements in diverse gaming environments, leading to increased interest in the broader application of such techniques. To address the prevalent challenge of partial observability, communication-based algorithms have improved cooperative performance through the sharing of numerical embedding between agents. Howe… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 12 pages, 6 figures

  29. arXiv:2404.15909  [pdf, other

    cs.CV

    Learning Long-form Video Prior via Generative Pre-Training

    Authors: Jinheng Xie, Jiajun Feng, Zhaoxu Tian, Kevin Qinghong Lin, Yawen Huang, Xi Xia, Nanxu Gong, Xu Zuo, Jiaqi Yang, Yefeng Zheng, Mike Zheng Shou

    Abstract: Concepts involved in long-form videos such as people, objects, and their interactions, can be viewed as following an implicit prior. They are notably complex and continue to pose challenges to be comprehensively learned. In recent years, generative pre-training (GPT) has exhibited versatile capacities in modeling any kind of text content even visual locations. Can this manner work for learning lon… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  30. arXiv:2404.14441  [pdf

    cs.CV cs.AI cs.LG eess.IV

    Optimizing Contrail Detection: A Deep Learning Approach with EfficientNet-b4 Encoding

    Authors: Qunwei Lin, Qian Leng, Zhicheng Ding, Chao Yan, Xiaonan Xu

    Abstract: In the pursuit of environmental sustainability, the aviation industry faces the challenge of minimizing its ecological footprint. Among the key solutions is contrail avoidance, targeting the linear ice-crystal clouds produced by aircraft exhaust. These contrails exacerbate global warming by trapping atmospheric heat, necessitating precise segmentation and comprehensive analysis of contrail images… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  31. arXiv:2404.12597  [pdf, other

    cs.LG math.ST stat.ML

    The phase diagram of kernel interpolation in large dimensions

    Authors: Haobo Zhang, Weihao Lu, Qian Lin

    Abstract: The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^γ$ for some $γ>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 18 pages, 1 figure

  32. arXiv:2404.10571  [pdf, other

    cs.CV

    CMU-Flownet: Exploring Point Cloud Scene Flow Estimation in Occluded Scenario

    Authors: Jingze Chen, Junfeng Yao, Qiqin Lin, Lei Li

    Abstract: Occlusions hinder point cloud frame alignment in LiDAR data, a challenge inadequately addressed by scene flow models tested mainly on occlusion-free datasets. Attempts to integrate occlusion handling within networks often suffer accuracy issues due to two main limitations: a) the inadequate use of occlusion information, often merging it with flow estimation without an effective integration strateg… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 14 pages

  33. arXiv:2403.15157  [pdf, other

    cs.SE

    AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models

    Authors: Chaoyun Zhang, Zicheng Ma, Yuhao Wu, Shilin He, Si Qin, Minghua Ma, Xiaoting Qin, Yu Kang, Yuyi Liang, Xiaoyu Gou, Yajie Xue, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

    Abstract: Verbatim feedback constitutes a valuable repository of user experiences, opinions, and requirements essential for software development. Effectively and efficiently extracting valuable insights from such data poses a challenging task. This paper introduces Allhands , an innovative analytic framework designed for large-scale feedback analysis through a natural language interface, leveraging large la… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  34. arXiv:2403.14390  [pdf, other

    cs.CL

    From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision

    Authors: Qingwen Lin, Boyan Xu, Zhengting Huang, Ruichu Cai

    Abstract: Addressing the challenge of high annotation costs in solving Math Word Problems (MWPs) through full supervision with intermediate equations, recent works have proposed weakly supervised task settings that rely solely on the final answer as a supervised signal. Existing leading approaches typically employ various search techniques to infer intermediate equations, but cannot ensure their semantic co… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  35. arXiv:2403.12968  [pdf, other

    cs.CL cs.LG

    LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

    Authors: Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Rühle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang

    Abstract: This paper focuses on task-agnostic prompt compression for better generalizability and efficiency. Considering the redundancy in natural language, existing approaches compress prompts by removing tokens or lexical units according to their information entropy obtained from a causal language model such as LLaMa-7B. The challenge is that information entropy may be a suboptimal compression metric: (i)… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  36. arXiv:2403.09721  [pdf, other

    cs.CL cs.AI

    A Semantic Mention Graph Augmented Model for Document-Level Event Argument Extraction

    Authors: Jian Zhang, Changlin Yang, Haiping Zhu, Qika Lin, Fangzhi Xu, Jun Liu

    Abstract: Document-level Event Argument Extraction (DEAE) aims to identify arguments and their specific roles from an unstructured document. The advanced approaches on DEAE utilize prompt-based methods to guide pre-trained language models (PLMs) in extracting arguments from input documents. They mainly concentrate on establishing relations between triggers and entity mentions within documents, leaving two u… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted By Coling 2024

  37. arXiv:2403.08593  [pdf, other

    cs.CL cs.AI

    Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments

    Authors: Sitao Cheng, Ziyuan Zhuang, Yong Xu, Fangkai Yang, Chaoyun Zhang, Xiaoting Qin, Xiang Huang, Ling Chen, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

    Abstract: Large Language Models (LLMs) have shown potential in reasoning over structured environments, e.g., knowledge graph and table. Such tasks typically require multi-hop reasoning, i.e., match natural language utterance with instances in the environment. Previous methods leverage LLMs to incrementally build a reasoning path, where the LLMs either invoke tools or pick up schemas by step-by-step interact… ▽ More

    Submitted 3 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ACL 2024 Findings. 21 pages, 7 figures, 17 tables

  38. arXiv:2403.08515  [pdf, other

    cs.NI cs.DC

    Plotinus: A Satellite Internet Digital Twin System

    Authors: Yue Gao, Kun Qiu, Zhe Chen, Wenjun Zhu, Qi Zhang, Handong Luo, Quanwei Lin, Ziheng Yang, Wenhao Liu

    Abstract: The development of an integrated space-air-ground network (SAGIN) requires sophisticated satellite Internet emulation tools that can handle complex, dynamic topologies and offer in-depth analysis. Existing emulation platforms struggle with challenges like the need for detailed implementation across all network layers, real-time response, and scalability. This paper proposes a digital twin system b… ▽ More

    Submitted 24 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  39. arXiv:2403.08255  [pdf, other

    cs.CV

    Make Me Happier: Evoking Emotions Through Image Diffusion Models

    Authors: Qing Lin, Jingfeng Zhang, Yew Soon Ong, Mengmi Zhang

    Abstract: Despite the rapid progress in image generation, emotional image editing remains under-explored. The semantics, context, and structure of an image can evoke emotional responses, making emotional image editing techniques valuable for various real-world applications, including treatment of psychological disorders, commercialization of products, and artistic design. For the first time, we present a no… ▽ More

    Submitted 27 May, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  40. arXiv:2402.17531  [pdf, other

    cs.SE cs.AI cs.CL

    Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides

    Authors: Kaikai An, Fangkai Yang, Junting Lu, Liqun Li, Zhixing Ren, Hao Huang, Lu Wang, Pu Zhao, Yu Kang, Hua Ding, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

    Abstract: Effective incident management is pivotal for the smooth operation of enterprises-level cloud services. In order to expedite incident mitigation, service teams compile troubleshooting knowledge into Troubleshooting Guides (TSGs) accessible to on-call engineers (OCEs). While automated pipelines are enabled to resolve the most frequent and easy incidents, there still exist complex incidents that requ… ▽ More

    Submitted 10 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Work in progress

  41. arXiv:2402.07939  [pdf, other

    cs.HC cs.AI cs.CL

    UFO: A UI-Focused Agent for Windows OS Interaction

    Authors: Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

    Abstract: We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a dual-agent framework to meticulously observe and analyze the graphical user interface (GUI) and control information of Windows applications. This enables the agent to seamlessly navigate and operate within individual applications… ▽ More

    Submitted 23 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  42. arXiv:2402.04267  [pdf

    physics.med-ph cs.AI cs.CV eess.IV

    Application analysis of ai technology combined with spiral CT scanning in early lung cancer screening

    Authors: Shulin Li, Liqiang Yu, Bo Liu, Qunwei Lin, Jiaxin Huang

    Abstract: At present, the incidence and fatality rate of lung cancer in China rank first among all malignant tumors. Despite the continuous development and improvement of China's medical level, the overall 5-year survival rate of lung cancer patients is still lower than 20% and is staged. A number of studies have confirmed that early diagnosis and treatment of early stage lung cancer is of great significanc… ▽ More

    Submitted 26 January, 2024; originally announced February 2024.

    Comments: This article was accepted by Frontiers in Computing and Intelligent Systems https://drpress.org/ojs/index.php/fcis/article/view/15781. arXiv admin note: text overlap with arXiv:nlin/0508031 by other authors

  43. arXiv:2402.03951  [pdf, other

    cs.CV cs.AI

    Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping

    Authors: Qinliang Lin, Cheng Luo, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, Siyang Song

    Abstract: Adversarial examples generated by a surrogate model typically exhibit limited transferability to unknown target systems. To address this problem, many transferability enhancement approaches (e.g., input transformation and model augmentation) have been proposed. However, they show poor performances in attacking systems having different model genera from the surrogate model. In this paper, we propos… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: AAAI 2024

  44. arXiv:2402.02820  [pdf, other

    cs.LG

    Revisiting VAE for Unsupervised Time Series Anomaly Detection: A Frequency Perspective

    Authors: Zexin Wang, Changhua Pei, Minghua Ma, Xin Wang, Zhihan Li, Dan Pei, Saravan Rajmohan, Dongmei Zhang, Qingwei Lin, Haiming Zhang, Jianhui Li, Gaogang Xie

    Abstract: Time series Anomaly Detection (AD) plays a crucial role for web systems. Various web systems rely on time series data to monitor and identify anomalies in real time, as well as to initiate diagnosis and remediation procedures. Variational Autoencoders (VAEs) have gained popularity in recent decades due to their superior de-noising capabilities, which are useful for anomaly detection. However, our… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: WWW 2024

  45. arXiv:2402.01684  [pdf, other

    cs.CL cs.AI cs.LG

    A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs Using the CGC-LORA Algorithm

    Authors: Chao Song, Zhihao Ye, Qiqiang Lin, Qiuying Peng, Jun Wang

    Abstract: With the productive evolution of large language models (LLMs) in the field of natural language processing (NLP), tons of effort has been made to effectively fine-tune common pre-trained LLMs to fulfill a variety of tasks in one or multiple specific domain. In practice, there are two prevailing ways, in which the adaptation can be achieved: (i) Multiple Independent Models: Pre-trained LLMs are fine… ▽ More

    Submitted 22 January, 2024; originally announced February 2024.

  46. arXiv:2402.01148  [pdf, other

    math.ST cs.LG stat.ML

    The Optimality of Kernel Classifiers in Sobolev Space

    Authors: Jianfa Lai, Zhifan Li, Dongming Huang, Qian Lin

    Abstract: Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $η(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a k… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 21 pages, 2 figures

    MSC Class: 62G08 (Primary); 68T07; 46E22 (secondary) ACM Class: G.3

  47. arXiv:2402.00740  [pdf, other

    cs.CV

    DRSM: efficient neural 4d decomposition for dynamic reconstruction in stationary monocular cameras

    Authors: Weixing Xie, Xiao Dong, Yong Yang, Qiqin Lin, Jingze Chen, Junfeng Yao, Xiaohu Guo

    Abstract: With the popularity of monocular videos generated by video sharing and live broadcasting applications, reconstructing and editing dynamic scenes in stationary monocular cameras has become a special but anticipated technology. In contrast to scene reconstructions that exploit multi-view observations, the problem of modeling a dynamic scene from a single view is significantly more under-constrained… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  48. arXiv:2402.00034  [pdf, other

    cs.DC cs.AI

    Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction

    Authors: Haozhe Li, Minghua Ma, Yudong Liu, Pu Zhao, Lingling Zheng, Ze Li, Yingnong Dang, Murali Chintalapati, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang

    Abstract: With the rapid growth of cloud computing, a variety of software services have been deployed in the cloud. To ensure the reliability of cloud services, prior studies focus on failure instance (disk, node, and switch, etc.) prediction. Once the output of prediction is positive, mitigation actions are taken to rapidly resolve the underlying failure. According to our real-world practice in Microsoft A… ▽ More

    Submitted 7 January, 2024; originally announced February 2024.

    ACM Class: K.6.3; I.2.0

  49. Activity Detection for Massive Connectivity in Cell-free Networks with Unknown Large-scale Fading, Channel Statistics, Noise Variance, and Activity Probability: A Bayesian Approach

    Authors: Hao Zhang, Qingfeng Lin, Yang Li, Lei Cheng, Yik-Chung Wu

    Abstract: Activity detection is an important task in the next generation grant-free multiple access. While there are a number of existing algorithms designed for this purpose, they mostly require precise information about the network, such as large-scale fading coefficients, small-scale fading channel statistics, noise variance at the access points, and user activity probability. Acquiring these information… ▽ More

    Submitted 2 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 16 pages, 9 figures, accepted for publication in IEEE Transactions on Signal Processing

    MSC Class: 68T01

  50. arXiv:2401.14758  [pdf, other

    cs.LG

    Off-Policy Primal-Dual Safe Reinforcement Learning

    Authors: Zifan Wu, Bo Tang, Qian Lin, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

    Abstract: Primal-dual safe RL methods commonly perform iterations between the primal update of the policy and the dual update of the Lagrange Multiplier. Such a training paradigm is highly susceptible to the error in cumulative cost estimation since this estimation serves as the key bond connecting the primal and dual update processes. We show that this problem causes significant underestimation of cost whe… ▽ More

    Submitted 15 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 Poster