Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 245 results for author: Deng, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.15017  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    Knowledge Mechanisms in Large Language Models: A Survey and Perspective

    Authors: Mengru Wang, Yunzhi Yao, Ziwen Xu, Shuofei Qiao, Shumin Deng, Peng Wang, Xiang Chen, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

    Abstract: Understanding knowledge mechanisms in Large Language Models (LLMs) is crucial for advancing towards trustworthy AGI. This paper reviews knowledge mechanism analysis from a novel taxonomy including knowledge utilization and evolution. Knowledge utilization delves into the mechanism of memorization, comprehension and application, and creation. Knowledge evolution focuses on the dynamic progression o… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Ongoing work (v1); 34 pages, 5 figures

  2. arXiv:2407.08584  [pdf, other

    cs.DC

    Data-Locality-Aware Task Assignment and Scheduling for Distributed Job Executions

    Authors: Hailiang Zhao, Xueyan Tang, Peng Chen, Jianwei Yin, Shuiguang Deng

    Abstract: This paper investigates a data-locality-aware task assignment and scheduling problem aimed at minimizing job completion times for distributed job executions. Without prior knowledge of future job arrivals, we propose an optimal balanced task assignment algorithm (OBTA) that minimizes the completion time of each arriving job. We significantly reduce OBTA's computational overhead by narrowing the se… ▽ More

    Submitted 15 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2407.08583  [pdf, other

    cs.AI cs.CV cs.LG

    The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

    Authors: Zhen Qin, Daoyuan Chen, Wenhao Zhang, Liuyi Yao, Yilun Huang, Bolin Ding, Yaliang Li, Shuiguang Deng

    Abstract: The rapid development of large language models (LLMs) has been witnessed in recent years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from text to a broader spectrum of domains, attracting widespread attention due to the broader range of application scenarios. As LLMs and MLLMs rely on vast amounts of model parameters and data to achieve emergent capabilities, the impo… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Ongoing work. 31 pages. Related materials are continually maintained and available at https://github.com/modelscope/data-juicer/blob/main/docs/awesome_llm_data.md

  4. arXiv:2407.05784  [pdf, other

    cs.AR

    Hecaton: Training and Finetuning Large Language Models with Scalable Chiplet Systems

    Authors: Zongle Huang, Shupei Fan, Chen Tang, Xinyuan Lin, Shuwen Deng, Yongpan Liu

    Abstract: Large Language Models (LLMs) have achieved remarkable success in various fields, but their training and finetuning require massive computation and memory, necessitating parallelism which introduces heavy communication overheads. Driven by advances in packaging, the chiplet architecture emerges as a potential solution, as it can integrate computing power, as well as utilize on-package links with be… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.04272  [pdf, other

    cs.LG cs.DC

    Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression

    Authors: Hao Feng, Boyuan Zhang, Fanjiang Ye, Min Si, Ching-Hsiang Chu, Jiannan Tian, Chunxing Yin, Summer Deng, Yuchen Hao, Pavan Balaji, Tong Geng, Dingwen Tao

    Abstract: DLRM is a state-of-the-art recommendation system model that has gained widespread adoption across various industry applications. The large size of DLRM models, however, necessitates the use of multiple devices/GPUs for efficient training. A significant bottleneck in this process is the time-consuming all-to-all communication required to collect embedding data from all devices. To mitigate this, we… ▽ More

    Submitted 11 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: accepted by SC '24

  6. arXiv:2407.04192  [pdf, other

    cs.LG

    KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics

    Authors: Benjamin C. Koenig, Suyong Kim, Sili Deng

    Abstract: Kolmogorov-Arnold networks (KANs) as an alternative to multi-layer perceptrons (MLPs) are a recent development demonstrating strong potential for data-driven modeling. This work applies KANs as the backbone of a neural ordinary differential equation (ODE) framework, generalizing their use to the time-dependent and temporal grid-sensitive cases often seen in dynamical systems and scientific machine… ▽ More

    Submitted 18 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: B.C.K. and S.K. contributed equally to this work. 20 pages, 10 figures, and 4 tables. Revised upload includes additional examples and extended discussion of existing examples

    ACM Class: I.6.5; G.1.7

  7. arXiv:2407.00993  [pdf, other

    cs.AI cs.CL

    Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents

    Authors: Shihan Deng, Weikai Xu, Hongda Sun, Wei Liu, Tao Tan, Jianfeng Liu, Ang Li, Jian Luan, Bin Wang, Rui Yan, Shuo Shang

    Abstract: With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking these agents generally faces three main challenges: (1) The inefficiency of UI-only operations imposes limitations to task evaluation. (2) Specific instructions… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  8. arXiv:2406.11087  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    MemDPT: Differential Privacy for Memory Efficient Language Models

    Authors: Yanming Liu, Xinyue Peng, Jiannan Cao, Yuwei Zhang, Chen Ma, Songhang Deng, Mengchen Fu, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, Tianyu Du

    Abstract: Large language models have consistently demonstrated remarkable performance across a wide spectrum of applications. Nonetheless, the deployment of these models can inadvertently expose user privacy to potential risks. The substantial memory demands of these models during training represent a significant resource consumption challenge. The sheer size of these models imposes a considerable burden on… ▽ More

    Submitted 20 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: 12 pages first version

  9. arXiv:2406.08372  [pdf, other

    cs.CV

    APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

    Authors: Weizhao He, Yang Zhang, Wei Zhuo, Linlin Shen, Jiaqi Yang, Songhe Deng, Liang Sun

    Abstract: Few-shot semantic segmentation (FSS) endeavors to segment unseen classes with only a few labeled samples. Current FSS methods are commonly built on the assumption that their training and application scenarios share similar domains, and their performances degrade significantly while applied to a distinct domain. To this end, we propose to leverage the cutting-edge foundation model, the Segment Anyt… ▽ More

    Submitted 12 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 9 figures

  10. arXiv:2406.07686  [pdf, other

    cs.CV

    AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation

    Authors: Kai Wang, Shijian Deng, Jing Shi, Dimitrios Hatzinakos, Yapeng Tian

    Abstract: Recent Diffusion Transformers (DiTs) have shown impressive capabilities in generating high-quality single-modality content, including images, videos, and audio. However, it is still under-explored whether the transformer-based diffuser can efficiently denoise the Gaussian noises towards superb multimodal content creation. To bridge this gap, we introduce AV-DiT, a novel and efficient audio-visual… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  11. arXiv:2406.06600  [pdf, other

    cs.LG cs.AI cs.CL

    HORAE: A Domain-Agnostic Modeling Language for Automating Multimodal Service Regulation

    Authors: Yutao Sun, Mingshuai Chen, Tiancheng Zhao, Kangjia Zhao, He Li, Jintao Chen, Liqiang Lu, Xinkui Zhao, Shuiguang Deng, Jianwei Yin

    Abstract: Artificial intelligence is rapidly encroaching on the field of service regulation. This work presents the design principles behind HORAE, a unified specification language to model multimodal regulation rules across a diverse set of domains. We show how HORAE facilitates an intelligent service regulation pipeline by further exploiting a fine-tuned large language model named HORAE that automates the… ▽ More

    Submitted 18 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  12. arXiv:2406.05639  [pdf, other

    cs.SE

    A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair

    Authors: Guochang Li, Chen Zhi, Jialiang Chen, Junxiao Han, Shuiguang Deng

    Abstract: Automated Program Repair (APR) aims to fix bugs by generating patches. And existing work has demonstrated that "pre-training and fine-tuning" paradigm enables Large Language Models (LLMs) improve fixing capabilities on APR. However, existing work mainly focuses on Full-Model Fine-Tuning (FMFT) for APR and limited research has been conducted on the execution-based evaluation of Parameter-Efficient… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  13. arXiv:2406.05287  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Group-wise oracle-efficient algorithms for online multi-group learning

    Authors: Samuel Deng, Daniel Hsu, Jingwen Liu

    Abstract: We study the problem of online multi-group learning, a learning model in which an online learner must simultaneously achieve small prediction regret on a large collection of (possibly overlapping) subsequences corresponding to a family of groups. Groups are subsets of the context space, and in fairness applications, they may correspond to subpopulations defined by expressive functions of demograph… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  14. arXiv:2406.04657  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

    Authors: Vignesh Kothapalli, Tianyu Pang, Shenyang Deng, Zongmin Liu, Yaoqing Yang

    Abstract: Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT) spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs with HT weight spectra tend to generalize well. A prevailing notion for the occurrence of such HT spectra attributes gradient noise during training as a key contributing factor. Our work shows that gradient noise is unn… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 31 pages, 37 figures

  15. arXiv:2406.02554  [pdf, other

    eess.AS cs.AI cs.CL cs.CV cs.LG cs.MM

    Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition

    Authors: Shijian Deng, Erin E. Kosloski, Siddhi Patel, Zeke A. Barnett, Yiyang Nan, Alexander Kaplan, Sisira Aarukapalli, William T. Doan, Matthew Wang, Harsh Singh, Pamela R. Rollins, Yapeng Tian

    Abstract: In this article, we introduce a novel problem of audio-visual autism behavior recognition, which includes social behavior recognition, an essential aspect previously omitted in AI-assisted autism screening research. We define the task at hand as one that is audio-visual autism behavior recognition, which uses audio and visual cues, including any speech present in the audio, to recognize autism-rel… ▽ More

    Submitted 22 March, 2024; originally announced June 2024.

  16. arXiv:2405.17969  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    Knowledge Circuits in Pretrained Transformers

    Authors: Yunzhi Yao, Ningyu Zhang, Zekun Xi, Mengru Wang, Ziwen Xu, Shumin Deng, Huajun Chen

    Abstract: The remarkable capabilities of modern large language models are rooted in their vast repositories of knowledge encoded within their parameters, enabling them to perceive the world and engage in reasoning. The inner workings of how these models store knowledge have long been a subject of intense interest and investigation among researchers. To date, most studies have concentrated on isolated compon… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Work in progress, 25 pages

  17. arXiv:2405.15329  [pdf, other

    cs.CL

    Decompose and Aggregate: A Step-by-Step Interpretable Evaluation Framework

    Authors: Minzhi Li, Zhengyuan Liu, Shumin Deng, Shafiq Joty, Nancy F. Chen, Min-Yen Kan

    Abstract: The acceleration of Large Language Models (LLMs) research has opened up new possibilities for evaluating generated texts. They serve as scalable and economical evaluators, but the question of how reliable these evaluators are has emerged as a crucial research question. Prior research efforts in the meta-evaluation of LLMs as judges limit the prompting of an LLM to a single use to obtain a final ev… ▽ More

    Submitted 14 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  18. arXiv:2405.14205  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    Agent Planning with World Knowledge Model

    Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ''real'' physical world. Imitating humans' m… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Work in progress

  19. arXiv:2404.17876  [pdf, other

    cs.CV

    DF-SLAM: Dictionary Factors Representation for High-Fidelity Neural Implicit Dense Visual SLAM System

    Authors: Weifeng Wei, Jie Wang, Shuqi Deng, Jie Liu

    Abstract: We introduce a high-fidelity neural implicit dense visual Simultaneous Localization and Mapping (SLAM) system, termed DF-SLAM. In our work, we employ dictionary factors for scene representation, encoding the geometry and appearance information of the scene as a combination of basis and coefficient factors. Compared to neural implicit dense visual SLAM methods that directly encode scene information… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

  20. arXiv:2404.14755  [pdf, other

    cs.MM cs.AI cs.CV cs.HC

    SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models

    Authors: Bo Lin, Yingjing Xu, Xuanwen Bao, Zhou Zhao, Zuyong Zhang, Zhouyang Wang, Jie Zhang, Shuiguang Deng, Jianwei Yin

    Abstract: With the continuous advancement of vision language models (VLMs) technology, remarkable research achievements have emerged in the dermatology field, the fourth most prevalent human disease category. However, despite these advancements, VLM still faces "hallucination" in dermatological diagnosis, and due to the inherent complexity of dermatological conditions, existing tools offer relatively limite… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  21. arXiv:2404.06443  [pdf, other

    cs.CV

    Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition

    Authors: Zihan Wang, Siyang Song, Cheng Luo, Songhe Deng, Weicheng Xie, Linlin Shen

    Abstract: Human facial action units (AUs) are mutually related in a hierarchical manner, as not only they are associated with each other in both spatial and temporal domains but also AUs located in the same/close facial regions show stronger relationships than those of different facial regions. While none of existing approach thoroughly model such hierarchical inter-dependencies among AUs, this paper propos… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR2024

  22. arXiv:2403.19964  [pdf, other

    cs.CV cs.CY cs.LG

    FairRAG: Fair Human Generation via Fair Retrieval Augmentation

    Authors: Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, Siqi Deng

    Abstract: Existing text-to-image generative models reflect or even amplify societal biases ingrained in their training data. This is especially concerning for human image generation where models are biased against certain demographic groups. Existing attempts to rectify this issue are hindered by the inherent limitations of the pre-trained models and fail to substantially improve demographic diversity. In t… ▽ More

    Submitted 5 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  23. arXiv:2403.19460  [pdf, other

    cs.RO cs.AI

    RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation

    Authors: Chongkai Gao, Zhengrong Xue, Shuying Deng, Tianhai Liang, Siqi Yang, Lin Shao, Huazhe Xu

    Abstract: We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot Manipulation imitation learning framework from scene point cloud input. Compared to previous methods that rely on descriptor field matching, RiEMann directly predicts the target poses of objects for manipulation without any object segmentation. RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, gener… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  24. arXiv:2403.14472  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    Detoxifying Large Language Models via Knowledge Editing

    Authors: Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chen

    Abstract: This paper investigates using knowledge editing techniques to detoxify Large Language Models (LLMs). We construct a benchmark, SafeEdit, which covers nine unsafe categories with various powerful attack prompts and equips comprehensive metrics for systematic evaluation. We conduct experiments with several knowledge editing approaches, indicating that knowledge editing has the potential to detoxify… ▽ More

    Submitted 28 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: ACL 2024. Project website: https://zjunlp.github.io/project/SafeEdit Benchmark: https://huggingface.co/datasets/zjunlp/SafeEdit

  25. arXiv:2403.12029  [pdf, other

    cs.CV cs.AI cs.LG

    Align and Distill: Unifying and Improving Domain Adaptive Object Detection

    Authors: Justin Kay, Timm Haucke, Suzanne Stathatos, Siqi Deng, Erik Young, Pietro Perona, Sara Beery, Grant Van Horn

    Abstract: Object detectors often perform poorly on data that differs from their training set. Domain adaptive object detection (DAOD) methods have recently demonstrated strong results on addressing this challenge. Unfortunately, we identify systemic benchmarking pitfalls that call past results into question and hamper further progress: (a) Overestimation of performance due to underpowered baselines, (b) Inc… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 30 pages, 10 figures

  26. arXiv:2403.06259  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    Editing Conceptual Knowledge for Large Language Models

    Authors: Xiaohan Wang, Shengyu Mao, Ningyu Zhang, Shumin Deng, Yunzhi Yao, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen

    Abstract: Recently, there has been a growing interest in knowledge editing for Large Language Models (LLMs). Current approaches and evaluations merely explore the instance-level editing, while whether LLMs possess the capability to modify concepts remains unclear. This paper pioneers the investigation of editing conceptual knowledge for LLMs, by constructing a novel benchmark dataset ConceptEdit and establi… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: Work in progress. Code: https://github.com/zjunlp/EasyEdit Dataset: https://huggingface.co/datasets/zjunlp/ConceptEdit

  27. arXiv:2403.05916  [pdf, other

    cs.CV cs.AI

    GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing

    Authors: Hao Lu, Xuesong Niu, Jiyao Wang, Yin Wang, Qingyong Hu, Jiaqi Tang, Yuting Zhang, Kaishen Yuan, Bin Huang, Zitong Yu, Dengbo He, Shuiguang Deng, Hao Chen, Yingcong Chen, Shiguang Shan

    Abstract: Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. Despite its success in language understanding, it is critical to evaluate the performance of downstream tasks for better human-centric applications. This paper assesses the application of MLLMs with 5 crucial abilities for affective computing,… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  28. arXiv:2403.04981  [pdf, other

    cs.ET

    Paving the Way for Pass Disturb Free Vertical NAND Storage via A Dedicated and String-Compatible Pass Gate

    Authors: Zijian Zhao, Sola Woo, Khandker Akif Aabrar, Sharadindu Gopal Kirtania, Zhouhang Jiang, Shan Deng, Yi Xiao, Halid Mulaosmanovic, Stefan Duenkel, Dominik Kleimaier, Steven Soss, Sven Beyer, Rajiv Joshi, Scott Meninger, Mohamed Mohamed, Kijoon Kim, Jongho Woo, Suhwan Lim, Kwangsoo Kim, Wanki Kim, Daewon Ha, Vijaykrishnan Narayanan, Suman Datta, Shimeng Yu, Kai Ni

    Abstract: In this work, we propose a dual-port cell design to address the pass disturb in vertical NAND storage, which can pass signals through a dedicated and string-compatible pass gate. We demonstrate that: i) the pass disturb-free feature originates from weakening of the depolarization field by the pass bias at the high-${V}_{TH}$ (HVT) state and the screening of the applied field by channel at the low-… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 29 pages, 7 figures

  29. arXiv:2403.03101  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

    Authors: Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Ningyu Zhang, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen

    Abstract: Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories durin… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Work in progress. Project page: https://zjunlp.github.io/project/KnowAgent/ Code: https://github.com/zjunlp/KnowAgent

  30. arXiv:2403.02253  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

    Authors: Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, Nay Oo, Hoon Wei Lim, Bryan Hooi

    Abstract: Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that the… ▽ More

    Submitted 15 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by USENIX Security 2024

  31. arXiv:2402.06123  [pdf, other

    cs.DC

    Decentralized Proactive Model Offloading and Resource Allocation for Split and Federated Learning

    Authors: Binbin Huang, Hailiang Zhao, Lingbin Wang, Wenzhuo Qian, Yuyu Yin, Shuiguang Deng

    Abstract: In the resource-constrained IoT-edge environment, Split Federated (SplitFed) learning is implemented to enhance training efficiency. This method involves each IoT device dividing its full DNN model at a designated layer into a device-side model and a server-side model, then offloading the latter to the edge server. However, existing research overlooks four critical issues as follows: (1) the heter… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  32. arXiv:2402.00258  [pdf, other

    cs.LG

    Multi-group Learning for Hierarchical Groups

    Authors: Samuel Deng, Daniel Hsu

    Abstract: The multi-group learning model formalizes the learning scenario in which a single predictor must generalize well on multiple, possibly overlapping subgroups of interest. We extend the study of multi-group learning to the natural case where the groups are hierarchically structured. We design an algorithm for this setting that outputs an interpretable and deterministic decision tree predictor with n… ▽ More

    Submitted 12 June, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: Accepted in International Conference on Machine Learning 2024 (ICML 2024). Fixed reference description in "Related Work" for multi-task learning

  33. arXiv:2401.12853  [pdf, other

    cs.GR

    Hyper-Realist Rendering: A Theoretical Framework

    Authors: Ergun Akleman, Murat Kurt, Derya Akleman, Gary Bruins, Sitong Deng, Meena Subramanian

    Abstract: This is the first paper in a series on hyper-realist rendering. In this paper, we introduce the concept of hyper-realist rendering and present a theoretical framework to obtain hyper-realist images. We are using the term Hyper-realism as an umbrella word that captures all types of visual artifacts that can evoke an impression of reality. The hyper-realist artifacts are visual representations that… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 20 pages

  34. arXiv:2401.12550  [pdf, other

    cs.AI cs.LG

    UR4NNV: Neural Network Verification, Under-approximation Reachability Works!

    Authors: Zhen Liang, Taoran Wu, Ran Zhao, Bai Xue, Ji Wang, Wenjing Yang, Shaojun Deng, Wanwei Liu

    Abstract: Recently, formal verification of deep neural networks (DNNs) has garnered considerable attention, and over-approximation based methods have become popular due to their effectiveness and efficiency. However, these strategies face challenges in addressing the "unknown dilemma" concerning whether the exact output region or the introduced approximation error violates the property in question. To addre… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 11 pages, 4 figures

    MSC Class: 68Q60; 68T07 ACM Class: D.2.4; I.2.0

  35. arXiv:2401.09883  [pdf, other

    cs.CV

    Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation

    Authors: Songhe Deng, Wei Zhuo, Jinheng Xie, Linlin Shen

    Abstract: Class Activation Map (CAM) has emerged as a popular tool for weakly supervised semantic segmentation (WSSS), allowing the localization of object regions in an image using only image-level labels. However, existing CAM methods suffer from under-activation of target object regions and false-activation of background regions due to the fact that a lack of detailed supervision can hinder the model's ab… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: ACM MM 2023

  36. arXiv:2401.09334  [pdf, other

    cs.CL cs.AI

    Large Language Models Are Neurosymbolic Reasoners

    Authors: Meng Fang, Shilong Deng, Yudi Zhang, Zijing Shi, Ling Chen, Mykola Pechenizkiy, Jun Wang

    Abstract: A wide range of real-world applications is characterized by their symbolic nature, necessitating a strong capability for symbolic reasoning. This paper investigates the potential application of Large Language Models (LLMs) as symbolic reasoners. We focus on text-based games, significant benchmarks for agents with natural language capabilities, particularly in symbolic tasks like math, map reading,… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  37. arXiv:2401.07439  [pdf, other

    cs.CV

    Mask-adaptive Gated Convolution and Bi-directional Progressive Fusion Network for Depth Completion

    Authors: Tingxuan Huang, Jiacheng Miao, Shizhuo Deng, Tong, Dongyue Chen

    Abstract: Depth completion is a critical task for handling depth images with missing pixels, which can negatively impact further applications. Recent approaches have utilized Convolutional Neural Networks (CNNs) to reconstruct depth images with the assistance of color images. However, vanilla convolution has non-negligible drawbacks in handling missing pixels. To solve this problem, we propose a new model f… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  38. arXiv:2401.01286  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    A Comprehensive Study of Knowledge Editing for Large Language Models

    Authors: Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

    Abstract: Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs t… ▽ More

    Submitted 28 March, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPapers

  39. arXiv:2312.17515  [pdf, other

    cs.CL

    Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game

    Authors: Zijing Shi, Meng Fang, Shunfeng Zheng, Shilong Deng, Ling Chen, Yali Du

    Abstract: Multi-agent collaboration with Large Language Models (LLMs) demonstrates proficiency in basic tasks, yet its efficiency in more complex scenarios remains unexplored. In gaming environments, these agents often face situations without established coordination protocols, requiring them to make intelligent inferences about teammates from limited data. This problem motivates the area of ad hoc teamwork… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Comments: Code will release soon

  40. arXiv:2312.16883  [pdf, other

    cs.DC

    Tail-Learning: Adaptive Learning Method for Mitigating Tail Latency in Autonomous Edge Systems

    Authors: Cheng Zhang, Yinuo Deng, Hailiang Zhao, Tianlv Chen, Shuiguang Deng

    Abstract: In the realm of edge computing, the increasing demand for high Quality of Service (QoS), particularly in dynamic multimedia streaming applications (e.g., Augmented Reality/Virtual Reality and online gaming), has prompted the need for effective solutions. Nevertheless, adopting an edge paradigm grounded in distributed computing has exacerbated the issue of tail latency. Given a limited variety of m… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  41. arXiv:2312.06353  [pdf, other

    cs.LG cs.DC

    Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

    Authors: Zhen Qin, Daoyuan Chen, Bingchen Qian, Bolin Ding, Yaliang Li, Shuiguang Deng

    Abstract: Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions. Federated learning offers a way to fine-tune LLMs using the abundant data on end devices without compromising data privacy. Most existing federated fine-tuning methods for LLMs rely on parameter-efficient fine-tuning techniques, which may not reach the performance height poss… ▽ More

    Submitted 27 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted to ICML 2024. 25 pages, 14 figures, 7 tables. Codes are available at https://github.com/alibaba/FederatedScope/tree/FedKSeed

  42. arXiv:2312.05401  [pdf, other

    cs.GR

    A Digital Compositing Approach to obtain Animated Chinese Still-life Paintings with Global Effects

    Authors: Sitong Deng, Ergun Akleman

    Abstract: In this work, we present a method for turning Chinese still-life paintings with global illumination effects into dynamic paintings with moving lights. Our goal is to preserve the original look and feel of still-life paintings with moving lights and objects. We have developed a deceptively simple method that can be computed as a composite of two animated texture images using an animated rendering.… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 14 pages

  43. arXiv:2311.16191  [pdf, other

    cs.LG cs.AI

    Learning Multi-Pattern Normalities in the Frequency Domain for Efficient Time Series Anomaly Detection

    Authors: Feiyi Chen, Yingying zhang, Zhen Qin, Lunting Fan, Renhe Jiang, Yuxuan Liang, Qingsong Wen, Shuiguang Deng

    Abstract: Anomaly detection significantly enhances the robustness of cloud systems. While neural network-based methods have recently demonstrated strong advantages, they encounter practical challenges in cloud environments: the contradiction between the impracticality of maintaining a unique model for each service and the limited ability to deal with diverse normal patterns by a unified model, as well as is… ▽ More

    Submitted 18 March, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: Accepted by IEEE 40th International Conference on Data Engineering (ICDE 2024)

  44. arXiv:2311.09101  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Towards A Unified View of Answer Calibration for Multi-Step Reasoning

    Authors: Shumin Deng, Ningyu Zhang, Nay Oo, Bryan Hooi

    Abstract: Large Language Models (LLMs) employing Chain-of-Thought (CoT) prompting have broadened the scope for improving multi-step reasoning capabilities. We generally divide multi-step reasoning into two phases: path generation to generate the reasoning path(s); and answer calibration post-processing the reasoning path(s) to obtain a final answer. However, the existing literature lacks systematic analysis… ▽ More

    Submitted 25 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Working in Progress

  45. arXiv:2311.08588  [pdf, other

    cs.CL cs.AI cs.SE

    CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation

    Authors: Weixiang Yan, Haitian Liu, Yunkun Wang, Yunzhe Li, Qian Chen, Wen Wang, Tingyu Lin, Weishan Zhao, Li Zhu, Hari Sundaram, Shuiguang Deng

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance on assisting humans in programming and facilitating programming automation. However, existing benchmarks for evaluating the code understanding and generation capacities of LLMs suffer from severe limitations. First, most benchmarks are insufficient as they focus on a narrow range of popular programming languages and specific tas… ▽ More

    Submitted 7 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Accepted by ACL 2024 main conference

  46. arXiv:2310.14676  [pdf, other

    cs.CL

    Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding

    Authors: Shuwen Deng, Paul Prasse, David R. Reich, Tobias Scheffer, Lena A. Jäger

    Abstract: Human gaze data offer cognitive information that reflects natural language comprehension. Indeed, augmenting language models with human scanpaths has proven beneficial for a range of NLP tasks, including language understanding. However, the applicability of this approach is hampered because the abundance of text corpora is contrasted by a scarcity of gaze data. Although models for the generation o… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Pre-print for EMNLP 2023

  47. arXiv:2310.11713  [pdf, other

    cs.CV cs.SD eess.AS

    Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation

    Authors: Yiyang Su, Ali Vosoughi, Shijian Deng, Yapeng Tian, Chenliang Xu

    Abstract: The audio-visual sound separation field assumes visible sources in videos, but this excludes invisible sounds beyond the camera's view. Current methods struggle with such sounds lacking visible cues. This paper introduces a novel "Audio-Visual Scene-Aware Separation" (AVSA-Sep) framework. It includes a semantic parser for visible and invisible sounds and a separator for scene-informed separation.… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted at ICCV 2023 - AV4D, 4 figures, 3 tables

  48. arXiv:2310.10537  [pdf, other

    cs.LG cs.AI

    Microscaling Data Formats for Deep Learning

    Authors: Bita Darvish Rouhani, Ritchie Zhao, Ankit More, Mathew Hall, Alireza Khodamoradi, Summer Deng, Dhruv Choudhary, Marius Cornea, Eric Dellinger, Kristof Denolf, Stosic Dusan, Venmugil Elango, Maximilian Golub, Alexander Heinecke, Phil James-Roxby, Dharmesh Jani, Gaurav Kolhe, Martin Langhammer, Ada Li, Levi Melnick, Maral Mesmakhosroshahi, Andres Rodriguez, Michael Schulte, Rasoul Shafipour, Lei Shao , et al. (8 additional authors not shown)

    Abstract: Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical result… ▽ More

    Submitted 19 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

  49. arXiv:2310.05668  [pdf, other

    cs.LG

    LARA: A Light and Anti-overfitting Retraining Approach for Unsupervised Time Series Anomaly Detection

    Authors: Feiyi Chen, Zhen Qin, Yingying Zhang, Shuiguang Deng, Yi Xiao, Guansong Pang, Qingsong Wen

    Abstract: Most of current anomaly detection models assume that the normal pattern remains same all the time. However, the normal patterns of Web services change dramatically and frequently. The model trained on old-distribution data is outdated after such changes. Retraining the whole model every time is expensive. Besides, at the beginning of normal pattern changes, there is not enough observation data fro… ▽ More

    Submitted 23 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted by ACM Web Conference 2024 (WWW 24)

  50. arXiv:2310.02141  [pdf, other

    cs.RO eess.SY

    Adaptive Gait Modeling and Optimization for Principally Kinematic Systems

    Authors: Siming Deng, Noah J. Cowan, Brian A. Bittner

    Abstract: Robotic adaptation to unanticipated operating conditions is crucial to achieving persistence and robustness in complex real world settings. For a wide range of cutting-edge robotic systems, such as micro- and nano-scale robots, soft robots, medical robots, and bio-hybrid robots, it is infeasible to anticipate the operating environment a priori due to complexities that arise from numerous factors i… ▽ More

    Submitted 18 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: 7 pages, 4 figures