Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 173 results for author: Zeng, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11518  [pdf, other

    stat.ML cs.LG stat.OT

    Ensemble Transport Filter via Optimized Maximum Mean Discrepancy

    Authors: Dengfei Zeng, Lijian Jiang

    Abstract: In this paper, we present a new ensemble-based filter method by reconstructing the analysis step of the particle filter through a transport map, which directly transports prior particles to posterior particles. The transport map is constructed through an optimization problem described by the Maximum Mean Discrepancy loss function, which matches the expectation information of the approximated poste… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 27 pages, 14 figures

  2. arXiv:2407.09580  [pdf, other

    cs.CV cs.AI

    Don't Fear Peculiar Activation Functions: EUAF and Beyond

    Authors: Qianchao Wang, Shijun Zhang, Dong Zeng, Zhaoheng Xie, Hengtao Guo, Feng-Lei Fan, Tieyong Zeng

    Abstract: In this paper, we propose a new super-expressive activation function called the Parametric Elementary Universal Activation Function (PEUAF). We demonstrate the effectiveness of PEUAF through systematic and comprehensive experiments on various industrial and image datasets, including CIFAR10, Tiny-ImageNet, and ImageNet. Moreover, we significantly generalize the family of super-expressive activatio… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2407.05383  [pdf, other

    cs.CV

    Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking

    Authors: You Wu, Xucheng Wang, Dan Zeng, Hengzhou Ye, Xiaolan Xie, Qijun Zhao, Shuiwang Li

    Abstract: Recently, the surge in the adoption of single-stream architectures utilizing pre-trained ViT backbones represents a promising advancement in the field of generic visual tracking. By integrating feature extraction and fusion into a cohesive framework, these architectures offer improved performance, efficiency, and robustness. However, there has been limited exploration into optimizing these framewo… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  4. arXiv:2407.04958  [pdf, other

    cs.LG cs.CV

    Entropy-Informed Weighting Channel Normalizing Flow

    Authors: Wei Chen, Shian Du, Shigui Li, Delu Zeng, John Paisley

    Abstract: Normalizing Flows (NFs) have gained popularity among deep generative models due to their ability to provide exact likelihood estimation and efficient sampling. However, a crucial limitation of NFs is their substantial memory requirements, arising from maintaining the dimension of the latent space equal to that of the input space. Multi-scale architectures bypass this limitation by progressively re… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  5. arXiv:2406.08037  [pdf, other

    cs.CV

    Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking

    Authors: Xiangyang Yang, Dan Zeng, Xucheng Wang, You Wu, Hengzhou Ye, Qijun Zhao, Shuiwang Li

    Abstract: Empowered by transformer-based models, visual tracking has advanced significantly. However, the slow speed of current trackers limits their applicability on devices with constrained computational resources. To address this challenge, we introduce ABTrack, an adaptive computation framework that adaptively bypassing transformer blocks for efficient visual tracking. The rationale behind ABTrack is ro… ▽ More

    Submitted 1 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  6. arXiv:2405.16761  [pdf, other

    cs.CV cs.AI cs.LG

    Masked Face Recognition with Generative-to-Discriminative Representations

    Authors: Shiming Ge, Weijia Guo, Chenyu Li, Junzheng Zhang, Yong Li, Dan Zeng

    Abstract: Masked face recognition is important for social good but challenged by diverse occlusions that cause insufficient or inaccurate representations. In this work, we propose a unified deep network to learn generative-to-discriminative representations for facilitating masked face recognition. To this end, we split the network into three modules and learn them on synthetic masked faces in a greedy modul… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by International Conference on Machine Learning 2024

  7. arXiv:2405.16456  [pdf, other

    cs.LG cs.AI

    Dominant Shuffle: A Simple Yet Powerful Data Augmentation for Time-series Prediction

    Authors: Kai Zhao, Zuojie He, Alex Hung, Dan Zeng

    Abstract: Recent studies have suggested frequency-domain Data augmentation (DA) is effec tive for time series prediction. Existing frequency-domain augmentations disturb the original data with various full-spectrum noises, leading to excess domain gap between augmented and original data. Although impressive performance has been achieved in certain cases, frequency-domain DA has yet to be generalized to time… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: https://kaizhao.net/time-series

  8. arXiv:2405.08681  [pdf, other

    cs.CV cs.AI

    Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis

    Authors: Qingpeng Kong, Ching-Hao Chiu, Dewen Zeng, Yu-Jen Chen, Tsung-Yi Ho, Jingtong hu, Yiyu Shi

    Abstract: Numerous studies have revealed that deep learning-based medical image classification models may exhibit bias towards specific demographic attributes, such as race, gender, and age. Existing bias mitigation methods often achieve high level of fairness at the cost of significant accuracy degradation. In response to this challenge, we propose an innovative and adaptable Soft Nearest Neighbor Loss-bas… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 13 pages, 3 figures, early accepted by International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024

  9. arXiv:2405.04700  [pdf, other

    cs.LG cs.AI cs.DC cs.IR

    Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures

    Authors: Ruiyang Qin, Zheyu Yan, Dewen Zeng, Zhenge Jia, Dancheng Liu, Jianbo Liu, Zhi Zheng, Ningyuan Cao, Kai Ni, Jinjun Xiong, Yiyu Shi

    Abstract: Large Language Models (LLMs) deployed on edge devices learn through fine-tuning and updating a certain portion of their parameters. Although such learning methods can be optimized to reduce resource utilization, the overall required resources remain a heavy burden on edge devices. Instead, Retrieval-Augmented Generation (RAG), a resource-efficient LLM learning method, can improve the quality of th… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  10. arXiv:2405.01884  [pdf, other

    cs.CL

    Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

    Authors: Wanlong Liu, Li Zhou, Dingyi Zeng, Yichen Xiao, Shaohuan Cheng, Chen Zhang, Grandee Lee, Malu Zhang, Wenyu Chen

    Abstract: Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a do… ▽ More

    Submitted 16 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted to Findings of ACL 2024

  11. arXiv:2404.14691  [pdf, other

    cs.DC

    Towards Fast Setup and High Throughput of GPU Serverless Computing

    Authors: Han Zhao, Weihao Cui, Quan Chen, Shulai Zhang, Zijun Li, Jingwen Leng, Chao Li, Deze Zeng, Minyi Guo

    Abstract: Integrating GPUs into serverless computing platforms is crucial for improving efficiency. However, existing solutions for GPU-enabled serverless computing platforms face two significant problems due to coarse-grained GPU management: long setup time and low function throughput. To address these issues, we propose SAGE, a GPU serverless framework with fast setup and high throughput. First, based o… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  12. arXiv:2404.14678  [pdf, other

    cs.CV

    3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset

    Authors: Junjie Zhang, Tianci Hu, Xiaoshui Huang, Yongshun Gong, Dan Zeng

    Abstract: Evaluating the performance of Multi-modal Large Language Models (MLLMs), integrating both point cloud and language, presents significant challenges. The lack of a comprehensive assessment hampers determining whether these models truly represent advancements, thereby impeding further progress in the field. Current evaluations heavily rely on classification and caption tasks, falling short in provid… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  13. arXiv:2404.13792  [pdf, other

    cs.MM cs.AI cs.CL cs.HC

    Counterfactual Reasoning Using Predicted Latent Personality Dimensions for Optimizing Persuasion Outcome

    Authors: Donghuo Zeng, Roberto S. Legaspi, Yuewen Sun, Xinshuai Dong, Kazushi Ikeda, Peter Spirtes, kun Zhang

    Abstract: Customizing persuasive conversations related to the outcome of interest for specific users achieves better persuasion results. However, existing persuasive conversation systems rely on persuasive strategies and encounter challenges in dynamically adjusting dialogues to suit the evolving states of individual users during interactions. This limitation restricts the system's ability to deliver flexib… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 14 pages, 10 figures, Accepted by Persuasive Technology 2024

  14. arXiv:2404.13789  [pdf, other

    cs.SD cs.AI cs.IR cs.MM eess.AS

    Anchor-aware Deep Metric Learning for Audio-visual Retrieval

    Authors: Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Yi Yu

    Abstract: Metric learning minimizes the gap between similar (positive) pairs of data points and increases the separation of dissimilar (negative) pairs, aiming at capturing the underlying data structure and enhancing the performance of tasks like audio-visual cross-modal retrieval (AV-CMR). Recent works employ sampling methods to select impactful data points from the embedding space during training. However… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 9 pages, 5 figures. Accepted by ACM ICMR 2024

  15. arXiv:2403.14995  [pdf, other

    cs.CV

    Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive Segmentation

    Authors: Wenlve Zhou, Zhiheng Zhou, Tianlei Wang, Delu Zeng

    Abstract: Unsupervised Domain Adaptation (UDA) endeavors to adjust models trained on a source domain to perform well on a target domain without requiring additional annotations. In the context of domain adaptive semantic segmentation, which tackles UDA for dense prediction, the goal is to circumvent the need for costly pixel-level annotations. Typically, various prevailing methods baseline rely on construct… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  16. arXiv:2403.02307  [pdf, other

    eess.IV cs.CV

    Harnessing Intra-group Variations Via a Population-Level Context for Pathology Detection

    Authors: P. Bilha Githinji, Xi Yuan, Zhenglin Chen, Ijaz Gul, Dingqi Shang, Wen Liang, Jianming Deng, Dan Zeng, Dongmei yu, Chenggang Yan, Peiwu Qin

    Abstract: Realizing sufficient separability between the distributions of healthy and pathological samples is a critical obstacle for pathology detection convolutional models. Moreover, these models exhibit a bias for contrast-based images, with diminished performance on texture-based medical images. This study introduces the notion of a population-level context for pathology detection and employs a graph th… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  17. arXiv:2403.02075  [pdf, other

    cs.CV

    DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

    Authors: Weiyi Lv, Yuhang Huang, Ning Zhang, Ruei-Sung Lin, Mei Han, Dan Zeng

    Abstract: In Multiple Object Tracking, objects often exhibit non-linear motion of acceleration and deceleration, with irregular direction changes. Tacking-by-detection (TBD) trackers with Kalman Filter motion prediction work well in pedestrian-dominant scenarios but fall short in complex situations when multiple objects perform non-linear and diverse motion simultaneously. To tackle the complex non-linear m… ▽ More

    Submitted 20 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: CVPR2024

  18. arXiv:2402.19103  [pdf, other

    cs.CL cs.AI

    Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models

    Authors: Hongbang Yuan, Pengfei Cao, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao

    Abstract: Large Language Models (LLMs) have shown impressive capabilities but still suffer from the issue of hallucinations. A significant type of this issue is the false premise hallucination, which we define as the phenomenon when LLMs generate hallucinated text when confronted with false premise questions. In this paper, we perform a comprehensive analysis of the false premise hallucination and elucidate… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 12 pages, 5 figures, 5 tables

  19. arXiv:2402.18344  [pdf, other

    cs.CL

    Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning

    Authors: Jiachun Li, Pengfei Cao, Chenhao Wang, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao

    Abstract: Large language models exhibit high-level commonsense reasoning abilities, especially with enhancement methods like Chain-of-Thought (CoT). However, we find these CoT-like methods lead to a considerable number of originally correct answers turning wrong, which we define as the Toxic CoT problem. To interpret and mitigate this problem, we first utilize attribution tracing and causal tracing methods… ▽ More

    Submitted 27 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted as a long paper to ACL 2024 Main, 25 pages, 22 figures

  20. arXiv:2402.11274  [pdf, other

    eess.IV cs.CV cs.LG

    TC-DiffRecon: Texture coordination MRI reconstruction method based on diffusion model and modified MF-UNet method

    Authors: Chenyan Zhang, Yifei Chen, Zhenxiong Fan, Yiyu Huang, Wenchao Weng, Ruiquan Ge, Dong Zeng, Changmiao Wang

    Abstract: Recently, diffusion models have gained significant attention as a novel set of deep learning-based generative methods. These models attempt to sample data from a Gaussian distribution that adheres to a target distribution, and have been successfully adapted to the reconstruction of MRI data. However, as an unconditional generative model, the diffusion model typically disrupts image coordination be… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: 5 pages, 2 figures, accept ISBI2024

    Journal ref: ISBI 2024

  21. arXiv:2402.10045  [pdf

    cs.CV cs.LG

    Short-Form Videos and Mental Health: A Knowledge-Guided Neural Topic Model

    Authors: Jiaheng Xie, Ruicheng Liang, Yidong Chai, Yang Liu, Daniel Zeng

    Abstract: While short-form videos head to reshape the entire social media landscape, experts are exceedingly worried about their depressive impacts on viewers, as evidenced by medical studies. To prevent widespread consequences, platforms are eager to predict these videos' impact on viewers' mental health. Subsequently, they can take intervention measures, such as revising recommendation algorithms and disp… ▽ More

    Submitted 21 March, 2024; v1 submitted 10 January, 2024; originally announced February 2024.

  22. arXiv:2401.17699  [pdf, other

    cs.CV

    Unified Physical-Digital Face Attack Detection

    Authors: Hao Fang, Ajian Liu, Haocheng Yuan, Junze Zheng, Dingheng Zeng, Yanhong Liu, Jiankang Deng, Sergio Escalera, Xiaoming Liu, Jun Wan, Zhen Lei

    Abstract: Face Recognition (FR) systems can suffer from physical (i.e., print photo) and digital (i.e., DeepFake) attacks. However, previous related work rarely considers both situations at the same time. This implies the deployment of multiple models and thus more computational burden. The main reasons for this lack of an integrated model are caused by two factors: (1) The lack of a dataset including both… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 12 pages, 8 figures

  23. arXiv:2401.13929  [pdf, other

    cs.LG stat.AP stat.ME stat.ML

    Reinforcement Learning with Hidden Markov Models for Discovering Decision-Making Dynamics

    Authors: Xingche Guo, Donglin Zeng, Yuanjia Wang

    Abstract: Major depressive disorder (MDD) presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  24. arXiv:2401.05031  [pdf, other

    cs.DC

    OTAS: An Elastic Transformer Serving System via Token Adaptation

    Authors: Jinyu Chen, Wenchao Xu, Zicong Hong, Song Guo, Haozhao Wang, Jie Zhang, Deze Zeng

    Abstract: Transformer model empowered architectures have become a pillar of cloud services that keeps reshaping our society. However, the dynamic query loads and heterogeneous user requirements severely challenge current transformer serving systems, which rely on pre-training multiple variants of a foundation model, i.e., with different sizes, to accommodate varying service demands. Unfortunately, such a me… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Accepted by INFOCOM '24

  25. arXiv:2401.03088  [pdf, other

    cs.RO cs.HC

    The RoSiD Tool: Empowering Users to Design Multimodal Signals for Human-Robot Collaboration

    Authors: Nathaniel Dennler, David Delgado, Daniel Zeng, Stefanos Nikolaidis, Maja Matarić

    Abstract: Robots that cooperate with humans must be effective at communicating with them. However, people have varied preferences for communication based on many contextual factors, such as culture, environment, and past experience. To communicate effectively, robots must take those factors into consideration. In this work, we present the Robot Signal Design (RoSiD) tool to empower people to easily self-spe… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: Accepted to ISER 2023. 8 pages, 4 figures

  26. arXiv:2401.01717  [pdf

    cs.CV

    Fact-checking based fake news detection: a review

    Authors: Yuzhou Yang, Yangming Zhou, Qichao Ying, Zhenxing Qian, Dan Zeng, Liang Liu

    Abstract: This paper reviews and summarizes the research results on fact-based fake news from the perspectives of tasks and problems, algorithm strategies, and datasets. First, the paper systematically explains the task definition and core problems of fact-based fake news detection. Second, the paper summarizes the existing detection methods based on the algorithm principles. Third, the paper analyzes the c… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: Invited short review paper (in Chinese)

  27. arXiv:2401.01667  [pdf, other

    cs.CL

    MLPs Compass: What is learned when MLPs are combined with PLMs?

    Authors: Li Zhou, Wenyu Chen, Yong Cao, Dingyi Zeng, Wanlong Liu, Hong Qu

    Abstract: While Transformer-based pre-trained language models and their variants exhibit strong semantic representation capabilities, the question of comprehending the information gain derived from the additional components of PLMs remains an open question in this field. Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outper… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  28. arXiv:2312.15927  [pdf, other

    cs.CV cs.LG

    M3D: Dataset Condensation by Minimizing Maximum Mean Discrepancy

    Authors: Hansong Zhang, Shikun Li, Pengju Wang, Dan Zeng, Shiming Ge

    Abstract: Training state-of-the-art (SOTA) deep models often requires extensive data, resulting in substantial training and storage costs. To address these challenges, dataset condensation has been developed to learn a small synthetic set that preserves essential information from the original large-scale dataset. Nowadays, optimization-oriented methods have been the primary method in the field of dataset co… ▽ More

    Submitted 25 February, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

    Comments: This work has been accepted by AAAI-24

  29. arXiv:2312.15548  [pdf, other

    cs.CL cs.AI

    YAYI-UIE: A Chat-Enhanced Instruction Tuning Framework for Universal Information Extraction

    Authors: Xinglin Xiao, Yijie Wang, Nan Xu, Yuqi Wang, Hanxuan Yang, Minzheng Wang, Yin Luo, Lei Wang, Wenji Mao, Daniel Zeng

    Abstract: The difficulty of the information extraction task lies in dealing with the task-specific label schemas and heterogeneous data structures. Recent work has proposed methods based on large language models to uniformly model different information extraction tasks. However, these existing methods are deficient in their information extraction capabilities for Chinese languages other than English. In thi… ▽ More

    Submitted 2 April, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

  30. arXiv:2312.14862  [pdf, other

    cs.CL cs.AI

    YAYI 2: Multilingual Open-Source Large Language Models

    Authors: Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang, Fan Feng, Feifei Zhao, Hailong Sun, Hanxuan Yang, Haojun Pan, Hongyu Liu, Jianbin Guo, Jiangtao Du, Jingyi Wang, Junfeng Li, Lei Sun, Liduo Liu, Lifeng Dong, Lili Liu, Lin Wang , et al. (28 additional authors not shown)

    Abstract: As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and ga… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  31. arXiv:2312.13611  [pdf, other

    cs.LG cs.NI eess.SP

    Topology Learning for Heterogeneous Decentralized Federated Learning over Unreliable D2D Networks

    Authors: Zheshun Wu, Zenglin Xu, Dun Zeng, Junfan Li, Jie Liu

    Abstract: With the proliferation of intelligent mobile devices in wireless device-to-device (D2D) networks, decentralized federated learning (DFL) has attracted significant interest. Compared to centralized federated learning (CFL), DFL mitigates the risk of central server failures due to communication bottlenecks. However, DFL faces several challenges, such as the severe heterogeneity of data distributions… ▽ More

    Submitted 10 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: To appear in IEEE Transactions on Vehicular Technology

  32. arXiv:2312.08402  [pdf, other

    cs.LG cs.AI

    LDM$^2$: A Large Decision Model Imitating Human Cognition with Dynamic Memory Enhancement

    Authors: Xingjin Wang, Linjing Li, Daniel Zeng

    Abstract: With the rapid development of large language models (LLMs), it is highly demanded that LLMs can be adopted to make decisions to enable the artificial general intelligence. Most approaches leverage manually crafted examples to prompt the LLMs to imitate the decision process of human. However, designing optimal prompts is difficult and the patterned prompts can hardly be generalized to more complex… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Findings of the Association for Computational Linguistics: EMNLP 2023

  33. arXiv:2312.07401  [pdf, other

    cs.AI

    On Diversified Preferences of Large Language Model Alignment

    Authors: Dun Zeng, Yong Dai, Pengyu Cheng, Longyue Wang, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu

    Abstract: Aligning large language models (LLMs) with human preferences has been recognized as the key to improving LLMs' interaction quality. However, in this pluralistic world, human preferences can be diversified due to annotators' different tastes, which hinders the effectiveness of LLM alignment methods. This paper presents the first quantitative analysis of commonly used human feedback datasets to inve… ▽ More

    Submitted 17 April, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: preprint

  34. arXiv:2312.07331  [pdf, other

    cs.LG cs.CV cs.HC

    Coupled Confusion Correction: Learning from Crowds with Sparse Annotations

    Authors: Hansong Zhang, Shikun Li, Dan Zeng, Chenggang Yan, Shiming Ge

    Abstract: As the size of the datasets getting larger, accurately annotating such datasets is becoming more impractical due to the expensiveness on both time and economy. Therefore, crowd-sourcing has been widely adopted to alleviate the cost of collecting labels, which also inevitably introduces label noise and eventually degrades the performance of the model. To learn from crowd-sourcing annotations, model… ▽ More

    Submitted 20 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: This work has been accepted by AAAI-24

  35. arXiv:2312.05572  [pdf, other

    cs.CV

    R2-Talker: Realistic Real-Time Talking Head Synthesis with Hash Grid Landmarks Encoding and Progressive Multilayer Conditioning

    Authors: Zhiling Ye, LiangGuo Zhang, Dingheng Zeng, Quan Lu, Ning Jiang

    Abstract: Dynamic NeRFs have recently garnered growing attention for 3D talking portrait synthesis. Despite advances in rendering speed and visual quality, challenges persist in enhancing efficiency and effectiveness. We present R2-Talker, an efficient and effective framework enabling realistic real-time talking head synthesis. Specifically, using multi-resolution hash grids, we introduce a novel approach f… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  36. arXiv:2312.05568  [pdf, other

    cs.LG cs.AI

    Sparse Variational Student-t Processes

    Authors: Jian Xu, Delu Zeng

    Abstract: The theory of Bayesian learning incorporates the use of Student-t Processes to model heavy-tailed distributions and datasets with outliers. However, despite Student-t Processes having a similar computational complexity as Gaussian Processes, there has been limited emphasis on the sparse representation of this model. This is mainly due to the increased difficulty in modeling and computation compare… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  37. arXiv:2311.15436  [pdf, other

    cs.CL

    Learning to Skip for Language Modeling

    Authors: Dewen Zeng, Nan Du, Tao Wang, Yuanzhong Xu, Tao Lei, Zhifeng Chen, Claire Cui

    Abstract: Overparameterized large-scale language models have impressive generalization performance of in-context few-shot learning. However, most language models allocate the same amount of parameters or computation to each token, disregarding the complexity or importance of the input data. We argue that in language model pretraining, a variable amount of computation should be assigned to different tokens,… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  38. arXiv:2311.10341  [pdf, other

    cs.LG cs.AI

    Federated Knowledge Graph Completion via Latent Embedding Sharing and Tensor Factorization

    Authors: Maolin Wang, Dun Zeng, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao

    Abstract: Knowledge graphs (KGs), which consist of triples, are inherently incomplete and always require completion procedure to predict missing triples. In real-world scenarios, KGs are distributed across clients, complicating completion tasks due to privacy restrictions. Many frameworks have been proposed to address the issue of federated knowledge graph completion. However, the existing frameworks, inclu… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: Accepted by ICDM 2023

  39. arXiv:2310.13451  [pdf, other

    cs.SD cs.CV cs.IR cs.MM eess.AS

    Two-Stage Triplet Loss Training with Curriculum Augmentation for Audio-Visual Retrieval

    Authors: Donghuo Zeng, Kazushi Ikeda

    Abstract: The cross-modal retrieval model leverages the potential of triple loss optimization to learn robust embedding spaces. However, existing methods often train these models in a singular pass, overlooking the distinction between semi-hard and hard triples in the optimization process. The oversight of not distinguishing between semi-hard and hard triples leads to suboptimal model performance. In this p… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 8 pages, 6 figures

  40. arXiv:2310.09772  [pdf, other

    cs.CL

    Rethinking Relation Classification with Graph Meaning Representations

    Authors: Li Zhou, Wenyu Chen, Dingyi Zeng, Malu Zhang, Daniel Hershcovich

    Abstract: In the field of natural language understanding, the intersection of neural models and graph meaning representations (GMRs) remains a compelling area of research. Despite the growing interest, a critical gap persists in understanding the exact influence of GMRs, particularly concerning relation extraction tasks. Addressing this, we introduce DAGNN-plus, a simple and parameter-efficient neural archi… ▽ More

    Submitted 27 December, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: 10 pages

  41. arXiv:2310.07171  [pdf, other

    cs.LG cs.IT

    Advocating for the Silent: Enhancing Federated Generalization for Non-Participating Clients

    Authors: Zheshun Wu, Zenglin Xu, Dun Zeng, Qifan Wang, Jie Liu

    Abstract: Federated Learning (FL) has surged in prominence due to its capability of collaborative model training without direct data sharing. However, the vast disparity in local data distributions among clients, often termed the Non-Independent Identically Distributed (Non-IID) challenge, poses a significant hurdle to FL's generalization efficacy. The scenario becomes even more complex when not all clients… ▽ More

    Submitted 3 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  42. arXiv:2310.05991  [pdf, other

    cs.CL

    Enhancing Document-level Event Argument Extraction with Contextual Clues and Role Relevance

    Authors: Wanlong Liu, Shaohuan Cheng, Dingyi Zeng, Hong Qu

    Abstract: Document-level event argument extraction poses new challenges of long input and cross-sentence inference compared to its sentence-level counterpart. However, most prior works focus on capturing the relations between candidate arguments and the event trigger in each event, ignoring two crucial points: a) non-argument contextual clue information; b) the relevance among argument roles. In this paper,… ▽ More

    Submitted 19 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Findings of ACL2023, correct some mistakes. arXiv admin note: text overlap with arXiv:2310.05116

  43. arXiv:2310.05116  [pdf, other

    cs.CL cs.IR

    Utilizing Contextual Clues and Role Correlations for Enhancing Document-level Event Argument Extraction

    Authors: Wanlong Liu, Dingyi Zeng, Li Zhou, Yichen Xiao, Weishan Kong, Malu Zhang, Shaohuan Cheng, Hongyang Zhao, Wenyu Chen

    Abstract: Document-level event argument extraction is a crucial yet challenging task within the field of information extraction. Current mainstream approaches primarily focus on the information interaction between event triggers and their arguments, facing two limitations: insufficient context interaction and the ignorance of event correlations. Here, we introduce a novel framework named CARLG (Contextual A… ▽ More

    Submitted 3 April, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: pre-submission

  44. arXiv:2310.02702  [pdf, other

    cs.LG

    FedAWARE: Maximizing Gradient Diversity for Heterogeneous Federated Server-side Optimization

    Authors: Dun Zeng, Zenglin Xu, Yu Pan, Qifan Wang, Xiaoying Tang

    Abstract: Federated learning (FL) is a distributed learning framework where numerous clients collaborate with a central server to train a model without sharing local data. However, the standard federated optimization in real-world applications faces both statistical and system heterogeneity challenges, which result in unfavorable convergence behavior. The previous works attempted to modify the local trainin… ▽ More

    Submitted 24 May, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Under review

  45. arXiv:2310.02698  [pdf, other

    cs.LG

    Enhanced Federated Optimization: Adaptive Unbiased Sampling with Reduced Variance

    Authors: Dun Zeng, Zenglin Xu, Yu Pan, Xu Luo, Qifan Wang, Xiaoying Tang

    Abstract: Federated Learning (FL) is a distributed learning paradigm to train a global model across multiple devices without collecting local data. In FL, a server typically selects a subset of clients for each training round to optimize resource usage. Central to this process is the technique of unbiased client sampling, which ensures a representative selection of clients. Current methods primarily utilize… ▽ More

    Submitted 4 February, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Under review

  46. arXiv:2309.15875  [pdf, other

    cs.LG cs.AI

    STAG: Enabling Low Latency and Low Staleness of GNN-based Services with Dynamic Graphs

    Authors: Jiawen Wang, Quan Chen, Deze Zeng, Zhuo Song, Chen Chen, Minyi Guo

    Abstract: Many emerging user-facing services adopt Graph Neural Networks (GNNs) to improve serving accuracy. When the graph used by a GNN model changes, representations (embedding) of nodes in the graph should be updated accordingly. However, the node representation update is too slow, resulting in either long response latency of user queries (the inference is performed after the update completes) or high s… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  47. VideoAdviser: Video Knowledge Distillation for Multimodal Transfer Learning

    Authors: Yanan Wang, Donghuo Zeng, Shinya Wada, Satoshi Kurihara

    Abstract: Multimodal transfer learning aims to transform pretrained representations of diverse modalities into a common domain space for effective multimodal fusion. However, conventional systems are typically built on the assumption that all modalities exist, and the lack of modalities always leads to poor inference performance. Furthermore, extracting pretrained embeddings for all modalities is computatio… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE Access

    Journal ref: in IEEE Access, vol. 11, pp. 51229-51240, 2023

  48. arXiv:2309.12658  [pdf, other

    cs.LG stat.ML

    Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes

    Authors: Jian Xu, Shian Du, Junmei Yang, Qianli Ma, Delu Zeng

    Abstract: Deep Gaussian Process (DGP) models offer a powerful nonparametric approach for Bayesian inference, but exact inference is typically intractable, motivating the use of various approximations. However, existing approaches, such as mean-field Gaussian assumptions, limit the expressiveness and efficacy of DGP models, while stochastic approximation can be computationally expensive. To tackle these chal… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  49. arXiv:2309.09222  [pdf, other

    cs.LG stat.ML

    Data-driven Modeling and Inference for Bayesian Gaussian Process ODEs via Double Normalizing Flows

    Authors: Jian Xu, Shian Du, Junmei Yang, Xinghao Ding, John Paisley, Delu Zeng

    Abstract: Recently, Gaussian processes have been used to model the vector field of continuous dynamical systems, referred to as GPODEs, which are characterized by a probabilistic ODE equation. Bayesian inference for these models has been extensively studied and applied in tasks such as time series prediction. However, the use of standard GPs with basic kernels like squared exponential kernels has been commo… ▽ More

    Submitted 2 January, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

  50. arXiv:2308.11450  [pdf, other

    cs.CV

    Towards Discriminative Representations with Contrastive Instances for Real-Time UAV Tracking

    Authors: Dan Zeng, Mingliang Zou, Xucheng Wang, Shuiwang Li

    Abstract: Maintaining high efficiency and high precision are two fundamental challenges in UAV tracking due to the constraints of computing resources, battery capacity, and UAV maximum load. Discriminative correlation filters (DCF)-based trackers can yield high efficiency on a single CPU but with inferior precision. Lightweight Deep learning (DL)-based trackers can achieve a good balance between efficiency… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2308.10262