Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 127 results for author: Zeng, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.20898  [pdf, other

    cs.SE

    ThinkRepair: Self-Directed Automated Program Repair

    Authors: Xin Yin, Chao Ni, Shaohua Wang, Zhenhao Li, Limin Zeng, Xiaohu Yang

    Abstract: Though many approaches have been proposed for Automated Program Repair (APR) and indeed achieved remarkable performance, they still have limitations in fixing bugs that require analyzing and reasoning about the logic of the buggy program. Recently, large language models (LLMs) instructed by prompt engineering have attracted much attention for their powerful ability to address many kinds of tasks i… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: Accepted By ISSTA'24

  2. arXiv:2407.19694  [pdf

    cs.CV

    Structural damage detection via hierarchical damage information with volumetric assessment

    Authors: Isaac Osei Agyemang, Jianwen Chen, Liaoyuan Zeng, Isaac Adjei-Mensah, Daniel Acheampong, Gordon Owusu Boateng, Adu Asare Baffour

    Abstract: Image environments and noisy labels hinder deep learning-based inference models in structural damage detection. Post-detection, there is the challenge of reliance on manual assessments of detected damages. As a result, Guided-DetNet, characterized by Generative Attention Module (GAM), Hierarchical Elimination Algorithm (HEA), and Volumetric Contour Visual Assessment (VCVA), is proposed to mitigate… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  3. arXiv:2407.18054  [pdf, other

    eess.IV cs.CV

    LKCell: Efficient Cell Nuclei Instance Segmentation with Large Convolution Kernels

    Authors: Ziwei Cui, Jingfeng Yao, Lunbin Zeng, Juan Yang, Wenyu Liu, Xinggang Wang

    Abstract: The segmentation of cell nuclei in tissue images stained with the blood dye hematoxylin and eosin (H$\&$E) is essential for various clinical applications and analyses. Due to the complex characteristics of cellular morphology, a large receptive field is considered crucial for generating high-quality segmentation. However, previous methods face challenges in achieving a balance between the receptiv… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  4. arXiv:2407.15320  [pdf, other

    cs.DC cs.AI cs.LG cs.NI

    Edge Graph Intelligence: Reciprocally Empowering Edge Networks with Graph Intelligence

    Authors: Liekang Zeng, Shengyuan Ye, Xu Chen, Xiaoxi Zhang, Ju Ren, Jian Tang, Yang Yang, Xuemin, Shen

    Abstract: Recent years have witnessed a thriving growth of computing facilities connected at the network edge, cultivating edge computing networks as a fundamental infrastructure for supporting miscellaneous intelligent services. Meanwhile, Artificial Intelligence frontiers have extrapolated Machine Learning to the graph domain and promoted Graph Intelligence (GI), which unlocks unprecedented ability in lea… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 38 pages, 14 figures

  5. arXiv:2407.12857  [pdf, other

    cs.CL cs.DL cs.IR

    Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis

    Authors: Jianxiang Yu, Zichen Ding, Jiaqi Tan, Kangyang Luo, Zhenmin Weng, Chenghua Gong, Long Zeng, Renjing Cui, Chengcheng Han, Qiushi Sun, Zhiyong Wu, Yunshi Lan, Xiang Li

    Abstract: In recent years, the rapid increase in scientific papers has overwhelmed traditional review mechanisms, resulting in varying quality of publications. Although existing methods have explored the capabilities of Large Language Models (LLMs) for automated scientific reviewing, their generated contents are often generic or partial. To address the issues above, we introduce an automated paper reviewing… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  6. arXiv:2407.08348  [pdf, other

    cs.AI cs.CL cs.LG

    Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

    Authors: Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou

    Abstract: In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model… ▽ More

    Submitted 17 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  7. arXiv:2406.19791  [pdf, other

    cs.RO

    Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

    Authors: Yifan Tang, Cong Tai, Fangxing Chen, Wanting Zhang, Tao Zhang, Xueping Liu, Yongjin Liu, Long Zeng

    Abstract: Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed,… ▽ More

    Submitted 30 June, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: This version has been accepted by ICRA2024 and the dataset has been published, where the link can be found in the paper

    Journal ref: IEEE International Conference on Robotics & Automation,2024

  8. arXiv:2406.19613  [pdf, other

    cs.DC

    Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing

    Authors: Rui Li, Tao Ouyang, Liekang Zeng, Guocheng Liao, Zhi Zhou, Xu Chen

    Abstract: Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics. Nevertheless, as the key knob to improve network utility in CEC, existing works mainly focus on the workload routing strategies among edge devices with the aim of minimizing the routing cost, remaining a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE/ACM TRANSACTIONS ON NETWORKING (ToN)

  9. arXiv:2406.14283  [pdf, other

    cs.AI

    Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

    Authors: Chaojie Wang, Yanchen Deng, Zhiyi Lyu, Liang Zeng, Jujie He, Shuicheng Yan, Bo An

    Abstract: Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. However, the auto-regressive generation process makes LLMs prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. In this paper, by casting multi-step reasoning of LLMs as a heuristic search problem, we aim to alleviate the pathology by introducing… ▽ More

    Submitted 22 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  10. arXiv:2406.08877  [pdf, other

    cs.CV cs.AI

    EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding

    Authors: Yuan-Ming Li, Wei-Jin Huang, An-Lan Wang, Ling-An Zeng, Jing-Ke Meng, Wei-Shi Zheng

    Abstract: We present EgoExo-Fitness, a new full-body action understanding dataset, featuring fitness sequence videos recorded from synchronized egocentric and fixed exocentric (third-person) cameras. Compared with existing full-body action understanding datasets, EgoExo-Fitness not only contains videos from first-person perspectives, but also provides rich annotations. Specifically, two-level temporal bound… ▽ More

    Submitted 16 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by ECCV2024

  11. arXiv:2406.06563  [pdf, other

    cs.CL cs.AI

    Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

    Authors: Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

    Abstract: In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  12. arXiv:2406.04603  [pdf, ps, other

    cs.CV

    Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network

    Authors: Xinquan Yang, Xuguang Li, Xiaoling Luo, Leilei Zeng, Yudi Zhang, Linlin Shen, Yongqiang Deng

    Abstract: Surgical guide plate is an important tool for the dental implant surgery. However, the design process heavily relies on the dentist to manually simulate the implant angle and depth. When deep neural networks have been applied to assist the dentist quickly locates the implant position, most of them are not able to determine the implant depth. Inspired by the video grounding task which localizes the… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Journal ref: MICCAI'2024

  13. arXiv:2406.00605  [pdf, other

    cs.CL cs.AI

    LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models

    Authors: Liang Zhao, Tianwen Wei, Liang Zeng, Cheng Cheng, Liu Yang, Peng Cheng, Lijie Wang, Chenxia Li, Xuejie Wu, Bo Zhu, Yimeng Gan, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

    Abstract: We introduce LongSkywork, a long-context Large Language Model (LLM) capable of processing up to 200,000 tokens. We provide a training recipe for efficiently extending context length of LLMs. We identify that the critical element in enhancing long-context processing capability is to incorporate a long-context SFT stage following the standard SFT stage. A mere 200 iterations can convert the standard… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  14. arXiv:2406.00023  [pdf, other

    cs.CL

    LocMoE+: Enhanced Router with Token Feature Awareness for Efficient LLM Pre-Training

    Authors: Jing Li, Zhijie Sun, Dachao Lin, Xuan He, Yi Lin, Binfan Zheng, Li Zeng, Rongqian Zhao, Xin Chen

    Abstract: Mixture-of-Experts (MoE) architectures have recently gained increasing popularity within the domain of large language models (LLMs) due to their ability to significantly reduce training and inference overhead. However, MoE architectures face challenges, such as significant disparities in the number of tokens assigned to each expert and a tendency toward homogenization among experts, which adversel… ▽ More

    Submitted 23 May, 2024; originally announced June 2024.

  15. GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

    Authors: Boming Zhao, Yuan Li, Ziyu Sun, Lin Zeng, Yujun Shen, Rui Ma, Yinda Zhang, Hujun Bao, Zhaopeng Cui

    Abstract: Forecasting future scenarios in dynamic environments is essential for intelligent decision-making and navigation, a challenge yet to be fully realized in computer vision and robotics. Traditional approaches like video prediction and novel-view synthesis either lack the ability to forecast from arbitrary viewpoints or to predict temporal dynamics. In this paper, we introduce GaussianPrediction, a n… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to SIGGRAPH 2024 Conference. Project Page: https://zju3dv.github.io/gaussian-prediction/

  16. arXiv:2405.18739  [pdf, other

    cs.NI eess.SP

    FlocOff: Data Heterogeneity Resilient Federated Learning with Communication-Efficient Edge Offloading

    Authors: Mulei Ma, Chenyu Gong, Liekang Zeng, Yang Yang, Liantao Wu

    Abstract: Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity.… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  17. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  18. arXiv:2405.17245  [pdf, other

    cs.DC cs.AI cs.LG cs.NI

    Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

    Authors: Shengyuan Ye, Jiangsu Du, Liekang Zeng, Wenzhong Ou, Xiaowen Chu, Yutong Lu, Xu Chen

    Abstract: Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recogniz… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE International Conference on Computer Communications 2024

  19. arXiv:2405.03924  [pdf, other

    cs.DB cs.AI cs.LG

    NeurDB: An AI-powered Autonomous Data System

    Authors: Beng Chin Ooi, Shaofeng Cai, Gang Chen, Yanyan Shen, Kian-Lee Tan, Yuncheng Wu, Xiaokui Xiao, Naili Xing, Cong Yue, Lingze Zeng, Meihui Zhang, Zhanhao Zhao

    Abstract: In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. The imminent fusion of AI and DB (AIxDB) promises a new generation of data systems, which will relieve the burden on end-users across all industry sectors by featuring AI-enhanced functionalities, such as personalized and automated in-database AI-powered analytics, sel… ▽ More

    Submitted 4 July, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  20. arXiv:2405.00568  [pdf, other

    cs.DB cs.AI

    Powering In-Database Dynamic Model Slicing for Structured Data Analytics

    Authors: Lingze Zeng, Naili Xing, Shaofeng Cai, Gang Chen, Beng Chin Ooi, Jian Pei, Yuncheng Wu

    Abstract: Relational database management systems (RDBMS) are widely used for the storage and retrieval of structured data. To derive insights beyond statistical aggregation, we typically have to extract specific subdatasets from the database using conventional database operations, and then apply deep neural networks (DNN) training and inference on these respective subdatasets in a separate machine learning… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  21. arXiv:2404.17766  [pdf, other

    cs.LG cs.AI cs.DC cs.NI

    Implementation of Big AI Models for Wireless Networks with Collaborative Edge Computing

    Authors: Liekang Zeng, Shengyuan Ye, Xu Chen, Yang Yang

    Abstract: Big Artificial Intelligence (AI) models have emerged as a crucial element in various intelligent applications at the edge, such as voice assistants in smart homes and autonomous robotics in smart factories. Training big AI models, e.g., for personalized fine-tuning and continual model refinement, poses significant challenges to edge devices due to the inherent conflict between limited computing re… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  22. arXiv:2404.13692  [pdf, other

    cs.CV

    A sustainable development perspective on urban-scale roof greening priorities and benefits

    Authors: Jie Shao, Wei Yao, Lei Luo, Linzhou Zeng, Zhiyi He, Puzuo Wang, Huadong Guo

    Abstract: Greenspaces are tightly linked to human well-being. Yet, rapid urbanization has exacerbated greenspace exposure inequality and declining human life quality. Roof greening has been recognized as an effective strategy to mitigate these negative impacts. Understanding priorities and benefits is crucial to promoting green roofs. Here, using geospatial big data, we conduct an urban-scale assessment of… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  23. arXiv:2404.08681  [pdf, other

    cs.CL

    EFSA: Towards Event-Level Financial Sentiment Analysis

    Authors: Tianyu Chen, Yiming Zhang, Guoxin Yu, Dapeng Zhang, Li Zeng, Qing He, Xiang Ao

    Abstract: In this paper, we extend financial sentiment analysis~(FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extrac… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  24. arXiv:2404.04661  [pdf, other

    cs.LG cs.AI

    Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning

    Authors: Tianle Pu, Changjun Fan, Mutian Shen, Yizhou Lu, Li Zeng, Zohar Nussinov, Chao Chen, Zhong Liu

    Abstract: Many complex problems encountered in both production and daily life can be conceptualized as combinatorial optimization problems (COPs) over graphs. Recent years, reinforcement learning (RL) based models have emerged as a promising direction, which treat the COPs solving as a heuristic learning problem. However, current finite-horizon-MDP based RL models have inherent limitations. They are not all… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  25. arXiv:2404.01875  [pdf, other

    eess.SP cs.DC cs.IT cs.LG

    Satellite Federated Edge Learning: Architecture Design and Convergence Analysis

    Authors: Yuanming Shi, Li Zeng, Jingyang Zhu, Yong Zhou, Chunxiao Jiang, Khaled B. Letaief

    Abstract: The proliferation of low-earth-orbit (LEO) satellite networks leads to the generation of vast volumes of remote sensing data which is traditionally transferred to the ground server for centralized processing, raising privacy and bandwidth concerns. Federated edge learning (FEEL), as a distributed machine learning approach, has the potential to address these challenges by sharing only model paramet… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 16 pages, 15 figures

  26. arXiv:2404.01050  [pdf, other

    cs.CV cs.GR cs.HC cs.LG

    Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

    Authors: Haofeng Liu, Chenshu Xu, Yifei Yang, Lihua Zeng, Shengfeng He

    Abstract: Point-based interactive editing serves as an essential tool to complement the controllability of existing generative models. A concurrent work, DragDiffusion, updates the diffusion latent map in response to user inputs, causing global latent map alterations. This results in imprecise preservation of the original content and unsuccessful editing due to gradient vanishing. In contrast, we present Dr… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  27. arXiv:2403.09317  [pdf, other

    cs.CV cs.AI

    SD-Net: Symmetric-Aware Keypoint Prediction and Domain Adaptation for 6D Pose Estimation In Bin-picking Scenarios

    Authors: Ding-Tao Huang, En-Te Lin, Lipeng Chen, Li-Fu Liu, Long Zeng

    Abstract: Despite the success in 6D pose estimation in bin-picking scenarios, existing methods still struggle to produce accurate prediction results for symmetry objects and real world scenarios. The primary bottlenecks include 1) the ambiguity keypoints caused by object symmetries; 2) the domain gap between real and synthetic data. To circumvent these problem, we propose a new 6D pose estimation network wi… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  28. KeNet:Knowledge-enhanced Doc-Label Attention Network for Multi-label text classification

    Authors: Bo Li, Yuyan Chen, Liang Zeng

    Abstract: Multi-Label Text Classification (MLTC) is a fundamental task in the field of Natural Language Processing (NLP) that involves the assignment of multiple labels to a given text. MLTC has gained significant importance and has been widely applied in various domains such as topic recognition, recommendation systems, sentiment analysis, and information retrieval. However, traditional machine learning an… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted in ICASSP 2024

  29. arXiv:2403.00331  [pdf, other

    cs.DC

    WindGP: Efficient Graph Partitioning on Heterogenous Machines

    Authors: Li Zeng, Haohan Huang, Binfan Zheng, Kang Yang, Shengcheng Shao, Jinhua Zhou, Jun Xie, Rongqian Zhao, Xin Chen

    Abstract: Graph Partitioning is widely used in many real-world applications such as fraud detection and social network analysis, in order to enable the distributed graph computing on large graphs. However, existing works fail to balance the computation cost and communication cost on machines with different power (including computing capability, network bandwidth and memory size), as they only consider repli… ▽ More

    Submitted 6 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: 19 pages, 15 figures, 18 tables

  30. arXiv:2402.12772  [pdf, other

    cs.HC

    GazePrompt: Enhancing Low Vision People's Reading Experience with Gaze-Aware Augmentations

    Authors: Ru Wang, Zach Potter, Yun Ho, Daniel Killough, Linxiu Zeng, Sanbrita Mondal, Yuhang Zhao

    Abstract: Reading is a challenging task for low vision people. While conventional low vision aids (e.g., magnification) offer certain support, they cannot fully address the difficulties faced by low vision users, such as locating the next line and distinguishing similar words. To fill this gap, we present GazePrompt, a gaze-aware reading aid that provides timely and targeted visual and audio augmentations b… ▽ More

    Submitted 22 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    ACM Class: H.5.2

  31. arXiv:2402.09444  [pdf, other

    eess.SP cs.AI cs.CV

    Multimodal Action Quality Assessment

    Authors: Ling-An Zeng, Wei-Shi Zheng

    Abstract: Action quality assessment (AQA) is to assess how well an action is performed. Previous works perform modelling by only the use of visual information, ignoring audio information. We argue that although AQA is highly dependent on visual information, the audio is useful complementary information for improving the score regression accuracy, especially for sports with background music, such as figure s… ▽ More

    Submitted 20 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: IEEE Transactions on Image Processing 2024

    ACM Class: I.2.10

  32. arXiv:2401.13920  [pdf, other

    cs.LG cs.AI cs.CL

    LocMoE: A Low-Overhead MoE for Large Language Model Training

    Authors: Jing Li, Zhijie Sun, Xuan He, Li Zeng, Yi Lin, Entong Li, Binfan Zheng, Rongqian Zhao, Xin Chen

    Abstract: The Mixtures-of-Experts (MoE) model is a widespread distributed and integrated learning method for large language models (LLM), which is favored due to its ability to sparsify and expand models efficiently. However, the performance of MoE is limited by load imbalance and high latency of All-to-All communication, along with relatively redundant computation owing to large expert capacity. Load imbal… ▽ More

    Submitted 23 May, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 1. Update the font size of all figures. 2. Update the name of the proposed layer Grouped Average Pooling (GrAP). 3. Change the order of the Section Contribution Statement

  33. arXiv:2401.08472  [pdf, other

    cs.CV

    Instilling Multi-round Thinking to Text-guided Image Generation

    Authors: Lidong Zeng, Zhedong Zheng, Yinwei Wei, Tat-seng Chua

    Abstract: This paper delves into the text-guided image editing task, focusing on modifying a reference image according to user-specified textual feedback to embody specific attributes. Despite recent advancements, a persistent challenge remains that the single-round generation often overlooks crucial details, particularly in the realm of fine-grained changes like shoes or sleeves. This issue compounds over… ▽ More

    Submitted 9 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 14 pages, 6 figures

  34. arXiv:2312.15859  [pdf, other

    cs.CV

    SCPMan: Shape Context and Prior Constrained Multi-scale Attention Network for Pancreatic Segmentation

    Authors: Leilei Zeng, Xuechen Li, Xinquan Yang, Linlin Shen, Song Wu

    Abstract: Due to the poor prognosis of Pancreatic cancer, accurate early detection and segmentation are critical for improving treatment outcomes. However, pancreatic segmentation is challenged by blurred boundaries, high shape variability, and class imbalance. To tackle these problems, we propose a multiscale attention network with shape context and prior constraint for robust pancreas segmentation. Specif… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 9 pages,6 figures

  35. arXiv:2312.10618  [pdf

    stat.ME cs.LG stat.ML

    Sparse Learning and Class Probability Estimation with Weighted Support Vector Machines

    Authors: Liyun Zeng, Hao Helen Zhang

    Abstract: Classification and probability estimation have broad applications in modern machine learning and data science applications, including biology, medicine, engineering, and computer science. The recent development of a class of weighted Support Vector Machines (wSVMs) has shown great values in robustly predicting the class probability and classification for various problems with high accuracy. The cu… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  36. arXiv:2312.07327  [pdf, other

    cs.CV

    Adaptive Confidence Multi-View Hashing for Multimedia Retrieval

    Authors: Jian Zhu, Yu Cui, Zhangmin Huang, Xingyu Li, Lei Liu, Lingfang Zeng, Li-Rong Dai

    Abstract: The multi-view hash method converts heterogeneous data from multiple views into binary hash codes, which is one of the critical technologies in multimedia retrieval. However, the current methods mainly explore the complementarity among multiple views while lacking confidence learning and fusion. Moreover, in practical application scenarios, the single-view data contain redundant noise. To conduct… ▽ More

    Submitted 16 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: accepted by International Conference on Acoustics, Speech and Signal Processing 2024(ICASSP2024)

  37. arXiv:2312.01560  [pdf, ps, other

    cs.SI

    RaftGP: Random Fast Graph Partitioning

    Authors: Yu Gao, Meng Qin, Yibin Ding, Li Zeng, Chaorui Zhang, Weixi Zhang, Wei Han, Rongqian Zhao, Bo Bai

    Abstract: Graph partitioning (GP), a.k.a. community detection, is a classic problem that divides the node set of a graph into densely-connected blocks. Following prior work on the IEEE HPEC Graph Challenge benchmark and recent advances in graph machine learning, we propose a novel RAndom FasT Graph Partitioning (RaftGP) method based on an efficient graph embedding scheme. It uses the Gaussian random project… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  38. arXiv:2311.11046  [pdf

    q-bio.QM cs.LG q-bio.NC

    DenseNet and Support Vector Machine classifications of major depressive disorder using vertex-wise cortical features

    Authors: Vladimir Belov, Tracy Erwin-Grabner, Ling-Li Zeng, Christopher R. K. Ching, Andre Aleman, Alyssa R. Amod, Zeynep Basgoze, Francesco Benedetti, Bianca Besteher, Katharina Brosch, Robin Bülow, Romain Colle, Colm G. Connolly, Emmanuelle Corruble, Baptiste Couvy-Duchesne, Kathryn Cullen, Udo Dannlowski, Christopher G. Davey, Annemiek Dols, Jan Ernsting, Jennifer W. Evans, Lukas Fisch, Paola Fuentes-Claramonte, Ali Saffet Gonul, Ian H. Gotlib , et al. (63 additional authors not shown)

    Abstract: Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, h… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  39. arXiv:2311.09269  [pdf, other

    cs.CV cs.AI

    NormNet: Scale Normalization for 6D Pose Estimation in Stacked Scenarios

    Authors: En-Te Lin, Wei-Jie Lv, Ding-Tao Huang, Long Zeng

    Abstract: Existing Object Pose Estimation (OPE) methods for stacked scenarios are not robust to changes in object scale. This paper proposes a new 6DoF OPE network (NormNet) for different scale objects in stacked scenarios. Specifically, each object's scale is first learned with point-wise regression. Then, all objects in the stacked scenario are normalized into the same scale through semantic segmentation… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  40. arXiv:2311.08908  [pdf, other

    stat.ME cs.CV

    Robust Brain MRI Image Classification with SIBOW-SVM

    Authors: Liyun Zeng, Hao Helen Zhang

    Abstract: The majority of primary Central Nervous System (CNS) tumors in the brain are among the most aggressive diseases affecting humans. Early detection of brain tumor types, whether benign or malignant, glial or non-glial, is critical for cancer prevention and treatment, ultimately improving human life expectancy. Magnetic Resonance Imaging (MRI) stands as the most effective technique to detect brain tu… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  41. arXiv:2310.10957  [pdf, other

    eess.IV cs.CV

    Medical Image Segmentation via Sparse Coding Decoder

    Authors: Long Zeng, Kaigui Wu

    Abstract: Transformers have achieved significant success in medical image segmentation, owing to its capability to capture long-range dependencies. Previous works incorporate convolutional layers into the encoder module of transformers, thereby enhancing their ability to learn local relationships among pixels. However, transformers may suffer from limited generalization capabilities and reduced robustness,… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 8 pages, 1 figures

    MSC Class: 68T07; 68U10 ACM Class: I.4.6; I.4.7; I.5.1

  42. Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling

    Authors: Yuan-Ming Li, Ling-An Zeng, Jing-Ke Meng, Wei-Shi Zheng

    Abstract: Action Quality Assessment (AQA) is a task that tries to answer how well an action is carried out. While remarkable progress has been achieved, existing works on AQA assume that all the training data are visible for training at one time, but do not enable continual learning on assessing new technical actions. In this work, we address such a Continual Learning problem in AQA (Continual-AQA), which u… ▽ More

    Submitted 2 May, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: 16 pages, 8 figures

  43. arXiv:2309.07726  [pdf, other

    cs.RO

    GRID: Scene-Graph-based Instruction-driven Robotic Task Planning

    Authors: Zhe Ni, Xiaoxin Deng, Cong Tai, Xinyue Zhu, Qinghongbing Xie, Weihang Huang, Xiang Wu, Long Zeng

    Abstract: Recent works have shown that Large Language Models (LLMs) can facilitate the grounding of instructions for robotic task planning. Despite this progress, most existing works have primarily focused on utilizing raw images to aid LLMs in understanding environmental information. However, this approach not only limits the scope of observation but also typically necessitates extensive multimodal data co… ▽ More

    Submitted 10 March, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 8 pages, 10 figures

  44. arXiv:2309.07075   

    cs.CR cs.DC cs.GT

    Chained-DP: Can We Recycle Privacy Budget?

    Authors: Jingyi Li, Guangjing Huang, Liekang Zeng, Lin Chen, Xu Chen

    Abstract: Privacy-preserving vector mean estimation is a crucial primitive in federated analytics. Existing practices usually resort to Local Differentiated Privacy (LDP) mechanisms that inject random noise into users' vectors when communicating with users and the central server. Due to the privacy-utility trade-off, the privacy budget has been widely recognized as the bottleneck resource that requires well… ▽ More

    Submitted 18 September, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: Insufficient completion

  45. arXiv:2308.13774  [pdf, other

    cs.CV cs.IR cs.MM

    Central Similarity Multi-View Hashing for Multimedia Retrieval

    Authors: Jian Zhu, Wen Cheng, Yu Cui, Chang Tang, Yuyang Dai, Yong Li, Lingfang Zeng

    Abstract: Hash representation learning of multi-view heterogeneous data is the key to improving the accuracy of multimedia retrieval. However, existing methods utilize local similarity and fall short of deeply fusing the multi-view features, resulting in poor retrieval accuracy. Current methods only use local similarity to train their model. These methods ignore global similarity. Furthermore, most recent w… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: accepted by the Asia Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data (APWeb-WAIM2023)

  46. arXiv:2308.09357  [pdf, other

    cs.CV cs.MM

    Multi-scale Target-Aware Framework for Constrained Image Splicing Detection and Localization

    Authors: Yuxuan Tan, Yuanman Li, Limin Zeng, Jiaxiong Ye, Wei wang, Xia Li

    Abstract: Constrained image splicing detection and localization (CISDL) is a fundamental task of multimedia forensics, which detects splicing operation between two suspected images and localizes the spliced region on both images. Recent works regard it as a deep matching problem and have made significant progress. However, existing frameworks typically perform feature extraction and correlation matching as… ▽ More

    Submitted 21 August, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: accepted by ACMMM2023

  47. arXiv:2308.01609  [pdf, other

    cs.LG

    Feature Noise Boosts DNN Generalization under Label Noise

    Authors: Lu Zeng, Xuan Chen, Xiaoshuang Shi, Heng Tao Shen

    Abstract: The presence of label noise in the training data has a profound impact on the generalization of deep neural networks (DNNs). In this study, we introduce and theoretically demonstrate a simple feature noise method, which directly adds noise to the features of training data, can enhance the generalization of DNNs under label noise. Specifically, we conduct theoretical analyses to reveal that label n… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  48. arXiv:2308.01241  [pdf, other

    cs.NE q-bio.NC

    Digital Twin Brain: a simulation and assimilation platform for whole human brain

    Authors: Wenlian Lu, Longbin Zeng, Xin Du, Wenyong Zhang, Shitong Xiang, Huarui Wang, Jiexiang Wang, Mingda Ji, Yubo Hou, Minglong Wang, Yuhao Liu, Zhongyu Chen, Qibao Zheng, Ningsheng Xu, Jianfeng Feng

    Abstract: In this work, we present a computing platform named digital twin brain (DTB) that can simulate spiking neuronal networks of the whole human brain scale and more importantly, a personalized biological brain structure. In comparison to most brain simulations with a homogeneous global structure, we highlight that the sparseness, couplingness and heterogeneity in the sMRI, DTI and PET data of the brai… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 12 pages, 11 figures

  49. arXiv:2307.09880  [pdf, other

    cs.NI cs.CV cs.DC cs.RO

    A3D: Adaptive, Accurate, and Autonomous Navigation for Edge-Assisted Drones

    Authors: Liekang Zeng, Haowei Chen, Daipeng Feng, Xiaoxi Zhang, Xu Chen

    Abstract: Accurate navigation is of paramount importance to ensure flight safety and efficiency for autonomous drones. Recent research starts to use Deep Neural Networks to enhance drone navigation given their remarkable predictive capability for visual perception. However, existing solutions either run DNN inference tasks on drones in situ, impeded by the limited onboard resource, or offload the computatio… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE/ACM Transactions on Networking

  50. arXiv:2307.08243  [pdf, other

    cs.CV

    Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting

    Authors: Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, Yu Kong

    Abstract: Hand trajectory forecasting from egocentric views is crucial for enabling a prompt understanding of human intentions when interacting with AR/VR systems. However, existing methods handle this problem in a 2D image space which is inadequate for 3D real-world applications. In this paper, we set up an egocentric 3D hand trajectory forecasting task that aims to predict hand trajectories in a 3D space… ▽ More

    Submitted 16 September, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: ICCV 2023 Accepted (Camera Ready)