Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 520 results for author: Cai, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18616  [pdf, other

    cs.SE cs.AI cs.CL

    Towards Large Language Model Aided Program Refinement

    Authors: Yufan Cai, Zhe Hou, Xiaokun Luan, David Miguel Sanan Baena, Yun Lin, Jun Sun, Jin Song Dong

    Abstract: Program refinement involves correctness-preserving transformations from formal high-level specification statements into executable programs. Traditional verification tool support for program refinement is highly interactive and lacks automation. On the other hand, the emergence of large language models (LLMs) enables automatic code generations from informal natural language specifications. However… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    ACM Class: K.6.3

  2. arXiv:2406.18050  [pdf, other

    cs.CV

    A Multi-Stage Goal-Driven Network for Pedestrian Trajectory Prediction

    Authors: Xiuen Wu, Tao Wang, Yuanzheng Cai, Lingyu Liang, George Papageorgiou

    Abstract: Pedestrian trajectory prediction plays a pivotal role in ensuring the safety and efficiency of various applications, including autonomous vehicles and traffic management systems. This paper proposes a novel method for pedestrian trajectory prediction, called multi-stage goal-driven network (MGNet). Diverging from prior approaches relying on stepwise recursive prediction and the singular forecastin… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Paper accepted by 5th International Conference on Computer Vision, Image and Deep Learning (CVIDL 2024)

  3. arXiv:2406.16170  [pdf, other

    cs.IR cs.AI

    SimCE: Simplifying Cross-Entropy Loss for Collaborative Filtering

    Authors: Xiaodong Yang, Huiyuan Chen, Yuchen Yan, Yuxin Tang, Yuying Zhao, Eric Xu, Yiwei Cai, Hanghang Tong

    Abstract: The learning objective is integral to collaborative filtering systems, where the Bayesian Personalized Ranking (BPR) loss is widely used for learning informative backbones. However, BPR often experiences slow convergence and suboptimal local optima, partially because it only considers one negative item for each positive item, neglecting the potential impacts of other unobserved items. To address t… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  4. arXiv:2406.15819  [pdf, other

    cs.LG cs.IT cs.NI eess.SP

    Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning

    Authors: Qiushuo Hou, Matteo Zecchin, Sangwoo Park, Yunlong Cai, Guanding Yu, Kaushik Chowdhury, Osvaldo Simeone

    Abstract: In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The mapping between context and AI model parameter… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: submitted for a journal publication

  5. arXiv:2406.14086  [pdf

    cs.CV cs.AI cs.LG

    Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images

    Authors: Qinfeng Zhu, Yuanzhi Cai, Lei Fan

    Abstract: Recent advancements in autoregressive networks with linear complexity have driven significant research progress, demonstrating exceptional performance in large language models. A representative model is the Extended Long Short-Term Memory (xLSTM), which incorporates gating mechanisms and memory structures, performing comparably to Transformer architectures in long-sequence language tasks. Autoregr… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  6. arXiv:2406.11572  [pdf, other

    cs.RO

    Propagative Distance Optimization for Constrained Inverse Kinematics

    Authors: Yu Chen, Yilin Cai, Jinyun Xu, Zhongqiang Ren, Guanya Shi, Howie Choset

    Abstract: This paper investigates a constrained inverse kinematic (IK) problem that seeks a feasible configuration of an articulated robot under various constraints such as joint limits and obstacle collision avoidance. Due to the high-dimensionality and complex constraints, this problem is often solved numerically via iterative local optimization. Classic local optimization methods take joint angles as the… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.10631  [pdf, other

    cs.GT cs.LG math.OC

    Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

    Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

    Abstract: Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-descent-ascent (OGDA). While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several adva… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 27 pages, 4 figures

  8. arXiv:2406.10391  [pdf, other

    q-bio.QM cs.LG

    BEACON: Benchmark for Comprehensive RNA Tasks and Language Models

    Authors: Yuchen Ren, Zhiyuan Chen, Lifeng Qiao, Hongtai Jing, Yuchen Cai, Sheng Xu, Peng Ye, Xinzhu Ma, Siqi Sun, Hongliang Yan, Dong Yuan, Wanli Ouyang, Xihui Liu

    Abstract: RNA plays a pivotal role in translating genetic instructions into functional outcomes, underscoring its importance in biological processes and disease mechanisms. Despite the emergence of numerous deep learning approaches for RNA, particularly universal RNA language models, there remains a significant lack of standardized benchmarks to assess the effectiveness of these methods. In this study, we i… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  9. arXiv:2406.08654  [pdf, other

    stat.ML cs.LG math.OC

    Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization

    Authors: Yuhang Cai, Jingfeng Wu, Song Mei, Michael Lindsey, Peter L. Bartlett

    Abstract: The typical training of neural networks using large stepsize gradient descent (GD) under the logistic loss often involves two distinct phases, where the empirical risk oscillates in the first phase but decreases monotonically in the second phase. We investigate this phenomenon in two-layer networks that satisfy a near-homogeneity condition. We show that the second phase begins once the empirical r… ▽ More

    Submitted 26 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Clarify our results on sigmoid neural networks

  10. arXiv:2406.07961  [pdf, other

    cs.CV cs.AI

    Accurate Explanation Model for Image Classifiers using Class Association Embedding

    Authors: Ruitao Xie, Jingbang Chen, Limai Jiang, Rui Xiao, Yi Pan, Yunpeng Cai

    Abstract: Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 40th IEEE International Conference on Data Engineering

  11. arXiv:2406.03262  [pdf, other

    cs.CV

    ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

    Authors: Jiangning Zhang, Haoyang He, Zhenye Gan, Qingdong He, Yuxuan Cai, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong Liu

    Abstract: Visual anomaly detection aims to identify anomalous regions in images through unsupervised learning paradigms, with increasing application demand and value in fields such as industrial inspection and medical lesion detection. Despite significant progress in recent years, there is a lack of comprehensive benchmarks to adequately evaluate the performance of various mainstream methods across differen… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  12. arXiv:2406.02222  [pdf, other

    cs.SE

    Towards an Extensible Model-Based Digital Twin Framework for Space Launch Vehicles

    Authors: Ran Wei, Ruizhe Yang, Shijun Liu, Chongsheng Fan, Rong Zhou, Zekun Wu, Haochi Wang, Yifan Cai, Zhe Jiang

    Abstract: The concept of Digital Twin (DT) is increasingly applied to systems on different levels of abstraction across domains, to support monitoring, analysis, diagnosis, decision making and automated control. Whilst the interest in applying DT is growing, the definition of DT is unclear, neither is there a clear pathway to develop DT to fully realise its capacities. In this paper, we revise the concept o… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  13. arXiv:2406.02079  [pdf, ps, other

    cs.CL

    Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks

    Authors: Yida Cai, Hao Sun, Hsiu-Yuan Huang, Yunfang Wu

    Abstract: Information Extraction (IE) plays a crucial role in Natural Language Processing (NLP) by extracting structured information from unstructured text, thereby facilitating seamless integration with various real-world applications that rely on structured data. Despite its significance, recent experiments focusing on English IE tasks have shed light on the challenges faced by Large Language Models (LLMs… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  14. arXiv:2406.01863  [pdf, other

    cs.CL

    Towards Effective Time-Aware Language Representation: Exploring Enhanced Temporal Understanding in Language Models

    Authors: Jiexin Wang, Adam Jatowt, Yi Cai

    Abstract: In the evolving field of Natural Language Processing, understanding the temporal context of text is increasingly crucial. This study investigates methods to incorporate temporal information during pre-training, aiming to achieve effective time-aware language representation for improved performance on time-related tasks. In contrast to common pre-trained models like BERT, which rely on synchronic d… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  15. arXiv:2406.00965  [pdf, other

    cs.RO cs.AI

    Efficient Behavior Tree Planning with Commonsense Pruning and Heuristic

    Authors: Xinglin Chen, Yishuai Cai, Yunxin Mao, Minglong Li, Zhou Yang, Wen Shanghua, Wenjing Yang, Weixia Xu, Ji Wang

    Abstract: Behavior Tree (BT) planning is crucial for autonomous robot behavior control, yet its application in complex scenarios is hampered by long planning times. Pruning and heuristics are common techniques to accelerate planning, but it is difficult to design general pruning strategies and heuristic functions for BT planning problems. This paper proposes improving BT planning efficiency for everyday ser… ▽ More

    Submitted 3 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  16. arXiv:2405.20693  [pdf, other

    eess.IV cs.CV

    R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction

    Authors: Ruyi Zha, Tao Jun Lin, Yuanhao Cai, Jiwen Cao, Yanhao Zhang, Hongdong Li

    Abstract: 3D Gaussian splatting (3DGS) has shown promising results in image rendering and surface reconstruction. However, its potential in volumetric reconstruction tasks, such as X-ray computed tomography, remains under-explored. This paper introduces R2-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction. By carefully deriving X-ray rasterization functions, we discover a p… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  17. arXiv:2405.19804  [pdf

    cs.LG

    Exploring Key Factors for Long-Term Vessel Incident Risk Prediction

    Authors: Tianyi Chen, Hua Wang, Yutong Cai, Maohan Liang, Qiang Meng

    Abstract: Factor analysis acts a pivotal role in enhancing maritime safety. Most previous studies conduct factor analysis within the framework of incident-related label prediction, where the developed models can be categorized into short-term and long-term prediction models. The long-term models offer a more strategic approach, enabling more proactive risk management, compared to the short-term ones. Nevert… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  18. arXiv:2405.16466  [pdf, other

    cs.NE

    High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost

    Authors: JiaKui Hu, Man Yao, Xuerui Qiu, Yuhong Chou, Yuxuan Cai, Ning Qiao, Yonghong Tian, Bo XU, Guoqi Li

    Abstract: Multi-timestep simulation of brain-inspired Spiking Neural Networks (SNNs) boost memory requirements during training and increase inference energy cost. Current training methods cannot simultaneously solve both training and inference dilemmas. This work proposes a novel Temporal Reversible architecture for SNNs (T-RevSNN) to jointly address the training and inference challenges by altering the for… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML2024

  19. arXiv:2405.15426  [pdf, other

    cs.CR

    AuthNet: Neural Network with Integrated Authentication Logic

    Authors: Yuling Cai, Fan Xiang, Guozhu Meng, Yinzhi Cao, Kai Chen

    Abstract: Model stealing, i.e., unauthorized access and exfiltration of deep learning models, has become one of the major threats. Proprietary models may be protected by access controls and encryption. However, in reality, these measures can be compromised due to system breaches, query-based model extraction or a disgruntled insider. Security hardening of neural networks is also suffering from limits, for e… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  20. arXiv:2405.15267  [pdf, other

    cs.CV

    Off-the-shelf ChatGPT is a Good Few-shot Human Motion Predictor

    Authors: Haoxuan Qu, Zhaoyang He, Zeyu Hu, Yujun Cai, Jun Liu

    Abstract: To facilitate the application of motion prediction in practice, recently, the few-shot motion prediction task has attracted increasing research attention. Yet, in existing few-shot motion prediction works, a specific model that is dedicatedly trained over human motions is generally required. In this work, rather than tackling this task through training a specific human motion prediction model, we… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  21. arXiv:2405.15196  [pdf, other

    cs.CV

    DisC-GS: Discontinuity-aware Gaussian Splatting

    Authors: Haoxuan Qu, Zhuoling Li, Hossein Rahmani, Yujun Cai, Jun Liu

    Abstract: Recently, Gaussian Splatting, a method that represents a 3D scene as a collection of Gaussian distributions, has gained significant attention in addressing the task of novel view synthesis. In this paper, we highlight a fundamental limitation of Gaussian Splatting: its inability to accurately render discontinuities and boundaries in images due to the continuous nature of Gaussian distributions. To… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  22. arXiv:2405.15193  [pdf, other

    cs.DB cs.DS

    CuckooGraph: A Scalable and Space-Time Efficient Data Structure for Large-Scale Dynamic Graphs

    Authors: Zhuochen Fan, Yalun Cai, Zirui Liu, Jiarui Guo, Xin Fan, Tong Yang, Bin Cui

    Abstract: Graphs play an increasingly important role in various big data applications. However, existing graph data structures cannot simultaneously address the performance bottlenecks caused by the dynamic updates, large scale, and high query complexity of current graphs. This paper proposes a novel data structure for large-scale dynamic graphs called CuckooGraph. It does not need to know the amount of gra… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  23. arXiv:2405.15125  [pdf, other

    cs.CV

    HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

    Authors: Yuanhao Cai, Zihao Xiao, Yixun Liang, Minghan Qin, Yulun Zhang, Xiaokang Yang, Yaoyao Liu, Alan Yuille

    Abstract: High dynamic range (HDR) novel view synthesis (NVS) aims to create photorealistic images from novel viewpoints using HDR imaging techniques. The rendered HDR images capture a wider range of brightness levels containing more details of the scene than normal low dynamic range (LDR) images. Existing HDR NVS methods are mainly based on NeRF. They suffer from long training time and slow inference speed… ▽ More

    Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: The first 3D Gaussian Splatting-based method for HDR imaging

  24. arXiv:2405.13084  [pdf, other

    cs.CL cs.AI

    The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG)

    Authors: Yucheng Cai, Si Chen, Yi Huang, Junlan Feng, Zhijian Ou

    Abstract: The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG), Co-located with SLT 2024

    Submitted 21 May, 2024; originally announced May 2024.

  25. arXiv:2405.13014  [pdf, other

    cs.CL cs.AI

    QCRD: Quality-guided Contrastive Rationale Distillation for Large Language Models

    Authors: Wei Wang, Zhaowei Li, Qi Xu, Yiqing Cai, Hang Song, Qi Qi, Ran Zhou, Zhida Huang, Tao Wang, Li Xiao

    Abstract: Deploying large language models (LLMs) poses challenges in terms of resource limitations and inference efficiency. To address these challenges, recent research has focused on using smaller task-specific language models, which are enhanced by distilling the knowledge rationales generated by LLMs. However, previous works mostly emphasize the effectiveness of positive knowledge, while overlooking the… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  26. arXiv:2405.12725  [pdf, other

    cs.CR cs.CV

    Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

    Authors: Boheng Li, Yishuo Cai, Haowei Li, Feng Xue, Zhifeng Li, Yiming Li

    Abstract: Model quantization is widely used to compress and accelerate deep neural networks. However, recent studies have revealed the feasibility of weaponizing model quantization via implanting quantization-conditioned backdoors (QCBs). These special backdoors stay dormant on released full-precision models but will come into effect after standard quantization. Due to the peculiarity of QCBs, existing defe… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024. 19 pages, 9 figures

  27. arXiv:2405.08493  [pdf

    cs.CV

    Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study

    Authors: Qinfeng Zhu, Yuan Fang, Yuanzhi Cai, Cheng Chen, Lei Fan

    Abstract: Deep learning methods, especially Convolutional Neural Networks (CNN) and Vision Transformer (ViT), are frequently employed to perform semantic segmentation of high-resolution remotely sensed images. However, CNNs are constrained by their restricted receptive fields, while ViTs face challenges due to their quadratic complexity. Recently, the Mamba model, featuring linear complexity and a global re… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  28. arXiv:2405.07474  [pdf, other

    cs.AI cs.HC cs.RO

    Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions

    Authors: Xinglin Chen, Yishuai Cai, Yunxin Mao, Minglong Li, Wenjing Yang, Weixia Xu, Ji Wang

    Abstract: Robots executing tasks following human instructions in domestic or industrial environments essentially require both adaptability and reliability. Behavior Tree (BT) emerges as an appropriate control architecture for these scenarios due to its modularity and reactivity. Existing BT generation methods, however, either do not involve interpreting natural language or cannot theoretically guarantee the… ▽ More

    Submitted 27 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  29. arXiv:2405.06283  [pdf, other

    cs.CV

    Novel Class Discovery for Ultra-Fine-Grained Visual Categorization

    Authors: Yu Liu, Yaqi Cai, Qi Jia, Binglin Qiu, Weimin Wang, Nan Pu

    Abstract: Ultra-fine-grained visual categorization (Ultra-FGVC) aims at distinguishing highly similar sub-categories within fine-grained objects, such as different soybean cultivars. Compared to traditional fine-grained visual categorization, Ultra-FGVC encounters more hurdles due to the small inter-class and large intra-class variation. Given these challenges, relying on human annotation for Ultra-FGVC is… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures

  30. arXiv:2405.05564  [pdf, other

    eess.IV cs.CV cs.LG

    Joint Edge Optimization Deep Unfolding Network for Accelerated MRI Reconstruction

    Authors: Yue Cai, Yu Luo, Jie Ling, Shun Yao

    Abstract: Magnetic Resonance Imaging (MRI) is a widely used imaging technique, however it has the limitation of long scanning time. Though previous model-based and learning-based MRI reconstruction methods have shown promising performance, most of them have not fully utilized the edge prior of MR images, and there is still much room for improvement. In this paper, we build a joint edge optimization model th… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  31. arXiv:2405.05297  [pdf

    cs.CV

    Deep Learning Method to Predict Wound Healing Progress Based on Collagen Fibers in Wound Tissue

    Authors: Juan He, Xiaoyan Wang, Long Chen, Yunpeng Cai, Zhengshan Wang

    Abstract: Wound healing is a complex process involving changes in collagen fibers. Accurate monitoring of these changes is crucial for assessing the progress of wound healing and has significant implications for guiding clinical treatment strategies and drug screening. However, traditional quantitative analysis methods focus on spatial characteristics such as collagen fiber alignment and variance, lacking t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  32. arXiv:2405.04120  [pdf, ps, other

    cs.IT

    Movable Antennas-Enabled Two-User Multicasting: Do We Really Need Alternating Optimization for Minimum Rate Maximization?

    Authors: Guojie Hu, Qingqing Wu, Donghui Xu, Kui Xu, Jiangbo Si, Yunlong Cai, Naofal Al-Dhahir

    Abstract: Movable antenna (MA) technology, which can reconfigure wireless channels by flexibly moving antenna positions in a specified region, has great potential for improving communication performance. In this paper, we consider a new setup of MAs-enabled multicasting, where we adopt a simple setting in which a linear MA array-enabled source (${\rm{S}}$) transmits a common message to two single-antenna us… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  33. Rip-NeRF: Anti-aliasing Radiance Fields with Ripmap-Encoded Platonic Solids

    Authors: Junchen Liu, Wenbo Hu, Zhuo Yang, Jianteng Chen, Guoliang Wang, Xiaoxue Chen, Yantong Cai, Huan-ang Gao, Hao Zhao

    Abstract: Despite significant advancements in Neural Radiance Fields (NeRFs), the renderings may still suffer from aliasing and blurring artifacts, since it remains a fundamental challenge to effectively and efficiently characterize anisotropic areas induced by the cone-casting procedure. This paper introduces a Ripmap-Encoded Platonic Solid representation to precisely and efficiently featurize 3D anisotrop… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024, Project page: https://junchenliu77.github.io/Rip-NeRF , Code: https://github.com/JunchenLiu77/Rip-NeRF

  34. arXiv:2404.14248  [pdf, other

    cs.CV

    NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and Results

    Authors: Xiaoning Liu, Zongwei Wu, Ao Li, Florin-Alexandru Vasluianu, Yulun Zhang, Shuhang Gu, Le Zhang, Ce Zhu, Radu Timofte, Zhi Jin, Hongjun Wu, Chenxi Wang, Haitao Ling, Yuanhao Cai, Hao Bian, Yuxin Zheng, Jing Lin, Alan Yuille, Ben Shao, Jin Guo, Tianli Liu, Mohao Wu, Yixu Feng, Shuo Hou, Haotian Lin , et al. (87 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 low light image enhancement challenge, highlighting the proposed solutions and results. The aim of this challenge is to discover an effective network design or solution capable of generating brighter, clearer, and visually appealing results when dealing with a variety of conditions, including ultra-high resolution (4K and beyond), non-uniform illumination, backlig… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: NTIRE 2024 Challenge Report

  35. arXiv:2404.13617  [pdf, other

    cs.DC

    Parallel AIG Refactoring via Conflict Breaking

    Authors: Ye Cai, Zonglin Yang, Liwei Ni, Junfeng Liu, Biwei Xie, Xingquan Li

    Abstract: Algorithm parallelization to leverage multi-core platforms for improving the efficiency of Electronic Design Automation~(EDA) tools plays a significant role in enhancing the scalability of Integrated Circuit (IC) designs. Logic optimization is a key process in the EDA design flow to reduce the area and depth of the circuit graph by finding logically equivalent graphs for substitution, which is typ… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  36. arXiv:2404.13614  [pdf, other

    cs.DC

    Enhancing ASIC Technology Mapping via Parallel Supergate Computing

    Authors: Ye Cai, Zonglin Yang, Liwei Ni, Biwei Xie, Xingquan Li

    Abstract: With the development of large-scale integrated circuits, electronic design automation~(EDA) tools are increasingly emphasizing efficiency, with parallel algorithms becoming a trend. The optimization of delay reduction is a crucial factor for ASIC technology mapping, and supergate technology proves to be an effective method for achieving this in EDA tools flow. However, we have observed that increa… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  37. arXiv:2404.12104  [pdf, other

    cs.CV cs.CL cs.LG

    Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models

    Authors: Yuzhu Cai, Sheng Yin, Yuxi Wei, Chenxin Xu, Weibo Mao, Felix Juefei-Xu, Siheng Chen, Yanfeng Wang

    Abstract: The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designe… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 42 pages, 17 figures, 29 tables

  38. arXiv:2404.11890  [pdf, other

    math.NA cs.LG

    FCNCP: A Coupled Nonnegative CANDECOMP/PARAFAC Decomposition Based on Federated Learning

    Authors: Yukai Cai, Hang Liu, Xiulin Wang, Hongjin Li, Ziyi Wang, Chuanshuai Yang, Fengyu Cong

    Abstract: In the field of brain science, data sharing across servers is becoming increasingly challenging due to issues such as industry competition, privacy security, and administrative procedure policies and regulations. Therefore, there is an urgent need to develop new methods for data analysis and processing that enable scientific collaboration without data sharing. In view of this, this study proposes… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  39. arXiv:2404.05581  [pdf, other

    cs.RO eess.SY

    Design and Simulation of Time-energy Optimal Anti-swing Trajectory Planner for Autonomous Tower Cranes

    Authors: Souravik Dutta, Yiyu Cai

    Abstract: For autonomous crane lifting, optimal trajectories of the crane are required as reference inputs to the crane controller to facilitate feedforward control. Reducing the unactuated payload motion is a crucial issue for under-actuated tower cranes with spherical pendulum dynamics. The planned trajectory should be optimal in terms of both operating time and energy consumption, to facilitate optimum o… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 18 pages, 12 figures, 9 tables

  40. arXiv:2404.05300  [pdf, other

    cs.CV

    Texture Classification Network Integrating Adaptive Wavelet Transform

    Authors: Su-Xi Yu, Jing-Yuan He, Yi Wang, Yu-Jiao Cai, Jun Yang, Bo Lin, Wei-Bin Yang, Jian Ruan

    Abstract: Graves' disease is a common condition that is diagnosed clinically by determining the smoothness of the thyroid texture and its morphology in ultrasound images. Currently, the most widely used approach for the automated diagnosis of Graves' disease utilizes Convolutional Neural Networks (CNNs) for both feature extraction and classification. However, these methods demonstrate limited efficacy in ca… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  41. arXiv:2404.04941  [pdf, other

    cs.CL

    Prompting Large Language Models for Zero-shot Essay Scoring via Multi-trait Specialization

    Authors: Sanwoo Lee, Yida Cai, Desong Meng, Ziyang Wang, Yunfang Wu

    Abstract: Advances in automated essay scoring (AES) have traditionally relied on labeled essays, requiring tremendous cost and expertise for their acquisition. Recently, large language models (LLMs) have achieved great success in various tasks, but their potential is less explored in AES. In this paper, we propose Multi Trait Specialization (MTS), a zero-shot prompting framework to elicit essay scoring capa… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  42. arXiv:2404.04518  [pdf, other

    cs.CV

    MedIAnomaly: A comparative study of anomaly detection in medical images

    Authors: Yu Cai, Weiwen Zhang, Hao Chen, Kwang-Ting Cheng

    Abstract: Anomaly detection (AD) aims at detecting abnormal samples that deviate from the expected normal patterns. Generally, it can be trained on merely normal data without the requirement for abnormal samples, and thereby plays an important role in the recognition of rare diseases and health screening in the medical domain. Despite numerous related studies, we observe a lack of a fair and comprehensive e… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Under submission

  43. arXiv:2404.03395  [pdf, ps, other

    cs.IT cs.ET

    Movable Antennas-Assisted Secure Transmission Without Eavesdroppers' Instantaneous CSI

    Authors: Guojie Hu, Qingqing Wu, Donghui Xu, Kui Xu, Jiangbo Si, Yunlong Cai, Naofal Al-Dhahir

    Abstract: Movable antenna (MA) technology is highly promising for improving communication performance, due to its advantage of flexibly adjusting positions of antennas to reconfigure channel conditions. In this paper, we investigate MAs-assisted secure transmission under a legitimate transmitter Alice, a legitimate receiver Bob and multiple eavesdroppers. Specifically, we consider a practical scenario where… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Submitted for journal publication

  44. arXiv:2404.01705  [pdf

    cs.CV

    Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model

    Authors: Qinfeng Zhu, Yuanzhi Cai, Yuan Fang, Yihan Yang, Cheng Chen, Lei Fan, Anh Nguyen

    Abstract: High-resolution remotely sensed images pose a challenge for commonly used semantic segmentation methods such as Convolutional Neural Network (CNN) and Vision Transformer (ViT). CNN-based methods struggle with handling such high-resolution images due to their limited receptive field, while ViT faces challenges in handling long sequences. Inspired by Mamba, which adopts a State Space Model (SSM) to… ▽ More

    Submitted 11 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  45. arXiv:2404.00532  [pdf, other

    cs.CV

    LLMs are Good Action Recognizers

    Authors: Haoxuan Qu, Yujun Cai, Jun Liu

    Abstract: Skeleton-based action recognition has attracted lots of research attention. Recently, to build an accurate skeleton-based action recognizer, a variety of works have been proposed. Among them, some works use large model architectures as backbones of their recognizers to boost the skeleton data representation capability, while some other works pre-train their recognizers on external data to enrich t… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: CVPR 2024

  46. arXiv:2404.00340  [pdf, other

    cs.RO eess.SY

    Deep Reinforcement Learning in Autonomous Car Path Planning and Control: A Survey

    Authors: Yiyang Chen, Chao Ji, Yunrui Cai, Tong Yan, Bo Su

    Abstract: Combining data-driven applications with control systems plays a key role in recent Autonomous Car research. This thesis offers a structured review of the latest literature on Deep Reinforcement Learning (DRL) within the realm of autonomous vehicle Path Planning and Control. It collects a series of DRL methodologies and algorithms and their applications in the field, focusing notably on their roles… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  47. arXiv:2403.20079  [pdf, other

    cs.CV

    SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

    Authors: Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Zeke Xie, Yunfeng Cai, Jiale Cao, Zhong Ji, Mingming Sun

    Abstract: Novel View Synthesis (NVS) for street scenes play a critical role in the autonomous driving simulation. The current mainstream technique to achieve it is neural rendering, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Although thrilling progress has been made, when handling street scenes, current methods struggle to maintain rendering quality at the viewpoint that deviate… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  48. Task2Morph: Differentiable Task-inspired Framework for Contact-Aware Robot Design

    Authors: Yishuai Cai, Shaowu Yang, Minglong Li, Xinglin Chen, Yunxin Mao, Xiaodong Yi, Wenjing Yang

    Abstract: Optimizing the morphologies and the controllers that adapt to various tasks is a critical issue in the field of robot design, aka. embodied intelligence. Previous works typically model it as a joint optimization problem and use search-based methods to find the optimal solution in the morphology space. However, they ignore the implicit knowledge of task-to-morphology mapping which can directly insp… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 9 pages, 10 figures, published to IROS

    Journal ref: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023: 452-459

  49. arXiv:2403.16561  [pdf, other

    cs.LG cs.AI

    FedFixer: Mitigating Heterogeneous Label Noise in Federated Learning

    Authors: Xinyuan Ji, Zhaowei Zhu, Wei Xi, Olga Gadyatskaya, Zilong Song, Yong Cai, Yang Liu

    Abstract: Federated Learning (FL) heavily depends on label quality for its performance. However, the label distribution among individual clients is always both noisy and heterogeneous. The high loss incurred by client-specific samples in heterogeneous label noise poses challenges for distinguishing between client-specific and noisy label samples, impacting the effectiveness of existing label noise learning… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: accepted by AAA24

  50. arXiv:2403.16458  [pdf, other

    cs.IT eess.SP

    Next Generation Advanced Transceiver Technologies for 6G

    Authors: Changsheng You, Yunlong Cai, Yuanwei Liu, Marco Di Renzo, Tolga M. Duman, Aylin Yener, A. Lee Swindlehurst

    Abstract: To accommodate new applications such as extended reality, fully autonomous vehicular networks and the metaverse, next generation wireless networks are going to be subject to much more stringent performance requirements than the fifth-generation (5G) in terms of data rates, reliability, latency, and connectivity. It is thus necessary to develop next generation advanced transceiver (NGAT) technologi… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: This paper gives a comprehensive tutorial overview of next generation advanced transceiver (NGAT) technologies for 6G