Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 290 results for author: Cui, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.18054  [pdf, other

    eess.IV cs.CV

    LKCell: Efficient Cell Nuclei Instance Segmentation with Large Convolution Kernels

    Authors: Ziwei Cui, Jingfeng Yao, Lunbin Zeng, Juan Yang, Wenyu Liu, Xinggang Wang

    Abstract: The segmentation of cell nuclei in tissue images stained with the blood dye hematoxylin and eosin (H$\&$E) is essential for various clinical applications and analyses. Due to the complex characteristics of cellular morphology, a large receptive field is considered crucial for generating high-quality segmentation. However, previous methods face challenges in achieving a balance between the receptiv… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  2. arXiv:2407.15325  [pdf, other

    cs.AI

    Odyssey: Empowering Agents with Open-World Skills

    Authors: Shunyu Liu, Yaoru Li, Kongcheng Zhang, Zhenyu Cui, Wenkai Fang, Yuxuan Zheng, Tongya Zheng, Mingli Song

    Abstract: Recent studies have delved into constructing generalist agents for open-world embodied environments like Minecraft. Despite the encouraging results, existing efforts mainly focus on solving basic programmatic tasks, e.g., material collection and tool-crafting following the Minecraft tech-tree, treating the ObtainDiamond task as the ultimate goal. This limitation stems from the narrowly defined set… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  3. arXiv:2407.14239  [pdf, other

    cs.AI

    KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models

    Authors: Kemou Jiang, Xuan Cai, Zhiyong Cui, Aoyong Li, Yilong Ren, Haiyang Yu, Hao Yang, Daocheng Fu, Licheng Wen, Pinlong Cai

    Abstract: Large language models (LLMs) as autonomous agents offer a novel avenue for tackling real-world challenges through a knowledge-driven manner. These LLM-enhanced methodologies excel in generalization and interpretability. However, the complexity of driving tasks often necessitates the collaboration of multiple, heterogeneous agents, underscoring the need for such LLM-driven agents to engage in coope… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: 13 pages, 18 figures

  4. arXiv:2407.12257  [pdf, other

    cs.CV

    Compound Expression Recognition via Multi Model Ensemble for the ABAW7 Challenge

    Authors: Xuxiong Liu, Kang Shen, Jun Yao, Boyan Wang, Minrui Liu, Liuwei An, Zishun Cui, Weijie Feng, Xiao Sun

    Abstract: Compound Expression Recognition (CER) is vital for effective interpersonal interactions. Human emotional expressions are inherently complex due to the presence of compound expressions, requiring the consideration of both local and global facial cues for accurate judgment. In this paper, we propose an ensemble learning-based solution to address this complexity. Our approach involves training three… ▽ More

    Submitted 26 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2403.12572 by other authors

  5. arXiv:2407.11419  [pdf, other

    cs.CV

    TeethDreamer: 3D Teeth Reconstruction from Five Intra-oral Photographs

    Authors: Chenfan Xu, Zhentao Liu, Yuan Liu, Yulong Dou, Jiamin Wu, Jiepeng Wang, Minjiao Wang, Dinggang Shen, Zhiming Cui

    Abstract: Orthodontic treatment usually requires regular face-to-face examinations to monitor dental conditions of the patients. When in-person diagnosis is not feasible, an alternative is to utilize five intra-oral photographs for remote dental monitoring. However, it lacks of 3D information, and how to reconstruct 3D dental models from such sparse view photographs is a challenging problem. In this study,… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: MICCAI2024

  6. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  7. arXiv:2407.07078  [pdf, other

    cs.CV

    MoSt-DSA: Modeling Motion and Structural Interactions for Direct Multi-Frame Interpolation in DSA Images

    Authors: Ziyang Xu, Huangxuan Zhao, Ziwei Cui, Wenyu Liu, Chuansheng Zheng, Xinggang Wang

    Abstract: Artificial intelligence has become a crucial tool for medical image analysis. As an advanced cerebral angiography technique, Digital Subtraction Angiography (DSA) poses a challenge where the radiation dose to humans is proportional to the image count. By reducing images and using AI interpolation instead, the radiation can be cut significantly. However, DSA images present more complex motion and s… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted to ECAI2024

  8. arXiv:2407.07020  [pdf, other

    cs.AI cs.RO

    Less is More: Efficient Brain-Inspired Learning for Autonomous Driving Trajectory Prediction

    Authors: Haicheng Liao, Yongkang Li, Zhenning Li, Chengyue Wang, Chunlin Tian, Yuming Huang, Zilin Bian, Kaiqun Zhu, Guofa Li, Ziyuan Pu, Jia Hu, Zhiyong Cui, Chengzhong Xu

    Abstract: Accurately and safely predicting the trajectories of surrounding vehicles is essential for fully realizing autonomous driving (AD). This paper presents the Human-Like Trajectory Prediction model (HLTP++), which emulates human cognitive processes to improve trajectory prediction in AD. HLTP++ incorporates a novel teacher-student knowledge distillation framework. The "teacher" model equipped with an… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.19251

  9. arXiv:2407.06567  [pdf, other

    cs.CL

    FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making

    Authors: Yangyang Yu, Zhiyuan Yao, Haohang Li, Zhiyang Deng, Yupeng Cao, Zhi Chen, Jordan W. Suchow, Rong Liu, Zhenyu Cui, Denghui Zhang, Koduvayur Subbalakshmi, Guojun Xiong, Yueru He, Jimin Huang, Dong Li, Qianqian Xie

    Abstract: Large language models (LLMs) have demonstrated notable potential in conducting complex tasks and are increasingly utilized in various financial applications. However, high-quality sequential financial investment decision-making remains challenging. These tasks require multiple interactions with a volatile environment for every decision, demanding sufficient intelligence to maximize returns and man… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: LLM Applications, LLM Agents, Financial Technology, Quantitative Finance, Algorithmic Trading, Cognitive Science

  10. arXiv:2407.05909  [pdf, other

    cs.CV

    Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection

    Authors: Chenxu Wang, Chunyan Xu, Ziqi Gu, Zhen Cui

    Abstract: While existing semi-supervised object detection (SSOD) methods perform well in general scenes, they encounter challenges in handling oriented objects in aerial images. We experimentally find three gaps between general and oriented object detection in semi-supervised learning: 1) Sampling inconsistency: the common center sampling is not suitable for oriented objects with larger aspect ratios when s… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  11. arXiv:2406.17215  [pdf, other

    eess.SY cs.AI

    Enabling Large Language Models to Perform Power System Simulations with Previously Unseen Tools: A Case of Daline

    Authors: Mengshuo Jia, Zeyu Cui, Gabriela Hug

    Abstract: The integration of experiment technologies with large language models (LLMs) is transforming scientific research, offering AI capabilities beyond specialized problem-solving to becoming research assistants for human scientists. In power systems, simulations are essential for research. However, LLMs face significant challenges in power system simulations due to limited pre-existing knowledge and th… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: All the supplementary files mentioned in the manuscript will be open-source upon acceptance

  12. arXiv:2406.12577  [pdf, other

    cs.CV

    Cephalometric Landmark Detection across Ages with Prototypical Network

    Authors: Han Wu, Chong Wang, Lanzhuju Mei, Tong Yang, Min Zhu, Dingggang Shen, Zhiming Cui

    Abstract: Automated cephalometric landmark detection is crucial in real-world orthodontic diagnosis. Current studies mainly focus on only adult subjects, neglecting the clinically crucial scenario presented by adolescents whose landmarks often exhibit significantly different appearances compared to adults. Hence, an open question arises about how to develop a unified and effective detection algorithm across… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  13. arXiv:2406.03882  [pdf, other

    cs.CL cs.SD eess.AS

    Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

    Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

    Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More

    Submitted 9 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  14. arXiv:2406.03879  [pdf, other

    cs.LG cs.CV

    Decay Pruning Method: Smooth Pruning With a Self-Rectifying Procedure

    Authors: Minghao Yang, Linlin Gao, Pengyuan Li, Wenbo Li, Yihong Dong, Zhiying Cui

    Abstract: Current structured pruning methods often result in considerable accuracy drops due to abrupt network changes and loss of information from pruned structures. To address these issues, we introduce the Decay Pruning Method (DPM), a novel smooth pruning approach with a self-rectifying mechanism. DPM consists of two key components: (i) Smooth Pruning: It converts conventional single-step pruning into m… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  15. arXiv:2406.03199  [pdf, other

    cs.CL cs.AI cs.LG

    Bayesian WeakS-to-Strong from Text Classification to Generation

    Authors: Ziyun Cui, Ziyang Zhang, Wen Wu, Guangzhi Sun, Chao Zhang

    Abstract: Advances in large language models raise the question of how alignment techniques will adapt as models become increasingly complex and humans will only be able to supervise them weakly. Weak-to-Strong mimics such a scenario where weak model supervision attempts to harness the full capabilities of a much stronger model. This work extends Weak-to-Strong to WeakS-to-Strong by exploring an ensemble of… ▽ More

    Submitted 24 May, 2024; originally announced June 2024.

  16. arXiv:2405.20337  [pdf, other

    cs.CV cs.AI

    OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

    Authors: Lening Wang, Wenzhao Zheng, Yilong Ren, Han Jiang, Zhiyong Cui, Haiyang Yu, Jiwen Lu

    Abstract: Understanding the evolution of 3D scenes is important for effective autonomous driving. While conventional methods mode scene development with the motion of individual instances, world models emerge as a generative framework to describe the general scene dynamics. However, most existing methods adopt an autoregressive framework to perform next-token prediction, which suffer from inefficiency in mo… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Code is available at: https://github.com/wzzheng/OccSora

  17. GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

    Authors: Boming Zhao, Yuan Li, Ziyu Sun, Lin Zeng, Yujun Shen, Rui Ma, Yinda Zhang, Hujun Bao, Zhaopeng Cui

    Abstract: Forecasting future scenarios in dynamic environments is essential for intelligent decision-making and navigation, a challenge yet to be fully realized in computer vision and robotics. Traditional approaches like video prediction and novel-view synthesis either lack the ability to forecast from arbitrary viewpoints or to predict temporal dynamics. In this paper, we introduce GaussianPrediction, a n… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to SIGGRAPH 2024 Conference. Project Page: https://zju3dv.github.io/gaussian-prediction/

  18. arXiv:2405.15301  [pdf, other

    cs.LG

    Rankability-enhanced Revenue Uplift Modeling Framework for Online Marketing

    Authors: Bowei He, Yunpeng Weng, Xing Tang, Ziqiang Cui, Zexu Sun, Liang Chen, Xiuqiang He, Chen Ma

    Abstract: Uplift modeling has been widely employed in online marketing by predicting the response difference between the treatment and control groups, so as to identify the sensitive individuals toward interventions like coupons or discounts. Compared with traditional \textit{conversion uplift modeling}, \textit{revenue uplift modeling} exhibits higher potential due to its direct connection with the corpora… ▽ More

    Submitted 12 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

  19. arXiv:2405.14371  [pdf, other

    cs.DC

    EdgeShard: Efficient LLM Inference via Collaborative Edge Computing

    Authors: Mingjin Zhang, Jiannong Cao, Xiaoming Shen, Zeyang Cui

    Abstract: Large language models (LLMs) have shown great potential in natural language processing and content generation. However, current LLMs heavily rely on cloud computing, leading to prolonged latency, high bandwidth cost, and privacy concerns. Edge computing is promising to address such concerns by deploying LLMs on edge devices, closer to data sources. Some works try to leverage model quantization to… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Under review

  20. arXiv:2405.11257  [pdf, other

    cs.RO

    PS6D: Point Cloud Based Symmetry-Aware 6D Object Pose Estimation in Robot Bin-Picking

    Authors: Yifan Yang, Zhihao Cui, Qianyi Zhang, Jingtai Liu

    Abstract: 6D object pose estimation holds essential roles in various fields, particularly in the grasping of industrial workpieces. Given challenges like rust, high reflectivity, and absent textures, this paper introduces a point cloud based pose estimation framework (PS6D). PS6D centers on slender and multi-symmetric objects. It extracts multi-scale features through an attention-guided feature extraction m… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  21. arXiv:2405.10705  [pdf, other

    eess.IV cs.CV

    3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning

    Authors: Zhentao Liu, Huangxuan Zhao, Wenhui Qin, Zhenghong Zhou, Xinggang Wang, Wenping Wang, Xiaochun Lai, Chuansheng Zheng, Dinggang Shen, Zhiming Cui

    Abstract: Digital Subtraction Angiography (DSA) is one of the gold standards in vascular disease diagnosing. With the help of contrast agent, time-resolved 2D DSA images deliver comprehensive insights into blood flow information and can be utilized to reconstruct 3D vessel structures. Current commercial DSA systems typically demand hundreds of scanning views to perform reconstruction, resulting in substanti… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 12 pages, 13 figures, 5 tables

  22. arXiv:2405.09369  [pdf, other

    cs.IR

    Diffusion-based Contrastive Learning for Sequential Recommendation

    Authors: Ziqiang Cui, Haolun Wu, Bowei He, Ji Cheng, Chen Ma

    Abstract: Self-supervised contrastive learning, which directly extracts inherent data correlations from unlabeled data, has been widely utilized to mitigate the data sparsity issue in sequential recommendation. The majority of existing methods create different augmented views of the same user sequence via random augmentation, and subsequently minimize their distance in the embedding space to enhance the qua… ▽ More

    Submitted 7 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  23. arXiv:2405.08589  [pdf, other

    cs.CV

    Variable Substitution and Bilinear Programming for Aligning Partially Overlapping Point Sets

    Authors: Wei Lian, Zhesen Cui, Fei Ma, Hang Pan, Wangmeng Zuo

    Abstract: In many applications, the demand arises for algorithms capable of aligning partially overlapping point sets while remaining invariant to the corresponding transformations. This research presents a method designed to meet such requirements through minimization of the objective function of the robust point matching (RPM) algorithm. First, we show that the RPM objective is a cubic polynomial. Then, t… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  24. arXiv:2405.08054  [pdf, other

    cs.GR cs.CV

    Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

    Authors: Wenqi Dong, Bangbang Yang, Lin Ma, Xiao Liu, Liyuan Cui, Hujun Bao, Yuewen Ma, Zhaopeng Cui

    Abstract: As humans, we aspire to create media content that is both freely willed and readily controlled. Thanks to the prominent development of generative techniques, we now can easily utilize 2D diffusion methods to synthesize images controlled by raw sketch or designated human poses, and even progressively edit/regenerate local regions with masked inpainting. However, similar workflows in 3D modeling tas… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Project webpage: https://zju3dv.github.io/coin3d

  25. arXiv:2405.06059  [pdf, other

    cs.CL cs.AI

    A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text Worlds

    Authors: Christopher Z. Cui, Xiangyu Peng, Mark O. Riedl

    Abstract: Open-ended worlds are those in which there are no pre-specified goals or environmental reward signal. As a consequence, an agent must know how to perform a multitude of tasks. However, when a new task is presented to an agent, we expect it to be able to reuse some of what it knows from previous tasks to rapidly learn that new task. We introduce a novel technique whereby policies for different a pr… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  26. arXiv:2405.04909  [pdf, other

    cs.CV cs.AI

    Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

    Authors: Zhengxing Lan, Hongbo Li, Lingshan Liu, Bo Fan, Yisheng Lv, Yilong Ren, Zhiyong Cui

    Abstract: Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of the complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) without explici… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  27. arXiv:2404.17520  [pdf, other

    cs.RO

    A Cognitive-Driven Trajectory Prediction Model for Autonomous Driving in Mixed Autonomy Environment

    Authors: Haicheng Liao, Zhenning Li, Chengyue Wang, Bonan Wang, Hanlin Kong, Yanchen Guan, Guofa Li, Zhiyong Cui, Chengzhong Xu

    Abstract: As autonomous driving technology progresses, the need for precise trajectory prediction models becomes paramount. This paper introduces an innovative model that infuses cognitive insights into trajectory prediction, focusing on perceived safety and dynamic decision-making. Distinct from traditional approaches, our model excels in analyzing interactions and behavior patterns in mixed autonomy traff… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  28. arXiv:2404.12031  [pdf, other

    cs.CV

    MLS-Track: Multilevel Semantic Interaction in RMOT

    Authors: Zeliang Ma, Song Yang, Zhe Cui, Zhicheng Zhao, Fei Su, Delong Liu, Jingyu Wang

    Abstract: The new trend in multi-object tracking task is to track objects of interest using natural language. However, the scarcity of paired prompt-instance data hinders its progress. To address this challenge, we propose a high-quality yet low-cost data generation method base on Unreal Engine 5 and construct a brand-new benchmark dataset, named Refer-UE-City, which primarily includes scenes from intersect… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 17 pages 8 figures

  29. arXiv:2404.06153  [pdf, other

    cs.LG q-bio.GN

    scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling

    Authors: Shengze Dong, Zhuorui Cui, Ding Liu, Jinzhi Lei

    Abstract: Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research, facilitating the examination of gene expression at the individual cell level within a given tissue sample. While numerous tools have been developed for scRNA-seq data analysis, the challenge persists in capturing the distinct features of such data and replicating virtual d… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 11 pages, 4 figures,

  30. arXiv:2404.04430  [pdf, other

    cs.CV

    PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos

    Authors: Yufei Zhang, Jeffrey O. Kephart, Zijun Cui, Qiang Ji

    Abstract: While current methods have shown promising progress on estimating 3D human motion from monocular videos, their motion estimates are often physically unrealistic because they mainly consider kinematics. In this paper, we introduce Physics-aware Pretrained Transformer (PhysPT), which improves kinematics-based motion estimates and infers motion forces. PhysPT exploits a Transformer encoder-decoder ba… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  31. arXiv:2404.02152  [pdf, other

    cs.CV

    GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image

    Authors: Chong Bao, Yinda Zhang, Yuan Li, Xiyu Zhang, Bangbang Yang, Hujun Bao, Marc Pollefeys, Guofeng Zhang, Zhaopeng Cui

    Abstract: Recently, we have witnessed the explosive growth of various volumetric representations in modeling animatable head avatars. However, due to the diversity of frameworks, there is no practical method to support high-level applications like 3D head avatar editing across different representations. In this paper, we propose a generic avatar editing approach that can be universally applied to various 3D… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024. Project page: https://zju3dv.github.io/geneavatar/

  32. arXiv:2403.19881  [pdf, other

    cs.AI

    IME: Integrating Multi-curvature Shared and Specific Embedding for Temporal Knowledge Graph Completion

    Authors: Jiapu Wang, Zheng Cui, Boyue Wang, Shirui Pan, Junbin Gao, Baocai Yin, Wen Gao

    Abstract: Temporal Knowledge Graphs (TKGs) incorporate a temporal dimension, allowing for a precise capture of the evolution of knowledge and reflecting the dynamic nature of the real world. Typically, TKGs contain complex geometric structures, with various geometric structures interwoven. However, existing Temporal Knowledge Graph Completion (TKGC) methods either model TKGs in a single space or neglect the… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  33. arXiv:2403.18282  [pdf, other

    cs.CV

    SGDM: Static-Guided Dynamic Module Make Stronger Visual Models

    Authors: Wenjie Xing, Zhenchao Cui, Jing Qi

    Abstract: The spatial attention mechanism has been widely used to improve object detection performance. However, its operation is currently limited to static convolutions lacking content-adaptive features. This paper innovatively approaches from the perspective of dynamic convolution. We propose Razor Dynamic Convolution (RDConv) to address thetwo flaws in dynamic weight convolution, making it hard to imple… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 16 pages, 4 figures

  34. arXiv:2403.16406  [pdf

    cs.HC

    Development of a Chinese Human-Automation Trust Scale

    Authors: Zixin Cui, Xiangling Zhuang, Seul Chan Lee, Jieun Lee, Xintong Li, Makoto Itoh

    Abstract: The development of a reliable and valid assessment tool of human-automation trust is an important topic. This study aimed to develop a Chinese version of human-automation trust scale (C-HATS) with reasonable reliability and validity based on Lee and See (2004)'s trust model. After three phases of assessments including exploratory factor analysis, item analysis, and confirmatory factor analysis, di… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: 26 pages with 3 figures

  35. arXiv:2403.16095  [pdf, other

    cs.CV cs.RO

    CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field

    Authors: Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, Zhaopeng Cui

    Abstract: Recently neural radiance fields (NeRF) have been widely exploited as 3D representations for dense simultaneous localization and mapping (SLAM). Despite their notable successes in surface modeling and novel view synthesis, existing NeRF-based methods are hindered by their computationally intensive and time-consuming volume rendering pipeline. This paper presents an efficient dense RGB-D SLAM system… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Project Page: https://zju3dv.github.io/cg-slam

  36. arXiv:2403.08251  [pdf, other

    cs.MA cs.AI cs.CY

    Emergence of Social Norms in Generative Agent Societies: Principles and Architecture

    Authors: Siyue Ren, Zhiyao Cui, Ruiqi Song, Zhen Wang, Shuyue Hu

    Abstract: Social norms play a crucial role in guiding agents towards understanding and adhering to standards of behavior, thus reducing social conflicts within multi-agent systems (MASs). However, current LLM-based (or generative) MASs lack the capability to be normative. In this paper, we propose a novel architecture, named CRSEC, to empower the emergence of social norms within generative MASs. Our archite… ▽ More

    Submitted 20 May, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Published as a conference paper at IJCAI 2024

  37. arXiv:2403.02221  [pdf, other

    cs.LG

    TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models

    Authors: Yilong Ren, Yue Chen, Shuai Liu, Boyue Wang, Haiyang Yu, Zhiyong Cui

    Abstract: Traffic prediction constitutes a pivotal facet within the purview of Intelligent Transportation Systems (ITS), and the attainment of highly precise predictions holds profound significance for efficacious traffic management. The precision of prevailing deep learning-driven traffic prediction models typically sees an upward trend with a rise in the volume of training data. However, the procurement o… ▽ More

    Submitted 18 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  38. arXiv:2402.19251  [pdf, other

    cs.AI cs.RO

    A Cognitive-Based Trajectory Prediction Approach for Autonomous Driving

    Authors: Haicheng Liao, Yongkang Li, Zhenning Li, Chengyue Wang, Zhiyong Cui, Shengbo Eben Li, Chengzhong Xu

    Abstract: In autonomous vehicle (AV) technology, the ability to accurately predict the movements of surrounding vehicles is paramount for ensuring safety and operational efficiency. Incorporating human decision-making insights enables AVs to more effectively anticipate the potential actions of other vehicles, significantly improving prediction accuracy and responsiveness in dynamic environments. This paper… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  39. arXiv:2402.17487  [pdf, other

    cs.CV cs.LG eess.IV

    Bit Rate Matching Algorithm Optimization in JPEG-AI Verification Model

    Authors: Panqi Jia, A. Burakhan Koyuncu, Jue Mao, Ze Cui, Yi Ma, Tiansheng Guo, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Jing Wang, Elena Alshina, Andre Kaup

    Abstract: The research on neural network (NN) based image compression has shown superior performance compared to classical compression frameworks. Unlike the hand-engineered transforms in the classical frameworks, NN-based models learn the non-linear transforms providing more compact bit representations, and achieve faster coding speed on parallel devices over their classical counterparts. Those properties… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at (IEEE) PCS 2024; 6 pages

  40. arXiv:2402.16718  [pdf

    physics.med-ph cs.AI

    An Overview of the Development of Stereotactic Body Radiation Therapy

    Authors: Yanqi Zong, Zhengrong Cui, Luqi Lin, Sihao Wang, Yizhi Chen

    Abstract: Stereotactic body radiation therapy (SBRT) refers to focusing high-energy rays in three-dimensional space on the tumor lesion area, reducing the dose received by surrounding normal tissues, which can effectively improve the local control rate of the tumor and reduce the probability of complications. With the comprehensive development of medical imaging, radiation biology and other disciplines, thi… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  41. arXiv:2402.12746  [pdf, ps, other

    eess.AS cs.SD

    Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network

    Authors: Yanan Chen, Zihao Cui, Yingying Gao, Junlan Feng, Chao Deng, Shilei Zhang

    Abstract: The expectation to deploy a universal neural network for speech enhancement, with the aim of improving noise robustness across diverse speech processing tasks, faces challenges due to the existing lack of awareness within static speech enhancement frameworks regarding the expected speech in downstream modules. These limitations impede the effectiveness of static speech enhancement approaches in ac… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  42. arXiv:2401.10786  [pdf, other

    cs.CV

    Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion

    Authors: Zuoyue Li, Zhenqiang Li, Zhaopeng Cui, Marc Pollefeys, Martin R. Oswald

    Abstract: Directly generating scenes from satellite imagery offers exciting possibilities for integration into applications like games and map services. However, challenges arise from significant view changes and scene scale. Previous efforts mainly focused on image or video generation, lacking exploration into the adaptability of scene generation for arbitrary views. Existing 3D generation works either ope… ▽ More

    Submitted 1 April, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Journal ref: CVPR 2024

  43. arXiv:2401.06557  [pdf, other

    cs.LG cs.AI cs.SI stat.ME

    Treatment-Aware Hyperbolic Representation Learning for Causal Effect Estimation with Social Networks

    Authors: Ziqiang Cui, Xing Tang, Yang Qiao, Bowei He, Liang Chen, Xiuqiang He, Chen Ma

    Abstract: Estimating the individual treatment effect (ITE) from observational data is a crucial research topic that holds significant value across multiple domains. How to identify hidden confounders poses a key challenge in ITE estimation. Recent studies have incorporated the structural information of social networks to tackle this challenge, achieving notable advancements. However, these methods utilize g… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by SIAM SDM'24

  44. arXiv:2401.01491   

    cs.CE

    A Hybrid Neural Network Model For Predicting The Nitrate Concentration In The Recirculating Aquaculture System

    Authors: Xiangyu Fan, Jiaxin Lia, Yingzhe Wang, Yingsha Qu, Hao Li, Keming Qu, Zhengguo Cui

    Abstract: This study was groundbreaking in its application of neural network models for nitrate management in the Recirculating Aquaculture System (RAS). A hybrid neural network model was proposed, which accurately predicted daily nitrate concentration and its trends using six water quality parameters. We conducted a 105-day aquaculture experiment, during which we collected 450 samples from five sets of RAS… ▽ More

    Submitted 15 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: The content of this paper needs to be further filled and improved

  45. arXiv:2401.00781  [pdf

    cs.LG stat.ML

    Inferring Heterogeneous Treatment Effects of Crashes on Highway Traffic: A Doubly Robust Causal Machine Learning Approach

    Authors: Shuang Li, Ziyuan Pu, Zhiyong Cui, Seunghyeon Lee, Xiucheng Guo, Dong Ngoduy

    Abstract: Highway traffic crashes exert a considerable impact on both transportation systems and the economy. In this context, accurate and dependable emergency responses are crucial for effective traffic management. However, the influence of crashes on traffic status varies across diverse factors and may be biased due to selection bias. Therefore, there arises a necessity to accurately estimate the heterog… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: 38 pages, 13 figures, 8 tables

  46. arXiv:2312.17050  [pdf, other

    cs.CV

    KeDuSR: Real-World Dual-Lens Super-Resolution via Kernel-Free Matching

    Authors: Huanjing Yue, Zifan Cui, Kun Li, Jingyu Yang

    Abstract: Dual-lens super-resolution (SR) is a practical scenario for reference (Ref) based SR by utilizing the telephoto image (Ref) to assist the super-resolution of the low-resolution wide-angle image (LR input). Different from general RefSR, the Ref in dual-lens SR only covers the overlapped field of view (FoV) area. However, current dual-lens SR methods rarely utilize these specific characteristics and… ▽ More

    Submitted 2 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 14 pages, 10 figures. Accepted by AAAI-2024

  47. arXiv:2312.13156  [pdf, other

    cs.CE cs.AI

    AccidentGPT: Accident Analysis and Prevention from V2X Environmental Perception with Multi-modal Large Model

    Authors: Lening Wang, Yilong Ren, Han Jiang, Pinlong Cai, Daocheng Fu, Tianqi Wang, Zhiyong Cui, Haiyang Yu, Xuesong Wang, Hanchu Zhou, Helai Huang, Yinhai Wang

    Abstract: Traffic accidents, being a significant contributor to both human casualties and property damage, have long been a focal point of research for many scholars in the field of traffic safety. However, previous studies, whether focusing on static environmental assessments or dynamic driving analyses, as well as pre-accident predictions or post-accident rule analyses, have typically been conducted in is… ▽ More

    Submitted 28 December, 2023; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: 21 pages, 19 figures

  48. arXiv:2312.10649  [pdf, other

    cs.CV

    PNeRFLoc: Visual Localization with Point-based Neural Radiance Fields

    Authors: Boming Zhao, Luwei Yang, Mao Mao, Hujun Bao, Zhaopeng Cui

    Abstract: Due to the ability to synthesize high-quality novel views, Neural Radiance Fields (NeRF) have been recently exploited to improve visual localization in a known environment. However, the existing methods mostly utilize NeRFs for data augmentation to improve the regression model training, and the performance on novel viewpoints and appearances is still limited due to the lack of geometric constraint… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  49. arXiv:2312.09093  [pdf, other

    cs.CV

    Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

    Authors: Ziteng Cui, Lin Gu, Xiao Sun, Xianzheng Ma, Yu Qiao, Tatsuya Harada

    Abstract: The standard Neural Radiance Fields (NeRF) paradigm employs a viewer-centered methodology, entangling the aspects of illumination and material reflectance into emission solely from 3D points. This simplified rendering approach presents challenges in accurately modeling images captured under adverse lighting conditions, such as low light or over-exposure. Motivated by the ancient Greek emission the… ▽ More

    Submitted 24 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: AAAI 2024, code available at https://cuiziteng.github.io/Aleth_NeRF_web/ Modified version of previous paper arXiv:2303.05807

  50. arXiv:2312.07353  [pdf, other

    cs.CV

    CLIP in Medical Imaging: A Comprehensive Survey

    Authors: Zihao Zhao, Yuxiao Liu, Han Wu, Yonghao Li, Sheng Wang, Lin Teng, Disheng Liu, Zhiming Cui, Qian Wang, Dinggang Shen

    Abstract: Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training paradigm, successfully introduces text supervision to vision models. It has shown promising results across various tasks, attributable to its generalizability and interpretability. The use of CLIP has recently gained increasing interest in the medical imaging domain, serving both as a pre-training paradigm for alig… ▽ More

    Submitted 21 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Project page available at https://github.com/zhaozh10/Awesome-CLIP-in-Medical-Imaging