Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 637 results for author: Jin, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.19988  [pdf, other

    cs.MM

    HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets

    Authors: Yili Jin, Xize Duan, Fangxin Wang, Xue Liu

    Abstract: Virtual Reality (VR) headsets have become increasingly popular for remote collaboration, but video conferencing poses challenges when the user's face is covered by the headset. Existing solutions have limitations in terms of accessibility. In this paper, we propose HeadsetOff, a novel system that achieves photorealistic video conferencing on economical VR headsets by leveraging voice-driven face r… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM Multimedia 2024

  2. arXiv:2407.19696  [pdf, other

    cs.CV

    Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images

    Authors: Zewen Du, Zhenjiang Hu, Guiyu Zhao, Ying Jin, Hongbin Ma

    Abstract: Object detection in aerial images has always been a challenging task due to the generally small size of the objects. Most current detectors prioritize novel detection frameworks, often overlooking research on fundamental components such as feature pyramid networks. In this paper, we introduce the Cross-Layer Feature Pyramid Transformer (CFPT), a novel upsampler-free feature pyramid network designe… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  3. arXiv:2407.16957  [pdf, other

    cs.CV

    Raindrop Clarity: A Dual-Focused Dataset for Day and Night Raindrop Removal

    Authors: Yeying Jin, Xin Li, Jiadong Wang, Yan Zhang, Malu Zhang

    Abstract: Existing raindrop removal datasets have two shortcomings. First, they consist of images captured by cameras with a focus on the background, leading to the presence of blurry raindrops. To our knowledge, none of these datasets include images where the focus is specifically on raindrops, which results in a blurry background. Second, these datasets predominantly consist of daytime images, thereby lac… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV2024, dataset and benchmark at: \url{https://github.com/jinyeying/RaindropClarity}

  4. arXiv:2407.16508  [pdf, other

    cs.CV

    ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation

    Authors: Zhenhua Wu, Yanlin Jin, Liangdong Qiu, Xiaoguang Han, Xiang Wan, Guanbin Li

    Abstract: Visualizing colonoscopy is crucial for medical auxiliary diagnosis to prevent undetected polyps in areas that are not fully observed. Traditional feature-based and depth-based reconstruction approaches usually end up with undesirable results due to incorrect point matching or imprecise depth estimation in realistic colonoscopy videos. Modern deep-based methods often require a sufficient number of… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  5. arXiv:2407.15141  [pdf, other

    cs.AI cs.LG physics.chem-ph

    Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation

    Authors: Yu Zhang, Ruijie Yu, Kaipeng Zeng, Ding Li, Feng Zhu, Xiaokang Yang, Yaohui Jin, Yanyan Xu

    Abstract: High-throughput reaction condition (RC) screening is fundamental to chemical synthesis. However, current RC screening suffers from laborious and costly trial-and-error workflows. Traditional computer-aided synthesis planning (CASP) tools fail to find suitable RCs due to data sparsity and inadequate reaction representations. Nowadays, large language models (LLMs) are capable of tackling chemistry-r… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  6. arXiv:2407.13122  [pdf, other

    cs.LG cs.AI

    MO-EMT-NAS: Multi-Objective Continuous Transfer of Architectural Knowledge Between Tasks from Different Datasets

    Authors: Peng Liao, XiLu Wang, Yaochu Jin, WenLi Du

    Abstract: Deploying models across diverse devices demands tradeoffs among multiple objectives due to different resource constraints. Arguably, due to the small model trap problem in multi-objective neural architecture search (MO-NAS) based on a supernet, existing approaches may fail to maintain large models. Moreover, multi-tasking neural architecture search (MT-NAS) excels in handling multiple tasks simult… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  7. arXiv:2407.13108  [pdf, other

    cs.CV

    UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt

    Authors: Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, Zhibo Chen

    Abstract: Compressed Image Super-resolution (CSR) aims to simultaneously super-resolve the compressed images and tackle the challenging hybrid distortions caused by compression. However, existing works on CSR usually focuses on a single compression codec, i.e., JPEG, ignoring the diverse traditional or learning-based codecs in the practical application, e.g., HEVC, VVC, HIFIC, etc. In this work, we propose… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  8. arXiv:2407.13092  [pdf, other

    eess.IV cs.CV

    CC-DCNet: Dynamic Convolutional Neural Network with Contrastive Constraints for Identifying Lung Cancer Subtypes on Multi-modality Images

    Authors: Yuan Jin, Gege Ma, Geng Chen, Tianling Lyu, Jan Egger, Junhui Lyu, Shaoting Zhang, Wentao Zhu

    Abstract: The accurate diagnosis of pathological subtypes of lung cancer is of paramount importance for follow-up treatments and prognosis managements. Assessment methods utilizing deep learning technologies have introduced novel approaches for clinical diagnosis. However, the majority of existing models rely solely on single-modality image input, leading to limited diagnostic accuracy. To this end, we prop… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  9. arXiv:2407.12443  [pdf, other

    cs.LG cs.CV

    Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective

    Authors: Zhaoxin Wang, Handing Wang, Cong Tian, Yaochu Jin

    Abstract: Adversarial training (AT) has become an effective defense method against adversarial examples (AEs) and it is typically framed as a bi-level optimization problem. Among various AT methods, fast AT (FAT), which employs a single-step attack strategy to guide the training process, can achieve good robustness against adversarial attacks at a low cost. However, FAT methods suffer from the catastrophic… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  10. arXiv:2407.11044  [pdf, other

    cs.LG cs.AI

    Generalizing soft actor-critic algorithms to discrete action spaces

    Authors: Le Zhang, Yong Gu, Xin Zhao, Yanshuo Zhang, Shu Zhao, Yifei Jin, Xinxin Wu

    Abstract: ATARI is a suite of video games used by reinforcement learning (RL) researchers to test the effectiveness of the learning algorithm. Receiving only the raw pixels and the game score, the agent learns to develop sophisticated strategies, even to the comparable level of a professional human games tester. Ideally, we also want an agent requiring very few interactions with the environment. Previous co… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Chinese Conference on Pattern Recognition and Computer Vision (PRCV) 2024. GitHub Repo https://github.com/lezhang-thu/bigger-better-faster-SAC

  11. arXiv:2407.08125  [pdf, ps, other

    cs.LG

    Real-Time Summarization of Twitter

    Authors: Yixin Jin, Meiqi Wang, Meng Li, Wenjing Zhou, Yi Shen, Hao Liu

    Abstract: In this paper, we describe our approaches to TREC Real-Time Summarization of Twitter. We focus on real time push notification scenario, which requires a system monitors the stream of sampled tweets and returns the tweets relevant and novel to given interest profiles. Dirichlet score with and with very little smoothing (baseline) are employed to classify whether a tweet is relevant to a given inter… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: This paper was accepted to International Conference on Artificial Intelligence and Electromechanical Automation 2024

  12. arXiv:2407.06394  [pdf, other

    cs.RO cs.MA

    Modeling and Analysis of Multi-Line Orders in Multi-Tote Storage and Retrieval Autonomous Mobile Robot Systems

    Authors: Xiaotao Shan, Yichao Jin, Peizheng Li, Koichi Kondo

    Abstract: As warehouses are emphasizing space utilization and the ability to handle multi-line orders, multi-tote storage and retrieval (MTSR) autonomous mobile robot systems, where robots directly retrieve totes from high shelves, are becoming increasingly popular. This paper presents a novel shared-token, multi-class, semi-open queueing network model to account for multi-line orders with general distribut… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 8 pages, 5 figures. This paper has been accepted for publication in IEEE 20th International Conference on Automation Science and Engineering (IEEE CASE 2024)

  13. arXiv:2407.03760  [pdf, other

    q-fin.CP cs.LG

    GraphCNNpred: A stock market indices prediction using a Graph based deep learning system

    Authors: Yuhui Jin

    Abstract: The application of deep learning techniques for predicting stock market prices is a prominent and widely researched topic in the field of data science. To effectively predict market trends, it is essential to utilize a diversified dataset. In this paper, we give a graph neural network based convolutional neural network (CNN) model, that can be applied on diverse source of data, in the attempt to e… ▽ More

    Submitted 17 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: 10 pages.Version 2

    MSC Class: 68Txx

  14. arXiv:2406.19217  [pdf, other

    cs.CV cs.AI cs.RO

    Think Step by Step: Chain-of-Gesture Prompting for Error Detection in Robotic Surgical Videos

    Authors: Zhimin Shao, Jialang Xu, Danail Stoyanov, Evangelos B. Mazomenos, Yueming Jin

    Abstract: Despite significant advancements in robotic systems and surgical data science, ensuring safe and optimal execution in robot-assisted minimally invasive surgery (RMIS) remains a complex challenge. Current surgical error detection methods involve two parts: identifying surgical gestures and then detecting errors within each gesture clip. These methods seldom consider the rich contextual and semantic… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures

  15. arXiv:2406.18311  [pdf, other

    cs.LG

    Online Learning of Multiple Tasks and Their Relationships : Testing on Spam Email Data and EEG Signals Recorded in Construction Fields

    Authors: Yixin Jin, Wenjing Zhou, Meiqi Wang, Meng Li, Xintao Li, Tianyu Hu

    Abstract: This paper examines an online multi-task learning (OMTL) method, which processes data sequentially to predict labels across related tasks. The framework learns task weights and their relatedness concurrently. Unlike previous models that assumed static task relatedness, our approach treats tasks as initially independent, updating their relatedness iteratively using newly calculated weight vectors.… ▽ More

    Submitted 29 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  16. arXiv:2406.17963  [pdf, other

    cs.LG cs.HC cs.SI

    Empowering Interdisciplinary Insights with Dynamic Graph Embedding Trajectories

    Authors: Yiqiao Jin, Andrew Zhao, Yeon-Chang Lee, Meng Ye, Ajay Divakaran, Srijan Kumar

    Abstract: We developed DyGETViz, a novel framework for effectively visualizing dynamic graphs (DGs) that are ubiquitous across diverse real-world systems. This framework leverages recent advancements in discrete-time dynamic graph (DTDG) models to adeptly handle the temporal dynamics inherent in dynamic graphs. DyGETViz effectively captures both micro- and macro-level structural shifts within these graphs,… ▽ More

    Submitted 28 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 27 pages, 11 figures

  17. arXiv:2406.15031  [pdf, other

    cs.IT

    New Upper Bounds for Noisy Permutation Channels

    Authors: Lugaoze Feng, Baoji Wang, Guocheng Lv, Xvnan Li, Luhua Wang, Ye jin

    Abstract: The noisy permutation channel is a useful abstraction introduced by Makur for point-to-point communication networks and biological storage. While the asymptotic capacity results exist for this model, the characterization of the second-order asymptotics is not available. Therefore, we analyze the converse bounds for the noisy permutation channel in the finite blocklength regime. To do this, we pres… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 24 Pages, Submitted to IEEE Transactions on Communications

  18. arXiv:2406.14903  [pdf, other

    cs.AI

    GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models

    Authors: Leyan Wang, Yonggang Jin, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, Wenhao Huang, Jiaheng Liu, Shi Wang, Ge Zhang, Liuyu Xiang, Zhaofeng He

    Abstract: As large language models (LLMs) continue to develop and gain widespread application, the ability of LLMs to exhibit empathy towards diverse group identities and understand their perspectives is increasingly recognized as critical. Most existing benchmarks for empathy evaluation of LLMs focus primarily on universal human emotions, such as sadness and pain, often overlooking the context of individua… ▽ More

    Submitted 24 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  19. arXiv:2406.14865  [pdf, other

    cs.NE

    Multi-Domain Evolutionary Optimization of Network Structures

    Authors: Jie Zhao, Kang Hao Cheong, Yaochu Jin

    Abstract: Multi-Task Evolutionary Optimization (MTEO), an important field focusing on addressing complex problems through optimizing multiple tasks simultaneously, has attracted much attention. While MTEO has been primarily focusing on task similarity, there remains a hugely untapped potential in harnessing the shared characteristics between different domains to enhance evolutionary optimization. For exampl… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  20. arXiv:2406.14534  [pdf, other

    eess.IV cs.CV

    Epicardium Prompt-guided Real-time Cardiac Ultrasound Frame-to-volume Registration

    Authors: Long Lei, Jun Zhou, Jialun Pei, Baoliang Zhao, Yueming Jin, Yuen-Chun Jeremy Teoh, Jing Qin, Pheng-Ann Heng

    Abstract: A comprehensive guidance view for cardiac interventional surgery can be provided by the real-time fusion of the intraoperative 2D images and preoperative 3D volume based on the ultrasound frame-to-volume registration. However, cardiac ultrasound images are characterized by a low signal-to-noise ratio and small differences between adjacent frames, coupled with significant dimension variations betwe… ▽ More

    Submitted 27 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted by MICCAI 2024

  21. arXiv:2406.13125  [pdf, other

    cs.AI

    A Unified Framework for Combinatorial Optimization Based on Graph Neural Networks

    Authors: Yaochu Jin, Xueming Yan, Shiqing Liu, Xiangyu Wang

    Abstract: Graph neural networks (GNNs) have emerged as a powerful tool for solving combinatorial optimization problems (COPs), exhibiting state-of-the-art performance in both graph-structured and non-graph-structured domains. However, existing approaches lack a unified framework capable of addressing a wide range of COPs. After presenting a summary of representative COPs and a brief review of recent advance… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  22. arXiv:2406.12708  [pdf, other

    cs.CL

    AgentReview: Exploring Peer Review Dynamics with LLM Agents

    Authors: Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kaijie Zhu, Yijia Xiao, Jindong Wang

    Abstract: Peer review is fundamental to the integrity and advancement of scientific publication. Traditional methods of peer review analyses often rely on exploration and statistics of existing peer review data, which do not adequately address the multivariate nature of the process, account for the latent variables, and are further constrained by privacy concerns due to the sensitive nature of the data. We… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 22 pages, 10 figures

  23. arXiv:2406.12195  [pdf, other

    quant-ph cs.LG

    Quantum Compiling with Reinforcement Learning on a Superconducting Processor

    Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, Jingning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong Jin, Ruixia Wang, Haifeng Yu, S. P. Zhao

    Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  24. arXiv:2406.11519  [pdf, other

    cs.CV eess.IV

    HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

    Authors: Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, Dacheng Tao, Liangpei Zhang

    Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA

  25. arXiv:2406.10797  [pdf, other

    cs.CV

    STAR: Scale-wise Text-to-image generation via Auto-Regressive representations

    Authors: Xiaoxiao Ma, Mohan Zhou, Tao Liang, Yalong Bai, Tiejun Zhao, Huaian Chen, Yi Jin

    Abstract: We present STAR, a text-to-image model that employs scale-wise auto-regressive paradigm. Unlike VAR, which is limited to class-conditioned synthesis within a fixed set of predetermined categories, our STAR enables text-driven open-set generation through three key designs: To boost diversity and generalizability with unseen combinations of objects and concepts, we introduce a pre-trained text encod… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures

  26. arXiv:2406.10261  [pdf, other

    cs.CL cs.AI

    FoodSky: A Food-oriented Large Language Model that Passes the Chef and Dietetic Examination

    Authors: Pengfei Zhou, Weiqing Min, Chaoran Fu, Ying Jin, Mingyu Huang, Xiangyang Li, Shuhuan Mei, Shuqiang Jiang

    Abstract: Food is foundational to human life, serving not only as a source of nourishment but also as a cornerstone of cultural identity and social interaction. As the complexity of global dietary needs and preferences grows, food intelligence is needed to enable food perception and reasoning for various tasks, ranging from recipe generation and dietary recommendation to diet-disease correlation discovery a… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 32 pages, 19 figures

  27. arXiv:2406.09682  [pdf, other

    cs.CR

    Privacy-preserving Quantification of Non-IID Degree in Federated Learning

    Authors: Yuping Yan, Yizhi Wang, Yingchao Yu, Yaochu Jin

    Abstract: Federated learning (FL) offers a privacy-preserving approach to machine learning for multiple collaborators without sharing raw data. However, the existence of non-independent and non-identically distributed (non-IID) datasets across different clients presents a significant challenge to FL, leading to a sharp drop in accuracy, reduced efficiency, and hindered implementation. To address the non-IID… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 8 pages, 8 figures, FL@FM-IJCAI'24

  28. arXiv:2406.09680  [pdf, other

    cs.LG cs.DC

    Heterogeneous Federated Learning with Convolutional and Spiking Neural Networks

    Authors: Yingchao Yu, Yuping Yan, Jisong Cai, Yaochu Jin

    Abstract: Federated learning (FL) has emerged as a promising paradigm for training models on decentralized data while safeguarding data privacy. Most existing FL systems, however, assume that all machine learning models are of the same type, although it becomes more likely that different edge devices adopt different types of AI models, including both conventional analogue artificial neural networks (ANNs) a… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures, FL@FM-IJCAI'24

  29. arXiv:2406.09621  [pdf, other

    cs.IR

    Enhancing Knowledge Retrieval with In-Context Learning and Semantic Search through Generative AI

    Authors: Mohammed-Khalil Ghali, Abdelrahman Farrag, Daehan Won, Yu Jin

    Abstract: Retrieving and extracting knowledge from extensive research documents and large databases presents significant challenges for researchers, students, and professionals in today's information-rich era. Existing retrieval systems, which rely on general-purpose Large Language Models (LLMs), often fail to provide accurate responses to domain-specific inquiries. Additionally, the high cost of pretrainin… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  30. arXiv:2406.08709  [pdf, other

    cs.LG stat.ME

    Introducing Diminutive Causal Structure into Graph Representation Learning

    Authors: Hang Gao, Peng Qiao, Yifan Jin, Fengge Wu, Jiangmeng Li, Changwen Zheng

    Abstract: When engaging in end-to-end graph representation learning with Graph Neural Networks (GNNs), the intricate causal relationships and rules inherent in graph data pose a formidable challenge for the model in accurately capturing authentic data relationships. A proposed mitigating strategy involves the direct integration of rules or relationships corresponding to the graph data into the model. Howeve… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  31. arXiv:2406.08524  [pdf, other

    cs.LG cs.DC

    Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks

    Authors: Xueming Yan, Ziqi Wang, Yaochu Jin

    Abstract: Federated multi-view clustering offers the potential to develop a global clustering model using data distributed across multiple devices. However, current methods face challenges due to the absence of label information and the paramount importance of data privacy. A significant issue is the feature heterogeneity across multi-view data, which complicates the effective mining of complementary cluste… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  32. arXiv:2406.07521  [pdf, other

    cs.DS cs.LG

    Faster Spectral Density Estimation and Sparsification in the Nuclear Norm

    Authors: Yujia Jin, Ishani Karmarkar, Christopher Musco, Aaron Sidford, Apoorv Vikram Singh

    Abstract: We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $n$-node undirected graph. We provide a randomized algorithm that, with $O(nε^{-2})$ queries to a degree and neighbor oracle and in $O(nε^{-3})$ time, estimates the spectrum up to $ε$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $O(nε^{-7})$… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2024

  33. arXiv:2406.06606  [pdf, other

    cs.CL cs.AI

    Prototypical Reward Network for Data-Efficient RLHF

    Authors: Jinghan Zhang, Xiting Wang, Yiqiao Jin, Changyu Chen, Xinhao Zhang, Kunpeng Liu

    Abstract: The reward model for Reinforcement Learning from Human Feedback (RLHF) has proven effective in fine-tuning Large Language Models (LLMs). Notably, collecting human feedback for RLHF can be resource-intensive and lead to scalability issues for LLMs and complex tasks. Our proposed framework Proto-RM leverages prototypical networks to enhance reward models under limited human feedback. By enabling sta… ▽ More

    Submitted 7 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024

  34. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  35. arXiv:2406.05338  [pdf, other

    cs.CV

    MotionClone: Training-Free Motion Cloning for Controllable Video Generation

    Authors: Pengyang Ling, Jiazi Bu, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin

    Abstract: Motion-based controllable text-to-video generation involves motions to control the video generation. Previous methods typically require the training of models to encode motion cues or the fine-tuning of video diffusion models. However, these approaches often result in suboptimal motion generation when applied outside the trained domain. In this work, we propose MotionClone, a training-free framewo… ▽ More

    Submitted 28 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, https://bujiazi.github.io/motionclone.github.io/

  36. arXiv:2406.05017  [pdf, other

    cs.LG cs.AI

    Adaptively Learning to Select-Rank in Online Platforms

    Authors: Jingyuan Wang, Perry Dong, Ying Jin, Ruohan Zhan, Zhengyuan Zhou

    Abstract: Ranking algorithms are fundamental to various online platforms across e-commerce sites to content streaming services. Our research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users, a key component in personalizing user experience. We develop a user response model that considers diverse user preferences and the varying effects of item positions, aimi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 25 pages in total. Includes 4 figures and a pdf. International conference on machine learning. PMLR, 2024

  37. arXiv:2406.04680  [pdf, other

    eess.IV cs.CV

    MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

    Authors: Yixin Huang, Yiqi Jin, Ke Tao, Kaijian Xia, Jianfeng Gu, Lei Yu, Lan Du, Cunjian Chen

    Abstract: May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  38. arXiv:2406.04658  [pdf, other

    cs.CR cs.AI cs.LG

    Advanced Payment Security System:XGBoost, LightGBM and SMOTE Integrated

    Authors: Qi Zheng, Chang Yu, Jin Cao, Yongshun Xu, Qianwen Xing, Yinxin Jin

    Abstract: With the rise of various online and mobile payment systems, transaction fraud has become a significant threat to financial security. This study explores the application of advanced machine learning models, specifically based on XGBoost and LightGBM, for developing a more accurate and robust Payment Security Protection Model. To enhance data reliability, we meticulously processed the data sources a… ▽ More

    Submitted 26 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: This paper is received by https://ieee-metacom.org

  39. arXiv:2406.04614  [pdf, ps, other

    cs.CL cs.AI

    LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

    Authors: Zhi Zhou, Jiang-Xin Shi, Peng-Xiao Song, Xiao-Wen Yang, Yi-Xuan Jin, Lan-Zhe Guo, Yu-Feng Li

    Abstract: Large language models (LLMs), including both proprietary and open-source models, have showcased remarkable capabilities in addressing a wide range of downstream tasks. Nonetheless, when it comes to practical Chinese legal tasks, these models fail to meet the actual requirements. Proprietary models do not ensure data privacy for sensitive legal cases, while open-source models demonstrate unsatisfac… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Technical Report

  40. arXiv:2406.04533  [pdf, other

    cs.AI cs.LG

    Rare Class Prediction Model for Smart Industry in Semiconductor Manufacturing

    Authors: Abdelrahman Farrag, Mohammed-Khalil Ghali, Yu Jin

    Abstract: The evolution of industry has enabled the integration of physical and digital systems, facilitating the collection of extensive data on manufacturing processes. This integration provides a reliable solution for improving process quality and managing equipment health. However, data collected from real manufacturing processes often exhibit challenging properties, such as severe class imbalance, high… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  41. arXiv:2406.03733  [pdf, other

    cs.LG cs.AI

    Credit Card Fraud Detection Using Advanced Transformer Model

    Authors: Chang Yu, Yongshun Xu, Jin Cao, Ye Zhang, Yinxin Jin, Mengran Zhu

    Abstract: With the proliferation of various online and mobile payment systems, credit card fraud has emerged as a significant threat to financial security. This study focuses on innovative applications of the latest Transformer models for more robust and precise fraud detection. To ensure the reliability of the data, we meticulously processed the data sources, balancing the dataset to address the issue of d… ▽ More

    Submitted 26 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: This paper have been received by https://ieee-metacom.org/

  42. arXiv:2406.02929  [pdf, other

    cs.CV cs.LG

    Exploring Data Efficiency in Zero-Shot Learning with Diffusion Models

    Authors: Zihan Ye, Shreyank N. Gowda, Xiaobo Jin, Xiaowei Huang, Haotian Xu, Yaochu Jin, Kaizhu Huang

    Abstract: Zero-Shot Learning (ZSL) aims to enable classifiers to identify unseen classes by enhancing data efficiency at the class level. This is achieved by generating image features from pre-defined semantics of unseen classes. However, most current approaches heavily depend on the number of samples from seen classes, i.e. they do not consider instance-level effectiveness. In this paper, we demonstrate th… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  43. arXiv:2406.02040  [pdf, other

    cs.LG cs.AI

    DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment

    Authors: Gongpei Zhao, Tao Wang, Congyan Lang, Yi Jin, Yidong Li, Haibin Ling

    Abstract: Graph neural networks are recognized for their strong performance across various applications, with the backpropagation algorithm playing a central role in the development of most GNN models. However, despite its effectiveness, BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks. Whil… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  44. arXiv:2406.01987  [pdf, other

    cs.CV

    Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization

    Authors: Yunpeng Zhao, Cheng Chen, Qing You Pang, Quanzheng Li, Carol Tang, Beng-Ti Ang, Yueming Jin

    Abstract: Addressing missing modalities presents a critical challenge in multimodal learning. Current approaches focus on developing models that can handle modality-incomplete inputs during inference, assuming that the full set of modalities are available for all the data during training. This reliance on full-modality data for training limits the use of abundant modality-incomplete samples that are often e… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  45. arXiv:2406.00631  [pdf, other

    cs.CV

    MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging

    Authors: Jiaying Zhou, Mingzhou Jiang, Junde Wu, Jiayuan Zhu, Ziyue Wang, Yueming Jin

    Abstract: Medicine is inherently a multimodal discipline. Medical images can reflect the pathological changes of cancer and tumors, while the expression of specific genes can influence their morphological characteristics. However, most deep learning models employed for these medical tasks are unimodal, making predictions using either image data or genomic data exclusively. In this paper, we propose a multim… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  46. arXiv:2405.20585  [pdf, other

    cs.CL cs.AI

    GAMedX: Generative AI-based Medical Entity Data Extractor Using Large Language Models

    Authors: Mohammed-Khalil Ghali, Abdelrahman Farrag, Hajar Sakai, Hicham El Baz, Yu Jin, Sarah Lam

    Abstract: In the rapidly evolving field of healthcare and beyond, the integration of generative AI in Electronic Health Records (EHRs) represents a pivotal advancement, addressing a critical gap in current information extraction techniques. This paper introduces GAMedX, a Named Entity Recognition (NER) approach utilizing Large Language Models (LLMs) to efficiently extract entities from medical narratives an… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  47. arXiv:2405.17835  [pdf, other

    cs.CV

    Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

    Authors: Shuojue Yang, Qian Li, Daiyun Shen, Bingchen Gong, Qi Dou, Yueming Jin

    Abstract: Tissue deformation poses a key challenge for accurate surgical scene reconstruction. Despite yielding high reconstruction quality, existing methods suffer from slow rendering speeds and long training times, limiting their intraoperative applicability. Motivated by recent progress in 3D Gaussian Splatting, an emerging technology in real-time 3D rendering, this work presents a novel fast reconstruct… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Early accepted at MICCAI 2024, 10 pages, 2 figures

  48. arXiv:2405.14677  [pdf, other

    cs.CV cs.LG

    RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

    Authors: Zhicheng Sun, Zhenhao Yang, Yang Jin, Haozhe Chi, Kun Xu, Kun Xu, Liwei Chen, Hao Jiang, Di Zhang, Yang Song, Kun Gai, Yadong Mu

    Abstract: Customizing diffusion models to generate identity-preserving images from user-provided reference images is an intriguing new problem. The prevalent approaches typically require training on extensive domain-specific images to achieve identity preservation, which lacks flexibility across different use cases. To address this issue, we exploit classifier guidance, a training-free technique that steers… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  49. arXiv:2405.14374  [pdf, other

    stat.ML cs.AI cs.LG

    State-Constrained Offline Reinforcement Learning

    Authors: Charles A. Hepburn, Yue Jin, Giovanni Montana

    Abstract: Traditional offline reinforcement learning methods predominantly operate in a batch-constrained setting. This confines the algorithms to a specific state-action distribution present in the dataset, reducing the effects of distributional shift but restricting the algorithm greatly. In this paper, we alleviate this limitation by introducing a novel framework named \emph{state-constrained} offline re… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  50. arXiv:2405.13560  [pdf, other

    cs.HC cs.AI

    Navigating User Experience of ChatGPT-based Conversational Recommender Systems: The Effects of Prompt Guidance and Recommendation Domain

    Authors: Yizhe Zhang, Yucheng Jin, Li Chen, Ting Yang

    Abstract: Conversational recommender systems (CRS) enable users to articulate their preferences and provide feedback through natural language. With the advent of large language models (LLMs), the potential to enhance user engagement with CRS and augment the recommendation process with LLM-generated content has received increasing attention. However, the efficacy of LLM-powered CRS is contingent upon the use… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.