Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 313 results for author: Hu, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.19183  [pdf, other

    cs.LG cs.AI cs.NE

    Graph Memory Learning: Imitating Lifelong Remembering and Forgetting of Brain Networks

    Authors: Jiaxing Miao, Liang Hu, Qi Zhang, Longbing Cao

    Abstract: Graph data in real-world scenarios undergo rapid and frequent changes, making it challenging for existing graph models to effectively handle the continuous influx of new data and accommodate data withdrawal requests. The approach to frequently retraining graph models is resource intensive and impractical. To address this pressing challenge, this paper introduces a new concept of graph memory learn… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

  2. arXiv:2407.18148  [pdf, other

    cs.DC cs.LG

    StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests

    Authors: Cheng-Wei Ching, Boyuan Guan, Hailu Xu, Liting Hu

    Abstract: The life cycle of machine learning (ML) applications consists of two stages: model development and model deployment. However, traditional ML systems (e.g., training-specific or inference-specific systems) focus on one particular stage or phase of the life cycle of ML applications. These systems often aim at optimizing model training or accelerating model inference, and they frequently assume homog… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 6 pages, 8 figures, to appear in AIoTC'24

  3. arXiv:2407.14953  [pdf, other

    cs.DB cs.DC

    AgileDART: An Agile and Scalable Edge Stream Processing Engine

    Authors: Liting Hu, Cheng-Wei Ching

    Abstract: Edge applications generate a large influx of sensor data at massive scales. Under many time-critical scenarios, these massive data streams must be processed in a very short time to derive actionable intelligence. However, traditional data processing systems (e.g., stream processing systems, cloud-based IoT data processing systems) are not well-suited for these edge applications. This is because th… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: 18 pages, 18 figures

  4. arXiv:2407.13194  [pdf, other

    cs.LG cs.AI

    Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

    Authors: Hui He, Qi Zhang, Kun Yi, Xiaojun Xue, Shoujin Wang, Liang Hu, Longbing Cao

    Abstract: The non-stationary nature of real-world Multivariate Time Series (MTS) data presents forecasting models with a formidable challenge of the time-variant distribution of time series, referred to as distribution shift. Existing studies on the distribution shift mostly adhere to adaptive normalization techniques for alleviating temporal mean and covariance shifts or time-variant modeling for capturing… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 19 pages, 11 figures

    MSC Class: 68Txx ACM Class: I.2.6

  5. Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning

    Authors: Jiakai Tang, Sunhao Dai, Zexu Sun, Xu Chen, Jun Xu, Wenhui Yu, Lantao Hu, Peng Jiang, Han Li

    Abstract: In recent years, graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity. However, most existing GCL models rely on heuristic approaches and usually assume entity independence when constructing contrastive views. We argue that these methods struggle to strike a balance between semantic invariance an… ▽ More

    Submitted 21 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

    Comments: KDD 2024

  6. arXiv:2407.09977  [pdf

    physics.geo-ph cs.AI

    Mitigating Interpretation Bias in Rock Records with Large Language Models: Insights from Paleoenvironmental Analysis

    Authors: Luoqi Wang, Haipeng Li, Linshu Hu, Jiarui Cai, Zhenhong Du

    Abstract: The reconstruction of Earth's history faces significant challenges due to the nonunique interpretations often derived from rock records. The problem has long been recognized but there are no systematic solutions in practice. This study introduces an innovative approach that leverages Large Language Models (LLMs) along with retrieval augmented generation and real-time search capabilities to counter… ▽ More

    Submitted 17 May, 2024; originally announced July 2024.

  7. arXiv:2407.03771  [pdf, other

    cs.CV

    SpikeGS: Reconstruct 3D scene via fast-moving bio-inspired sensors

    Authors: Yijia Guo, Liwen Hu, Lei Ma, Tiejun Huang

    Abstract: 3D Gaussian Splatting (3DGS) demonstrates unparalleled superior performance in 3D scene reconstruction. However, 3DGS heavily relies on the sharp images. Fulfilling this requirement can be challenging in real-world scenarios especially when the camera moves fast, which severely limits the application of 3DGS. To address these challenges, we proposed Spike Gausian Splatting (SpikeGS), the first fra… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  8. arXiv:2407.01599  [pdf, other

    cs.CL cs.CR cs.CV cs.LG

    JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models

    Authors: Haibo Jin, Leyang Hu, Xinuo Li, Peiyan Zhang, Chonghan Chen, Jun Zhuang, Haohan Wang

    Abstract: The rapid evolution of artificial intelligence (AI) through developments in Large Language Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements across various technological domains. While these models enhance capabilities in natural language processing and visual interactive tasks, their growing adoption raises critical concerns regarding security and ethical alignm… ▽ More

    Submitted 24 July, 2024; v1 submitted 25 June, 2024; originally announced July 2024.

    Comments: 45 pages

  9. arXiv:2406.19531  [pdf, other

    stat.ML cs.LG

    Forward and Backward State Abstractions for Off-policy Evaluation

    Authors: Meiling Hao, Pingfan Su, Liyuan Hu, Zoltan Szabo, Qingyuan Zhao, Chengchun Shi

    Abstract: Off-policy evaluation (OPE) is crucial for evaluating a target policy's impact offline before its deployment. However, achieving accurate OPE in large state spaces remains challenging.This paper studies state abstractions-originally designed for policy learning-in the context of OPE. Our contributions are three-fold: (i) We define a set of irrelevance conditions central to learning state abstracti… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 42 pages, 5 figures

    ACM Class: G.3; I.2.6; G.1.2

  10. arXiv:2406.19215  [pdf, other

    cs.CL

    SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

    Authors: Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Weichuan Liu, Lei Hou, Juanzi Li

    Abstract: This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states. SeaKR activates retrieval when the LLMs present high self-aware uncertainty for generation. To effectively integrate retrieved knowledge snippets, SeaKR re-ranks them based on LLM's self-aware uncertainty to preserve the snippet that redu… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  11. arXiv:2406.18992  [pdf, other

    cs.CV cs.AI cs.LG

    Semi-supervised Concept Bottleneck Models

    Authors: Lijie Hu, Tianhao Huang, Huanyi Xie, Chenyang Ren, Zhengyu Hu, Lu Yu, Di Wang

    Abstract: Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs heavily relies on the accuracy and richness of annotated concepts in the dataset. These concept labels are typically provided… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 17 pages

  12. arXiv:2406.16968  [pdf, other

    cs.LG cs.AI

    Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition

    Authors: Kai Shao, Rui Wang, Yixue Hao, Long Hu, Min Chen, Hans Arno Jacobsen

    Abstract: Depression recognition based on physiological signals such as functional near-infrared spectroscopy (fNIRS) and electroencephalogram (EEG) has made considerable progress. However, most existing studies ignore the complementarity and semantic consistency of multimodal physiological signals under the same stimulation task in complex spatio-temporal patterns. In this paper, we introduce a multimodal… ▽ More

    Submitted 25 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

  13. arXiv:2406.15811  [pdf, other

    cs.CV

    PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud by 2D Inpainting

    Authors: Qiao Yu, Xianzhi Li, Yuan Tang, Jinfeng Xu, Long Hu, Yixue Hao, Min Chen

    Abstract: Reconstructing textured meshes from colored point clouds is an important but challenging task in 3D graphics and vision. Most existing methods predict colors as implicit functions in 3D or UV space, suffering from blurry textures or the lack of generalization capability. Addressing this, we propose PointDreamer, a novel framework for textured mesh reconstruction from colored point cloud. It produc… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  14. arXiv:2406.14066  [pdf, other

    cs.AI cs.PF

    Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

    Authors: Xiaoxuan Liu, Cade Daniel, Langxiang Hu, Woosuk Kwon, Zhuohan Li, Xiangxi Mo, Alvin Cheung, Zhijie Deng, Ion Stoica, Hao Zhang

    Abstract: Reducing the inference latency of large language models (LLMs) is crucial, and speculative decoding (SD) stands out as one of the most effective techniques. Rather than letting the LLM generate all tokens directly, speculative decoding employs effective proxies to predict potential outputs, which are then verified by the LLM without compromising the generation quality. Yet, deploying SD in real on… ▽ More

    Submitted 25 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  15. arXiv:2406.12255  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

    Authors: Lijie Hu, Liang Liu, Shu Yang, Xin Chen, Hongru Xiao, Mengdi Li, Pan Zhou, Muhammad Asif Ali, Di Wang

    Abstract: Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models (LLMs). While some studies focus on improving CoT accuracy through methods like retrieval enhancement, yet a rigorous explanation for why CoT achieves such success remains unclear. In this paper, we analyze CoT methods under two different settings by asking the following questions: (1… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 21 pages

  16. arXiv:2406.09742  [pdf, other

    cs.IR

    IFA: Interaction Fidelity Attention for Entire Lifelong Behaviour Sequence Modeling

    Authors: Wenhui Yu, Chao Feng, Yanze Zhang, Lantao Hu, Peng Jiang, Han Li

    Abstract: The lifelong user behavior sequence provides abundant information of user preference and gains impressive improvement in the recommendation task, however increases computational consumption significantly. To meet the severe latency requirement in online service, a short sub-sequence is sampled based on similarity to the target item. Unfortunately, items not in the sub-sequence are abandoned, leadi… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 7 pages, 2 figures

  17. Modeling User Retention through Generative Flow Networks

    Authors: Ziru Liu, Shuchang Liu, Bin Yang, Zhenghai Xue, Qingpeng Cai, Xiangyu Zhao, Zijian Zhang, Lantao Hu, Han Li, Peng Jiang

    Abstract: Recommender systems aim to fulfill the user's daily demands. While most existing research focuses on maximizing the user's engagement with the system, it has recently been pointed out that how frequently the users come back for the service also reflects the quality and stability of recommendations. However, optimizing this user retention behavior is non-trivial and poses several challenges includi… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: KDD-ADS 2024

  18. arXiv:2406.01460  [pdf, other

    cs.CV cs.AI

    MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization

    Authors: Yu Zhang, Qi Zhang, Zixuan Gong, Yiwei Shi, Yepeng Liu, Duoqian Miao, Yang Liu, Ke Liu, Kun Yi, Wei Fan, Liang Hu, Changwei Wang

    Abstract: Contrastive Language-Image Pretraining (CLIP) has achieved remarkable success, leading to rapid advancements in multimodal studies. However, CLIP faces a notable challenge in terms of inefficient data utilization. It relies on a single contrastive supervision for each image-text pair during representation learning, disregarding a substantial amount of valuable information that could offer richer s… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  19. arXiv:2406.00696  [pdf, ps, other

    cs.CV

    Bilinear-Convolutional Neural Network Using a Matrix Similarity-based Joint Loss Function for Skin Disease Classification

    Authors: Belal Ahmad, Mohd Usama, Tanvir Ahmad, Adnan Saeed, Shabnam Khatoon, Long Hu

    Abstract: In this study, we proposed a model for skin disease classification using a Bilinear Convolutional Neural Network (BCNN) with a Constrained Triplet Network (CTN). BCNN can capture rich spatial interactions between features in image data. This computes the outer product of feature vectors from two different CNNs by a bilinear pooling. The resulting features encode second-order statistics, enabling t… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 16 pages, 11 figures, 2 tables

  20. arXiv:2405.19708  [pdf, other

    cs.CV cs.AI

    Text Guided Image Editing with Automatic Concept Locating and Forgetting

    Authors: Jia Li, Lijie Hu, Zhixian He, Jingfeng Zhang, Tianhang Zheng, Di Wang

    Abstract: With the advancement of image-to-image diffusion models guided by text, significant progress has been made in image editing. However, a persistent challenge remains in seamlessly incorporating objects into images based on textual instructions, without relying on extra user-provided guidance. Text and images are inherently distinct modalities, bringing out difficulties in fully capturing the semant… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  21. arXiv:2405.16790  [pdf, other

    cs.CV

    SCSim: A Realistic Spike Cameras Simulator

    Authors: Liwen Hu, Lei Ma, Yijia Guo, Tiejun Huang

    Abstract: Spike cameras, with their exceptional temporal resolution, are revolutionizing high-speed visual applications. Large-scale synthetic datasets have significantly accelerated the development of these cameras, particularly in reconstruction and optical flow. However, current synthetic datasets for spike cameras lack sophistication. Addressing this gap, we introduce SCSim, a novel and more realistic s… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME2024. arXiv admin note: substantial text overlap with arXiv:2304.03129

  22. arXiv:2405.16204  [pdf, other

    cs.CV cs.AI cs.GR

    VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresence

    Authors: Phong Tran, Egor Zakharov, Long-Nhat Ho, Liwen Hu, Adilbek Karmanov, Aviral Agarwal, McLean Goldwhite, Ariana Bermudez Venegas, Anh Tuan Tran, Hao Li

    Abstract: We introduce VOODOO XP: a 3D-aware one-shot head reenactment method that can generate highly expressive facial expressions from any input driver video and a single 2D portrait. Our solution is real-time, view-consistent, and can be instantly used without calibration or fine-tuning. We demonstrate our solution on a monocular video setting and an end-to-end VR telepresence system for two-way communi… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  23. arXiv:2405.15476  [pdf, other

    cs.LG cs.AI cs.CV

    Editable Concept Bottleneck Models

    Authors: Lijie Hu, Chenyang Ren, Zhengyu Hu, Cheng-Long Wang, Di Wang

    Abstract: Concept Bottleneck Models (CBMs) have garnered much attention for their ability to elucidate the prediction process through a human-understandable concept layer. However, most previous studies focused on cases where the data, including concepts, are clean. In many scenarios, we always need to remove/insert some training data or new concepts from trained CBMs due to different reasons, such as priva… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 33 pages

  24. arXiv:2405.15452  [pdf, other

    cs.CL cs.AI cs.LG

    Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top

    Authors: Keyuan Cheng, Muhammad Asif Ali, Shu Yang, Gang Lin, Yuxuan Zhai, Haoyang Fei, Ke Xu, Lu Yu, Lijie Hu, Di Wang

    Abstract: Multi-hop Question Answering (MQA) under knowledge editing (KE) is a key challenge in Large Language Models (LLMs). While best-performing solutions in this domain use a plan and solve paradigm to split a question into sub-questions followed by response generation, we claim that this approach is sub-optimal as it fails for hard to decompose questions, and it does not explicitly cater to correlated… ▽ More

    Submitted 27 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 18 pages

  25. Modeling User Fatigue for Sequential Recommendation

    Authors: Nian Li, Xin Ban, Cheng Ling, Chen Gao, Lantao Hu, Peng Jiang, Kun Gai, Yong Li, Qingmin Liao

    Abstract: Recommender systems filter out information that meets user interests. However, users may be tired of the recommendations that are too similar to the content they have been exposed to in a short historical period, which is the so-called user fatigue. Despite the significance for a better user experience, user fatigue is seldom explored by existing recommenders. In fact, there are three main challen… ▽ More

    Submitted 22 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: SIGIR 2024

  26. arXiv:2405.09780  [pdf, other

    cs.RO

    EFEAR-4D: Ego-Velocity Filtering for Efficient and Accurate 4D radar Odometry

    Authors: Xiaoyi Wu, Yushuai Chen, Zhan Li, Ziyang Hong, Liang Hu

    Abstract: Odometry is a crucial component for successfully implementing autonomous navigation, relying on sensors such as cameras, LiDARs and IMUs. However, these sensors may encounter challenges in extreme weather conditions, such as snowfall and fog. The emergence of FMCW radar technology offers the potential for robust perception in adverse conditions. As the latest generation of FWCW radars, the 4D mmWa… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  27. arXiv:2405.06646  [pdf, other

    cs.GR cs.CV

    On-the-fly Learning to Transfer Motion Style with Diffusion Models: A Semantic Guidance Approach

    Authors: Lei Hu, Zihao Zhang, Yongjing Ye, Yiwen Xu, Shihong Xia

    Abstract: In recent years, the emergence of generative models has spurred development of human motion generation, among which the generation of stylized human motion has consistently been a focal point of research. The conventional approach for stylized human motion generation involves transferring the style from given style examples to new motions. Despite decades of research in human motion style transfer… ▽ More

    Submitted 20 March, 2024; originally announced May 2024.

    Comments: 23 pages

    MSC Class: 68U05 ACM Class: I.3.0

  28. arXiv:2405.02644  [pdf, other

    cs.LG

    Interpretable Multi-View Clustering

    Authors: Mudi Jiang, Lianyu Hu, Zengyou He, Zhikui Chen

    Abstract: Multi-view clustering has become a significant area of research, with numerous methods proposed over the past decades to enhance clustering accuracy. However, in many real-world applications, it is crucial to demonstrate a clear decision-making process-specifically, explaining why samples are assigned to particular clusters. Consequently, there remains a notable gap in developing interpretable met… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 12 pages,6 figures

    ACM Class: I.2.6

  29. arXiv:2405.01847  [pdf, other

    cs.IR cs.AI

    A Model-based Multi-Agent Personalized Short-Video Recommender System

    Authors: Peilun Zhou, Xiaoxiao Xu, Lantao Hu, Han Li, Peng Jiang

    Abstract: Recommender selects and presents top-K items to the user at each online request, and a recommendation session consists of several sequential requests. Formulating a recommendation session as a Markov decision process and solving it by reinforcement learning (RL) framework has attracted increasing attention from both academic and industry communities. In this paper, we propose a RL-based industrial… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  30. arXiv:2405.01461  [pdf, other

    cs.CV

    SATO: Stable Text-to-Motion Framework

    Authors: Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen

    Abstract: Is the Text to Motion model robust? Recent advancements in Text to Motion models primarily stem from more accurate predictions of specific actions. However, the text modality typically relies solely on pre-trained Contrastive Language-Image Pretraining (CLIP) models. Our research has uncovered a significant issue with the text-to-motion model: its predictions often exhibit inconsistent outputs, re… ▽ More

    Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  31. arXiv:2405.01413  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors

    Authors: Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Yixue Hao, Long Hu, Min Chen

    Abstract: Large 2D vision-language models (2D-LLMs) have gained significant attention by bridging Large Language Models (LLMs) with images using a simple projector. Inspired by their success, large 3D point cloud-language models (3D-LLMs) also integrate point clouds into LLMs. However, directly aligning point clouds with LLM requires expensive training costs, typically in hundreds of GPU-hours on A100, whic… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 17 pages, 9 figures

  32. arXiv:2405.00614  [pdf, other

    cs.LG

    Multigroup Robustness

    Authors: Lunjia Hu, Charlotte Peale, Judy Hanwen Shen

    Abstract: To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  33. M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework

    Authors: Zijian Zhang, Shuchang Liu, Jiaao Yu, Qingpeng Cai, Xiangyu Zhao, Chunxu Zhang, Ziru Liu, Qidong Liu, Hongwei Zhao, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: Multi-domain recommendation and multi-task recommendation have demonstrated their effectiveness in leveraging common information from different domains and objectives for comprehensive user modeling. Nonetheless, the practical recommendation usually faces multiple domains and tasks simultaneously, which cannot be well-addressed by current methods. To this end, we introduce M3oE, an adaptive Multi-… ▽ More

    Submitted 12 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  34. arXiv:2404.18410  [pdf, other

    cs.CL

    Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions

    Authors: Bowen Xu, Shaoyu Wu, Kai Liu, Lulu Hu

    Abstract: With the proliferation of large language models (LLMs), the comprehensive alignment of such models across multiple tasks has emerged as a critical area of research. Existing alignment methodologies primarily address single task, such as multi-turn dialogue, coding, mathematical problem-solving, and tool usage. However, AI-driven products that leverage language models usually necessitate a fusion o… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  35. arXiv:2404.17609  [pdf, other

    cs.LG cs.AI cs.CL

    CoSD: Collaborative Stance Detection with Contrastive Heterogeneous Topic Graph Learning

    Authors: Yinghan Cheng, Qi Zhang, Chongyang Shi, Liang Xiao, Shufeng Hao, Liang Hu

    Abstract: Stance detection seeks to identify the viewpoints of individuals either in favor or against a given target or a controversial topic. Current advanced neural models for stance detection typically employ fully parametric softmax classifiers. However, these methods suffer from several limitations, including lack of explainability, insensitivity to the latent data structure, and unimodality, which gre… ▽ More

    Submitted 19 June, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 13 pages

  36. arXiv:2404.13503  [pdf, other

    cs.LG cs.DS stat.ML

    Predict to Minimize Swap Regret for All Payoff-Bounded Tasks

    Authors: Lunjia Hu, Yifan Wu

    Abstract: A sequence of predictions is calibrated if and only if it induces no swap regret to all down-stream decision tasks. We study the Maximum Swap Regret (MSR) of predictions for binary events: the swap regret maximized over all downstream tasks with bounded payoffs. Previously, the best online prediction algorithm for minimizing MSR is obtained by minimizing the K1 calibration error, which upper bound… ▽ More

    Submitted 24 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  37. arXiv:2404.13282  [pdf, other

    cs.CV cs.MM

    Wills Aligner: A Robust Multi-Subject Brain Representation Learner

    Authors: Guangyin Bao, Zixuan Gong, Qi Zhang, Jialei Zhou, Wei Fan, Kun Yi, Usman Naseem, Liang Hu, Duoqian Miao

    Abstract: Decoding visual information from human brain activity has seen remarkable advancements in recent research. However, due to the significant variability in cortical parcellation and cognition patterns across subjects, current approaches personalized deep models for each subject, constraining the practicality of this technology in real-world contexts. To tackle the challenges, we introduce Wills Alig… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 15 pages

  38. arXiv:2404.12721  [pdf, other

    cs.CV cs.AI cs.LG

    Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework

    Authors: Zhuohong Li, Fangxiao Lu, Jiaqi Zou, Lei Hu, Hongyan Zhang

    Abstract: Land-cover mapping is one of the vital applications in Earth observation, aiming at classifying each pixel's land-cover type of remote-sensing images. As natural and human activities change the landscape, the land-cover map needs to be rapidly updated. However, discovering newly appeared land-cover types in existing classification systems is still a non-trivial task hindered by various scales of c… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 11 pages, 11 figures, accepted by CVPR 2024 L3D-IVU Workshop

  39. arXiv:2404.12630  [pdf, other

    cs.CV cs.MM

    MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction

    Authors: Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Ke Liu, Liang Hu, Duoqian Miao

    Abstract: Decoding natural visual scenes from brain activity has flourished, with extensive research in single-subject tasks and, however, less in cross-subject tasks. Reconstructing high-quality images in cross-subject tasks is a challenging problem due to profound individual differences between subjects and the scarcity of data annotation. In this work, we proposed MindTuner for cross-subject visual decod… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 14 pages

  40. arXiv:2404.11938  [pdf, other

    cs.MM cs.DC cs.SD eess.AS

    HyDiscGAN: A Hybrid Distributed cGAN for Audio-Visual Privacy Preservation in Multimodal Sentiment Analysis

    Authors: Zhuojia Wu, Qi Zhang, Duoqian Miao, Kun Yi, Wei Fan, Liang Hu

    Abstract: Multimodal Sentiment Analysis (MSA) aims to identify speakers' sentiment tendencies in multimodal video content, raising serious concerns about privacy risks associated with multimodal data, such as voiceprints and facial images. Recent distributed collaborative learning has been verified as an effective paradigm for privacy preservation in multimodal tasks. However, they often overlook the privac… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 13 pages, IJCAI-2024

  41. arXiv:2404.11171  [pdf, other

    cs.LG cs.AI eess.SP

    Personalized Heart Disease Detection via ECG Digital Twin Generation

    Authors: Yaojun Hu, Jintai Chen, Lianting Hu, Dantong Li, Jiahuan Yan, Haochao Ying, Huiying Liang, Jian Wu

    Abstract: Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ dig… ▽ More

    Submitted 11 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  42. arXiv:2404.11111  [pdf, other

    cs.CV

    CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation

    Authors: Lianyu Hu, Wei Feng, Liqing Gao, Zekang Liu, Liang Wan

    Abstract: In sign language, the conveyance of human body trajectories predominantly relies upon the coordinated movements of hands and facial expressions across successive frames. Despite the recent advancements of sign language understanding methods, they often solely focus on individual frames, inevitably overlooking the inter-frame correlations that are essential for effectively modeling human body traje… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.03202

  43. arXiv:2404.09836  [pdf, other

    cs.SE cs.CR

    How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models

    Authors: Xiuwei Shang, Shaoyin Cheng, Guoqiang Chen, Yanming Zhang, Li Hu, Xiao Yu, Gangyang Li, Weiming Zhang, Nenghai Yu

    Abstract: Binary code analysis plays a pivotal role in various software security applications, such as software maintenance, malware detection, software vulnerability discovery, patch analysis, etc. However, unlike source code, understanding binary code is challenging for reverse engineers due to the absence of semantic information. Therefore, automated tools are needed to assist human players in interpreti… ▽ More

    Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  44. arXiv:2404.08675  [pdf, other

    cs.IR cs.AI cs.CL

    RecGPT: Generative Personalized Prompts for Sequential Recommendation via ChatGPT Training Paradigm

    Authors: Yabin Zhang, Wenhui Yu, Erhan Zhang, Xu Chen, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as words, which has similar underlying pattern with ChatGPT, we design a new chat framework in item index level for the recommendation task. Our novelty mainly contains three parts: model, training and inference. For the model p… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  45. arXiv:2404.08226  [pdf, other

    cs.CV

    Improving Continuous Sign Language Recognition with Adapted Image Models

    Authors: Lianyu Hu, Tongkai Shi, Liqing Gao, Zekang Liu, Wei Feng

    Abstract: The increase of web-scale weakly labelled image-text pairs have greatly facilitated the development of large-scale vision-language models (e.g., CLIP), which have shown impressive generalization performance over a series of downstream tasks. However, the massive model size and scarcity of available data limit their applications to fine-tune the whole model in downstream tasks. Besides, fully fine-… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  46. Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention

    Authors: Ziru Liu, Shuchang Liu, Zijian Zhang, Qingpeng Cai, Xiangyu Zhao, Kesen Zhao, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: In the landscape of Recommender System (RS) applications, reinforcement learning (RL) has recently emerged as a powerful tool, primarily due to its proficiency in optimizing long-term rewards. Nevertheless, it suffers from instability in the learning process, stemming from the intricate interactions among bootstrapping, off-policy training, and function approximation. Moreover, in multi-reward rec… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: SIGIR 2024

  47. arXiv:2404.00979  [pdf, other

    cs.CV

    PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation

    Authors: Jinfeng Xu, Siyuan Yang, Xianzhi Li, Yuan Tang, Yixue Hao, Long Hu, Min Chen

    Abstract: Existing point cloud semantic segmentation networks cannot identify unknown classes and update their knowledge, due to a closed-set and static perspective of the real world, which would induce the intelligent agent to make bad decisions. To address this problem, we propose a Probability-Driven Framework (PDF) for open world semantic segmentation that includes (i) a lightweight U-decoder branch to… ▽ More

    Submitted 23 July, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  48. arXiv:2404.00929  [pdf, other

    cs.CL cs.AI

    A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias

    Authors: Yuemei Xu, Ling Hu, Jiayi Zhao, Zihan Qiu, Yuqi Ye, Hanwen Gu

    Abstract: Based on the foundation of Large Language Models (LLMs), Multilingual Large Language Models (MLLMs) have been developed to address the challenges of multilingual natural language processing tasks, hoping to achieve knowledge transfer from high-resource to low-resource languages. However, significant limitations and challenges still exist, such as language imbalance, multilingual alignment, and inh… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  49. arXiv:2404.00492  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-hop Question Answering under Temporal Knowledge Editing

    Authors: Keyuan Cheng, Gang Lin, Haoyang Fei, Yuxuan zhai, Lu Yu, Muhammad Asif Ali, Lijie Hu, Di Wang

    Abstract: Multi-hop question answering (MQA) under knowledge editing (KE) has garnered significant attention in the era of large language models. However, existing models for MQA under KE exhibit poor performance when dealing with questions containing explicit temporal contexts. To address this limitation, we propose a novel framework, namely TEMPoral knowLEdge augmented Multi-hop Question Answering (TEMPLE… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 23 pages

  50. arXiv:2404.00489  [pdf, other

    cs.CL cs.AI cs.LG

    PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression

    Authors: Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang

    Abstract: Large language models (LLMs) have shown exceptional abilities for multiple different natural language processing tasks. While prompting is a crucial tool for LLM inference, we observe that there is a significant cost associated with exceedingly lengthy prompts. Existing attempts to compress lengthy prompts lead to sub-standard results in terms of readability and interpretability of the compressed… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.