Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 474 results for author: He, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01688  [pdf, other

    cs.SE

    How We Built Cedar: A Verification-Guided Approach

    Authors: Craig Disselkoen, Aaron Eline, Shaobo He, Kyle Headley, Michael Hicks, Kesha Hietala, John Kastner, Anwar Mamat, Matt McCutchen, Neha Rungta, Bhakti Shah, Emina Torlak, Andrew Wells

    Abstract: This paper presents verification-guided development (VGD), a software engineering process we used to build Cedar, a new policy language for expressive, fast, safe, and analyzable authorization. Developing a system with VGD involves writing an executable model of the system and mechanically proving properties about the model; writing production code for the system and using differential random test… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.00987  [pdf, other

    cs.NI eess.SY

    Exploiting Dependency-Aware Priority Adjustment for Mixed-Criticality TSN Flow Scheduling

    Authors: Miao Guo, Yifei Sun, Chaojie Gu, Shibo He, Zhiguo Shi

    Abstract: Time-Sensitive Networking (TSN) serves as a one-size-fits-all solution for mixed-criticality communication, in which flow scheduling is vital to guarantee real-time transmissions. Traditional approaches statically assign priorities to flows based on their associated applications, resulting in significant queuing delays. In this paper, we observe that assigning different priorities to a flow leads… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by IWQoS'24

  3. arXiv:2406.19708  [pdf, other

    cs.NE cs.AI cs.CE q-bio.NC

    A Differentiable Approach to Multi-scale Brain Modeling

    Authors: Chaoming Wang, Muyang Lyu, Tianqiu Zhang, Sichao He, Si Wu

    Abstract: We present a multi-scale differentiable brain modeling workflow utilizing BrainPy, a unique differentiable brain simulator that combines accurate brain simulation with powerful gradient-based optimization. We leverage this capability of BrainPy across different brain scales. At the single-neuron level, we implement differentiable neuron models and employ gradient methods to optimize their fit to e… ▽ More

    Submitted 1 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: 2nd Differentiable Almost Everything Workshop at ICML 2024

  4. arXiv:2406.18548  [pdf

    eess.IV cs.CV

    Exploration of Multi-Scale Image Fusion Systems in Intelligent Medical Image Analysis

    Authors: Yuxiang Hu, Haowei Yang, Ting Xu, Shuyao He, Jiajie Yuan, Haozhang Deng

    Abstract: The diagnosis of brain cancer relies heavily on medical imaging techniques, with MRI being the most commonly used. It is necessary to perform automatic segmentation of brain tumors on MRI images. This project intends to build an MRI algorithm based on U-Net. The residual network and the module used to enhance the context information are combined, and the void space convolution pooling pyramid is a… ▽ More

    Submitted 23 May, 2024; originally announced June 2024.

  5. arXiv:2406.18085  [pdf, other

    cs.CL

    Multilingual Knowledge Graph Completion from Pretrained Language Models with Knowledge Constraints

    Authors: Ran Song, Shizhu He, Shengxiang Gao, Li Cai, Kang Liu, Zhengtao Yu, Jun Zhao

    Abstract: Multilingual Knowledge Graph Completion (mKGC) aim at solving queries like (h, r, ?) in different languages by reasoning a tail entity t thus improving multilingual knowledge graphs. Previous studies leverage multilingual pretrained language models (PLMs) and the generative paradigm to achieve mKGC. Although multilingual pretrained language models contain extensive knowledge of different languages… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 11 pages, ACL 2023

  6. arXiv:2406.17739  [pdf, other

    cs.CL cs.AI

    Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model

    Authors: Fei Xia, Yixuan Weng, Shizhu He, Kang Liu, Jun Zhao

    Abstract: Taxonomies, which organize domain concepts into hierarchical structures, are crucial for building knowledge systems and downstream applications. As domain knowledge evolves, taxonomies need to be continuously updated to include new concepts. Previous approaches have mainly focused on adding concepts to the leaf nodes of the existing hierarchical tree, which does not fully utilize the taxonomy's kn… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  7. arXiv:2406.17005  [pdf, other

    cs.CV

    PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

    Authors: Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo , et al. (12 additional authors not shown)

    Abstract: Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: MOSE Challenge: https://henghuiding.github.io/MOSE/ChallengeCVPR2024, MeViS Challenge: https://henghuiding.github.io/MeViS/ChallengeCVPR2024

  8. arXiv:2406.15786  [pdf, other

    cs.LG cs.AI cs.CL

    What Matters in Transformers? Not All Attention is Needed

    Authors: Shwai He, Guoheng Sun, Zheyu Shen, Ang Li

    Abstract: Scaling Transformer-based large language models (LLMs) has demonstrated promising performance across various tasks. However, this scaling also introduces redundant structures, posing challenges for real-world deployment. Despite some recognition of redundancy in LLMs, the variability of redundancy across different structures, such as MLP and Attention layers, is under-explored. In this work, we in… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 15 pages, 13 figures, 6 tables

  9. arXiv:2406.12382  [pdf, other

    cs.CL

    From Instance Training to Instruction Learning: Task Adapters Generation from Instructions

    Authors: Huanxuan Liao, Yao Xu, Shizhu He, Yuanzhe Zhang, Yanchao Hao, Shengping Liu, Kang Liu, Jun Zhao

    Abstract: Large language models (LLMs) have acquired the ability to solve general tasks by utilizing instruction finetuning (IFT). However, IFT still relies heavily on instance training of extensive task data, which greatly limits the adaptability of LLMs to real-world scenarios where labeled task instances are scarce and broader task generalization becomes paramount. Contrary to LLMs, humans acquire skills… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  10. arXiv:2406.12100  [pdf, other

    cs.LG cs.RO

    Adaptive Uncertainty Quantification for Trajectory Prediction Under Distributional Shift

    Authors: Huiqun Huang, Sihong He, Fei Miao

    Abstract: Trajectory prediction models that can infer both finite future trajectories and their associated uncertainties of the target vehicles in an online setting (e.g., real-world application scenarios) is crucial for ensuring the safe and robust navigation and path planning of autonomous vehicle motion. However, the majority of existing trajectory prediction models have neither considered reducing the u… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures

  11. arXiv:2406.08756  [pdf, other

    cs.DC cs.LG

    Optimizing Large Model Training through Overlapped Activation Recomputation

    Authors: Ping Chen, Wenjie Zhang, Shuibing He, Yingjie Gu, Zhuwei Peng, Kexin Huang, Xuan Zhan, Weijian Chen, Yi Zheng, Zhefeng Wang, Yanlong Yin, Gang Chen

    Abstract: Large model training has been using recomputation to alleviate the memory pressure and pipelining to exploit the parallelism of data, tensor, and devices. The existing recomputation approaches may incur up to 40% overhead when training real-world models, e.g., the GPT model with 22B parameters. This is because they are executed on demand in the critical training path. In this paper, we design a ne… ▽ More

    Submitted 27 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages

  12. arXiv:2406.07147  [pdf

    cs.HC cs.AI cs.CY

    Wearable Device-Based Physiological Signal Monitoring: An Assessment Study of Cognitive Load Across Tasks

    Authors: Ling He, Yanxin Chen, Wenqi Wang, Shuting He, Xiaoqiang Hu

    Abstract: This study employs cutting-edge wearable monitoring technology to conduct high-precision, high-temporal-resolution cognitive load assessment on EEG data from the FP1 channel and heart rate variability (HRV) data of secondary vocational students(SVS). By jointly analyzing these two critical physiological indicators, the research delves into their application value in assessing cognitive load among… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  13. arXiv:2406.05977  [pdf, other

    cs.IR

    Weighted KL-Divergence for Document Ranking Model Refinement

    Authors: Yingrui Yang, Yifan Qiao, Shanxiu He, Tao Yang

    Abstract: Transformer-based retrieval and reranking models for text document search are often refined through knowledge distillation together with contrastive learning. A tight distribution matching between the teacher and student models can be hard as over-calibration may degrade training effectiveness when a teacher does not perform well. This paper contrastively reweights KL divergence terms to prioritiz… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  14. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  15. arXiv:2406.03097  [pdf, other

    cs.LG cs.AI

    Enhancing the Resilience of Graph Neural Networks to Topological Perturbations in Sparse Graphs

    Authors: Shuqi He, Jun Zhuang, Ding Wang, Luyao Peng, Jun Song

    Abstract: Graph neural networks (GNNs) have been extensively employed in node classification. Nevertheless, recent studies indicate that GNNs are vulnerable to topological perturbations, such as adversarial attacks and edge disruptions. Considerable efforts have been devoted to mitigating these challenges. For example, pioneering Bayesian methodologies, including GraphSS and LlnDT, incorporate Bayesian labe… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  16. arXiv:2406.02542  [pdf, other

    cs.LG

    Loki: Low-Rank Keys for Efficient Sparse Attention

    Authors: Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, Abhinav Bhatele

    Abstract: Inference on large language models can be expensive in terms of the compute and memory costs involved, especially when long sequence lengths are used. In particular, the self-attention mechanism used in such models contributes significantly to these costs, which has resulted in several recent works that propose sparse attention approximations for inference. In this work, we propose to approximate… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  17. arXiv:2406.02500  [pdf, other

    cs.LG cs.AI

    Demystifying the Compression of Mixture-of-Experts Through a Unified Framework

    Authors: Shwai He, Daize Dong, Liang Ding, Ang Li

    Abstract: Scaling large language models has revolutionized the performance across diverse domains, yet the continual growth in model size poses significant challenges for real-world deployment. The Mixture of Experts (MoE) approach addresses this by dynamically selecting and activating only a subset of experts, significantly reducing computational costs while maintaining high performance. However, MoE intro… ▽ More

    Submitted 24 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 20 pages, 15 figures, 5 tables

  18. arXiv:2406.00492  [pdf, other

    eess.IV cs.CV cs.LG

    SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation

    Authors: Xueying Zeng, Baixiang Huang, Yu Luo, Guangyu Wei, Songyan He, Yushuang Shao

    Abstract: Coronary artery disease (CAD) is one of the most prevalent diseases in the cardiovascular field and one of the major contributors to death worldwide. Computed Tomography Angiography (CTA) images are regarded as the authoritative standard for the diagnosis of coronary artery disease, and by performing vessel segmentation and stenosis detection on CTA images, physicians are able to diagnose coronary… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  19. arXiv:2405.19499  [pdf, other

    cs.LG cs.MA math.OC

    Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments

    Authors: Han Wang, Sihong He, Zhili Zhang, Fei Miao, James Anderson

    Abstract: We explore a Federated Reinforcement Learning (FRL) problem where $N$ agents collaboratively learn a common policy without sharing their trajectory data. To date, existing FRL work has primarily focused on agents operating in the same or ``similar" environments. In contrast, our problem setup allows for arbitrarily large levels of environment heterogeneity. To obtain the optimal policy which maxim… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, 2024 Learning

  20. arXiv:2405.19323  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Are Large Language Models Chameleons?

    Authors: Mingmeng Geng, Sihong He, Roberto Trotta

    Abstract: Do large language models (LLMs) have their own worldviews and personality tendencies? Simulations in which an LLM was asked to answer subjective questions were conducted more than 1 million times. Comparison of the responses from different LLMs with real data from the European Social Survey (ESS) suggests that the effect of prompts on bias and variability is fundamental, highlighting major cultura… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 16 pages,8 figures

  21. arXiv:2405.17460  [pdf

    cs.LG cs.AI cs.CV

    Investigation of Customized Medical Decision Algorithms Utilizing Graph Neural Networks

    Authors: Yafeng Yan, Shuyao He, Zhou Yu, Jiajie Yuan, Ziang Liu, Yan Chen

    Abstract: Aiming at the limitations of traditional medical decision system in processing large-scale heterogeneous medical data and realizing highly personalized recommendation, this paper introduces a personalized medical decision algorithm utilizing graph neural network (GNN). This research innovatively integrates graph neural network technology into the medical and health field, aiming to build a high-pr… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  22. arXiv:2405.15269  [pdf, other

    cs.CV cs.LG

    BDetCLIP: Multimodal Prompting Contrastive Test-Time Backdoor Detection

    Authors: Yuwei Niu, Shuo He, Qi Wei, Feng Liu, Lei Feng

    Abstract: Multimodal contrastive learning methods (e.g., CLIP) have shown impressive zero-shot classification performance due to their strong ability to joint representation learning for visual and textual modalities. However, recent research revealed that multimodal contrastive learning on poisoned pre-training data with a small proportion of maliciously backdoored data can induce backdoored CLIP that coul… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  23. arXiv:2405.09717  [pdf, other

    cs.CV

    From NeRFs to Gaussian Splats, and Back

    Authors: Siming He, Zach Osman, Pratik Chaudhari

    Abstract: For robotics applications where there is a limited number of (typically ego-centric) views, parametric representations such as neural radiance fields (NeRFs) generalize better than non-parametric ones such as Gaussian splatting (GS) to views that are very different from those in the training data; GS however can render much faster than NeRFs. We develop a procedure to convert back and forth betwee… ▽ More

    Submitted 10 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  24. arXiv:2405.07148  [pdf, other

    physics.flu-dyn cs.CE

    Investigate the efficiency of incompressible flow simulations on CPUs and GPUs with BSAMR

    Authors: Dewen Liu, Shuai He, Haoran Cheng, Yadong Zeng

    Abstract: Adaptive mesh refinement (AMR) is a classical technique about local refinement in space where needed, thus effectively reducing computational costs for HPC-based physics simulations. Although AMR has been used for many years, little reproducible research discusses the impact of software-based parameters on block-structured AMR (BSAMR) efficiency and how to choose them. This article primarily does… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 22 pages include reference, 9 figures

  25. arXiv:2405.06929  [pdf, other

    cs.CV

    PRENet: A Plane-Fit Redundancy Encoding Point Cloud Sequence Network for Real-Time 3D Action Recognition

    Authors: Shenglin He, Xiaoyang Qu, Jiguang Wan, Guokuan Li, Changsheng Xie, Jianzong Wang

    Abstract: Recognizing human actions from point cloud sequence has attracted tremendous attention from both academia and industry due to its wide applications. However, most previous studies on point cloud action recognition typically require complex networks to extract intra-frame spatial features and inter-frame temporal features, resulting in an excessive number of redundant computations. This leads to hi… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)

  26. arXiv:2405.01327  [pdf, other

    cs.LG

    Constrained Reinforcement Learning Under Model Mismatch

    Authors: Zhongchang Sun, Sihong He, Fei Miao, Shaofeng Zou

    Abstract: Existing studies on constrained reinforcement learning (RL) may obtain a well-performing policy in the training environment. However, when deployed in a real environment, it may easily violate constraints that were originally satisfied during training because there might be model mismatch between the training and real environments. To address the above challenge, we formulate the problem as constr… ▽ More

    Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  27. arXiv:2405.00476  [pdf, other

    cs.LG

    A Comprehensive Survey of Dynamic Graph Neural Networks: Models, Frameworks, Benchmarks, Experiments and Challenges

    Authors: ZhengZhao Feng, Rui Wang, TianXing Wang, Mingli Song, Sai Wu, Shuibing He

    Abstract: Dynamic Graph Neural Networks (GNNs) combine temporal information with GNNs to capture structural, temporal, and contextual relationships in dynamic graphs simultaneously, leading to enhanced performance in various applications. As the demand for dynamic GNNs continues to grow, numerous models and frameworks have emerged to cater to different application needs. There is a pressing need for a compr… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Under review of PVLDB2025

  28. arXiv:2404.18419  [pdf

    cs.CV cs.AI

    Research on Intelligent Aided Diagnosis System of Medical Image Based on Computer Deep Learning

    Authors: Jiajie Yuan, Linxiao Wu, Yulu Gong, Zhou Yu, Ziang Liu, Shuyao He

    Abstract: This paper combines Struts and Hibernate two architectures together, using DAO (Data Access Object) to store and access data. Then a set of dual-mode humidity medical image library suitable for deep network is established, and a dual-mode medical image assisted diagnosis method based on the image is proposed. Through the test of various feature extraction methods, the optimal operating characteris… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  29. arXiv:2404.17379  [pdf

    cs.RO

    Adaptive speed planning for Unmanned Vehicle Based on Deep Reinforcement Learning

    Authors: Hao Liu, Yi Shen, Wenjing Zhou, Yuelin Zou, Chang Zhou, Shuyao He

    Abstract: In order to solve the problem of frequent deceleration of unmanned vehicles when approaching obstacles, this article uses a Deep Q-Network (DQN) and its extension, the Double Deep Q-Network (DDQN), to develop a local navigation system that adapts to obstacles while maintaining optimal speed planning. By integrating improved reward functions and obstacle angle determination methods, the system demo… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  30. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  31. arXiv:2404.16611  [pdf, ps, other

    cs.IT eess.SP

    Towards Symbiotic SAGIN Through Inter-operator Resource and Service Sharing: Joint Orchestration of User Association and Radio Resources

    Authors: Shizhao He, Jungang Ge, Ying-Chang Liang, Dusit Niyato

    Abstract: The space-air-ground integrated network (SAGIN) is a pivotal architecture to support ubiquitous connectivity in the upcoming 6G era. Inter-operator resource and service sharing is a promising way to realize such a huge network, utilizing resources efficiently and reducing construction costs. Given the rationality of operators, the configuration of resources and services in SAGIN should focus on bo… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  32. arXiv:2404.16571  [pdf, other

    cs.CV

    MonoPCC: Photometric-invariant Cycle Constraint for Monocular Depth Estimation of Endoscopic Images

    Authors: Zhiwei Wang, Ying Zhou, Shiquan He, Ting Li, Fan Huang, Qiang Ding, Xinxia Feng, Mei Liu, Qiang Li

    Abstract: Photometric constraint is indispensable for self-supervised monocular depth estimation. It involves warping a source image onto a target view using estimated depth&pose, and then minimizing the difference between the warped and target images. However, the endoscopic built-in light causes significant brightness fluctuations, and thus makes the photometric constraint unreliable. Previous efforts onl… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 11 pages, 10 figures

  33. arXiv:2404.15127  [pdf, other

    cs.CV cs.CL

    MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning

    Authors: Sunan He, Yuxiang Nie, Zhixuan Chen, Zhiyuan Cai, Hongmei Wang, Shu Yang, Hao Chen

    Abstract: The rapid advancement of large-scale vision-language models has showcased remarkable capabilities across various tasks. However, the lack of extensive and high-quality image-text data in medicine has greatly hindered the development of large-scale medical vision-language models. In this work, we present a diagnosis-guided bootstrapping strategy that exploits both image and label information to con… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  34. arXiv:2404.14741  [pdf, other

    cs.CL cs.AI

    Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering

    Authors: Yao Xu, Shizhu He, Jiabei Chen, Zihao Wang, Yangqiu Song, Hanghang Tong, Kang Liu, Jun Zhao

    Abstract: To address the issue of insufficient knowledge and the tendency to generate hallucination in Large Language Models (LLMs), numerous studies have endeavored to integrate LLMs with Knowledge Graphs (KGs). However, all these methods are evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where the factual triples involved in each question are entirely covered by the… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  35. arXiv:2404.14700  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    FlashSpeech: Efficient Zero-Shot Speech Synthesis

    Authors: Zhen Ye, Zeqian Ju, Haohe Liu, Xu Tan, Jianyi Chen, Yiwen Lu, Peiwen Sun, Jiahao Pan, Weizhen Bian, Shulin He, Qifeng Liu, Yike Guo, Wei Xue

    Abstract: Recent progress in large-scale zero-shot speech synthesis has been significantly advanced by language models and diffusion models. However, the generation process of both methods is slow and computationally intensive. Efficient speech synthesis using a lower computing budget to achieve quality on par with previous work remains a significant challenge. In this paper, we present FlashSpeech, a large… ▽ More

    Submitted 24 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Efficient zero-shot speech synthesis

  36. arXiv:2404.12170  [pdf, other

    eess.SP cs.IT

    Secure Semantic Communication for Image Transmission in the Presence of Eavesdroppers

    Authors: Shunpu Tang, Chen Liu, Qianqian Yang, Shibo He, Dusit Niyato

    Abstract: Semantic communication (SemCom) has emerged as a key technology for the forthcoming sixth-generation (6G) network, attributed to its enhanced communication efficiency and robustness against channel noise. However, the open nature of wireless channels renders them vulnerable to eavesdropping, posing a serious threat to privacy. To address this issue, we propose a novel secure semantic communication… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  37. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  38. arXiv:2404.11105  [pdf, other

    cs.DB cs.DC

    XMiner: Efficient Directed Subgraph Matching with Pattern Reduction

    Authors: Pingpeng Yuan, Yujiang Wang, Tianyu Ma, Siyuan He, Ling Liu

    Abstract: Graph pattern matching, one of the fundamental graph mining problems, aims to extract structural patterns of interest from an input graph. The state-of-the-art graph matching algorithms and systems are mainly designed for undirected graphs. Directed graph matching is more complex than undirected graph matching because the edge direction must be taken into account before the exploration of each dir… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  39. arXiv:2404.10378  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data

    Authors: Ivan DeAndres-Tame, Ruben Tolosana, Pietro Melzi, Ruben Vera-Rodriguez, Minchul Kim, Christian Rathgeb, Xiaoming Liu, Aythami Morales, Julian Fierrez, Javier Ortega-Garcia, Zhizhou Zhong, Yuge Huang, Yuxi Mi, Shouhong Ding, Shuigeng Zhou, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Zhihong Xiao, Evgeny Smirnov, Anton Pimenov, Aleksei Grigorev, Denis Timoshenko, Kaleb Mesfin Asfaw , et al. (33 additional authors not shown)

    Abstract: Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2311.10476

    Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRw 2024)

  40. arXiv:2404.10365  [pdf, other

    cs.NI cs.LG eess.SP

    Learning Wireless Data Knowledge Graph for Green Intelligent Communications: Methodology and Experiments

    Authors: Yongming Huang, Xiaohu You, Hang Zhan, Shiwen He, Ningning Fu, Wei Xu

    Abstract: Intelligent communications have played a pivotal role in shaping the evolution of 6G networks. Native artificial intelligence (AI) within green communication systems must meet stringent real-time requirements. To achieve this, deploying lightweight and resource-efficient AI models is necessary. However, as wireless networks generate a multitude of data fields and indicators during operation, only… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 12 pages,11 figures

  41. arXiv:2404.08896  [pdf, other

    cs.IR

    Approximate Cluster-Based Sparse Document Retrieval with Segmented Maximum Term Weights

    Authors: Yifan Qiao, Shanxiu He, Yingrui Yang, Parker Carlson, Tao Yang

    Abstract: This paper revisits cluster-based retrieval that partitions the inverted index into multiple groups and skips the index partially at cluster and document levels during online inference using a learned sparse representation. It proposes an approximate search scheme with two parameters to control the rank-safeness competitiveness of pruning with segmented maximum term weights within each cluster. Cl… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  42. arXiv:2404.08217  [pdf, other

    cs.PL

    Escape with Your Self: Expressive Reachability Types with Sound and Decidable Bidirectional Type Checking

    Authors: Songlin Jia, Guannan Wei, Siyuan He, Yueyang Tang, Yuyan Bao, Tiark Rompf

    Abstract: Despite Rust's success in systems programming, its "shared XOR mutable" principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style "shared XOR mutable" approaches by tracking lifetimes and reachability of shared, escaping, and mutable data, even in the… ▽ More

    Submitted 17 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  43. arXiv:2404.03645  [pdf, other

    cs.CV

    Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

    Authors: Shuting He, Henghui Ding

    Abstract: Referring video segmentation relies on natural language expressions to identify and segment objects, often emphasizing motion clues. Previous works treat a sentence as a whole and directly perform identification at the video-level, mixing up static image-level cues with temporal motion cues. However, image-level features cannot well comprehend motion cues in sentences, and static cues are not cruc… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, code: https://github.com/heshuting555/DsHmp

  44. arXiv:2404.02424  [pdf, other

    cs.LG cs.CV

    Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity and Performance Restoration

    Authors: Shwai He, Ang Li, Tianlong Chen

    Abstract: Vision-Language Models (VLMs) integrate information from multiple modalities and have shown remarkable success across various tasks. However, deploying large-scale VLMs in resource-constrained scenarios is challenging. Pruning followed by finetuning offers a potential solution but remains underexplored for VLMs. This study addresses two key questions: how to distribute sparsity across different mo… ▽ More

    Submitted 24 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  45. arXiv:2404.01958  [pdf, other

    cs.LG

    MESEN: Exploit Multimodal Data to Design Unimodal Human Activity Recognition with Few Labels

    Authors: Lilin Xu, Chaojie Gu, Rui Tan, Shibo He, Jiming Chen

    Abstract: Human activity recognition (HAR) will be an essential function of various emerging applications. However, HAR typically encounters challenges related to modality limitations and label scarcity, leading to an application gap between current solutions and real-world requirements. In this work, we propose MESEN, a multimodal-empowered unimodal sensing framework, to utilize unlabeled multimodal data a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to the 21th ACM Conference on Embedded Networked Sensor Systems (SenSys 2023)

  46. arXiv:2404.01268  [pdf, other

    cs.CL cs.AI cs.DL cs.LG cs.SI

    Mapping the Increasing Use of LLMs in Scientific Papers

    Authors: Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, James Y. Zou

    Abstract: Scientific publishing lays the foundation of science by disseminating research findings, fostering collaboration, encouraging reproducibility, and ensuring that scientific knowledge is accessible, verifiable, and built upon over time. Recently, there has been immense speculation about how many people are using large language models (LLMs) like ChatGPT in their academic writing, and to what extent… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  47. arXiv:2404.01050  [pdf, other

    cs.CV cs.GR cs.HC cs.LG

    Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

    Authors: Haofeng Liu, Chenshu Xu, Yifei Yang, Lihua Zeng, Shengfeng He

    Abstract: Point-based interactive editing serves as an essential tool to complement the controllability of existing generative models. A concurrent work, DragDiffusion, updates the diffusion latent map in response to user inputs, causing global latent map alterations. This results in imprecise preservation of the original content and unsuccessful editing due to gradient vanishing. In contrast, we present Dr… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  48. arXiv:2404.00769  [pdf, other

    cs.RO

    An Active Perception Game for Robust Autonomous Exploration

    Authors: Siming He, Yuezhan Tao, Igor Spasojevic, Vijay Kumar, Pratik Chaudhari

    Abstract: We formulate active perception for an autonomous agent that explores an unknown environment as a two-player zero-sum game: the agent aims to maximize information gained from the environment while the environment aims to minimize the information gained by the agent. In each episode, the environment reveals a set of actions with their potentially erroneous information gain. In order to select the be… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  49. arXiv:2403.19975  [pdf, other

    cs.CV

    Context-Aware Integration of Language and Visual References for Natural Language Tracking

    Authors: Yanyan Shao, Shuting He, Qi Ye, Yuchao Feng, Wenhan Luo, Jiming Chen

    Abstract: Tracking by natural language specification (TNL) aims to consistently localize a target in a video sequence given a linguistic description in the initial frame. Existing methodologies perform language-based and template-based matching for target reasoning separately and merge the matching results from two sources, which suffer from tracking drift when language and visual templates miss-align with… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  50. arXiv:2403.19414  [pdf, other

    cs.CL

    BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation

    Authors: Yuhong He, Yongqi Zhang, Shizhu He, Jun Wan

    Abstract: Medical dialogue generation (MDG) has gained increasing attention due to its substantial practical value. Previous works typically employ a sequence-to-sequence framework to generate medical responses by modeling dialogue context as sequential text with annotated medical entities. While these methods have been successful in generating fluent responses, they fail to provide process explanations of… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024