Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 1,655 results for author: Wang, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.02108  [pdf, other

    cs.CV cs.GR cs.MM

    Unveiling Deep Shadows: A Survey on Image and Video Shadow Detection, Removal, and Generation in the Era of Deep Learning

    Authors: Xiaowei Hu, Zhenghao Xing, Tianyu Wang, Chi-Wing Fu, Pheng-Ann Heng

    Abstract: Shadows are formed when light encounters obstacles, leading to areas of diminished illumination. In computer vision, shadow detection, removal, and generation are crucial for enhancing scene understanding, refining image quality, ensuring visual consistency in video editing, and improving virtual environments. This paper presents a comprehensive survey of shadow detection, removal, and generation… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: Publicly available results, trained models, and evaluation metrics at https://github.com/xw-hu/Unveiling-Deep-Shadows

  2. arXiv:2409.01787  [pdf, other

    cs.CL

    LLM-GAN: Construct Generative Adversarial Network Through Large Language Models For Explainable Fake News Detection

    Authors: Yifeng Wang, Zhouhong Gu, Siwei Zhang, Suhang Zheng, Tao Wang, Tianyu Li, Hongwei Feng, Yanghua Xiao

    Abstract: Explainable fake news detection predicts the authenticity of news items with annotated explanations. Today, Large Language Models (LLMs) are known for their powerful natural language understanding and explanation generation abilities. However, presenting LLMs for explainable fake news detection remains two main challenges. Firstly, fake news appears reasonable and could easily mislead LLMs, leavin… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  3. arXiv:2409.01676  [pdf, other

    cs.LG cs.AI eess.SP

    Classifier-Free Diffusion-Based Weakly-Supervised Approach for Health Indicator Derivation in Rotating Machines: Advancing Early Fault Detection and Condition Monitoring

    Authors: Wenyang Hu, Gaetan Frusque, Tianyang Wang, Fulei Chu, Olga Fink

    Abstract: Deriving health indicators of rotating machines is crucial for their maintenance. However, this process is challenging for the prevalent adopted intelligent methods since they may take the whole data distributions, not only introducing noise interference but also lacking the explainability. To address these issues, we propose a diffusion-based weakly-supervised approach for deriving health indicat… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  4. arXiv:2409.01515  [pdf, other

    cs.CY

    METcross: A framework for short-term forecasting of cross-city metro passenger flow

    Authors: Wenbo Lu, Jinhua Xu, Peikun Li, Ting Wang, Yong Zhang

    Abstract: Metro operation management relies on accurate predictions of passenger flow in the future. This study begins by integrating cross-city (including source and target city) knowledge and developing a short-term passenger flow prediction framework (METcross) for the metro. Firstly, we propose a basic framework for modeling cross-city metro passenger flow prediction from the perspectives of data fusion… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  5. SonoHaptics: An Audio-Haptic Cursor for Gaze-Based Object Selection in XR

    Authors: Hyunsung Cho, Naveen Sendhilnathan, Michael Nebeling, Tianyi Wang, Purnima Padmanabhan, Jonathan Browder, David Lindlbauer, Tanya R. Jonker, Kashyap Todi

    Abstract: We introduce SonoHaptics, an audio-haptic cursor for gaze-based 3D object selection. SonoHaptics addresses challenges around providing accurate visual feedback during gaze-based selection in Extended Reality (XR), e.g., lack of world-locked displays in no- or limited-display smart glasses and visual inconsistencies. To enable users to distinguish objects without visual feedback, SonoHaptics employ… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: UIST 2024

    ACM Class: H.5.1; H.5.2; H.5.5

  6. arXiv:2409.00644  [pdf, other

    cs.LG

    Knowledge-data fusion oriented traffic state estimation: A stochastic physics-informed deep learning approach

    Authors: Ting Wang, Ye Li, Rongjun Cheng, Guojian Zou, Takao Dantsujic, Dong Ngoduy

    Abstract: Physics-informed deep learning (PIDL)-based models have recently garnered remarkable success in traffic state estimation (TSE). However, the prior knowledge used to guide regularization training in current mainstream architectures is based on deterministic physical models. The drawback is that a solely deterministic model fails to capture the universally observed traffic flow dynamic scattering ef… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: under review in Information Fusion

  7. arXiv:2409.00497  [pdf, other

    quant-ph cs.CR

    Security Loophole Induced by Photorefractive Effect in Continous-variable Quantum Key Distribution System

    Authors: Zehao Zhou, Peng Huang, Tao Wang, Guihua Zeng

    Abstract: Modulators based on the Mach-Zehnder interferometer (MZI) structure are widely used in continuous-variable quantum key distribution (CVQKD) systems. MZI-based variable optical attenuator (VOA) and amplitude modulator can reshape the waveform and control the intensity of coherent state signal to realize secret key information modulation in CVQKD system. However, these devices are not ideal, interna… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  8. arXiv:2409.00387  [pdf, other

    eess.AS cs.SD

    Progressive Residual Extraction based Pre-training for Speech Representation Learning

    Authors: Tianrui Wang, Jin Li, Ziyang Ma, Rui Cao, Xie Chen, Longbiao Wang, Meng Ge, Xiaobao Wang, Yuguang Wang, Jianwu Dang, Nyima Tashi

    Abstract: Self-supervised learning (SSL) has garnered significant attention in speech processing, excelling in linguistic tasks such as speech recognition. However, jointly improving the performance of pre-trained models on various downstream tasks, each requiring different speech information, poses significant challenges. To this purpose, we propose a progressive residual extraction based self-supervised l… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  9. arXiv:2409.00133  [pdf, other

    cs.CL cs.AI

    A Survey for Large Language Models in Biomedicine

    Authors: Chong Wang, Mengyao Li, Junjun He, Zhongruo Wang, Erfan Darzi, Zan Chen, Jin Ye, Tianbin Li, Yanzhou Su, Jing Ke, Kaili Qu, Shuxin Li, Yi Yu, Pietro Liò, Tianyun Wang, Yu Guang Wang, Yiqing Shen

    Abstract: Recent breakthroughs in large language models (LLMs) offer unprecedented natural language understanding and generation capabilities. However, existing surveys on LLMs in biomedicine often focus on specific applications or model architectures, lacking a comprehensive analysis that integrates the latest advancements across various biomedical domains. This review, based on an analysis of 484 publicat… ▽ More

    Submitted 29 August, 2024; originally announced September 2024.

  10. arXiv:2408.16987  [pdf, other

    cs.LG

    From Model Explanation to Data Misinterpretation: Uncovering the Pitfalls of Post Hoc Explainers in Business Research

    Authors: Ronilo Ragodos, Tong Wang, Lu Feng, Yu, Hu

    Abstract: Machine learning models have been increasingly used in business research. However, most state-of-the-art machine learning models, such as deep neural networks and XGBoost, are black boxes in nature. Therefore, post hoc explainers that provide explanations for machine learning models by, for example, estimating numerical importance of the input features, have been gaining wide usage. Despite the in… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  11. arXiv:2408.16313  [pdf, other

    cs.CV cs.AI

    FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules

    Authors: Yukang Huo, Mingyuan Yao, Qingbin Tian, Tonghao Wang, Ruifeng Wang, Haihua Wang

    Abstract: Over the past few years, the YOLO series of models has emerged as one of the dominant methodologies in the realm of object detection. Many studies have advanced these baseline models by modifying their architectures, enhancing data quality, and developing new loss functions. However, current models still exhibit deficiencies in processing feature maps, such as overlooking the fusion of cross-scale… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 11 pages and 4 figures

  12. arXiv:2408.15916  [pdf, other

    eess.AS cs.LG cs.SD

    Multi-modal Adversarial Training for Zero-Shot Voice Cloning

    Authors: John Janiczek, Dading Chong, Dongyang Dai, Arlo Faria, Chao Wang, Tao Wang, Yuzong Liu

    Abstract: A text-to-speech (TTS) model trained to reconstruct speech given text tends towards predictions that are close to the average characteristics of a dataset, failing to model the variations that make human speech sound natural. This problem is magnified for zero-shot voice cloning, a task that requires training data with high variance in speaking styles. We build off of recent works which have used… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: Accepted at INTERSPEECH 2024

  13. arXiv:2408.15374  [pdf, other

    cs.CV cs.LG

    CycleGAN with Better Cycles

    Authors: Tongzhou Wang, Yihan Lin

    Abstract: CycleGAN provides a framework to train image-to-image translation with unpaired datasets using cycle consistency loss [4]. While results are great in many applications, the pixel level cycle consistency can potentially be problematic and causes unrealistic images in certain cases. In this project, we propose three simple modifications to cycle consistency, and show that such an approach achieves b… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: Technical Report 2018

  14. arXiv:2408.10623  [pdf, other

    cs.CV

    TextMastero: Mastering High-Quality Scene Text Editing in Diverse Languages and Styles

    Authors: Tong Wang, Xiaochao Qu, Ting Liu

    Abstract: Scene text editing aims to modify texts on images while maintaining the style of newly generated text similar to the original. Given an image, a target area, and target text, the task produces an output image with the target text in the selected area, replacing the original. This task has been studied extensively, with initial success using Generative Adversarial Networks (GANs) to balance text fi… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  15. arXiv:2408.09955  [pdf, other

    cs.MA

    MegaAgent: A Practical Framework for Autonomous Cooperation in Large-Scale LLM Agent Systems

    Authors: Qian Wang, Tianyu Wang, Qinbin Li, Jingsheng Liang, Bingsheng He

    Abstract: With the emergence of large language models (LLMs), LLM-powered multi-agent systems (LLM-MA systems) have been proposed to tackle real-world tasks. However, their agents mostly follow predefined Standard Operating Procedures (SOPs) that remain unchanged across the whole interaction, lacking autonomy and scalability. Additionally, current solutions often overlook the necessity for effective agent c… ▽ More

    Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  16. arXiv:2408.09698  [pdf, other

    cs.IR cs.AI

    Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

    Authors: Yuyang Ye, Zhi Zheng, Yishan Shen, Tianshu Wang, Hengruo Zhang, Peijun Zhu, Runlong Yu, Kai Zhang, Hui Xiong

    Abstract: Recent advances in Large Language Models (LLMs) have demonstrated significant potential in the field of Recommendation Systems (RSs). Most existing studies have focused on converting user behavior logs into textual prompts and leveraging techniques such as prompt tuning to enable LLMs for recommendation tasks. Meanwhile, research interest has recently grown in multimodal recommendation systems tha… ▽ More

    Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  17. arXiv:2408.09076  [pdf, ps, other

    cs.LG cs.NI

    Twin Sorting Dynamic Programming Assisted User Association and Wireless Bandwidth Allocation for Hierarchical Federated Learning

    Authors: Rung-Hung Gau, Ting-Yu Wang, Chun-Hung Liu

    Abstract: In this paper, we study user association and wireless bandwidth allocation for a hierarchical federated learning system that consists of mobile users, edge servers, and a cloud server. To minimize the length of a global round in hierarchical federated learning with equal bandwidth allocation, we formulate a combinatorial optimization problem. We design the twin sorting dynamic programming (TSDP) a… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 14 pages

  18. arXiv:2408.08583  [pdf, other

    cs.LG

    GrassNet: State Space Model Meets Graph Neural Network

    Authors: Gongpei Zhao, Tao Wang, Yi Jin, Congyan Lang, Yidong Li, Haibin Ling

    Abstract: Designing spectral convolutional networks is a formidable task in graph learning. In traditional spectral graph neural networks (GNNs), polynomial-based methods are commonly used to design filters via the Laplacian matrix. In practical applications, however, these polynomial methods encounter inherent limitations, which primarily arise from the the low-order truncation of polynomial filters and th… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  19. arXiv:2408.08147  [pdf, other

    cs.DC cs.CL cs.LG

    P/D-Serve: Serving Disaggregated Large Language Model at Scale

    Authors: Yibo Jin, Tao Wang, Huimin Lin, Mingyang Song, Peiyang Li, Yipeng Ma, Yicheng Shan, Zhengfan Yuan, Cailong Li, Yajing Sun, Tiandeng Wu, Xing Chu, Ruizhi Huan, Li Ma, Xiao You, Wenting Zhou, Yunpeng Ye, Wen Liu, Xiangkun Xu, Yongsheng Zhang, Tiantian Dong, Jiawei Zhu, Zhe Wang, Xijian Ju, Jianxun Song , et al. (5 additional authors not shown)

    Abstract: Serving disaggregated large language models (LLMs) over tens of thousands of xPU devices (GPUs or NPUs) with reliable performance faces multiple challenges. 1) Ignoring the diversity (various prefixes and tidal requests), treating all the prompts in a mixed pool is inadequate. To facilitate the similarity per scenario and minimize the inner mismatch on P/D (prefill and decoding) processing, fine-g… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  20. arXiv:2408.08134  [pdf, other

    cs.CV

    CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning

    Authors: Wei Zhu, Yicheng Liu, Yuping He, Tangfei Liao, Kang Zheng, Xiaoqiu Xu, Tao Wang, Tong Lu

    Abstract: In the fields of computer vision and robotics, accurate pixel-level correspondences are essential for enabling advanced tasks such as structure-from-motion and simultaneous localization and mapping. Recent correspondence pruning methods usually focus on learning local consistency through k-nearest neighbors, which makes it difficult to capture robust context for each correspondence. We propose Cor… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: 8 pages, 4 figures, accepted by ECAI

  21. arXiv:2408.07975  [pdf, other

    cs.RO cs.CL cs.CV

    Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models

    Authors: Tianyu Wang, Haitao Lin, Junqiu Yu, Yanwei Fu

    Abstract: This paper investigates the task of the open-ended interactive robotic manipulation on table-top scenarios. While recent Large Language Models (LLMs) enhance robots' comprehension of user instructions, their lack of visual grounding constrains their ability to physically interact with the environment. This is because the robot needs to locate the target object for manipulation within the physical… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: Accepted by IROS 2024. 8 pages, 5 figures. See https://star-uu-wang.github.io/Polaris/

  22. arXiv:2408.07605  [pdf, other

    cs.CV

    Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving

    Authors: Yuqing Wen, Yucheng Zhao, Yingfei Liu, Binyuan Huang, Fan Jia, Yanhui Wang, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang

    Abstract: The field of autonomous driving increasingly demands high-quality annotated video training data. In this paper, we propose Panacea+, a powerful and universally applicable framework for generating video data in driving scenes. Built upon the foundation of our previous work, Panacea, Panacea+ adopts a multi-view appearance noise prior mechanism and a super-resolution module for enhanced consistency… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Project page: https://panacea-ad.github.io/. arXiv admin note: text overlap with arXiv:2311.16813

  23. arXiv:2408.07516  [pdf, other

    cs.CV eess.IV

    DIffSteISR: Harnessing Diffusion Prior for Superior Real-world Stereo Image Super-Resolution

    Authors: Yuanbo Zhou, Xinlin Zhang, Wei Deng, Tao Wang, Tao Tan, Qinquan Gao, Tong Tong

    Abstract: We introduce DiffSteISR, a pioneering framework for reconstructing real-world stereo images. DiffSteISR utilizes the powerful prior knowledge embedded in pre-trained text-to-image model to efficiently recover the lost texture details in low-resolution stereo images. Specifically, DiffSteISR implements a time-aware stereo cross attention with temperature adapter (TASCATA) to guide the diffusion pro… ▽ More

    Submitted 14 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

  24. arXiv:2408.07238  [pdf, other

    cs.CL cs.AI cs.LG

    Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach

    Authors: Tong Wang, K. Sudhir, Dat Hong

    Abstract: Advanced Large language models (LLMs) like GPT-4 or LlaMa 3 provide superior performance in complex human-like interactions. But they are costly, or too large for edge devices such as smartphones and harder to self-host, leading to security and privacy concerns. This paper introduces a novel interpretable knowledge distillation approach to enhance the performance of smaller, more economical LLMs t… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  25. arXiv:2408.05981  [pdf, other

    cs.RO

    CAD-Mesher: A Convenient, Accurate, Dense Mesh-based Mapping Module in SLAM for Dynamic Environments

    Authors: Yanpeng Jia, Fengkui Cao, Ting Wang, Yandong Tang, Shiliang Shao, Lianqing Liu

    Abstract: Most LiDAR odometry and SLAM systems construct maps in point clouds, which are discrete and sparse when zoomed in, making them not directly suitable for navigation. Mesh maps represent a dense and continuous map format with low memory consumption, which can approximate complex structures with simple elements, attracting significant attention of researchers in recent years. However, most implementa… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 9 pages, 7 figures

  26. arXiv:2408.05758  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

    Authors: Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jianwu Dang, Jianhua Tao

    Abstract: Deep learning has brought significant improvements to the field of cross-modal representation learning. For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired, emphasizing the semantic content of the text modality while de-emphasizing the paralinguistic information of the spe… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  27. arXiv:2408.05686  [pdf, other

    cs.LG cs.MA

    The Bandit Whisperer: Communication Learning for Restless Bandits

    Authors: Yunfan Zhao, Tonghan Wang, Dheeraj Nagaraj, Aparna Taneja, Milind Tambe

    Abstract: Applying Reinforcement Learning (RL) to Restless Multi-Arm Bandits (RMABs) offers a promising avenue for addressing allocation problems with resource constraints and temporal dynamics. However, classic RMAB models largely overlook the challenges of (systematic) data errors - a common occurrence in real-world scenarios due to factors like varying data collection protocols and intentional noise for… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  28. arXiv:2408.05419  [pdf, other

    cs.LG

    Interface Laplace Learning: Learnable Interface Term Helps Semi-Supervised Learning

    Authors: Tangjun Wang, Chenglong Bao, Zuoqiang Shi

    Abstract: We introduce a novel framework, called Interface Laplace learning, for graph-based semi-supervised learning. Motivated by the observation that an interface should exist between different classes where the function value is non-smooth, we introduce a Laplace learning model that incorporates an interface term. This model challenges the long-standing assumption that functions are smooth at all unlabe… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  29. arXiv:2408.03060  [pdf

    cs.CV cs.GR

    MGFs: Masked Gaussian Fields for Meshing Building based on Multi-View Images

    Authors: Tengfei Wang, Zongqian Zhan, Rui Xia, Linxia Ji, Xin Wang

    Abstract: Over the last few decades, image-based building surface reconstruction has garnered substantial research interest and has been applied across various fields, such as heritage preservation, architectural planning, etc. Compared to the traditional photogrammetric and NeRF-based solutions, recently, Gaussian fields-based methods have exhibited significant potential in generating surface meshes due to… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  30. arXiv:2408.02694  [pdf

    cs.LG cs.AI q-fin.CP

    KAN based Autoencoders for Factor Models

    Authors: Tianqi Wang, Shubham Singh

    Abstract: Inspired by recent advances in Kolmogorov-Arnold Networks (KANs), we introduce a novel approach to latent factor conditional asset pricing models. While previous machine learning applications in asset pricing have predominantly used Multilayer Perceptrons with ReLU activation functions to model latent factor exposures, our method introduces a KAN-based autoencoder which surpasses MLP models in bot… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 7 pages

  31. arXiv:2408.02666  [pdf, other

    cs.CL cs.AI

    Self-Taught Evaluators

    Authors: Tianlu Wang, Ilia Kulikov, Olga Golovneva, Ping Yu, Weizhe Yuan, Jane Dwivedi-Yu, Richard Yuanzhe Pang, Maryam Fazel-Zarandi, Jason Weston, Xian Li

    Abstract: Model-based evaluation is at the heart of successful model development -- as a reward model for training, and as a replacement for human evaluation. To train such evaluators, the standard approach is to collect a large amount of human preference judgments over model responses, which is costly and the data becomes stale as models improve. In this work, we present an approach that aims to im-prove e… ▽ More

    Submitted 8 August, 2024; v1 submitted 5 August, 2024; originally announced August 2024.

  32. arXiv:2408.02503  [pdf, other

    cs.CL

    UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

    Authors: Zhaowei Li, Wei Wang, YiQing Cai, Xu Qi, Pengyu Wang, Dong Zhang, Hang Song, Botian Jiang, Zhida Huang, Tao Wang

    Abstract: Significant advancements has recently been achieved in the field of multi-modal large language models (MLLMs), demonstrating their remarkable capabilities in understanding and reasoning across diverse tasks. However, these models are often trained for specific tasks and rely on task-specific input-output formats, limiting their applicability to a broader range of tasks. This raises a fundamental q… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  33. arXiv:2408.02211  [pdf, other

    cs.GR

    SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements

    Authors: Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva

    Abstract: Despite advances in text-to-3D generation methods, generation of multi-object arrangements remains challenging. Current methods exhibit failures in generating physically plausible arrangements that respect the provided text description. We present SceneMotifCoder (SMC), an example-driven framework for generating 3D object arrangements through visual program learning. SMC leverages large language m… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  34. arXiv:2408.01791  [pdf

    cs.NI

    Implementing NAT Hole Punching with QUIC

    Authors: Jinyu Liang, Wei Xu, Taotao Wang, Qing Yang, Shengli Zhang

    Abstract: The widespread adoption of Network Address Translation (NAT) technology has led to a significant number of network end nodes being located in private networks behind NAT devices, impeding direct communication between these nodes. To solve this problem, a technique known as "hole punching" has been devised for NAT traversal to facilitate peer-to-peer communication among end nodes located in distinc… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: The paper has been accepted for oral presentation at the VTC2024-Fall Conference

  35. arXiv:2408.00940  [pdf, other

    eess.IV cs.CV

    A dual-task mutual learning framework for predicting post-thrombectomy cerebral hemorrhage

    Authors: Caiwen Jiang, Tianyu Wang, Xiaodan Xing, Mianxin Liu, Guang Yang, Zhongxiang Ding, Dinggang Shen

    Abstract: Ischemic stroke is a severe condition caused by the blockage of brain blood vessels, and can lead to the death of brain tissue due to oxygen deprivation. Thrombectomy has become a common treatment choice for ischemic stroke due to its immediate effectiveness. But, it carries the risk of postoperative cerebral hemorrhage. Clinically, multiple CT scans within 0-72 hours post-surgery are used to moni… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  36. arXiv:2408.00920  [pdf, other

    cs.LG stat.ML

    Towards Certified Unlearning for Deep Neural Networks

    Authors: Binchi Zhang, Yushun Dong, Tianhao Wang, Jundong Li

    Abstract: In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: ICML 2024

  37. arXiv:2408.00803  [pdf, other

    cs.SE cs.AI cs.CE

    A Comprehensive Survey on Root Cause Analysis in (Micro) Services: Methodologies, Challenges, and Trends

    Authors: Tingting Wang, Guilin Qi

    Abstract: The complex dependencies and propagative faults inherent in microservices, characterized by a dense network of interconnected services, pose significant challenges in identifying the underlying causes of issues. Prompt identification and resolution of disruptive problems are crucial to ensure rapid recovery and maintain system stability. Numerous methodologies have emerged to address this challeng… ▽ More

    Submitted 23 July, 2024; originally announced August 2024.

  38. arXiv:2407.21040  [pdf, other

    cs.AI cs.CL cs.DB cs.SE

    Towards Automated Data Sciences with Natural Language and SageCopilot: Practices and Lessons Learned

    Authors: Yuan Liao, Jiang Bian, Yuhui Yun, Shuo Wang, Yubo Zhang, Jiaming Chu, Tao Wang, Kewei Li, Yuchen Li, Xuhong Li, Shilei Ji, Haoyi Xiong

    Abstract: While the field of NL2SQL has made significant advancements in translating natural language instructions into executable SQL scripts for data querying and processing, achieving full automation within the broader data science pipeline - encompassing data querying, analysis, visualization, and reporting - remains a complex challenge. This study introduces SageCopilot, an advanced, industry-grade sys… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  39. arXiv:2407.19451  [pdf, other

    cs.CV cs.GR

    Perm: A Parametric Representation for Multi-Style 3D Hair Modeling

    Authors: Chengan He, Xin Sun, Zhixin Shu, Fujun Luan, Sören Pirk, Jorge Alejandro Amador Herrera, Dominik L. Michels, Tuanfeng Y. Wang, Meng Zhang, Holly Rushmeier, Yi Zhou

    Abstract: We present Perm, a learned parametric model of human 3D hair designed to facilitate various hair-related applications. Unlike previous work that jointly models the global hair shape and local strand details, we propose to disentangle them using a PCA-based strand representation in the frequency domain, thereby allowing more precise editing and output control. Specifically, we leverage our strand r… ▽ More

    Submitted 8 August, 2024; v1 submitted 28 July, 2024; originally announced July 2024.

    Comments: Project page: https://cs.yale.edu/homes/che/projects/perm/

  40. arXiv:2407.19446  [pdf, ps, other

    cs.IT stat.ML

    Leave-One-Out Analysis for Nonconvex Robust Matrix Completion with General Thresholding Functions

    Authors: Tianming Wang, Ke Wei

    Abstract: We study the problem of robust matrix completion (RMC), where the partially observed entries of an underlying low-rank matrix is corrupted by sparse noise. Existing analysis of the non-convex methods for this problem either requires the explicit but empirically redundant regularization in the algorithm or requires sample splitting in the analysis. In this paper, we consider a simple yet efficient… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

  41. arXiv:2407.19259  [pdf, other

    cs.CV cs.AI

    Fine-Grained Scene Graph Generation via Sample-Level Bias Prediction

    Authors: Yansheng Li, Tingzhu Wang, Kang Wu, Linlin Wang, Xin Guo, Wenbin Wang

    Abstract: Scene Graph Generation (SGG) aims to explore the relationships between objects in images and obtain scene summary graphs, thereby better serving downstream tasks. However, the long-tailed problem has adversely affected the scene graph's quality. The predictions are dominated by coarse-grained relationships, lacking more informative fine-grained ones. The union region of one object pair (i.e., one… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: 24 pages, 10 figures, ECCV2024

  42. arXiv:2407.19147  [pdf, other

    quant-ph cs.CR

    Reexamination of the realtime protection for user privacy in practical quantum private query

    Authors: Chun-Yan Wei, Xiao-Qiu Cai, Tian-Yin Wang

    Abstract: Quantum private query (QPQ) is the quantum version for symmetrically private retrieval. However, the user privacy in QPQ is generally guarded in the non-realtime and cheat sensitive way. That is, the dishonest database holder's cheating to elicit user privacy can only be discovered after the protocol is finished (when the user finds some errors in the retrieved database item). Such delayed detecti… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  43. arXiv:2407.18248  [pdf, other

    cs.CL

    Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

    Authors: Tianduo Wang, Shichen Li, Wei Lu

    Abstract: Effective training of language models (LMs) for mathematical reasoning tasks demands high-quality supervised fine-tuning data. Besides obtaining annotations from human experts, a common alternative is sampling from larger and more powerful LMs. However, this knowledge distillation approach can be costly and unstable, particularly when relying on closed-source, proprietary LMs like GPT-4, whose beh… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: ACL 2024. Code and data are available at https://github.com/TianduoWang/DPO-ST

  44. arXiv:2407.18074  [pdf, other

    cs.GT cs.LG cs.MA

    Principal-Agent Reinforcement Learning

    Authors: Dima Ivanov, Paul Dütting, Inbal Talgam-Cohen, Tonghan Wang, David C. Parkes

    Abstract: Contracts are the economic framework which allows a principal to delegate a task to an agent -- despite misaligned interests, and even without directly observing the agent's actions. In many modern reinforcement learning settings, self-interested agents learn to perform a multi-stage task delegated to them by a principal. We explore the significant potential of utilizing contracts to incentivize t… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  45. arXiv:2407.16931  [pdf, other

    cs.CL

    ScholarChemQA: Unveiling the Power of Language Models in Chemical Research Question Answering

    Authors: Xiuying Chen, Tairan Wang, Taicheng Guo, Kehan Guo, Juexiao Zhou, Haoyang Li, Mingchen Zhuge, Jürgen Schmidhuber, Xin Gao, Xiangliang Zhang

    Abstract: Question Answering (QA) effectively evaluates language models' reasoning and knowledge depth. While QA datasets are plentiful in areas like general domain and biomedicine, academic chemistry is less explored. Chemical QA plays a crucial role in both education and research by effectively translating complex chemical information into readily understandable format. Addressing this gap, we introduce S… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 14 pages

  46. Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

    Authors: Letian Gong, Huaiyu Wan, Shengnan Guo, Xiucheng Li, Yan Lin, Erwen Zheng, Tianyi Wang, Zeyu Zhou, Youfang Lin

    Abstract: The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention… ▽ More

    Submitted 25 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted as a regular paper at IEEE TKDE

  47. arXiv:2407.15334  [pdf, other

    cs.CV

    Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection

    Authors: Yiran Yang, Xu Gao, Tong Wang, Xin Hao, Yifeng Shi, Xiao Tan, Xiaoqing Ye, Jingdong Wang

    Abstract: Camera and LiDAR serve as informative sensors for accurate and robust autonomous driving systems. However, these sensors often exhibit heterogeneous natures, resulting in distributional modality gaps that present significant challenges for fusion. To address this, a robust fusion technique is crucial, particularly for enhancing 3D object detection. In this paper, we introduce a dynamic adjustment… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  48. arXiv:2407.15259  [pdf, other

    cs.AI

    New Rules for Causal Identification with Background Knowledge

    Authors: Tian-Zuo Wang, Lue Tao, Zhi-Hua Zhou

    Abstract: Identifying causal relations is crucial for a variety of downstream tasks. In additional to observational data, background knowledge (BK), which could be attained from human expertise or experiments, is usually introduced for uncovering causal relations. This raises an open problem that in the presence of latent variables, what causal relations are identifiable from observational data and BK. In t… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  49. arXiv:2407.14020  [pdf, other

    q-bio.NC cs.LG

    NeuroBind: Towards Unified Multimodal Representations for Neural Signals

    Authors: Fengyu Yang, Chao Feng, Daniel Wang, Tianye Wang, Ziyao Zeng, Zhiyang Xu, Hyoungseob Park, Pengliang Ji, Hanbin Zhao, Yuanning Li, Alex Wong

    Abstract: Understanding neural activity and information representation is crucial for advancing knowledge of brain function and cognition. Neural activity, measured through techniques like electrophysiology and neuroimaging, reflects various aspects of information processing. Recent advances in deep neural networks offer new approaches to analyzing these signals using pre-trained models. However, challenges… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  50. arXiv:2407.13945  [pdf, other

    cs.CL

    FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking

    Authors: Zhuoer Wang, Leonardo F. R. Ribeiro, Alexandros Papangelis, Rohan Mukherjee, Tzu-Yen Wang, Xinyan Zhao, Arijit Biswas, James Caverlee, Angeliki Metallinou

    Abstract: API call generation is the cornerstone of large language models' tool-using ability that provides access to the larger world. However, existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. To address these limitations, we propose an output-side opt… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.