Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 143 results for author: Wen, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.17691  [pdf, other

    cs.NI eess.SY

    Design, Key Techniques and System-Level Simulation for NB-IoT Networks

    Authors: Shutao Zhang, Peiran Wu, Hongqing Huang, Liya Zhu, Yijia Guo, Wenkun Wen, Tingting Yang, Minghua Xia

    Abstract: Narrowband Internet of Things (NB-IoT) is a promising technology designated specially by the 3rd Generation Partnership Project (3GPP) to meet the growing demand of massive machine-type communications (mMTC). More and more industrial companies choose NB-IoT network as the solution to mMTC due to its unique design and technical specification released by 3GPP. In order to evaluate the performance of… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  2. arXiv:2407.13833  [pdf, other

    cs.CL cs.AI

    Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

    Authors: Emman Haider, Daniel Perez-Becker, Thomas Portet, Piyush Madan, Amit Garg, David Majercak, Wen Wen, Dongwoo Kim, Ziyi Yang, Jianwen Zhang, Hiteshi Sharma, Blake Bullwinkel, Martin Pouliot, Amanda Minnich, Shiven Chawla, Solianna Herrera, Shahed Warreth, Maggie Engler, Gary Lopez, Nina Chikanov, Raja Sekhar Rao Dheekonda, Bolor-Erdene Jagdagdorj, Roman Lutz, Richard Lundeen, Tori Westerhoff , et al. (5 additional authors not shown)

    Abstract: Recent innovations in language model training have demonstrated that it is possible to create highly performant models that are small enough to run on a smartphone. As these models are deployed in an increasing number of domains, it is critical to ensure that they are aligned with human preferences and safety considerations. In this report, we present our methodology for safety aligning the Phi-3… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2407.12319  [pdf, other

    cs.CV

    Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model

    Authors: Tao Wang, Wei Wen, Jingzhi Zhai, Kang Xu, Haoming Luo

    Abstract: Point cloud segmentation is crucial for robotic visual perception and environmental understanding, enabling applications such as robotic navigation and 3D reconstruction. However, handling the sparse and unordered nature of point cloud data presents challenges for efficient and accurate segmentation. Inspired by the Mamba model's success in natural language processing, we propose the Serialized Po… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  4. arXiv:2407.05633  [pdf, other

    cs.LG cs.CR

    AdaPI: Facilitating DNN Model Adaptivity for Efficient Private Inference in Edge Computing

    Authors: Tong Zhou, Jiahui Zhao, Yukui Luo, Xi Xie, Wujie Wen, Caiwen Ding, Xiaolin Xu

    Abstract: Private inference (PI) has emerged as a promising solution to execute computations on encrypted data, safeguarding user privacy and model parameters in edge computing. However, existing PI methods are predominantly developed considering constant resource constraints, overlooking the varied and dynamic resource constraints in diverse edge devices, like energy budgets. Consequently, model providers… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: ICCAD 2024 accepted publication

  5. arXiv:2407.02888  [pdf, ps, other

    cs.LG cs.AI

    Joint Optimization of Resource Allocation and Data Selection for Fast and Cost-Efficient Federated Edge Learning

    Authors: Yunjian Jia, Zhen Huang, Jiping Yan, Yulu Zhang, Kun Luo, Wanli Wen

    Abstract: Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data sele… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  6. arXiv:2406.10932  [pdf, other

    cs.SD cs.AI eess.AS

    Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition

    Authors: Wenhan Yao, Jiangkun Yang, Yongqiang He, Jia Liu, Weiping Wen

    Abstract: Speech recognition is an essential start ring of human-computer interaction, and recently, deep learning models have achieved excellent success in this task. However, when the model training and private data provider are always separated, some security threats that make deep neural networks (DNNs) abnormal deserve to be researched. In recent years, the typical backdoor attacks have been researched… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  7. arXiv:2406.06544  [pdf, other

    cs.AR cs.AI

    TSB: Tiny Shared Block for Efficient DNN Deployment on NVCIM Accelerators

    Authors: Yifan Qin, Zheyu Yan, Zixuan Pan, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi

    Abstract: Compute-in-memory (CIM) accelerators using non-volatile memory (NVM) devices offer promising solutions for energy-efficient and low-latency Deep Neural Network (DNN) inference execution. However, practical deployment is often hindered by the challenge of dealing with the massive amount of model weight parameters impacted by the inherent device variations within non-volatile computing-in-memory (NV… ▽ More

    Submitted 8 May, 2024; originally announced June 2024.

  8. arXiv:2406.02629  [pdf, other

    cs.CR cs.LG

    SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud

    Authors: Shijin Duan, Chenghong Wang, Hongwu Peng, Yukui Luo, Wujie Wen, Caiwen Ding, Xiaolin Xu

    Abstract: As privacy-preserving becomes a pivotal aspect of deep learning (DL) development, multi-party computation (MPC) has gained prominence for its efficiency and strong security. However, the practice of current MPC frameworks is limited, especially when dealing with large neural networks, exemplified by the prolonged execution time of 25.8 seconds for secure inference on ResNet-152. The primary challe… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 16 pages, 9 figures

  9. arXiv:2405.02238  [pdf, other

    cs.CR

    Secure and Efficient General Matrix Multiplication On Cloud Using Homomorphic Encryption

    Authors: Yang Gao, Gang Quan, Soamar Homsi, Wujie Wen, Liqiang Wang

    Abstract: Despite the cloud enormous technical and financial advantages, security and privacy have always been the primary concern for adopting cloud computing facility, especially for government agencies and commercial sectors with high-security requirements. Homomorphic Encryption (HE) has recently emerged as an effective tool in assuring privacy and security for sensitive applications by allowing computi… ▽ More

    Submitted 22 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures. 4 tables

  10. arXiv:2404.14724  [pdf

    cs.RO

    Tightly Joined Positioning and Control Model for Unmanned Aerial Vehicles Based on Factor Graph Optimization

    Authors: Peiwen Yang, Weisong Wen, Shiyu Bai, Li-Ta Hsu

    Abstract: The execution of flight missions by unmanned aerial vehicles (UAV) primarily relies on navigation. In particular, the navigation pipeline has traditionally been divided into positioning and control, operating in a sequential loop. However, the existing navigation pipeline, where the positioning and control are decoupled, struggles to adapt to ubiquitous uncertainties arising from measurement noise… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  11. arXiv:2404.00252  [pdf, other

    eess.IV cs.CV

    Learned Scanpaths Aid Blind Panoramic Video Quality Assessment

    Authors: Kanglong Fan, Wen Wen, Mu Li, Yifan Peng, Kede Ma

    Abstract: Panoramic videos have the advantage of providing an immersive and interactive viewing experience. Nevertheless, their spherical nature gives rise to various and uncertain user viewing behaviors, which poses significant challenges for panoramic video quality assessment (PVQA). In this work, we propose an end-to-end optimized, blind PVQA method with explicit modeling of user viewing patterns through… ▽ More

    Submitted 15 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  12. arXiv:2403.15191  [pdf, other

    cs.CR cs.DC

    VORTEX: Real-Time Off-Chain Payments and Cross-Chain Swaps for Cryptocurrencies

    Authors: Di Wu, Jian Liu, Zhengwei Hou, Wu Wen, Kui Ren

    Abstract: In this paper, we present VERTEX, a TEE-based layer-2 solution that tackles two crucial challenges in the realm of cryptocurrencies: off-chain payments and cross-chain swaps. It offers three notable features: - Channel-free off-chain payments: it allows a payer to make direct payments to anyone without requiring any on-chain relationship or intermediary channels. - Real-time yet decentralized cros… ▽ More

    Submitted 5 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  13. arXiv:2403.05026  [pdf, other

    cs.LG cs.AI

    Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts

    Authors: Zeyang Zhang, Xin Wang, Ziwei Zhang, Zhou Qin, Weigao Wen, Hui Xue, Haoyang Li, Wenwu Zhu

    Abstract: Dynamic graph neural networks (DyGNNs) currently struggle with handling distribution shifts that are inherent in dynamic graphs. Existing work on DyGNNs with out-of-distribution settings only focuses on the time domain, failing to handle cases involving distribution shifts in the spectral domain. In this paper, we discover that there exist cases with distribution shifts unobservable in the time do… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: NeurIPS'23

  14. arXiv:2402.19276  [pdf, other

    eess.IV cs.CV

    Modular Blind Video Quality Assessment

    Authors: Wen Wen, Mu Li, Yabin Zhang, Yiting Liao, Junlin Li, Li Zhang, Kede Ma

    Abstract: Blind video quality assessment (BVQA) plays a pivotal role in evaluating and improving the viewing experience of end-users across a wide range of video-based platforms and services. Contemporary deep learning-based models primarily analyze video content in its aggressively subsampled format, while being blind to the impact of the actual spatial resolution and frame rate on video quality. In this p… ▽ More

    Submitted 31 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR 2024; Camera-ready version

  15. arXiv:2402.18934  [pdf, other

    cs.RO

    RELEAD: Resilient Localization with Enhanced LiDAR Odometry in Adverse Environments

    Authors: Zhiqiang Chen, Hongbo Chen, Yuhua Qi, Shipeng Zhong, Dapeng Feng, Wu Jin, Weisong Wen, Ming Liu

    Abstract: LiDAR-based localization is valuable for applications like mining surveys and underground facility maintenance. However, existing methods can struggle when dealing with uninformative geometric structures in challenging scenarios. This paper presents RELEAD, a LiDAR-centric solution designed to address scan-matching degradation. Our method enables degeneracy-free point cloud registration by solving… ▽ More

    Submitted 15 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Journal ref: published in ICRA 2024

  16. arXiv:2402.11790  [pdf, other

    cs.RO

    CoLRIO: LiDAR-Ranging-Inertial Centralized State Estimation for Robotic Swarms

    Authors: Shipeng Zhong, Hongbo Chen, Yuhua Qi, Dapeng Feng, Zhiqiang Chen, Jin Wu, Weisong Wen, Ming Liu

    Abstract: Collaborative state estimation using different heterogeneous sensors is a fundamental prerequisite for robotic swarms operating in GPS-denied environments, posing a significant research challenge. In this paper, we introduce a centralized system to facilitate collaborative LiDAR-ranging-inertial state estimation, enabling robotic swarms to operate without the need for anchor deployment. The system… ▽ More

    Submitted 23 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Journal ref: published in ICRA 2024

  17. arXiv:2402.11262  [pdf, other

    cs.IR cs.LG

    Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima

    Authors: Shanshan Zhong, Zhongzhan Huang, Daifeng Li, Wushao Wen, Jinghui Qin, Liang Lin

    Abstract: Multimodal recommender systems utilize various types of information to model user preferences and item features, helping users discover items aligned with their interests. The integration of multimodal information mitigates the inherent challenges in recommender systems, e.g., the data sparsity problem and cold-start issues. However, it simultaneously magnifies certain risks from multimodal inform… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW'24

  18. arXiv:2401.11664  [pdf, other

    cs.LG cs.AI cs.AR

    Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM

    Authors: Bingbing Li, Geng Yuan, Zigeng Wang, Shaoyi Huang, Hongwu Peng, Payman Behnam, Wujie Wen, Hang Liu, Caiwen Ding

    Abstract: Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication. However, hardware failures, such as stuck-at-fault defects, can result in significant prediction errors during model inference. While additional crossbars can be used to address these failures, they come with storage overhe… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  19. Enhancing Communication Efficiency of Semantic Transmission via Joint Processing Technique

    Authors: Xumin Pu, Tiantian Lei, Wanli Wen, Qianbin Chen

    Abstract: This work presents a novel semantic transmission framework in wireless networks, leveraging the joint processing technique. Our framework enables multiple cooperating base stations to efficiently transmit semantic information to multiple users simultaneously. To enhance the semantic communication efficiency of the transmission framework, we formulate an optimization problem with the objective of m… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 6 pages, 6 figures

  20. arXiv:2312.02439  [pdf, other

    cs.AI cs.CL cs.CV

    Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

    Authors: Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou

    Abstract: Chain-of-Thought (CoT) guides large language models (LLMs) to reason step-by-step, and can motivate their logical reasoning ability. While effective for logical tasks, CoT is not conducive to creative problem-solving which often requires out-of-box thoughts and is crucial for innovation advancements. In this paper, we explore the Leap-of-Thought (LoT) abilities within LLMs -- a non-sequential, cre… ▽ More

    Submitted 21 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Technical report

  21. arXiv:2311.13169  [pdf, other

    cs.LG cs.AI

    SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss Landscape

    Authors: Hua Zheng, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Wen-Yen Chen, Wei Wen

    Abstract: Neural Architecture Search (NAS) has become a widely used tool for automating neural network design. While one-shot NAS methods have successfully reduced computational requirements, they often require extensive training. On the other hand, zero-shot NAS utilizes training-free proxies to evaluate a candidate architecture's test performance but has two limitations: (1) inability to use the informati… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 24 pages, 7 figures

  22. arXiv:2311.08430  [pdf, other

    cs.LG cs.AI cs.IR

    Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

    Authors: Wei Wen, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen

    Abstract: Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Wei Wen and Kuang-Hung Liu contribute equally

  23. arXiv:2311.07870  [pdf, other

    cs.IR cs.AI

    AutoML for Large Capacity Modeling of Meta's Ranking Systems

    Authors: Hang Yin, Kuang-Hung Liu, Mengying Sun, Yuxin Chen, Buyun Zhang, Jiang Liu, Vivek Sehgal, Rudresh Rajnikant Panchal, Eugen Hotaj, Xi Liu, Daifeng Guo, Jamey Zhang, Zhou Wang, Shali Jiang, Huayu Li, Zhengxing Chen, Wen-Yen Chen, Jiyan Yang, Wei Wen

    Abstract: Web-scale ranking systems at Meta serving billions of users is complex. Improving ranking models is essential but engineering heavy. Automated Machine Learning (AutoML) can release engineers from labor intensive work of tuning ranking models; however, it is unknown if AutoML is efficient enough to meet tight production timeline in real-world and, at the same time, bring additional improvements to… ▽ More

    Submitted 16 November, 2023; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Hang Yin and Kuang-Hung Liu contribute equally

  24. arXiv:2311.02327  [pdf, other

    cs.RO cs.DB

    ECMD: An Event-Centric Multisensory Driving Dataset for SLAM

    Authors: Peiyu Chen, Weipeng Guan, Feng Huang, Yihan Zhong, Weisong Wen, Li-Ta Hsu, Peng Lu

    Abstract: Leveraging multiple sensors enhances complex environmental perception and increases resilience to varying luminance conditions and high-speed motion patterns, achieving precise localization and mapping. This paper proposes, ECMD, an event-centric multisensory dataset containing 81 sequences and covering over 200 km of various challenging driving scenarios including high-speed motion, repetitive sc… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  25. arXiv:2311.00231  [pdf, other

    cs.IR cs.LG

    DistDNAS: Search Efficient Feature Interactions within 2 Hours

    Authors: Tunhou Zhang, Wei Wen, Igor Fedorov, Xi Liu, Buyun Zhang, Fangqiu Han, Wen-Yen Chen, Yiping Han, Feng Yan, Hai Li, Yiran Chen

    Abstract: Search efficiency and serving efficiency are two major axes in building feature interactions and expediting the model development process in recommender systems. On large-scale benchmarks, searching for the optimal feature interaction design requires extensive cost due to the sequential workflow on the large volume of data. In addition, fusing interactions of various sources, orders, and mathemati… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  26. arXiv:2310.20705  [pdf, other

    cs.LG cs.IR

    Farthest Greedy Path Sampling for Two-shot Recommender Search

    Authors: Yufan Cao, Tunhou Zhang, Wei Wen, Feng Yan, Hai Li, Yiran Chen

    Abstract: Weight-sharing Neural Architecture Search (WS-NAS) provides an efficient mechanism for developing end-to-end deep recommender models. However, in complex search spaces, distinguishing between superior and inferior architectures (or paths) is challenging. This challenge is compounded by the limited coverage of the supernet and the co-adaptation of subnet weights, which restricts the exploration and… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: 9 pages, 5 figures

  27. arXiv:2310.02542  [pdf, other

    cs.RO

    Tightly Joining Positioning and Control for Trustworthy Unmanned Aerial Vehicles Based on Factor Graph Optimization in Urban Transportation

    Authors: Peiwen Yang, Weisong Wen

    Abstract: Unmanned aerial vehicles (UAV) showed great potential in improving the efficiency of parcel delivery applications in the coming smart cities era. Unfortunately, the trustworthy positioning and control algorithms of the UAV are significantly challenged in complex urban areas. For example, the ubiquitous global navigation satellite system (GNSS) positioning can be degraded by the signal reflections… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  28. arXiv:2309.15683  [pdf, other

    cs.CV

    End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning

    Authors: Jinrong Zhang, Wujun Wen, Shenglan Liu, Yunheng Li, Qifeng Li, Lin Feng

    Abstract: The streaming temporal action segmentation (STAS) task, a supplementary task of temporal action segmentation (TAS), has not received adequate attention in the field of video understanding. Existing TAS methods are constrained to offline scenarios due to their heavy reliance on multimodal features and complete contextual information. The STAS task requires the model to classify each frame of the en… ▽ More

    Submitted 23 May, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: submit to TNNLS

  29. arXiv:2309.14331  [pdf, other

    cs.LG cs.AI cs.CR

    LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference

    Authors: Hongwu Peng, Ran Ran, Yukui Luo, Jiahui Zhao, Shaoyi Huang, Kiran Thorat, Tong Geng, Chenghong Wang, Xiaolin Xu, Wujie Wen, Caiwen Ding

    Abstract: The growth of Graph Convolution Network (GCN) model sizes has revolutionized numerous applications, surpassing human performance in areas such as personal healthcare and financial systems. The deployment of GCNs in the cloud raises privacy concerns due to potential adversarial attacks on client data. To address security concerns, Privacy-Preserving Machine Learning (PPML) using Homomorphic Encrypt… ▽ More

    Submitted 4 October, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: NeurIPS 2023 accepted publication

    ACM Class: E.3; I.2; B.0

  30. arXiv:2308.14105  [pdf, other

    cs.CV cs.AI

    Unified and Dynamic Graph for Temporal Character Grouping in Long Videos

    Authors: Xiujun Shu, Wei Wen, Liangsheng Xu, Ruizhi Qiao, Taian Guo, Hanjun Li, Bei Gan, Xiao Wang, Xing Sun

    Abstract: Video temporal character grouping locates appearing moments of major characters within a video according to their identities. To this end, recent works have evolved from unsupervised clustering to graph-based supervised clustering. However, graph methods are built upon the premise of fixed affinity graphs, bringing many inexact connections. Besides, they extract multi-modal features with kinds of… ▽ More

    Submitted 22 June, 2024; v1 submitted 27 August, 2023; originally announced August 2023.

  31. arXiv:2308.10134  [pdf, other

    cs.CR cs.LG

    AutoReP: Automatic ReLU Replacement for Fast Private Network Inference

    Authors: Hongwu Peng, Shaoyi Huang, Tong Zhou, Yukui Luo, Chenghong Wang, Zigeng Wang, Jiahui Zhao, Xi Xie, Ang Li, Tony Geng, Kaleel Mahmood, Wujie Wen, Xiaolin Xu, Caiwen Ding

    Abstract: The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients' data privacy and security issues. Private inference (PI) techniques using cryptographic primitives offer a solution but often have high computation and communication costs, particularly with non-linear operators like ReLU. Many attempts to reduce ReLU operations exist, but they may need heuristic threshold sele… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: ICCV 2023 accepeted publication

    ACM Class: E.3; I.2; B.0

  32. arXiv:2308.04197  [pdf, other

    cs.CV

    D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

    Authors: Hanjun Li, Xiujun Shu, Sunan He, Ruizhi Qiao, Wei Wen, Taian Guo, Bei Gan, Xing Sun

    Abstract: Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. Recently, weakly supervised methods still have a large performance gap compared to fully supervised ones, while the latter requires laborious timestamp annotations. In this study, we aim to reduce the annotation cost yet keep competitive performance for TSG task compared… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: ICCV2023

  33. arXiv:2307.15853  [pdf, other

    cs.LG cs.ET

    Improving Realistic Worst-Case Performance of NVCiM DNN Accelerators through Training with Right-Censored Gaussian Noise

    Authors: Zheyu Yan, Yifan Qin, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi

    Abstract: Compute-in-Memory (CiM), built upon non-volatile memory (NVM) devices, is promising for accelerating deep neural networks (DNNs) owing to its in-situ data processing capability and superior energy efficiency. Unfortunately, the well-trained model parameters, after being mapped to NVM devices, can often exhibit large deviations from their intended values due to device variations, resulting in notab… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

  34. arXiv:2307.13981  [pdf, other

    cs.CV cs.MM eess.IV

    Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models

    Authors: Wei Sun, Wen Wen, Xiongkuo Min, Long Lan, Guangtao Zhai, Kede Ma

    Abstract: Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to proper… ▽ More

    Submitted 3 April, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

  35. Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

    Authors: Ce Feng, Nuo Xu, Wujie Wen, Parv Venkitasubramaniam, Caiwen Ding

    Abstract: Differential privacy is a widely accepted measure of privacy in the context of deep learning algorithms, and achieving it relies on a noisy training approach known as differentially private stochastic gradient descent (DP-SGD). DP-SGD requires direct noise addition to every gradient in a dense neural network, the privacy is achieved at a significant utility cost. In this work, we present Spectral-… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: Accepted in 2023 IEEE Symposium on Security and Privacy (SP)

  36. arXiv:2306.15513  [pdf, other

    cs.CR

    PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment

    Authors: Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

    Abstract: Two-party computation (2PC) is promising to enable privacy-preserving deep learning (DL). However, the 2PC-based privacy-preserving DL implementation comes with high comparison protocol overhead from the non-linear operators. This work presents PASNet, a novel systematic framework that enables low latency, high energy efficiency & accuracy, and security-guaranteed 2PC-DL by integrating the hardwar… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: DAC 2023 accepeted publication, short version was published on AAAI 2023 workshop on DL-Hardware Co-Design for AI Acceleration: RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

    ACM Class: E.3; I.2; B.0

    Journal ref: DAC 2023

  37. arXiv:2305.14561  [pdf, other

    cs.LG cs.AI cs.AR

    Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators

    Authors: Yifan Qin, Zheyu Yan, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi

    Abstract: Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference, thanks to their in-situ data processing capability. However, the stochastic nature and intrinsic variations of NVM devices often result in performance degradation in DNN inference. Introducing these non-ideal device behaviors… ▽ More

    Submitted 12 April, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  38. arXiv:2305.05200  [pdf, other

    cs.CV cs.AI

    LSAS: Lightweight Sub-attention Strategy for Alleviating Attention Bias Problem

    Authors: Shanshan Zhong, Wushao Wen, Jinghui Qin, Qiangpu Chen, Zhongzhan Huang

    Abstract: In computer vision, the performance of deep neural networks (DNNs) is highly related to the feature extraction ability, i.e., the ability to recognize and focus on key pixel regions in an image. However, in this paper, we quantitatively and statistically illustrate that DNNs have a serious attention bias problem on many samples from some popular datasets: (1) Position bias: DNNs fully focus on lab… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  39. arXiv:2305.05189  [pdf, other

    cs.CL cs.CV

    SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models

    Authors: Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin

    Abstract: Diffusion models, which have emerged to become popular text-to-image generation models, can produce high-quality and content-rich images guided by textual prompts. However, there are limitations to semantic understanding and commonsense reasoning in existing models when the input prompts are concise narrative, resulting in low-quality image generation. To improve the capacities for narrative promp… ▽ More

    Submitted 29 November, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: accepted by ACM MM 2023

  40. arXiv:2304.12214  [pdf, other

    cs.NE

    Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

    Authors: Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

    Abstract: Biologically inspired Spiking Neural Networks (SNNs) have attracted significant attention for their ability to provide extremely energy-efficient machine intelligence through event-driven operation and sparse activities. As artificial intelligence (AI) becomes ever more democratized, there is an increasing need to execute SNN models on edge devices. Existing works adopt weight pruning to reduce SN… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  41. arXiv:2304.06345  [pdf, other

    cs.CV cs.AI

    ASR: Attention-alike Structural Re-parameterization

    Authors: Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin

    Abstract: The structural re-parameterization (SRP) technique is a novel deep learning technique that achieves interconversion between different network architectures through equivalent parameter transformations. This technique enables the mitigation of the extra costs for performance improvement during training, such as parameter size and inference time, through these transformations during inference, and t… ▽ More

    Submitted 26 August, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: Technical report

  42. arXiv:2304.04183  [pdf, other

    cs.LG stat.ME

    Nearest-Neighbor Sampling Based Conditional Independence Testing

    Authors: Shuai Li, Ziqi Chen, Hongtu Zhu, Christina Dan Wang, Wang Wen

    Abstract: The conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z. The CRT assumes that the conditional distribution of X given Z is known under the null hypothesis and then it is compared to the distribution of the observed samples of the original data. The aim of this paper is to develop a novel alte… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

    Comments: Accepted at AAAI 2023; 9 Pages, 3 Figures, 2 Tables

  43. arXiv:2303.16552  [pdf, other

    cs.CR

    Visual Content Privacy Protection: A Survey

    Authors: Ruoyu Zhao, Yushu Zhang, Tao Wang, Wenying Wen, Yong Xiang, Xiaochun Cao

    Abstract: Vision is the most important sense for people, and it is also one of the main ways of cognition. As a result, people tend to utilize visual content to capture and share their life experiences, which greatly facilitates the transfer of information. Meanwhile, it also increases the risk of privacy violations, e.g., an image or video can reveal different kinds of privacy-sensitive information. Resear… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: 24 pages, 13 figures

  44. arXiv:2303.13955  [pdf, other

    cs.CV

    PIAT: Parameter Interpolation based Adversarial Training for Image Classification

    Authors: Kun He, Xin Liu, Yichen Yang, Zhou Qin, Weigao Wen, Hui Xue, John E. Hopcroft

    Abstract: Adversarial training has been demonstrated to be the most effective approach to defend against adversarial attacks. However, existing adversarial training methods show apparent oscillations and overfitting issue in the training process, degrading the defense efficacy. In this work, we propose a novel framework, termed Parameter Interpolation based Adversarial Training (PIAT), that makes full use o… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  45. arXiv:2302.02292  [pdf, other

    cs.CR cs.LG

    RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

    Authors: Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Shaoyi Huang, Xi Xie, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

    Abstract: The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns. To address these issues, secure Two-party computation (2PC) has been proposed as a means of enabling privacy-preserving DL computation. However, in practice, 2PC methods often incur high computation and communication overhead, which can impede their use in large-scale systems. To address this challen… ▽ More

    Submitted 22 February, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

    Comments: This is work is a updated version of arXiv:2209.09424, the original version has been withdrawn

    ACM Class: I.2

  46. arXiv:2302.00272  [pdf, other

    cs.LG cs.AI

    W2SAT: Learning to generate SAT instances from Weighted Literal Incidence Graphs

    Authors: Weihuang Wen, Tianshu Yu

    Abstract: The Boolean Satisfiability (SAT) problem stands out as an attractive NP-complete problem in theoretic computer science and plays a central role in a broad spectrum of computing-related applications. Exploiting and tuning SAT solvers under numerous scenarios require massive high-quality industry-level SAT instances, which unfortunately are quite limited in the real world. To address the data insuff… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  47. Trajectory Smoothing Using GNSS/PDR Integration Via Factor Graph Optimization in Urban Canyons

    Authors: Yihan Zhong, Weisong Wen, Li-Ta Hsu

    Abstract: Accurate and smooth global navigation satellite system (GNSS) positioning for pedestrians in urban canyons is still a challenge due to the multipath effects and the non-light-of-sight (NLOS) receptions caused by the reflections from surrounding buildings. The recently developed factor graph optimization (FGO) based GNSS positioning method opened a new window for improving urban GNSS positioning by… ▽ More

    Submitted 11 May, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

    Comments: 11 pages, 14 figures

  48. arXiv:2212.05477  [pdf

    cs.RO

    3D LiDAR Aided GNSS NLOS Mitigation for Reliable GNSS-RTK Positioning in Urban Canyons

    Authors: Xikun Liu, Weisong Wen, Feng Huang, Han Gao, Yongliang Wang, Li-Ta Hsu

    Abstract: GNSS and LiDAR odometry are complementary as they provide absolute and relative positioning, respectively. Their integration in a loosely-coupled manner is straightforward but is challenged in urban canyons due to the GNSS signal reflections. Recent proposed 3D LiDAR-aided (3DLA) GNSS methods employ the point cloud map to identify the non-line-of-sight (NLOS) reception of GNSS signals. This facili… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

  49. arXiv:2212.04160  [pdf, ps, other

    cs.DC cs.CR

    Blockchain for Data Sharing at the Network Edge: Trade-Off Between Capability and Security

    Authors: Yixin Li, Liang Liang, Yunjian Jia, Wanli Wen, Chaowei Tang, Zhengchuan Chen

    Abstract: Blokchain is a promising technology to enable distributed and reliable data sharing at the network edge. The high security in blockchain is undoubtedly a critical factor for the network to handle important data item. On the other hand, according to the dilemma in blockchain, an overemphasis on distributed security will lead to poor transaction-processing capability, which limits the application of… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: 14 pages, 8 figures

  50. arXiv:2211.15127  [pdf

    cs.RO eess.SY

    Safety-quantifiable Line Feature-based Monocular Visual Localization with 3D Prior Map

    Authors: Xi Zheng, Weisong Wen, Li-Ta Hsu

    Abstract: Accurate and safety-quantifiable localization is of great significance for safety-critical autonomous systems, such as unmanned ground vehicles (UGV) and unmanned aerial vehicles (UAV). The visual odometry-based method can provide accurate positioning in a short period but is subjected to drift over time. Moreover, the quantification of the safety of the localization solution (the error is bounded… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.