Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 268 results for author: Wang, R

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.16359  [pdf, other

    math.OC eess.SY

    EM++: A parameter learning framework for stochastic switching systems

    Authors: Renzi Wang, Alexander Bodard, Mathijs Schuurmans, Panagiotis Patrinos

    Abstract: This paper proposes a general switching dynamical system model, and a custom majorization-minimization-based algorithm EM++ for identifying its parameters. For certain families of distributions, such as Gaussian distributions, this algorithm reduces to the well-known expectation-maximization method. We prove global convergence of the algorithm under suitable assumptions, thus addressing an importa… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  2. arXiv:2407.11223  [pdf, other

    eess.IV

    DD_RoTIR: Dual-Domain Image Registration via Image Translation and Hierarchical Feature-matching

    Authors: Ruixiong Wang, Stephen Cross, Alin Achim

    Abstract: Microscopy images obtained from multiple camera lenses or sensors in biological experiments provide a comprehensive understanding of objects from diverse perspectives. However, using multiple microscope setups increases the risk of misalignment of identical target features across different modalities, making multimodal image registration crucial. In this work, we build upon previous successes in b… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 30 pages including supporting information; 15 figures for main context, 5 figures for supporting information; 5 tables; 5 equations in main, 12 in supporting imformation

  3. arXiv:2407.11031  [pdf, other

    cs.LG eess.SP

    Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case

    Authors: Hanxiao Lu, Zeyu Huang, Ren Wang

    Abstract: Convolutional neural networks (CNNs), one of the key architectures of deep learning models, have achieved superior performance on many machine learning tasks such as image classification, video recognition, and power systems. Despite their success, CNNs can be easily contaminated by natural noises and artificially injected noises such as backdoor attacks. In this paper, we propose a robust recover… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. arXiv:2407.09251  [pdf, other

    cs.LG cs.AI eess.SP

    Deep Adversarial Defense Against Multilevel-Lp Attacks

    Authors: Ren Wang, Yuxuan Li, Alfred Hero

    Abstract: Deep learning models have shown considerable vulnerability to adversarial attacks, particularly as attacker strategies become more sophisticated. While traditional adversarial training (AT) techniques offer some resilience, they often focus on defending against a single type of attack, e.g., the $\ell_\infty$-norm attack, which can fail for other types. This paper introduces a computationally effi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  5. arXiv:2407.08093  [pdf, other

    eess.IV cs.AI cs.CV eess.SP

    MemWarp: Discontinuity-Preserving Cardiac Registration with Memorized Anatomical Filters

    Authors: Hang Zhang, Xiang Chen, Renjiu Hu, Dongdong Liu, Gaolei Li, Rongguang Wang

    Abstract: Many existing learning-based deformable image registration methods impose constraints on deformation fields to ensure they are globally smooth and continuous. However, this assumption does not hold in cardiac image registration, where different anatomical regions exhibit asymmetric motions during respiration and movements due to sliding organs within the chest. Consequently, such global constraint… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 11 pages, 2 figure, 2 tables

  6. arXiv:2407.03896  [pdf, other

    eess.SY

    Specification-guided temporal logic control for stochastic systems: a multi-layered approach

    Authors: Birgit C. van Huijgevoort, Ruohan Wang, Sadegh Soudjani, Sofie Haesaert

    Abstract: Designing controllers to satisfy temporal requirements has proven to be challenging for dynamical systems that are affected by uncertainty. This is mainly due to the states evolving in a continuous uncountable space, the stochastic evolution of the states, and infinite-horizon temporal requirements on the system evolution, all of which makes closed-form solutions generally inaccessible. A promisin… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  7. arXiv:2406.19677  [pdf, other

    cs.NI eess.SP

    End-to-End Uplink Performance Analysis of Satellite-Based IoT Networks: A Stochastic Geometry Approach

    Authors: Jiusi Zhou, Ruibo Wang, Basem Shihada, Mohamed-Slim Alouini

    Abstract: With the deployment of satellite constellations, Internet-of-Things (IoT) devices in remote areas have gained access to low-cost network connectivity. In this paper, we investigate the performance of IoT devices connecting in up-link through low Earth orbit (LEO) satellites to geosynchronous equatorial orbit (GEO) links. We model the dynamic LEO satellite constellation using the stochastic geometr… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  8. arXiv:2406.11265  [pdf, ps, other

    eess.SY

    Balancing Performance and Cost for Two-Hop Cooperative Communications: Stackelberg Game and Distributed Multi-Agent Reinforcement Learning

    Authors: Yuanzhe Geng, Erwu Liu, Wei Ni, Rui Wang, Yan Liu, Hao Xu, Chen Cai, Abbas Jamalipour

    Abstract: This paper aims to balance performance and cost in a two-hop wireless cooperative communication network where the source and relays have contradictory optimization goals and make decisions in a distributed manner. This differs from most existing works that have typically assumed that source and relay nodes follow a schedule created implicitly by a central controller. We propose that the relays for… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2406.08200  [pdf, other

    cs.SD cs.AI eess.AS

    Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding

    Authors: Rui Wang, Liping Chen, Kong AiK Lee, Zhen-Hua Ling

    Abstract: Voice anonymization has been developed as a technique for preserving privacy by replacing the speaker's voice in a speech signal with that of a pseudo-speaker, thereby obscuring the original voice attributes from machine recognition and human perception. In this paper, we focus on altering the voice attributes against machine recognition while retaining human perception. We referred to this as the… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: accpeted by Interspeech2024

  10. arXiv:2406.07061  [pdf, other

    eess.IV cs.CV

    Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments

    Authors: Gan Gao, Andrew H. Song, Fiona Wang, David Brenes, Rui Wang, Sarah S. L. Chow, Kevin W. Bishop, Lawrence D. True, Faisal Mahmood, Jonathan T. C. Liu

    Abstract: Accurate patient diagnoses based on human tissue biopsies are hindered by current clinical practice, where pathologists assess only a limited number of thin 2D tissue slices sectioned from 3D volumetric tissue. Recent advances in non-destructive 3D pathology, such as open-top light-sheet microscopy, enable comprehensive imaging of spatially heterogeneous tissue morphologies, offering the feasibili… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR CVMI 2024

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6955-6965

  11. Fast-Fading Channel and Power Optimization of the Magnetic Inductive Cellular Network

    Authors: Honglei Ma, Erwu Liu, Zhijun Fang, Rui Wang, Yongbin Gao, Wenjun Yu, Dongming Zhang

    Abstract: The cellular network of magnetic Induction (MI) communication holds promise in long-distance underground environments. In the traditional MI communication, there is no fast-fading channel since the MI channel is treated as a quasi-static channel. However, for the vehicle (mobile) MI (VMI) communication, the unpredictable antenna vibration brings the remarkable fast-fading. As such fast-fading cann… ▽ More

    Submitted 7 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: This work has been accepted by the IEEE TWC for publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  12. arXiv:2406.01419  [pdf, other

    eess.SY

    High-performance magnetostatic wave resonators through deep anisotropic etching of GGG substrates

    Authors: Sudhanshu Tiwari, Anuj Ashok, Connor Devitt, Sunil A. Bhave, Renyuan Wang

    Abstract: Microscale resonators are fundamental and necessary building blocks for modern radio communication filters for mobile devices. The resonator's Q factor ($Q$) determines the insertion loss while coupling ($K_t^2$) governs the fractional bandwidth. The product $k_t^2 \times Q$ is widely recognized as the definitive figure of merit for microresonators. Magnetostatic wave resonators based on Yttrium I… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  13. arXiv:2405.17366  [pdf, other

    cs.LG eess.SP

    EM-GANSim: Real-time and Accurate EM Simulation Using Conditional GANs for 3D Indoor Scenes

    Authors: Ruichen Wang, Dinesh Manocha

    Abstract: We present a novel machine-learning (ML) approach (EM-GANSim) for real-time electromagnetic (EM) propagation that is used for wireless communication simulation in 3D indoor environments. Our approach uses a modified conditional Generative Adversarial Network (GAN) that incorporates encoded geometry and transmitter location while adhering to the electromagnetic propagation theory. The overall physi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 5 tables

  14. arXiv:2405.15927  [pdf

    eess.SP cs.NE eess.SY

    Application based Evaluation of an Efficient Spike-Encoder, "Spiketrum"

    Authors: MHD Anas Alsakkal, Runze Wang, Jayawan Wijekoon, Huajin Tang

    Abstract: Spike-based encoders represent information as sequences of spikes or pulses, which are transmitted between neurons. A prevailing consensus suggests that spike-based approaches demonstrate exceptional capabilities in capturing the temporal dynamics of neural activity and have the potential to provide energy-efficient solutions for low-power applications. The Spiketrum encoder efficiently compresses… ▽ More

    Submitted 31 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: To be published at "IEEE/ACM Transactions on Audio, Speech, and Language Processing"

  15. arXiv:2405.15863  [pdf, other

    cs.SD cs.AI eess.AS

    Quality-aware Masked Diffusion Transformer for Enhanced Music Generation

    Authors: Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang

    Abstract: In recent years, diffusion-based text-to-music (TTM) generation has gained prominence, offering a novel approach to synthesizing musical content from textual descriptions. Achieving high accuracy and diversity in this generation process requires extensive, high-quality data, which often constitutes only a fraction of available datasets. Within open-source datasets, the prevalence of issues like mi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  16. arXiv:2405.13339  [pdf, other

    eess.SP

    Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

    Authors: Haiyao Yu, Changyang She, Yunkai Hu, Geng Wang, Rui Wang, Branka Vucetic, Yonghui Li

    Abstract: Machine learning has been considered a promising approach for indoor localization. Nevertheless, the sample efficiency, scalability, and generalization ability remain open issues of implementing learning-based algorithms in practical systems. In this paper, we establish a zero-shot learning framework that does not need real-world measurements in a new communication environment. Specifically, a gra… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  17. arXiv:2405.11432  [pdf, other

    cs.LG eess.SY

    On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks

    Authors: Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester

    Abstract: This paper presents a study of robust policy networks in deep reinforcement learning. We investigate the benefits of policy parameterizations that naturally satisfy constraints on their Lipschitz bound, analyzing their empirical performance and robustness on two representative problems: pendulum swing-up and Atari Pong. We illustrate that policy networks with small Lipschitz bounds are significant… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  18. arXiv:2405.11115  [pdf

    eess.IV physics.optics

    Ptychographic non-line-of-sight imaging for depth-resolved visualization of hidden objects

    Authors: Pengming Song, Qianhao Zhao, Ruihai Wang, Ninghe Liu, Yingqi Qiang, Tianbo Wang, Xincheng Zhang, Yi Zhang, Liangcai Cao, Guoan Zheng

    Abstract: Non-line-of-sight (NLOS) imaging enables the visualization of objects hidden from direct view, with applications in surveillance, remote sensing, and light detection and ranging. Here, we introduce a NLOS imaging technique termed ptychographic NLOS (pNLOS), which leverages coded ptychography for depth-resolved imaging of obscured objects. Our approach involves scanning a laser spot on a wall to il… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  19. arXiv:2405.06186  [pdf, other

    cs.IT eess.SP

    Sensing-Assisted Adaptive Channel Contention for Mobile Delay-Sensitive Communications

    Authors: Bojie Lv, Qianren Li, Rui Wang

    Abstract: This paper proposes an adaptive channel contention mechanism to optimize the queuing performance of a distributed millimeter wave (mmWave) uplink system with the capability of environment and mobility sensing. The mobile agents determine their back-off timer parameters according to their local knowledge of the uplink queue lengths, channel quality, and future channel statistics, where the channel… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  20. arXiv:2404.19646  [pdf

    eess.SP

    A Fully Screen-Printed Vanadium-Dioxide Switches Based Wideband Reconfigurable Intelligent Surface for 5G Bands

    Authors: Yiming Yang, Mohammad Vaseem, Ruiqi Wang, Behrooz Makki, Atif Shamim

    Abstract: Reconfigurable Intelligent Surface (RIS) is attracting more and more research interest because of its ability to reprogram the radio environment. Designing and implementing the RIS, however, is challenging because of limitations of printed circuit board (PCB) technology related to manufacturing of large sizes as well as the cost of switches. Thus, a low-cost manufacturing process suitable for larg… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  21. arXiv:2404.15830  [pdf, other

    eess.SP

    SNR Maximization and Localization for UAV-IRS-Assisted Near-Field Systems

    Authors: Hanfu Zhang, Yidan Mei, Erwu Liu, Rui Wang

    Abstract: This letter introduces a novel unmanned aerial vehicle (UAV)-intelligent reflecting surface (IRS) structure into near-field localization systems to enhance the design flexibility of IRS, thereby obtaining additional performance gains. Specifically, a UAV-IRS is utilized to improve the harsh wireless environment and provide localization possibilities. To improve the localization accuracy, a joint o… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 5 pages, 3 figures

  22. arXiv:2404.15761  [pdf, other

    eess.SP

    Rechargeable UAV Trajectory Optimization for Real-Time Persistent Data Collection of Large-Scale Sensor Networks

    Authors: Rui Wang, Deshi Li, Qingqing Wu, Kaitao Meng, Boning Feng, Lele Cong

    Abstract: Unmanned aerial vehicles (UAVs) have received plenty of attention due to their high flexibility and enhanced communication ability, nonetheless, the limited onboard energy restricts UAVs' application on persistent data collection missions in large areas. In this paper, we propose a rechargeable UAV-assisted periodic data collection scheme, where a UAV is dispatched to periodically collect data fro… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: 13 pages, 17 figures, submitted to IEEE for possible publication

  23. arXiv:2404.15584  [pdf

    eess.SY

    Research on OPF control of three-phase four-wire low-voltage distribution network considering uncertainty

    Authors: Rui Wang, Xiaoqing Bai, Shengquan Huang, Shoupu Wei

    Abstract: As power systems become more complex and uncertain, low-voltage distribution networks face numerous challenges, including three-phase imbalances caused by asymmetrical loads and distributed energy resources. We propose a robust stochastic optimization (RSO) based optimal power flow (OPF) control method for three-phase, four-wire low-voltage distribution networks that consider uncertainty to addres… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: systems optimization, robust optimization, local control

  24. arXiv:2404.12554  [pdf, other

    eess.SY cs.LG

    Learning Stable and Passive Neural Differential Equations

    Authors: Jing Cheng, Ruigang Wang, Ian R. Manchester

    Abstract: In this paper, we introduce a novel class of neural differential equation, which are intrinsically Lyapunov stable, exponentially stable or passive. We take a recently proposed Polyak Lojasiewicz network (PLNet) as an Lyapunov function and then parameterize the vector field as the descent directions of the Lyapunov function. The resulting models have a same structure as the general Hamiltonian dyn… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  25. arXiv:2404.12077  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches

    Authors: Rong Wang, Kun Sun

    Abstract: This study employs deep learning techniques to explore four speaker profiling tasks on the TIMIT dataset, namely gender classification, accent classification, age estimation, and speaker identification, highlighting the potential and challenges of multi-task learning versus single-task models. The motivation for this research is twofold: firstly, to empirically assess the advantages and drawbacks… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  26. arXiv:2404.09969  [pdf, other

    eess.SP

    Reconstructing classes of 3D FRI signals from sampled tomographic projections at unknown angles

    Authors: Renke Wang, Francien G. Bossema, Thierry Blu, Pier Luigi Dragotti

    Abstract: Traditional sampling schemes often assume that the sampling locations are known. Motivated by the recent bioimaging technique known as cryogenic electron microscopy (cryoEM), we consider the problem of reconstructing an unknown 3D structure from samples of its 2D tomographic projections at unknown angles. We focus on 3D convex bilevel polyhedra and 3D point sources and show that the exact estimati… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  27. arXiv:2404.02461  [pdf, other

    cs.LG eess.SP

    On the Efficiency and Robustness of Vibration-based Foundation Models for IoT Sensing: A Case Study

    Authors: Tomoyoshi Kimura, Jinyang Li, Tianshi Wang, Denizhan Kara, Yizhuo Chen, Yigong Hu, Ruijie Wang, Maggie Wigness, Shengzhong Liu, Mani Srivastava, Suhas Diggavi, Tarek Abdelzaher

    Abstract: This paper demonstrates the potential of vibration-based Foundation Models (FMs), pre-trained with unlabeled sensing data, to improve the robustness of run-time inference in (a class of) IoT applications. A case study is presented featuring a vehicle classification application using acoustic and seismic sensing. The work is motivated by the success of foundation models in the areas of natural lang… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  28. arXiv:2404.02159  [pdf, other

    cs.IT eess.SP

    Fairness-aware Age-of-Information Minimization in WPT-Assisted Short-Packet THz Communications for mURLLC

    Authors: Yao Zhu, Xiaopeng Yuan, Yulin Hu, Bo Ai, Ruikang Wang, Bin Han, Anke Schmeink

    Abstract: The technological landscape is swiftly advancing towards large-scale systems, creating significant opportunities, particularly in the domain of Terahertz (THz) communications. Networks designed for massive connectivity, comprising numerous Internet of Things (IoT) devices, are at the forefront of this advancement. In this paper, we consider Wireless Power Transfer (WPT)-enabled networks that suppo… ▽ More

    Submitted 15 February, 2024; originally announced April 2024.

  29. arXiv:2403.17275  [pdf

    eess.SP

    200Gb/s VCSEL transmission using 60m OM4 MMF and KP4 FEC for AI computing clusters

    Authors: Tom Wettlin, Youxi Lin, Nebojsa Stojanovic, Stefano Calabrò, Ruoxu Wang, Lewei Zhang, Maxim Kuschnerov

    Abstract: We show a beyond 200Gb/s VCSEL transmission experiment. Results are based on 35GHz VCSEL and advanced DSP. We show an AIR of 245Gb/s PAM-6 back-to-back, and 200Gb/s PAM-4 over 60m OM4 fiber assuming KP4-FEC.

    Submitted 25 March, 2024; originally announced March 2024.

  30. arXiv:2403.11091  [pdf, other

    cs.SD cs.CV eess.AS

    Multitask frame-level learning for few-shot sound event detection

    Authors: Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang

    Abstract: This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 6 pages, 4 figures, conference

  31. arXiv:2403.09302  [pdf, other

    eess.IV cs.CV cs.LG

    StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology Images

    Authors: Robert Jewsbury, Ruoyu Wang, Abhir Bhalerao, Nasir Rajpoot, Quoc Dang Vu

    Abstract: Stain normalization algorithms aim to transform the color and intensity characteristics of a source multi-gigapixel histology image to match those of a target image, mitigating inconsistencies in the appearance of stains used to highlight cellular components in the images. We propose a new approach, StainFuser, which treats this problem as a style transfer task using a novel Conditional Latent Dif… ▽ More

    Submitted 12 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  32. arXiv:2403.04245  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

    Authors: Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

    Abstract: Advanced Audio-Visual Speech Recognition (AVSR) systems have been observed to be sensitive to missing video frames, performing even worse than single-modality models. While applying the dropout technique to the video modality enhances robustness to missing frames, it simultaneously results in a performance loss when dealing with complete data input. In this paper, we investigate this contrasting p… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: the paper is accepted by CVPR2024

  33. arXiv:2403.02942  [pdf, other

    cs.IT eess.SP

    Tensor Decomposition-based Time Varying Channel Estimation for mmWave MIMO-OFDM Systems

    Authors: Ruizhe Wang, Hong Ren, Cunhua Pan, Gui Zhou, Jiangzhou Wang

    Abstract: In this paper, we consider the time-varying channel estimation in millimeter wave (mmWave) multiple-input multiple-output MIMO systems with hybrid beamforming architectures. Different from the existing contributions that considered single-carrier mmWave systems with high mobility, the wideband orthogonal frequency division multiplexing (OFDM) system is considered in this work. To solve the channel… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  34. arXiv:2403.00897  [pdf, other

    eess.IV astro-ph.GA cs.AI cs.CV cs.LG

    VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

    Authors: Ruoqi Wang, Haitao Wang, Qiong Luo, Feng Wang, Hejun Wu

    Abstract: Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  35. arXiv:2402.12820  [pdf, other

    eess.SY

    ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer

    Authors: Tong Xie, Yixuan Hu, Renjie Wei, Meng Li, Yuan Wang, Runsheng Wang, Ru Huang

    Abstract: Stochastic computing (SC) has emerged as a promising computing paradigm for neural acceleration. However, how to accelerate the state-of-the-art Vision Transformer (ViT) with SC remains unclear. Unlike convolutional neural networks, ViTs introduce notable compatibility and efficiency challenges because of their nonlinear functions, e.g., softmax and Gaussian Error Linear Units (GELU). In this pape… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted in DATE 2024

  36. arXiv:2402.11186  [pdf, other

    eess.IV physics.med-ph

    Low-Dose CT Reconstruction Using Dataset-free Learning

    Authors: Feng Wang, Renfang Wang, Hong Qiu

    Abstract: Low-Dose computer tomography (LDCT) is an ideal alternative to reduce radiation risk in clinical applications. Although supervised-deep-learning-based reconstruction methods have demonstrated superior performance compared to conventional model-driven reconstruction algorithms, they require collecting massive pairs of low-dose and norm-dose CT images for neural network training, which limits their… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  37. arXiv:2402.04097  [pdf, other

    cs.CV eess.IV

    Analysis of Deep Image Prior and Exploiting Self-Guidance for Image Reconstruction

    Authors: Shijun Liang, Evan Bell, Qing Qu, Rongrong Wang, Saiprasad Ravishankar

    Abstract: The ability of deep image prior (DIP) to recover high-quality images from incomplete or corrupted measurements has made it popular in inverse problems in image restoration and medical imaging including magnetic resonance imaging (MRI). However, conventional DIP suffers from severe overfitting and spectral bias effects. In this work, we first provide an analysis of how DIP recovers information from… ▽ More

    Submitted 7 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  38. arXiv:2401.15984  [pdf

    eess.IV physics.med-ph

    Choroidal thinning assessment through facial video analysis

    Authors: Qinghua He, Yi Zhang, Mengxi Shen, Giovanni Gregori, Philip J. Rosenfeld, Ruikang K. Wang

    Abstract: Different features of skin are associated with various medical conditions and provide opportunities to evaluate and monitor body health. This study created a strategy to assess choroidal thinning through the video analysis of facial skin. Videos capturing the entire facial skin were collected from 48 participants with age-related macular degeneration (AMD) and 12 healthy individuals. These facial… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 8 pages, 4 figures

  39. arXiv:2401.11270  [pdf, other

    eess.IV

    RoTIR: Rotation-Equivariant Network and Transformers for Fish Scale Image Registration

    Authors: Ruixiong Wang, Alin Achim, Renata Raele-Rolfe, Qiao Tong, Dylan Bergen, Chrissy Hammond, Stephen Cross

    Abstract: Image registration is an essential process for aligning features of interest from multiple images. With the recent development of deep learning techniques, image registration approaches have advanced to a new level. In this work, we present 'Rotation-Equivariant network and Transformers for Image Registration' (RoTIR), a deep-learning-based method for the alignment of fish scale images captured by… ▽ More

    Submitted 27 July, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: 7 pages, 4 figures, 2 tables

  40. arXiv:2401.09833  [pdf, other

    eess.IV cs.AI cs.CV

    Slicer Networks

    Authors: Hang Zhang, Xiang Chen, Rongguang Wang, Renjiu Hu, Dongdong Liu, Gaolei Li

    Abstract: In medical imaging, scans often reveal objects with varied contrasts but consistent internal intensities or textures. This characteristic enables the use of low-frequency approximations for tasks such as segmentation and deformation field estimation. Yet, integrating this concept into neural network architectures for medical image analysis remains underexplored. In this paper, we propose the Slice… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 8 figures and 3 tables

  41. arXiv:2401.09517  [pdf

    cs.LG eess.IV q-bio.QM

    Dimensional Neuroimaging Endophenotypes: Neurobiological Representations of Disease Heterogeneity Through Machine Learning

    Authors: Junhao Wen, Mathilde Antoniades, Zhijian Yang, Gyujoon Hwang, Ioanna Skampardoni, Rongguang Wang, Christos Davatzikos

    Abstract: Machine learning has been increasingly used to obtain individualized neuroimaging signatures for disease diagnosis, prognosis, and response to treatment in neuropsychiatric and neurodegenerative disorders. Therefore, it has contributed to a better understanding of disease heterogeneity by identifying disease subtypes that present significant differences in various brain phenotypic measures. In thi… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  42. arXiv:2401.08154  [pdf, ps, other

    cs.CV eess.IV

    TLIC: Learned Image Compression with ROI-Weighted Distortion and Bit Allocation

    Authors: Wei Jiang, Yongqi Zhai, Hangyu Li, Ronggang Wang

    Abstract: This short paper describes our method for the track of image compression. To achieve better perceptual quality, we use the adversarial loss to generate realistic textures, use region of interest (ROI) mask to guide the bit allocation for different regions. Our Team name is TLIC.

    Submitted 23 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 2nd Place in the Image Compression Track, CLIC 2024, DCC 2024

  43. arXiv:2401.07754  [pdf, ps, other

    eess.SP

    Passive Beamforming For Practical RIS-Assisted Communication Systems With Non-Ideal Hardware

    Authors: Yiming Liu, Rui Wang, Zhu Han

    Abstract: Reconfigurable intelligent surface (RIS) technology is a promising solution to improve the performance of existing wireless communications. To achieve its cost-effectiveness advantage, there inevitably exist certain hardware impairments in the system. Therefore, it is more reasonable to design passive beamforming in this scenario. Some existing research has considered such problems under transceiv… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  44. arXiv:2401.07446  [pdf, other

    cs.IT eess.SP

    Quantized RIS-aided mmWave Massive MIMO Channel Estimation with Uniform Planar Arrays

    Authors: Ruizhe Wang, Hong Ren, Cunhua Pan, Shi Jin, Petar Popovski, Jiangzhou Wang

    Abstract: In this paper, we investigate a cascaded channel estimation method for a millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) system aided by a reconfigurable intelligent surface (RIS) with the BS equipped with low-resolution analog-to-digital converters (ADCs), where the BS and the RIS are both equipped with a uniform planar array (UPA). Due to the sparse property of mmWave chan… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  45. arXiv:2401.00246  [pdf, other

    cs.CL cs.SD eess.AS

    Boosting Large Language Model for Speech Synthesis: An Empirical Study

    Authors: Hongkun Hao, Long Zhou, Shujie Liu, Jinyu Li, Shujie Hu, Rui Wang, Furu Wei

    Abstract: Large language models (LLMs) have made significant advancements in natural language processing and are concurrently extending the language ability to other modalities, such as speech and vision. Nevertheless, most of the previous work focuses on prompting LLMs with perception abilities like auditory comprehension, and the effective approach for augmenting LLMs with speech synthesis capabilities re… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

  46. arXiv:2312.12534  [pdf, other

    eess.SP

    Near-Field Localization and Phase Shift Optimization for RIS-Assisted Non-Ideal OFDM Systems

    Authors: Hanfu Zhang, Erwu Liu, Rui Wang, Zhe Xing, Yan Liu

    Abstract: By incorporating reconfigurable intelligent surface (RIS) into communication-assisted localization systems, the issue of signal blockage caused by obstacles can be addressed, and passive beamforming can be employed to enhance localization accuracy. However, existing works mainly consider ideal channels and do not account for the effects of realistic impairments like carrier frequency offset (CFO)… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 11 pages, 11 figures

  47. arXiv:2312.09659  [pdf, ps, other

    cs.IT eess.SP

    A Near Field Low Time Complexity Beam Training Scheme Based on Spatial Orthogonal Decomposition

    Authors: Xiyuan Liu, Qingqing Wu, Rui Wang, Jun Wu

    Abstract: With the application of high-frequency communication and extremely large MIMO (XL-MIMO), the near-field effect has become increasingly apparent. The near-field beam design now requires consideration not only of the angle of arrival (AoA) information but also the curvature of arrival (CoA) information. However, due to their mutual coupling, orthogonally decomposing the near-field space becomes chal… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 11 pages with double column, 7 figures

  48. arXiv:2312.08571  [pdf, other

    cs.SD cs.AI eess.AS

    PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition

    Authors: Chengxi Lei, Satwinder Singh, Feng Hou, Xiaoyun Jia, Ruili Wang

    Abstract: Most of the current speech data augmentation methods operate on either the raw waveform or the amplitude spectrum of speech. In this paper, we propose a novel speech data augmentation method called PhasePerturbation that operates dynamically on the phase spectrum of speech. Instead of statically rotating a phase by a constant degree, PhasePerturbation utilizes three dynamic phase spectrum operatio… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  49. arXiv:2312.01573  [pdf

    eess.IV cs.CV

    Survey on deep learning in multimodal medical imaging for cancer detection

    Authors: Yan Tian, Zhaocheng Xu, Yujun Ma, Weiping Ding, Ruili Wang, Zhihong Gao, Guohua Cheng, Linyang He, Xuran Zhao

    Abstract: The task of multimodal cancer detection is to determine the locations and categories of lesions by using different imaging techniques, which is one of the key research methods for cancer diagnosis. Recently, deep learning-based object detection has made significant developments due to its strength in semantic feature extraction and nonlinear function fitting. However, multimodal cancer detection r… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Journal ref: Neural Computing and Applications. 2023 Nov 29:1-6

  50. arXiv:2311.17065  [pdf, other

    eess.AS cs.CL cs.LG

    Efficient Deep Speech Understanding at the Edge

    Authors: Rongxiang Wang, Felix Xiaozhu Lin

    Abstract: In contemporary speech understanding (SU), a sophisticated pipeline is employed, encompassing the ingestion of streaming voice input. The pipeline executes beam search iteratively, invoking a deep neural network to generate tentative outputs (referred to as hypotheses) in an autoregressive manner. Periodically, the pipeline assesses attention and Connectionist Temporal Classification (CTC) scores.… ▽ More

    Submitted 4 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.