Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 161 results for author: Wu, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.04358  [pdf, other

    eess.SY

    Goal-Oriented UAV Communication Design and Optimization for Target Tracking: A MachineLearning Approach

    Authors: Wenchao Wu, Yanning Wu, Yuanqing Yang, Yansha Deng

    Abstract: To accomplish various tasks, safe and smooth control of unmanned aerial vehicles (UAVs) needs to be guaranteed, which cannot be met by existing ultra-reliable low latency communications (URLLC). This has attracted the attention of the communication field, where most existing work mainly focused on optimizing communication performance (i.e., delay) and ignored the performance of the task (i.e., tra… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  2. arXiv:2407.21381  [pdf, other

    eess.IV cs.CV

    Identity-Consistent Diffusion Network for Grading Knee Osteoarthritis Progression in Radiographic Imaging

    Authors: Wenhua Wu, Kun Hu, Wenxi Yue, Wei Li, Milena Simic, Changyang Li, Wei Xiang, Zhiyong Wang

    Abstract: Knee osteoarthritis (KOA), a common form of arthritis that causes physical disability, has become increasingly prevalent in society. Employing computer-aided techniques to automatically assess the severity and progression of KOA can greatly benefit KOA treatment and disease management. Particularly, the advancement of X-ray technology in KOA demonstrates its potential for this purpose. Yet, existi… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  3. arXiv:2407.19220  [pdf

    physics.ed-ph eess.SY

    A Low-Frequency Vibration Experimental Platform for University Physics Experiment Designed by LabVIEW

    Authors: Yangjie Dai, Leijian Wang, Wenbin Wu, Aiping Chen, Dawei Gu

    Abstract: Virtual instrument technology has been increasingly used in university physics experiment teaching. An experimental platform is specifically constructed for studying low-frequency vibrations in university physics, which is based on a computer and its internal sound card, along with a program developed in LabVIEW programming environment to perform control and measurement on our experimental platfor… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: 13 pages, 8 figures, 2 supplementary files

  4. arXiv:2407.18773  [pdf, other

    eess.SP

    Channel Estimation for Movable-Antenna MIMO Systems Via Tensor Decomposition

    Authors: Ruoyu Zhang, Lei Cheng, Wei Zhang, Xinrong Guan, Yueming Cai, Wen Wu, Rui Zhang

    Abstract: In this letter, we investigate the channel estimation problem for MIMO wireless communication systems with movable antennas (MAs) at both the transmitter (Tx) and receiver (Rx). To achieve high channel estimation accuracy with low pilot training overhead, we propose a tensor decomposition-based method for estimating the parameters of multi-path channel components, including their azimuth and eleva… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 5 pages, 3 figures

  5. arXiv:2407.11481  [pdf, other

    cs.LG cs.AI eess.SP

    Multi-Channel Masked Autoencoder and Comprehensive Evaluations for Reconstructing 12-Lead ECG from Arbitrary Single-Lead ECG

    Authors: Jiarong Chen, Wanqing Wu, Tong Liu, Shenda Hong

    Abstract: In the context of cardiovascular diseases (CVD) that exhibit an elevated prevalence and mortality, the electrocardiogram (ECG) is a popular and standard diagnostic tool for doctors, commonly utilizing a 12-lead configuration in clinical practice. However, the 10 electrodes placed on the surface would cause a lot of inconvenience and discomfort, while the rapidly advancing wearable devices adopt th… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by KDD-AIDSH 2024

  6. arXiv:2406.19608  [pdf, other

    eess.SY

    Multi-service collaboration and composition of cloud manufacturing customized production based on problem decomposition

    Authors: Hao Yue, Yingtao Wu, Min Wang, Hesuan Hu, Weimin Wu, Jihui Zhang

    Abstract: Cloud manufacturing system is a service-oriented and knowledge-based one, which can provide solutions for the large-scale customized production. The service resource allocation is the primary factor that restricts the production time and cost in the cloud manufacturing customized production (CMCP). In order to improve the efficiency and reduce the cost in CMCP, we propose a new framework which con… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 12 pages, 8 figures

    ACM Class: J.0

  7. arXiv:2406.03882  [pdf, other

    cs.CL cs.SD eess.AS

    Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

    Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

    Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More

    Submitted 9 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  8. arXiv:2406.01644  [pdf, other

    eess.IV

    Dual-Stream Attention Network for Hyperspectral Image Unmixing

    Authors: Yufang Wang, Wenmin Wu, Lin Qi, Feng Gao

    Abstract: Hyperspectral image (HSI) contains abundant spatial and spectral information, making it highly valuable for unmixing. In this paper, we propose a Dual-Stream Attention Network (DSANet) for HSI unmixing. The endmembers and abundance of a pixel in HSI have high correlations with its adjacent pixels. Therefore, we adopt a "many to one" strategy to estimate the abundance of the central pixel. In addit… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE IGARSS 2024

  9. arXiv:2406.00654  [pdf, other

    cs.CL cs.SD eess.AS

    Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

    Authors: Chen Chen, Yuchen Hu, Wen Wu, Helin Wang, Eng Siong Chng, Chao Zhang

    Abstract: In recent years, text-to-speech (TTS) technology has witnessed impressive advancements, particularly with large-scale training datasets, showcasing human-level speech quality and impressive zero-shot capabilities on unseen speakers. However, despite human subjective evaluations, such as the mean opinion score (MOS), remaining the gold standard for assessing the quality of synthetic speech, even st… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 19 pages, Preprint

  10. arXiv:2405.20064  [pdf, other

    eess.AS cs.SD

    1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem

    Authors: Mingjie Chen, Hezhao Zhang, Yuanchao Li, Jiachen Luo, Wen Wu, Ziyang Ma, Peter Bell, Catherine Lai, Joshua Reiss, Lin Wang, Philip C. Woodland, Xie Chen, Huy Phan, Thomas Hain

    Abstract: Speech emotion recognition is a challenging classification task with natural emotional speech, especially when the distribution of emotion types is imbalanced in the training and test data. In this case, it is more difficult for a model to learn to separate minority classes, resulting in those sometimes being ignored or frequently misclassified. Previous work has utilised class weighted loss for t… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  11. arXiv:2405.17659  [pdf, other

    eess.IV cs.CV

    Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

    Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

    Abstract: Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  12. arXiv:2405.17167  [pdf

    eess.IV cs.CV

    Partitioned Hankel-based Diffusion Models for Few-shot Low-dose CT Reconstruction

    Authors: Wenhao Zhang, Bin Huang, Shuyue Chen, Xiaoling Xu, Weiwen Wu, Qiegen Liu

    Abstract: Low-dose computed tomography (LDCT) plays a vital role in clinical applications by mitigating radiation risks. Nevertheless, reducing radiation doses significantly degrades image quality. Concurrently, common deep learning methods demand extensive data, posing concerns about privacy, cost, and time constraints. Consequently, we propose a few-shot low-dose CT reconstruction method using Partitioned… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  13. arXiv:2404.17926  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Pre-training on High Definition X-ray Images: An Experimental Study

    Authors: Xiao Wang, Yuehang Li, Wentao Wu, Jiandong Jin, Yao Rong, Bo Jiang, Chuanfu Li, Jin Tang

    Abstract: Existing X-ray based pre-trained vision models are usually conducted on a relatively small-scale dataset (less than 500k samples) with limited resolution (e.g., 224 $\times$ 224). However, the key to the success of self-supervised pre-training large models lies in massive training data, and maintaining high resolution in the field of X-ray images is the guarantee of effective solutions to difficul… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Technology Report

  14. arXiv:2403.16078  [pdf, other

    cs.SD eess.AS

    Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy

    Authors: Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng

    Abstract: Audio-visual target speech extraction (AV-TSE) is one of the enabling technologies in robotics and many audio-visual applications. One of the challenges of AV-TSE is how to effectively utilize audio-visual synchronization information in the process. AV-HuBERT can be a useful pre-trained model for lip-reading, which has not been adopted by AV-TSE. In this paper, we would like to explore the way to… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Accepted by IJCNN 2024

  15. arXiv:2403.12352  [pdf, other

    eess.SP cs.IT

    A New Intelligent Reflecting Surface-Aided Electromagnetic Stealth Strategy

    Authors: Xue Xiong, Beixiong Zheng, A. Lee Swindlehurst, Jie Tang, Wen Wu

    Abstract: Electromagnetic wave absorbing material (EWAM) plays an essential role in manufacturing stealth aircraft, which can achieve the electromagnetic stealth (ES) by reducing the strength of the signal reflected back to the radar system. However, the stealth performance is limited by the coating thickness, incident wave angles, and working frequencies. To tackle these limitations, we propose a new intel… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 5 pages, 4 figures

  16. arXiv:2402.19013  [pdf, other

    eess.SY

    Ultraviolet Positioning via TDOA: Error Analysis and System Prototype

    Authors: Shihui Yu, Chubing Lv, Yueke Yang, Yuchen Pan, Lei Sun, Juliang Cao, Ruihang Yu, Chen Gong, Wenqi Wu, Zhengyuan Xu

    Abstract: This work performs the design, real-time hardware realization, and experimental evaluation of a positioning system by ultra-violet (UV) communication under photon-level signal detection. The positioning is based on time-difference of arrival (TDOA) principle. Time division-based transmission of synchronization sequence from three transmitters with known positions is applied. We investigate the pos… ▽ More

    Submitted 14 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  17. arXiv:2402.13798  [pdf, other

    eess.SY

    AFPR-CIM: An Analog-Domain Floating-Point RRAM-based Compute-In-Memory Architecture with Dynamic Range Adaptive FP-ADC

    Authors: Haobo Liu, Zhengyang Qian, Wei Wu, Hongwei Ren, Zhiwei Liu, Leibin Ni

    Abstract: Power consumption has become the major concern in neural network accelerators for edge devices. The novel non-volatile-memory (NVM) based computing-in-memory (CIM) architecture has shown great potential for better energy efficiency. However, most of the recent NVM-CIM solutions mainly focus on fixed-point calculation and are not applicable to floating-point (FP) processing. In this paper, we propo… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted by DATE 2024

  18. Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation

    Authors: Nineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland

    Abstract: Foundation models have shown superior performance for speech emotion recognition (SER). However, given the limited data in emotion corpora, finetuning all parameters of large pre-trained models for SER can be both resource-intensive and susceptible to overfitting. This paper investigates parameter-efficient finetuning (PEFT) for SER. Various PEFT adaptors are systematically studied for both classi… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Journal ref: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 10986-10990

  19. arXiv:2402.07379  [pdf, other

    eess.SY

    Distribution Locational Marginal Emission for Carbon Alleviation in Distribution Networks: Formulation, Calculation, and Implication

    Authors: Linwei Sang, Yinliang Xu, Hongbin Sun, Qiuwei Wu, Wenchuan Wu

    Abstract: Regulating the proper carbon-aware intervention policy is one of the keys to emission alleviation in the distribution network, whose basis lies in effectively attributing the emission responsibility using emission factors. This paper establishes the distribution locational marginal emission (DLME) to calculate the marginal change of emission from the marginal change of both active and reactive loa… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  20. arXiv:2401.16564  [pdf

    eess.SP

    Data and Physics driven Deep Learning Models for Fast MRI Reconstruction: Fundamentals and Methodologies

    Authors: Jiahao Huang, Yinzhe Wu, Fanwen Wang, Yingying Fang, Yang Nan, Cagan Alkan, Lei Xu, Zhifan Gao, Weiwen Wu, Lei Zhu, Zhaolin Chen, Peter Lally, Neal Bangerter, Kawin Setsompop, Yike Guo, Daniel Rueckert, Ge Wang, Guang Yang

    Abstract: Magnetic Resonance Imaging (MRI) is a pivotal clinical diagnostic tool, yet its extended scanning times often compromise patient comfort and image quality, especially in volumetric, temporal and quantitative scans. This review elucidates recent advances in MRI acceleration via data and physics-driven models, leveraging techniques from algorithm unrolling models, enhancement-based models, and plug-… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  21. arXiv:2401.13463  [pdf, other

    cs.CL cs.IR cs.SD eess.AS

    SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering

    Authors: Chyi-Jiunn Lin, Guan-Ting Lin, Yung-Sung Chuang, Wei-Lun Wu, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Lin-shan Lee

    Abstract: Spoken Question Answering (SQA) is essential for machines to reply to user's question by finding the answer span within a given spoken passage. SQA has been previously achieved without ASR to avoid recognition errors and Out-of-Vocabulary (OOV) problems. However, the real-world problem of Open-domain SQA (openSQA), in which the machine needs to first retrieve passages that possibly contain the ans… ▽ More

    Submitted 18 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted at ICASSP 2024

  22. arXiv:2401.10156  [pdf, other

    cs.NI eess.SP

    Model-Assisted Learning for Adaptive Cooperative Perception of Connected Autonomous Vehicles

    Authors: Kaige Qu, Weihua Zhuang, Qiang Ye, Wen Wu, Xuemin Shen

    Abstract: Cooperative perception (CP) is a key technology to facilitate consistent and accurate situational awareness for connected and autonomous vehicles (CAVs). To tackle the network resource inefficiency issue in traditional broadcast-based CP, unicast-based CP has been proposed to associate CAV pairs for cooperative perception via vehicle-to-vehicle transmission. In this paper, we investigate unicast-b… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted by IEEE Transactions on Wireless Communications

  23. arXiv:2401.05844  [pdf

    physics.med-ph eess.IV

    Self-navigated 3D diffusion MRI using an optimized CAIPI sampling and structured low-rank reconstruction

    Authors: Ziyu Li, Karla L. Miller, Xi Chen, Mark Chiew, Wenchuan Wu

    Abstract: 3D multi-slab acquisitions are an appealing approach for diffusion MRI because they are compatible with the imaging regime delivering optimal SNR efficiency. In conventional 3D multi-slab imaging, shot-to-shot phase variations caused by motion pose challenges due to the use of multi-shot k-space acquisition. Navigator acquisition after each imaging echo is typically employed to correct phase varia… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 10 pages, 11 figures, 2 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  24. arXiv:2401.02831  [pdf, other

    cs.CV eess.IV

    Two-stage Progressive Residual Dense Attention Network for Image Denoising

    Authors: Wencong Wu, An Ge, Guannan Lv, Yuelong Xia, Yungang Zhang, Wen Xiong

    Abstract: Deep convolutional neural networks (CNNs) for image denoising can effectively exploit rich hierarchical features and have achieved great success. However, many deep CNN-based denoising models equally utilize the hierarchical features of noisy images without paying attention to the more important and useful features, leading to relatively low performance. To address the issue, we design a new Two-s… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  25. Integrated Sensing and Communication with Massive MIMO: A Unified Tensor Approach for Channel and Target Parameter Estimation

    Authors: Ruoyu Zhang, Lei Cheng, Shuai Wang, Yi Lou, Yulong Gao, Wen Wu, Derrick Wing Kwan Ng

    Abstract: Benefitting from the vast spatial degrees of freedom, the amalgamation of integrated sensing and communication (ISAC) and massive multiple-input multiple-output (MIMO) is expected to simultaneously improve spectral and energy efficiencies as well as the sensing capability. However, a large number of antennas deployed in massive MIMO-ISAC raises critical challenges in acquiring both accurate channe… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Journal ref: IEEE Transactions on Wireless Communications, 2024

  26. arXiv:2312.15424  [pdf, other

    eess.SY

    Integrating Renewable Energy Sources as Reserve Providers: Modeling, Pricing, and Properties

    Authors: Wenli Wu, Ye Guo, Jiantao Shi

    Abstract: In pursuit of carbon neutrality, many countries have adopted renewable portfolio standards to facilitate the integration of renewable energy. However, increasing penetration of renewable energy resources will also pose higher requirements on system flexibility. Allowing renewable themselves to participate in the reserve market could be a viable solution. To this end, this paper proposes an optimal… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Comments: 13 pages, 5 figures

  27. arXiv:2312.13182  [pdf, other

    cs.RO eess.SY

    Task-oriented Semantics-aware Communications for Robotic Waypoint Transmission: the Value and Age of Information Approach

    Authors: Wenchao Wu, Yuanqing Yang, Yansha Deng, A. Hamid Aghvami

    Abstract: The ultra-reliable and low-latency communication (URLLC) service of the fifth-generation (5G) mobile communication network struggles to support safe robot operation. Nowadays, the sixth-generation (6G) mobile communication network is proposed to provide hyper-reliable and low-latency communication to enable safer control for robots. However, current 5G/ 6G research mainly focused on improving comm… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  28. arXiv:2312.01071  [pdf, other

    cs.IT eess.SP

    Hybrid Hierarchical DRL Enabled Resource Allocation for Secure Transmission in Multi-IRS-Assisted Sensing-Enhanced Spectrum Sharing Networks

    Authors: Lingyi Wang, Wei Wu, Fuhui Zhou, Qihui Wu, Octavia A. Dobre, Tony Q. S. Quek

    Abstract: Secure communications are of paramount importance in spectrum sharing networks due to the allocation and sharing characteristics of spectrum resources. To further explore the potential of intelligent reflective surfaces (IRSs) in enhancing spectrum sharing and secure transmission performance, a multiple intelligent reflection surface (multi-IRS)-assisted sensing-enhanced wideband spectrum sharing… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  29. arXiv:2311.00518  [pdf

    eess.IV

    See SIFT in a Rain

    Authors: Wei Wu, Hao Chang, Zhu Li

    Abstract: Rain streaks bring complicated pixel intensity changes and additional gradients, greatly obstructing the extraction of image features from background. This causes serious performance degradation in feature-based applications. Thus, it is critical to remove rain streaks from a single rainy image to recover image features. Recently, many excellent image deraining methods have made remarkable progres… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: A direct DoG feature pyramid recovery from rainy pixels solution for SIFT detection, accepted by T-CSVT, 2023

    Journal ref: IEEE Trans. on Circuits & System for Video Tech., 2023

  30. Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection

    Authors: Ziyun Cui, Wen Wu, Wei-Qiang Zhang, Ji Wu, Chao Zhang

    Abstract: The detection of Alzheimer's disease (AD) from spontaneous speech has attracted increasing attention while the sparsity of training data remains an important issue. This paper handles the issue by knowledge transfer, specifically from both speech-generic and depression-specific knowledge. The paper first studies sequential knowledge transfer from generic foundation models pretrained on large amoun… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 8 pages, 4 figures. Accepted by ASRU 2023

  31. arXiv:2310.01342  [pdf, other

    cs.IT eess.SP

    Near-field Integrated Sensing and Communication: Opportunities and Challenges

    Authors: Jiayi Cong, Changsheng You, Jiapeng Li, Li Chen, Beixiong Zheng, Yuanwei Liu, Wen Wu, Yi Gong, Shi Jin, Rui Zhang

    Abstract: With the extremely large-scale array XL-array deployed in future wireless systems, wireless communication and sensing are expected to operate in the radiative near-field region, which needs to be characterized by the spherical rather than planar wavefronts. Unlike most existing works that considered far-field integrated sensing and communication (ISAC), we study in this article the new near-field… ▽ More

    Submitted 26 July, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: This work has been accpeted by IEEE Wireless Communications Magazine

  32. arXiv:2309.10294  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition

    Authors: Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen

    Abstract: In this paper, we explored how to boost speech emotion recognition (SER) with the state-of-the-art speech pre-trained model (PTM), data2vec, text generation technique, GPT-4, and speech synthesis technique, Azure TTS. First, we investigated the representation ability of different speech self-supervised pre-trained models, and we found that data2vec has a good representation ability on the SER task… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  33. arXiv:2309.06825   

    eess.IV cs.CV

    Topology-inspired Cross-domain Network for Developmental Cervical Stenosis Quantification

    Authors: Zhenxi Zhang, Yanyang Wang, Yao Wu, Weifei Wu

    Abstract: Developmental Canal Stenosis (DCS) quantification is crucial in cervical spondylosis screening. Compared with quantifying DCS manually, a more efficient and time-saving manner is provided by deep keypoint localization networks, which can be implemented in either the coordinate or the image domain. However, the vertebral visualization features often lead to abnormal topological structures during ke… ▽ More

    Submitted 18 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: We have discovered that some authors' contributions have been overlooked. We need to spend some time confirming whether the authors adhere to the paper's authorship guidelines and whether their authorship order complies with the standards. After discussion with all co-authors, we decide to withdraw this paper

  34. arXiv:2308.15942  [pdf

    eess.IV cs.CV

    Stage-by-stage Wavelet Optimization Refinement Diffusion Model for Sparse-View CT Reconstruction

    Authors: Kai Xu, Shiyu Lu, Bin Huang, Weiwen Wu, Qiegen Liu

    Abstract: Diffusion models have emerged as potential tools to tackle the challenge of sparse-view CT reconstruction, displaying superior performance compared to conventional methods. Nevertheless, these prevailing diffusion models predominantly focus on the sinogram or image domains, which can lead to instability during model training, potentially culminating in convergence towards local minimal solutions.… ▽ More

    Submitted 3 September, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

  35. Two-and-a-half Order Score-based Model for Solving 3D Ill-posed Inverse Problems

    Authors: Zirong Li, Yanyang Wang, Jianjia Zhang, Weiwen Wu, Hengyong Yu

    Abstract: Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are crucial technologies in the field of medical imaging. Score-based models have proven to be effective in addressing different inverse problems encountered in CT and MRI, such as sparse-view CT and fast MRI reconstruction. However, these models face challenges in achieving accurate three dimensional (3D) volumetric reconstruction. The… ▽ More

    Submitted 19 December, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 10 pages, 13 figures

    Journal ref: Computers in Biology and Medicine Volume 168, January 2024, 107819

  36. Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations

    Authors: Wen Wu, Chao Zhang, Philip C. Woodland

    Abstract: Although automatic emotion recognition (AER) has recently drawn significant research interest, most current AER studies use manually segmented utterances, which are usually unavailable for dialogue systems. This paper proposes integrating AER with automatic speech recognition (ASR) and speaker diarisation (SD) in a jointly-trained system. Distinct output layers are built for four sub-tasks includi… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: Interspeech 2023

  37. arXiv:2308.04112  [pdf, other

    eess.SY

    Multi-Interval Rolling-Window Joint Dispatch and Pricing of Energy and Reserve under Uncertainty

    Authors: Jiantao Shi, Ye Guo, Wenchuan Wu, Hongbin Sun

    Abstract: In this paper, the intra-day multi-interval rolling-window joint dispatch and pricing of energy and reserve is studied under increasing volatile and uncertain renewable generations. A look-ahead energy-reserve co-optimization model is proposed for the rolling-window dispatch, where possible contingencies and load/renewable forecast errors over the look-ahead window are modeled as several scenario… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  38. arXiv:2307.12709  [pdf

    eess.SY

    A Dynamic Equivalent Energy Storage Model of Natural Gas Networks for Joint Optimal Dispatch of Electricity-Gas Systems

    Authors: Siyuan Wang, Wenchuan Wu, Chenhui Lin, Binbin Chen

    Abstract: The development of energy conversion techniques enhances the coupling between the gas network and power system. However, challenges remain in the joint optimal dispatch of electricity-gas systems. The dynamic model of the gas network, described by partial differential equations, is complex and computationally demanding for power system operators. Furthermore, information privacy concerns and limit… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 12 pages, 8 figures

  39. arXiv:2307.09729  [pdf, other

    cs.CV cs.MM eess.IV

    NTIRE 2023 Quality Assessment of Video Enhancement Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu , et al. (47 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  40. Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

    Authors: Wen Wu, Chao Zhang, Philip C. Woodland

    Abstract: In automatic emotion recognition (AER), labels assigned by different human annotators to the same utterance are often inconsistent due to the inherent complexity of emotion and the subjectivity of perception. Though deterministic labels generated by averaging or voting are often used as the ground truth, it ignores the intrinsic uncertainty revealed by the inconsistent labels. This paper proposes… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted by ACL 2023

    Journal ref: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2023

  41. arXiv:2306.05946  [pdf, other

    eess.IV cs.NI

    Digital Twin-Assisted Resource Demand Prediction for Multicast Short Video Streaming

    Authors: Xinyu Huang, Wen Wu, Xuemin Sherman Shen

    Abstract: In this paper, we propose a digital twin (DT)-assisted resource demand prediction scheme to enhance prediction accuracy for multicast short video streaming. Particularly, we construct user DTs (UDTs) for collecting real-time user status, including channel condition, location, watching duration, and preference. A reinforcement learning-empowered K-means++ algorithm is developed to cluster users bas… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 2 pages, 3 figures

  42. arXiv:2306.04980  [pdf, other

    cs.CL cs.SD eess.AS

    Assessing Phrase Break of ESL Speech with Pre-trained Language Models and Large Language Models

    Authors: Zhiyi Wang, Shaoguang Mao, Wenshan Wu, Yan Xia, Yan Deng, Jonathan Tien

    Abstract: This work introduces approaches to assessing phrase breaks in ESL learners' speech using pre-trained language models (PLMs) and large language models (LLMs). There are two tasks: overall assessment of phrase break for a speech clip and fine-grained assessment of every possible phrase break position. To leverage NLP models, speech input is first force-aligned with texts, and then pre-processed into… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted by InterSpeech 2023. arXiv admin note: substantial text overlap with arXiv:2210.16029

  43. arXiv:2305.17778  [pdf

    physics.med-ph eess.IV

    PND-Net: Physics based Non-local Dual-domain Network for Metal Artifact Reduction

    Authors: Jinqiu Xia, Yiwen Zhou, Hailong Wang, Wenxin Deng, Jing Kang, Wangjiang Wu, Mengke Qi, Linghong Zhou, Jianhui Ma, Yuan Xu

    Abstract: Metal artifacts caused by the presence of metallic implants tremendously degrade the reconstructed computed tomography (CT) image quality, affecting clinical diagnosis or reducing the accuracy of organ delineation and dose calculation in radiotherapy. Recently, deep learning methods in sinogram and image domains have been rapidly applied on metal artifact reduction (MAR) task. The supervised dual-… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: 19 pages, 8 figures

  44. arXiv:2305.15395  [pdf, other

    eess.SY

    Safety-aware Semi-end-to-end Coordinated Decision Model for Voltage Regulation in Active Distribution Network

    Authors: Linwei Sang, Yinliang Xu, Huan Long, Wenchuan Wu

    Abstract: Prediction plays a vital role in the active distribution network voltage regulation under the high penetration of photovoltaics. Current prediction models aim at minimizing individual prediction errors but overlook their collective impacts on downstream decision-making. Hence, this paper proposes a safety-aware semi-end-to-end coordinated decision model to bridge the gap from the downstream voltag… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  45. Self-supervised representations in speech-based depression detection

    Authors: Wen Wu, Chao Zhang, Philip C. Woodland

    Abstract: This paper proposes handling training data sparsity in speech-based automatic depression detection (SDD) using foundation models pre-trained with self-supervised learning (SSL). An analysis of SSL representations derived from different layers of pre-trained foundation models is first presented for SDD, which provides insight to suitable indicator for depression detection. Knowledge transfer is the… ▽ More

    Submitted 6 July, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

  46. arXiv:2305.04269  [pdf, other

    eess.IV cs.CV

    Dual Residual Attention Network for Image Denoising

    Authors: Wencong Wu, Shijie Liu, Yi Zhou, Yungang Zhang, Yu Xiang

    Abstract: In image denoising, deep convolutional neural networks (CNNs) can obtain favorable performance on removing spatially invariant noise. However, many of these networks cannot perform well on removing the real noise (i.e. spatially variant noise) generated during image acquisition or transmission, which severely sets back their application in practical image denoising tasks. Instead of continuously i… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

  47. arXiv:2304.14598  [pdf, other

    cs.IT eess.SP

    A manifold learning-based CSI feedback framework for FDD massive MIMO

    Authors: Yandi Cao, Haifan Yin, Ziao Qin, Weidong Li, Weimin Wu, Merouane Debbah

    Abstract: Massive multi-input multi-output (MIMO) in Frequency Division Duplex (FDD) mode suffers from heavy feedback overhead for Channel State Information (CSI). In this paper, a novel manifold learning-based CSI feedback framework (MLCF) is proposed to reduce the feedback and improve the spectral efficiency of FDD massive MIMO. Manifold learning (ML) is an effective method for dimensionality reduction. H… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: 12 pages, 5 figures

  48. arXiv:2304.05175  [pdf

    math.OC eess.SY

    Sufficient Conditions for the Exact Relaxation of Complementarity Constraints for Storages in Multi-period OPF Problems

    Authors: Qi Wang, Wenchuan Wu, Chenhui Lin, Shuwei Xu, Xueliang Li

    Abstract: Storage-concerned Optimal Power Flow (OPF) with complementarity constraints is highly non-convex and intractable. In this paper, we propose two generalized sufficient conditions which guarantee no simultaneous charging and discharging (SCD) in the relaxed multi-period OPF excluding the complementarity constraints. Moreover, we prove that the regions on the locational marginal prices (LMPs) formed… ▽ More

    Submitted 10 November, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

  49. arXiv:2304.02409  [pdf, other

    cs.IT eess.SP

    Relative Entropy-Based Waveform Optimization for Rician Target Detection with Dual-Function Radar Communication Systems

    Authors: Xuyang Wang, Bo Tang, Wenjun Wu, Da Li

    Abstract: In this paper, we consider waveform design for dualfunction radar-communication systems based on multiple-inputmultiple-out arrays. To achieve better Rician target detection performance, we use the relative entropy associated with the formulated detection problem as the design metric. We also impose a multiuser interference energy constraint on the waveforms to ensure the achievable sum-rate of th… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  50. arXiv:2304.01822   

    eess.SP eess.SY

    Co-Design for Spectral Coexistence between RIS-aided MIMO Radar and MIMO Communication Systems

    Authors: Da Li, Bo Tang, Xuyang Wang, Wenjun Wu, Lei Xue

    Abstract: Reconfigurable intelligent surface (RIS) refers to a signal reflection surface containing a large number of low-cost passive reflecting elements. RIS can improve the performance of radar and communication systems by dynamically modulating the wireless channels. In this paper, we consider the co-design for improving the co-existence between multiple-input-multiple-output (MIMO) radar and MIMO commu… ▽ More

    Submitted 14 June, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: The paper has undergone significant rewriting and is currently being revised