Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–7 of 7 results for author: Ai, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2312.06337  [pdf, other

    cs.SD cs.CL eess.AS

    Deep Imbalanced Learning for Multimodal Emotion Recognition in Conversations

    Authors: Tao Meng, Yuntao Shou, Wei Ai, Nan Yin, Keqin Li

    Abstract: The main task of Multimodal Emotion Recognition in Conversations (MERC) is to identify the emotions in modalities, e.g., text, audio, image and video, which is a significant development direction for realizing machine intelligence. However, many data in MERC naturally exhibit an imbalanced distribution of emotion categories, and researchers ignore the negative impact of imbalanced data on emotion… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 16 pages, 9 figures

  2. arXiv:2309.07927  [pdf, ps, other

    eess.AS cs.CL cs.SD

    Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

    Authors: Ahmed Adel Attia, Jing Liu, Wei Ai, Dorottya Demszky, Carol Espy-Wilson

    Abstract: Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn't readily extend to ASR for children due to the limited availability of suitable child-specific databases and the distinct characteristics of children's speech. A recent st… ▽ More

    Submitted 15 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  3. arXiv:2105.14678  [pdf, other

    cs.CV eess.IV

    Image-to-Video Generation via 3D Facial Dynamics

    Authors: Xiaoguang Tu, Yingtian Zou, Jian Zhao, Wenjie Ai, Jian Dong, Yuan Yao, Zhikang Wang, Guodong Guo, Zhifeng Li, Wei Liu, Jiashi Feng

    Abstract: We present a versatile model, FaceAnime, for various video generation tasks from still images. Video generation from a single face image is an interesting problem and usually tackled by utilizing Generative Adversarial Networks (GANs) to integrate information from the input face image and a sequence of sparse facial landmarks. However, the generated face images usually suffer from quality loss, im… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

  4. arXiv:2011.09078  [pdf, other

    cs.SD cs.MM eess.AS

    Vertical-Horizontal Structured Attention for Generating Music with Chords

    Authors: Yizhou Zhao, Liang Qiu, Wensi Ai, Feng Shi, Song-Chun Zhu

    Abstract: In this paper, we propose a lightweight music-generating model based on variational autoencoder (VAE) with structured attention. Generating music is different from generating text because the melodies with chords give listeners distinguished polyphonic feelings. In a piece of music, a chord consisting of multiple notes comes from either the mixture of multiple instruments or the combination of mul… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  5. arXiv:2005.10455  [pdf, other

    eess.IV cs.CV

    Single Image Super-Resolution via Residual Neuron Attention Networks

    Authors: Wenjie Ai, Xiaoguang Tu, Shilei Cheng, Mei Xie

    Abstract: Deep Convolutional Neural Networks (DCNNs) have achieved impressive performance in Single Image Super-Resolution (SISR). To further improve the performance, existing CNN-based methods generally focus on designing deeper architecture of the network. However, we argue blindly increasing network's depth is not the most sensible way. In this paper, we propose a novel end-to-end Residual Neuron Attenti… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

    Comments: 6 pages, 4 figures, Accepted by IEEE ICIP 2020

  6. arXiv:1703.01107  [pdf, other

    eess.SY physics.med-ph q-bio.TO

    An intracardiac electrogram model to bridge virtual hearts and implantable cardiac devices

    Authors: Weiwei Ai, Nitish Patel, Partha Roop, Avinash Malik, Nathan Allen, Mark L. Trew

    Abstract: Virtual heart models have been proposed to enhance the safety of implantable cardiac devices through closed loop validation. To communicate with a virtual heart, devices have been driven by cardiac signals at specific sites. As a result, only the action potentials of these sites are sensed. However, the real device implanted in the heart will sense a complex combination of near and far-field extra… ▽ More

    Submitted 3 March, 2017; originally announced March 2017.

  7. arXiv:1603.05315  [pdf, ps, other

    eess.SY q-bio.TO

    Towards the Emulation of the Cardiac Conduction System for Pacemaker Testing

    Authors: Eugene Yip, Sidharta Andalam, Partha S. Roop, Avinash Malik, Mark Trew, Weiwei Ai, Nitish Patel

    Abstract: The heart is a vital organ that relies on the orchestrated propagation of electrical stimuli to coordinate each heart beat. Abnormalities in the heart's electrical behaviour can be managed with a cardiac pacemaker. Recently, the closed-loop testing of pacemakers with an emulation (real-time simulation) of the heart has been proposed. An emulated heart would provide realistic reactions to the pacem… ▽ More

    Submitted 17 March, 2016; v1 submitted 16 March, 2016; originally announced March 2016.