Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–22 of 22 results for author: Fels, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.14586  [pdf, other

    cs.SD cs.AI cs.CV eess.AS eess.SP

    Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer

    Authors: Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Sidney Fels, Jerry L. Prince, Georges El Fakhri, Jonghye Woo

    Abstract: The tongue's intricate 3D structure, comprising localized functional units, plays a crucial role in the production of speech. When measured using tagged MRI, these functional units exhibit cohesive displacements and derived quantities that facilitate the complex process of speech production. Non-negative matrix factorization-based approaches have been shown to estimate the functional units through… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: MICCAI 2023 (Oral presentation)

  2. arXiv:2211.05773  [pdf, other

    cs.CV

    Scaling Neural Face Synthesis to High FPS and Low Latency by Neural Caching

    Authors: Frank Yu, Sid Fels, Helge Rhodin

    Abstract: Recent neural rendering approaches greatly improve image quality, reaching near photorealism. However, the underlying neural networks have high runtime, precluding telepresence and virtual reality applications that require high resolution at low latency. The sequential dependency of layers in deep networks makes their optimization difficult. We break this dependency by caching information from the… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

  3. arXiv:2102.04588  [pdf, other

    cs.SD cs.CL eess.AS

    A comparative study of two-dimensional vocal tract acoustic modeling based on Finite-Difference Time-Domain methods

    Authors: Debasish Ray Mohapatra, Victor Zappi, Sidney Fels

    Abstract: The two-dimensional (2D) numerical approaches for vocal tract (VT) modelling can afford a better balance between the low computational cost and accurate rendering of acoustic wave propagation. However, they require a high spatio-temporal resolution in the numerical scheme for a precise estimation of acoustic formants at the simulation run-time expense. We have recently proposed a new VT acoustic m… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 4 pages, 3 figures

  4. arXiv:2102.01640  [pdf, other

    cs.SD cs.CL eess.AS

    SPEAK WITH YOUR HANDS Using Continuous Hand Gestures to control Articulatory Speech Synthesizer

    Authors: Pramit Saha, Debasish Ray Mohapatra, Sidney Fels

    Abstract: This work presents our advancements in controlling an articulatory speech synthesis engine, \textit{viz.}, Pink Trombone, with hand gestures. Our interface translates continuous finger movements and wrist flexion into continuous speech using vocal tract area-function based articulatory speech synthesis. We use Cyberglove II with 18 sensors to capture the kinematic information of the wrist and the… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: 2 pages, 1 figure

  5. arXiv:2010.14228  [pdf

    cs.HC cs.SD eess.AS

    New interfaces for musical expression

    Authors: Ivan Poupyrev, Michael J. Lyons, Sidney Fels, Tina Blaine

    Abstract: The rapid evolution of electronics, digital media, advanced materials, and other areas of technology, is opening up unprecedented opportunities for musical interface inventors and designers. The possibilities afforded by these new technologies carry with them the challenges of a complex and often confusing array of choices for musical composers and performers. New musical technologies are at least… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: 2 pages, This item describes the CHI'01 workshop which started the International Conference on New Interfaces for Musical Expression

    ACM Class: H.5.5

    Journal ref: ACM CHI'01 Extended Abstracts on Human Factors in Computing Systems, March 2001 Pages 491-492

  6. arXiv:2006.16367  [pdf, other

    eess.IV cs.LG cs.SD eess.AS stat.ML

    Ultra2Speech -- A Deep Learning Framework for Formant Frequency Estimation and Tracking from Ultrasound Tongue Images

    Authors: Pramit Saha, Yadong Liu, Bryan Gick, Sidney Fels

    Abstract: Thousands of individuals need surgical removal of their larynx due to critical diseases every year and therefore, require an alternative form of communication to articulate speech sounds after the loss of their voice box. This work addresses the articulatory-to-acoustic mapping problem based on ultrasound (US) tongue images for the development of a silent-speech interface (SSI) that can provide th… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: Accepted for publication in MICCAI 2020

  7. arXiv:2005.09463  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Learning Joint Articulatory-Acoustic Representations with Normalizing Flows

    Authors: Pramit Saha, Sidney Fels

    Abstract: The articulatory geometric configurations of the vocal tract and the acoustic properties of the resultant speech sound are considered to have a strong causal relationship. This paper aims at finding a joint latent representation between the articulatory and acoustic domain for vowel sounds via invertible neural network models, while simultaneously preserving the respective domain-specific features… ▽ More

    Submitted 30 September, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: 5 pages, 4 figures, accepted for publication in Interspeech 2020

  8. arXiv:1912.05184  [pdf, other

    cs.LG stat.ML

    Variational Learning with Disentanglement-PyTorch

    Authors: Amir H. Abdi, Purang Abolmaesumi, Sidney Fels

    Abstract: Unsupervised learning of disentangled representations is an open problem in machine learning. The Disentanglement-PyTorch library is developed to facilitate research, implementation, and testing of new variational algorithms. In this modular library, neural architectures, dimensionality of the latent space, and the training algorithms are fully decoupled, allowing for independent and consistent ex… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

    Comments: Disentanglement Challenge - 33rd Conference on Neural Information Processing Systems (NeurIPS) - NeurIPS 2019

  9. arXiv:1912.03120  [pdf, other

    eess.IV cs.LG stat.ML

    A Study into Echocardiography View Conversion

    Authors: Amir H. Abdi, Mohammad H. Jafari, Sidney Fels, Theresa Tsang, Purang Abolmaesumi

    Abstract: Transthoracic echo is one of the most common means of cardiac studies in the clinical routines. During the echo exam, the sonographer captures a set of standard cross sections (echo views) of the heart. Each 2D echo view cuts through the 3D cardiac geometry via a unique plane. Consequently, different views share some limited information. In this work, we investigate the feasibility of generating a… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: Workshop of Medical Imaging Meets NeurIPS, NeurIPS 2019

  10. arXiv:1911.11791  [pdf, other

    cs.LG stat.ML

    A Preliminary Study of Disentanglement With Insights on the Inadequacy of Metrics

    Authors: Amir H. Abdi, Purang Abolmaesumi, Sidney Fels

    Abstract: Disentangled encoding is an important step towards a better representation learning. However, despite the numerous efforts, there still is no clear winner that captures the independent features of the data in an unsupervised fashion. In this work we empirically evaluate the performance of six unsupervised disentanglement approaches on the mpi3d toy dataset curated and released for the NeurIPS 2019… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: Disentanglement Challenge - NeurIPS 2019

  11. arXiv:1910.02020  [pdf, other

    cs.HC

    EEG-to-F0: Establishing artificial neuro-muscular pathway for kinematics-based fundamental frequency control

    Authors: Himanshu Goyal, Pramit Saha, Bryan Gick, Sidney Fels

    Abstract: The fundamental frequency (F0) of human voice is generally controlled by changing the vocal fold parameters (including tension, length and mass), which in turn is manipulated by the muscle exciters, activated by the neural synergies. In order to begin investigating the neuromuscular F0 control pathway, we simulate a simple biomechanical arm prototype (instead of an artificial vocal tract) that ten… ▽ More

    Submitted 24 September, 2019; originally announced October 2019.

  12. An extended two-dimensional vocal tract model for fast acoustic simulation of single-axis symmetric three-dimensional tubes

    Authors: Debasish Ray Mohapatra, Victor Zappi, Sidney Fels

    Abstract: The simulation of two-dimensional (2D) wave propagation is an affordable computational task and its use can potentially improve time performance in vocal tracts' acoustic analysis. Several models have been designed that rely on 2D wave solvers and include 2D representations of three-dimensional (3D) vocal tract-like geometries. However, until now, only the acoustics of straight 3D tubes with circu… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: 5 pages, 2 figures, Interspeech 2019 submission

  13. Variational Shape Completion for Virtual Planning of Jaw Reconstructive Surgery

    Authors: Amir H. Abdi, Mehran Pesteie, Eitan Prisman, Purang Abolmaesumi, Sidney Fels

    Abstract: The premorbid geometry of the mandible is of significant relevance in jaw reconstructive surgeries and occasionally unknown to the surgical team. In this paper, an optimization framework is introduced to train deep models for completion (reconstruction) of the missing segments of the bone based on the remaining healthy structure. To leverage the contextual information of the surroundings of the di… ▽ More

    Submitted 15 July, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

    Comments: Proceedings of Medical Image Computing and Computer Assisted Intervention - {MICCAI} 2019

  14. arXiv:1904.05746  [pdf, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    SPEAK YOUR MIND! Towards Imagined Speech Recognition With Hierarchical Deep Learning

    Authors: Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels

    Abstract: Speech-related Brain Computer Interface (BCI) technologies provide effective vocal communication strategies for controlling devices through speech commands interpreted from brain signals. In order to infer imagined speech from active thoughts, we propose a novel hierarchical deep learning BCI system for subject-independent classification of 11 speech tokens including phonemes and words. Our novel… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Under review in INTERSPEECH 2019. arXiv admin note: text overlap with arXiv:1904.04358

  15. arXiv:1904.04358  [pdf, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts

    Authors: Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels

    Abstract: Speech-related Brain Computer Interfaces (BCI) aim primarily at finding an alternative vocal communication pathway for people with speaking disabilities. As a step towards full decoding of imagined speech from active thoughts, we present a BCI system for subject-independent classification of phonological categories exploiting a novel deep learning based hierarchical feature extraction scheme. To b… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Accepted for publication in IEEE ICASSP 2019

  16. arXiv:1904.04352  [pdf, other

    cs.LG eess.IV stat.ML

    Hierarchical Deep Feature Learning For Decoding Imagined Speech From EEG

    Authors: Pramit Saha, Sidney Fels

    Abstract: We propose a mixed deep neural network strategy, incorporating parallel combination of Convolutional (CNN) and Recurrent Neural Networks (RNN), cascaded with deep autoencoders and fully connected layers towards automatic identification of imagined speech from EEG. Instead of utilizing raw EEG channel data, we compute the joint variability of the channels in the form of a covariance matrix that pro… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Accepted in AAAI 2019 under Student Abstract and Poster Program

  17. arXiv:1902.03541  [pdf

    cs.HC cs.CY

    Human Computer Interaction Design for Mobile Devices Based on a Smart Healthcare Architecture

    Authors: Pu Liu, Sidney Fels, Nicholas West, Matthias Görges

    Abstract: Smart and IoT-enabled mobile devices have the potential to enhance healthcare services for both patients and healthcare providers. Human computer interaction design is key to realizing a useful and usable connection between the users and these smart healthcare technologies. Appropriate design of such devices enhances the usability, improves effective operation in an integrated healthcare system, a… ▽ More

    Submitted 10 February, 2019; originally announced February 2019.

    ACM Class: H.5.2; H.1.2

    Journal ref: In: Yurish SY, ed. Advances in Computers and Software Engineering: Reviews. 2019, Volume 2, p. 99-131

  18. arXiv:1811.08029  [pdf, other

    cs.SD eess.AS

    Sound-Stream II: Towards Real-Time Gesture Controlled Articulatory Sound Synthesis

    Authors: Pramit Saha, Debasish Ray Mohapatra, Praneeth SV, Sidney Fels

    Abstract: We present an interface involving four degrees-of-freedom (DOF) mechanical control of a two dimensional, mid-sagittal tongue through a biomechanical toolkit called ArtiSynth and a sound synthesis engine called JASS towards articulatory sound synthesis. As a demonstration of the project, the user will learn to produce a range of JASS vocal sounds, by varying the shape and position of the ArtiSynth… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

  19. arXiv:1811.07435  [pdf, other

    cs.SD cs.CL eess.AS

    Limitations of Source-Filter Coupling In Phonation

    Authors: Debasish Ray Mohapatra, Sidney Fels

    Abstract: The coupling of vocal fold (source) and vocal tract (filter) is one of the most critical factors in source-filter articulation theory. The traditional linear source-filter theory has been challenged by current research which clearly shows the impact of acoustic loading on the dynamic behavior of the vocal fold vibration as well as the variations in the glottal flow pulses shape. This paper outline… ▽ More

    Submitted 18 November, 2018; originally announced November 2018.

    Comments: 2 pages, 2 figures

  20. Muscle Excitation Estimation in Biomechanical Simulation Using NAF Reinforcement Learning

    Authors: Amir H. Abdi, Pramit Saha, Praneeth Srungarapu, Sidney Fels

    Abstract: Motor control is a set of time-varying muscle excitations which generate desired motions for a biomechanical system. Muscle excitations cannot be directly measured from live subjects. An alternative approach is to estimate muscle activations using inverse motion-driven simulation. In this article, we propose a deep reinforcement learning method to estimate the muscle excitations in simulated biome… ▽ More

    Submitted 3 May, 2019; v1 submitted 17 September, 2018; originally announced September 2018.

    Comments: 9 pages, 3 figures. Computational Biomechanics for Medicine. MICCAI 2019. Springer, Cham

  21. arXiv:1807.11089  [pdf, other

    cs.SD cs.CL cs.CV cs.LG eess.AS

    Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI

    Authors: Pramit Saha, Praneeth Srungarapu, Sidney Fels

    Abstract: Vocal tract configurations play a vital role in generating distinguishable speech sounds, by modulating the airflow and creating different resonant cavities in speech production. They contain abundant information that can be utilized to better understand the underlying speech production mechanism. As a step towards automatic mapping of vocal tract shape geometry to acoustics, this paper employs ef… ▽ More

    Submitted 29 July, 2018; originally announced July 2018.

    Comments: To appear in the INTERSPEECH 2018 Proceedings

  22. arXiv:1512.05811  [pdf, other

    cs.SD

    Spectral Study of the Vocal Tract in Vowel Synthesis: A Comparison between 1D and 3D Acoustic Analysis

    Authors: Negar M. Harandi, Daniel Aalto, Antti Hannukainen, Jarmo Malinen, Sidney Fels

    Abstract: A state-of-the-art 1D acoustic synthesizer has been previously developed, and coupled to speaker-specific biomechanical models of oropharynx in ArtiSynth. As expected, the formant frequencies of the synthesized vowel sounds were shown to be different from those of the recorded audio. Such discrepancy was hypothesized to be due to the simplified geometry of the vocal tract model as well as the one… ▽ More

    Submitted 17 December, 2015; originally announced December 2015.