Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–29 of 29 results for author: Murthy, H

.
  1. arXiv:2303.07130  [pdf, other

    eess.IV cs.CV cs.LG

    Enhancing COVID-19 Severity Analysis through Ensemble Methods

    Authors: Anand Thyagachandran, Hema A Murthy

    Abstract: Computed Tomography (CT) scans provide a detailed image of the lungs, allowing clinicians to observe the extent of damage caused by COVID-19. The CT severity score (CTSS) based scoring method is used to identify the extent of lung involvement observed on a CT scan. This paper presents a domain knowledge-based pipeline for extracting regions of infection in COVID-19 patients using a combination of… ▽ More

    Submitted 17 March, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

  2. arXiv:2302.07480  [pdf

    cond-mat.mtrl-sci physics.chem-ph

    Iridium-doping as a strategy to realize visible light absorption and p-type behavior in BaTiO3

    Authors: Sujana Chandrappa, Simon Joyson Galbao, P S Sankara Rama Krishnan, Namitha Anna Koshi, Srewashi Das, Stephen Nagaraju Myakala, Seung Cheol Lee, Arnab Dutta, Alexey Cherevan, Satadeep Bhattacharjee, Dharmapura H K Murthy

    Abstract: BaTiO3 is typically a strong n-type material with tuneable optoelectronic properties via doping and controlling the synthesis conditions. It has a wide band gap that can only harness the ultraviolet region of the solar spectrum. Despite significant progress, achieving visible-light absorbing BTO with tuneable carrier concentration has been challenging, a crucial requirement for many applications.… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: 21 pages, 8 figures

  3. arXiv:2302.06227  [pdf, other

    eess.AS cs.SD

    Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages

    Authors: Sudhanshu Srivastava, Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

    Abstract: Hidden-Markov-model (HMM) based text-to-speech (HTS) offers flexibility in speaking styles along with fast training and synthesis while being computationally less intense. HTS performs well even in low-resource scenarios. The primary drawback is that the voice quality is poor compared to that of E2E systems. A hybrid approach combining HMM-based feature generation and neural-network-based HiFi-GAN… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: 5 pages, 5 figures

  4. arXiv:2212.11982  [pdf, other

    eess.AS

    HMM-based data augmentation for E2E systems for building conversational speech synthesis systems

    Authors: Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

    Abstract: This paper proposes an approach to build a high-quality text-to-speech (TTS) system for technical domains using data augmentation. An end-to-end (E2E) system is trained on hidden Markov model (HMM) based synthesized speech and further fine-tuned with studio-recorded TTS data to improve the timbre of the synthesized voice. The motivation behind the work is that issues of word skips and repetitions… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

    Comments: 6 pages, 7 figures, 33 references

  5. arXiv:2211.08790  [pdf, other

    eess.AS cs.LG

    Structural Segmentation and Labeling of Tabla Solo Performances

    Authors: Gowriprasad R, R Aravind, Hema A Murthy

    Abstract: Tabla is a North Indian percussion instrument used as an accompaniment and an exclusive instrument for solo performances. Tabla solo is intricate and elaborate, exhibiting rhythmic evolution through a sequence of homogeneous sections marked by shared rhythmic characteristics. Each section has a specific structure and name associated with it. Tabla learning and performance in the Indian subcontinen… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 35 pages, 11 figures

  6. arXiv:2211.01603  [pdf, other

    q-bio.GN cs.LG eess.SP

    Using Signal Processing in Tandem With Adapted Mixture Models for Classifying Genomic Signals

    Authors: Saish Jaiswal, Shreya Nema, Hema A Murthy, Manikandan Narayanan

    Abstract: Genomic signal processing has been used successfully in bioinformatics to analyze biomolecular sequences and gain varied insights into DNA structure, gene organization, protein binding, sequence evolution, etc. But challenges remain in finding the appropriate spectral representation of a biomolecular sequence, especially when multiple variable-length sequences need to be handled consistently. In t… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  7. arXiv:2211.01338  [pdf, other

    eess.AS cs.CL cs.MM cs.SD eess.IV

    Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

    Authors: Anusha Prakash, Arun Kumar, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K V Vikram, Mano Ranjith Kumar M, Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda Sukhadia, Dipti Sharma, Hema Murthy, Pushpak Bhattacharya , et al. (2 additional authors not shown)

    Abstract: Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  8. arXiv:2210.17153  [pdf, other

    eess.AS cs.SD

    The Importance of Accurate Alignments in End-to-End Speech Synthesis

    Authors: Anusha Prakash, Hema A Murthy

    Abstract: Unit selection synthesis systems required accurate segmentation and labeling of the speech signal owing to the concatenative nature. Hidden Markov model-based speech synthesis accommodates some transcription errors, but it was later shown that accurate transcriptions yield highly intelligible speech with smaller amounts of training data. With the arrival of end-to-end (E2E) systems, it was observe… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

    Comments: Version 1 uploaded

  9. arXiv:2106.01400  [pdf, other

    eess.AS cs.LG cs.SD

    Dual Script E2E framework for Multilingual and Code-Switching ASR

    Authors: Mari Ganesh Kumar, Jom Kuriakose, Anand Thyagachandran, Arun Kumar A, Ashish Seth, Lodagala Durga Prasad, Saish Jaiswal, Anusha Prakash, Hema Murthy

    Abstract: India is home to multiple languages, and training automatic speech recognition (ASR) systems for languages is challenging. Over time, each language has adopted words from other languages, such as English, leading to code-mixing. Most Indian languages also have their own unique scripts, which poses a major limitation in training multilingual and code-switching ASR systems. Inspired by results in… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted for publication at Interspeech 2021

  10. arXiv:2105.04946  [pdf, other

    cond-mat.mtrl-sci

    Probing Photo-excited Charge Carrier Trapping and Defect Formation in Synergistic Doping of SrTiO3

    Authors: Namitha Anna Koshi, Dharmapura H K Murthy, Sudip Chakraborty, Seung-Cheol Lee, Satadeep Bhattacharjee

    Abstract: Strontium titanate (SrTiO3) is widely used as a promising photocatalyst due to its unique band edge alignment with respect to the oxidation and reduction potential corresponding to oxygen evolution reaction (OER) and hydrogen evolution reaction (HER). However, further enhancement of the photocatalytic activity in this material could be envisaged through the effective control of oxygen vacancy stat… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

  11. arXiv:2103.03215  [pdf, other

    eess.AS cs.SD

    Front-end Diarization for Percussion Separation in Taniavartanam of Carnatic Music Concerts

    Authors: Nauman Dawalatabad, Jilt Sebastian, Jom Kuriakose, C. Chandra Sekhar, Shrikanth Narayanan, Hema A. Murthy

    Abstract: Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source se… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  12. arXiv:2011.07279  [pdf, other

    cs.CV

    Towards Zero-Shot Learning with Fewer Seen Class Examples

    Authors: Vinay Kumar Verma, Ashish Mishra, Anubha Pandey, Hema A. Murthy, Piyush Rai

    Abstract: We present a meta-learning based generative model for zero-shot learning (ZSL) towards a challenging setting when the number of training examples from each \emph{seen} class is very few. This setup contrasts with the conventional ZSL approaches, where training typically assumes the availability of a sufficiently large number of training examples from each of the seen classes. The proposed approach… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

    Comments: Accepted in WACV 2021

  13. arXiv:2011.02195  [pdf, other

    eess.SP cs.LG cs.SD eess.AS

    Correlation based Multi-phasal models for improved imagined speech EEG recognition

    Authors: Rini A Sharon, Hema A Murthy

    Abstract: Translation of imagined speech electroencephalogram(EEG) into human understandable commands greatly facilitates the design of naturalistic brain computer interfaces. To achieve improved imagined speech unit classification, this work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

    Journal ref: Interspeech SMM 2020

  14. Novel Architectures for Unsupervised Information Bottleneck based Speaker Diarization of Meetings

    Authors: Nauman Dawalatabad, Srikanth Madikeri, C. Chandra Sekhar, Hema A. Murthy

    Abstract: Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: Accepted in IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021, pp 14-27

  15. arXiv:2010.05497  [pdf, other

    cs.LG eess.SP q-bio.NC

    The "Sound of Silence" in EEG -- Cognitive voice activity detection

    Authors: Rini A Sharon, Hema A Murthy

    Abstract: Speech cognition bears potential application as a brain computer interface that can improve the quality of life for the otherwise communication impaired people. While speech and resting state EEG are popularly studied, here we attempt to explore a "non-speech"(NS) state of brain activity corresponding to the silence regions of speech audio. Firstly, speech perception is studied to inspect the exis… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  16. arXiv:2009.04983  [pdf, other

    eess.AS cs.SD

    Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020

    Authors: Karthik Pandia D S, Anusha Prakash, Mano Ranjith Kumar, Hema A Murthy

    Abstract: A Spoken dialogue system for an unseen language is referred to as Zero resource speech. It is especially beneficial for developing applications for languages that have low digital resources. Zero resource speech synthesis is the task of building text-to-speech (TTS) models in the absence of transcriptions. In this work, speech is modelled as a sequence of transient and steady-state acoustic units,… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

    Comments: Accepted for publication in Interspeech 2020

  17. Evidence of Task-Independent Person-Specific Signatures in EEG using Subspace Techniques

    Authors: Mari Ganesh Kumar, Shrikanth Narayanan, Mriganka Sur, Hema A Murthy

    Abstract: Electroencephalography (EEG) signals are promising as alternatives to other biometrics owing to their protection against spoofing. Previous studies have focused on capturing individual variability by analyzing task/condition-specific EEG. This work attempts to model biometric signatures independent of task/condition by normalizing the associated variance. Toward this goal, the paper extends ideas… ▽ More

    Submitted 25 March, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

    Comments: ©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Journal ref: IEEE Transactions on Information Forensics and Security, 2021

  18. Generic Indic Text-to-speech Synthesisers with Rapid Adaptation in an End-to-end Framework

    Authors: Anusha Prakash, Hema A Murthy

    Abstract: Building text-to-speech (TTS) synthesisers for Indian languages is a difficult task owing to a large number of active languages. Indian languages can be classified into a finite set of families, prominent among them, Indo-Aryan and Dravidian. The proposed work exploits this property to build a generic TTS system using multiple languages from the same family in an end-to-end framework. Generic syst… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Journal ref: INTERSPEECH (2002) 2962-2966

  19. Zero resource speech synthesis using transcripts derived from perceptual acoustic units

    Authors: Karthik Pandia D S, Hema A Murthy

    Abstract: Zerospeech synthesis is the task of building vocabulary independent speech synthesis systems, where transcriptions are not available for training data. It is, therefore, necessary to convert training data into a sequence of fundamental acoustic units that can be used for synthesis during the test. This paper attempts to discover, and model perceptual acoustic units consisting of steady-state, and… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  20. arXiv:2001.06657  [pdf, other

    cs.CV cs.IR cs.LG stat.ML

    Stacked Adversarial Network for Zero-Shot Sketch based Image Retrieval

    Authors: Anubha Pandey, Ashish Mishra, Vinay Kumar Verma, Anurag Mittal, Hema A. Murthy

    Abstract: Conventional approaches to Sketch-Based Image Retrieval (SBIR) assume that the data of all the classes are available during training. The assumption may not always be practical since the data of a few classes may be unavailable, or the classes may not appear at the time of training. Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) relaxes this constraint and allows the algorithm to handle previous… ▽ More

    Submitted 18 January, 2020; originally announced January 2020.

    Comments: Accepted in WACV'2020

  21. arXiv:1904.07453  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    Spoof detection using time-delay shallow neural network and feature switching

    Authors: Mari Ganesh Kumar, Suvidha Rupesh Kumar, Saranya M, B. Bharathi, Hema A. Murthy

    Abstract: Detecting spoofed utterances is a fundamental problem in voice-based biometrics. Spoofing can be performed either by logical accesses like speech synthesis, voice conversion or by physical accesses such as replaying the pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based speaker verification approach, this paper proposes a time-delay shallow neural network (TD-SNN) for s… ▽ More

    Submitted 23 January, 2020; v1 submitted 16 April, 2019; originally announced April 2019.

    Journal ref: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1011--1017

  22. Incremental Transfer Learning in Two-pass Information Bottleneck based Speaker Diarization System for Meetings

    Authors: Nauman Dawalatabad, Srikanth Madikeri, C Chandra Sekhar, Hema A Murthy

    Abstract: The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This pap… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 5 pages, 2 figures, To appear in Proc. ICASSP 2019, May 12-17, 2019, Brighton, UK

  23. arXiv:1803.07144  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Synthesis and Characterization of Copper Doped Zinc Oxide Thin Films for CO Gas Sensing

    Authors: Sachin S Bharadwaj, Shivaraj B W, H N Narasimha Murthy, M Krishna, Manjush Ganiger, Mohd Idris, Pundaleek Anawal, Vitthal Sangappa Angadi

    Abstract: Objective of this work was to synthesize Copper doped Zinc Oxide (CZO) films and optimization of process parameters by varying molarity of zinc acetate dehydrate from 0.5 M to 1.0 M, concentration of copper acetate monohydrate from 1% to 5 % and annealing temperature from 200 C to 300 C to measure the sensitivity of CZO films for CO (Carbon Monoxide) gas. The concentration of CO gas was maintained… ▽ More

    Submitted 19 February, 2018; originally announced March 2018.

    Comments: 7 Pages, 16 Figures, 3 Tables

  24. arXiv:1711.02318  [pdf, other

    cs.SD eess.AS

    Non-uniform time-scaling of Carnatic music transients

    Authors: Venkata Subramanian Viraraghavan, Arpan Pal, R Aravind, Hema Murthy

    Abstract: Gamakas are an integral aspect of Carnatic Music, a form of classical music prevalent in South India. They are used in ragas, which may be seen as melodic scales and/or a set of characteristic melodic phrases. Gamakas exhibit continuous pitch variation often spanning several semitones. In this paper, we study how gamakas scale with tempo and propose a novel approach to change the tempo of Carnatic… ▽ More

    Submitted 7 November, 2017; originally announced November 2017.

    Comments: The non-uniform time-scaling of CP-notes and transients in Carnatic concert renditions is new; it has not been reported earlier in the literature, but a reviewer pointed out that the proposed algorithm is previously known

  25. arXiv:1709.00663  [pdf, other

    cs.CV

    A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

    Authors: Ashish Mishra, M Shiva Krishna Reddy, Anurag Mittal, Hema A Murthy

    Abstract: Zero shot learning in Image Classification refers to the setting where images from some novel classes are absent in the training data but other information such as natural language descriptions or attribute vectors of the classes are available. This setting is important in the real world since one may not be able to obtain images of all the possible classes at training. While previous approaches h… ▽ More

    Submitted 27 January, 2018; v1 submitted 3 September, 2017; originally announced September 2017.

  26. arXiv:1608.05892  [pdf

    cond-mat.mtrl-sci

    A study on the growth mechanism and the process parameters controlling aluminum oxide thin films deposition by pulsed pressure MOCVD

    Authors: Hari Murthy, S. S Miya, Susan Krumdieck

    Abstract: Aluminum oxide thin films were deposited on silicon substrates under different deposition conditions using pulse pressure metal organic chemical vapour deposition (PP-MOCVD). The current study investigates into the growth mechanism of the deposited film and the control of the film morphology by varying the processing parameters of PP-MOCVD - choice of solvent, concentration, and presence of a shie… ▽ More

    Submitted 21 August, 2016; originally announced August 2016.

    Comments: 27 pages, 12 figures, pre-peer review

  27. arXiv:1603.05435  [pdf, ps, other

    cs.SD

    Modified Group Delay Based MultiPitch Estimation in Co-Channel Speech

    Authors: Rajeev Rajan, Hema A. Murthy

    Abstract: Phase processing has been replaced by group delay processing for the extraction of source and system parameters from speech. Group delay functions are ill-behaved when the transfer function has zeros that are close to unit circle in the z-domain. The modified group delay function addresses this problem and has been successfully used for formant and monopitch estimation. In this paper, modified gro… ▽ More

    Submitted 17 March, 2016; originally announced March 2016.

  28. arXiv:1107.1576  [pdf, other

    cond-mat.mtrl-sci

    Influence of Phase Segregation on Recombination Dynamics in Organic Bulk-Heterojunction Solar Cells

    Authors: Andreas Baumann, Tom J. Savenije, Dharmapura Hanumantharaya K. Murthy, Martin Heeney, Vladimir Dyakonov, Carsten Deibel

    Abstract: We studied the recombination dynamics of charge carriers in organic bulk heterojunction solar cells made of the blend system poly(2,5-bis(3-dodecyl thiophen-2-yl) thieno[2,3-b]thiophene) (pBTCT-C12):[6,6]-phenyl-C61-butyric acid methyl ester (PC61BM) with a donor--acceptor ratio of 1:1 and 1:4. The techniques of charge carrier extraction by linearly increasing voltage (photo-CELIV) and, as local p… ▽ More

    Submitted 8 July, 2011; originally announced July 2011.

    Comments: 14 pages, 5 figures

    Journal ref: Adv. Func. Mat. 21, 1687 (2011)

  29. Efficient photogeneration of charge carriers in silicon nanowires with a radial doping gradient

    Authors: D. H. K. Murthy, T. Xu, W. H. Chen, J. Houtepen A., T. J. Savenije, L. D. A. Siebbeles, J. P. Nys, Christophe Krzeminski, Bruno Grandidier, Didier Stiévenard, Philippe Pareige, F. Jomard, Gilles Patriache, O. I. Lebedev

    Abstract: From electrodeless time-resolved microwave conductivity measurements, the efficiency of charge carrier generation, their mobility, and decay kinetics on photo-excitation were studied in arrays of Si nanowires grown by the vapor-liquid-solid mechanism. A large enhancement in the magnitude of the photoconductance and charge carrier lifetime are found depending on the incorporation of impurities duri… ▽ More

    Submitted 28 June, 2011; originally announced June 2011.

    Journal ref: Nanotechnology (2011), Vol. 22, p. 315710