Search | arXiv e-print repository

Direct Measurement of Microwave Loss in Nb Films for Superconducting Qubits

Authors: B. Abdisatarov, D. Bafia, A. Murthy, G. Eremeev, H. E. Elsayed-Ali, J. Lee, A. Netepenko, C. P. A. Carlos, S. Leith, G. J. Rosaz, A. Romanenko, A. Grassellino

Abstract: Niobium films are a key component in modern two-dimensional superconducting qubits, yet their contribution to the total qubit decay rate is not fully understood. The presence of different layers of materials and interfaces makes it difficult to identify the dominant loss channels in present two-dimensional qubit designs. In this paper we present the first study which directly correlates measuremen… ▽ More Niobium films are a key component in modern two-dimensional superconducting qubits, yet their contribution to the total qubit decay rate is not fully understood. The presence of different layers of materials and interfaces makes it difficult to identify the dominant loss channels in present two-dimensional qubit designs. In this paper we present the first study which directly correlates measurements of RF losses in such films to material parameters by investigating a high-power impulse magnetron sputtered (HiPIMS) film atop a three-dimensional niobium superconducting radiofrequency (SRF) resonator. By using a 3D SRF structure, we are able to isolate the niobium film loss from other contributions. Our findings indicate that microwave dissipation in the HiPIMS-prepared niobium films, within the quantum regime, resembles that of record-high intrinsic quality factor of bulk niobium SRF cavities, with lifetimes extending into seconds. Microstructure and impurity level of the niobium film do not significantly affect the losses. These results set the scale of microwave losses in niobium films and show that niobium losses do not dominate the observed coherence times in present two-dimensional superconducting qubit designs, instead highlighting the dominant role of the dielectric oxide in limiting the performance. We can also set a bound for when niobium film losses will become a limitation for qubit lifetimes. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 20 pages, 8 figures

arXiv:2405.02251 [pdf, other]

Fermionization and collective excitations of 1D polariton lattices

Authors: Johannes Knörzer, Rafał Ołdziejewski, Puneet A. Murthy, Ivan Amelio

Abstract: We theoretically demonstrate that the hallmarks of correlation and fermionization in a one-dimensional exciton-polaritons gas can be observed with state-of-the-art technology. Our system consists of a chain of excitonic quantum dots coupled to a photonic waveguide, with a low filling of polaritons. We analytically identify the Tonks-Girardeau, Tavis-Cummings and mean-field limits and relate them t… ▽ More We theoretically demonstrate that the hallmarks of correlation and fermionization in a one-dimensional exciton-polaritons gas can be observed with state-of-the-art technology. Our system consists of a chain of excitonic quantum dots coupled to a photonic waveguide, with a low filling of polaritons. We analytically identify the Tonks-Girardeau, Tavis-Cummings and mean-field limits and relate them to different regimes of the excitonic anharmonicity and photonic bandwidth. Using matrix-product states, we numerically calculate the ground-state energies, correlation functions and dynamic structure factor of the system. In particular, the latter has a finite weight in the Lieb-Liniger hole branch, and the density-density correlator displays Friedel-like oscillations for realistic parameters, which reveal the onset of fermionization close to the Tonks-Girardeau regime. Our work encourages future experiments aimed at observing, for the first time and in spite of the moderate excitonic anharmonicity, strongly correlated exciton-polariton physics. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 8 pages, 6 figures, supplemental material: 2 pages, 4 figures

arXiv:2402.01817 [pdf, other]

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Authors: Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

Abstract: There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the probl… ▽ More There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the problem specification from one syntactic format to another, and ship the problem off to external symbolic solvers. In this position paper, we take the view that both these extremes are misguided. We argue that auto-regressive LLMs cannot, by themselves, do planning or self-verification (which is after all a form of reasoning), and shed some light on the reasons for misunderstandings in the literature. We will also argue that LLMs should be viewed as universal approximate knowledge sources that have much more meaningful roles to play in planning/reasoning tasks beyond simple front-end/back-end format translators. We present a vision of {\bf LLM-Modulo Frameworks} that combine the strengths of LLMs with external model-based verifiers in a tighter bi-directional interaction regime. We will show how the models driving the external verifiers themselves can be acquired with the help of LLMs. We will also argue that rather than simply pipelining LLMs and symbolic components, this LLM-Modulo Framework provides a better neuro-symbolic approach that offers tighter integration between LLMs and symbolic components, and allows extending the scope of model-based planning/reasoning regimes towards more flexible knowledge, problem and preference specifications. △ Less

Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

arXiv:2312.14292 [pdf, other]

Benchmarking Multi-Agent Preference-based Reinforcement Learning for Human-AI Teaming

Authors: Siddhant Bhambri, Mudit Verma, Anil Murthy, Subbarao Kambhampati

Abstract: Preference-based Reinforcement Learning (PbRL) is an active area of research, and has made significant strides in single-agent actor and in observer human-in-the-loop scenarios. However, its application within the co-operative multi-agent RL frameworks, where humans actively participate and express preferences for agent behavior, remains largely uncharted. We consider a two-agent (Human-AI) cooper… ▽ More Preference-based Reinforcement Learning (PbRL) is an active area of research, and has made significant strides in single-agent actor and in observer human-in-the-loop scenarios. However, its application within the co-operative multi-agent RL frameworks, where humans actively participate and express preferences for agent behavior, remains largely uncharted. We consider a two-agent (Human-AI) cooperative setup where both the agents are rewarded according to human's reward function for the team. However, the agent does not have access to it, and instead, utilizes preference-based queries to elicit its objectives and human's preferences for the robot in the human-robot team. We introduce the notion of Human-Flexibility, i.e. whether the human partner is amenable to multiple team strategies, with a special case being Specified Orchestration where the human has a single team policy in mind (most constrained case). We propose a suite of domains to study PbRL for Human-AI cooperative setup which explicitly require forced cooperation. Adapting state-of-the-art single-agent PbRL algorithms to our two-agent setting, we conduct a comprehensive benchmarking study across our domain suite. Our findings highlight the challenges associated with high degree of Human-Flexibility and the limited access to the human's envisioned policy in PbRL for Human-AI cooperation. Notably, we observe that PbRL algorithms exhibit effective performance exclusively in the case of Specified Orchestration which can be seen as an upper bound PbRL performance for future research. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2312.10697 [pdf, other]

Magnetic Fluctuations in Niobium Pentoxide

Authors: Y. Krasnikova, A. A. Murthy, F. Crisa, M. Bal, Z. Sung, J. Lee, A. Cano, D. M. T. van Zanten, A. Romanenko, A. Grassellino, A. Suter, T. Prokscha, Z. Salman

Abstract: Using a spin-polarized muon beam we were able to capture magnetic dynamics in an amorphous niobium pentoxide thin film. Muons are used to probe internal magnetic fields produced by defects. Magnetic fluctuations could be described by the dynamical Kubo-Toyabe model considering a time-dependent local magnetic field. We state that observed fluctuations result from the correlated motion of electron s… ▽ More Using a spin-polarized muon beam we were able to capture magnetic dynamics in an amorphous niobium pentoxide thin film. Muons are used to probe internal magnetic fields produced by defects. Magnetic fluctuations could be described by the dynamical Kubo-Toyabe model considering a time-dependent local magnetic field. We state that observed fluctuations result from the correlated motion of electron spins. We expect that oxygen vacancies play a significant role in these films and lead to a complex magnetic field distribution which is non-stationary. The characteristic average rate of magnetic field change is on the order of 100~MHz. The observed dynamics may provide insight into potential noise sources in Nb-based superconducting devices, while also highlighting the limitations imposed by amorphous oxides. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: 8 pages, 7 figures

Report number: FERMILAB-PUB-23-642-SQMS

arXiv:2308.06361 [pdf, other]

Quantum control of exciton wavefunctions in 2D semiconductors

Authors: Jenny Hu, Etienne Lorchat, Xueqi Chen, Kenji Watanabe, Takashi Taniguchi, Tony F. Heinz, Puneet A. Murthy, Thibault Chervy

Abstract: Excitons -- bound electron-hole pairs -- play a central role in light-matter interaction phenomena, and are crucial for wide-ranging applications from light harvesting and generation to quantum information processing. A long-standing challenge in solid-state optics has been to achieve precise and scalable control over the quantum mechanical state of excitons in semiconductor heterostructures. Here… ▽ More Excitons -- bound electron-hole pairs -- play a central role in light-matter interaction phenomena, and are crucial for wide-ranging applications from light harvesting and generation to quantum information processing. A long-standing challenge in solid-state optics has been to achieve precise and scalable control over the quantum mechanical state of excitons in semiconductor heterostructures. Here, we demonstrate a technique for creating tailored and tunable potential landscapes for optically active excitons in 2D semiconductors that enables in-situ wavefunction shaping at the nanoscopic lengthscale. Using nanostructured gate electrodes, we create localized electrostatic traps for excitons in diverse geometries such as quantum dots and rings, and arrays thereof. We show independent spectral tuning of multiple spatially separated quantum dots, which allows us to bring them to degeneracy despite material disorder. Owing to the strong light-matter coupling of excitons in 2D semiconductors, we observe unambiguous signatures of confined exciton wavefunctions in optical reflection and photoluminescence measurements. Our work introduces a new approach to engineering exciton dynamics and interactions at the nanometer scale, with implications for novel optoelectronic devices, topological photonics, and many-body quantum nonlinear optics. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2305.01022 [pdf]

First Direct Observation of Nanometer size Hydride Precipitations on Superconducting Niobium

Authors: Zu Hawn Sung, Arely Cano, Akshay Murthy, Evguenia Karapetrova, Jaeyel Lee, Martina Martinello, Anna Grassellino, Alexander Romanenko

Abstract: Superconducting niobium serves as a key enabling material for superconducting radio frequency (SRF) technology as well as quantum computing devices. At room temperature, hydrogen commonly occupies tetragonal sites in the Nb lattice as metal (M)-gas (H) phase. When the temperature is decreased, however, solid solution of Nb-H starts to be precipitated. In this study, we show the first identified to… ▽ More Superconducting niobium serves as a key enabling material for superconducting radio frequency (SRF) technology as well as quantum computing devices. At room temperature, hydrogen commonly occupies tetragonal sites in the Nb lattice as metal (M)-gas (H) phase. When the temperature is decreased, however, solid solution of Nb-H starts to be precipitated. In this study, we show the first identified topographical features associated with nanometer-size hydride phase (Nb1-xHx) precipitates on metallic superconducting niobium using cryogenic-atomic force microscopy (AFM). Further, high energy grazing incidence X-ray diffraction reveals information regarding the structure and stoichiometry that these precipitates exhibit. Finally, through time-of-flight secondary ion mass spectroscopy (ToF-SIMS), we are able to locate atomic hydrogen sources near the top surface. This systematic study further explains localized degradation of RF superconductivity by the proximity effect due to hydrogen clusters. △ Less

Submitted 1 May, 2023; originally announced May 2023.

arXiv:2304.13257 [pdf, other]

doi 10.1038/s41534-024-00840-x

Systematic Improvements in Transmon Qubit Coherence Enabled by Niobium Surface Encapsulation

Authors: Mustafa Bal, Akshay A. Murthy, Shaojiang Zhu, Francesco Crisa, Xinyuan You, Ziwen Huang, Tanay Roy, Jaeyel Lee, David van Zanten, Roman Pilipenko, Ivan Nekrashevich, Andrei Lunin, Daniel Bafia, Yulia Krasnikova, Cameron J. Kopas, Ella O. Lachman, Duncan Miller, Josh Y. Mutus, Matthew J. Reagor, Hilal Cansizoglu, Jayss Marshall, David P. Pappas, Kim Vu, Kameshwar Yadavalli, Jin-Su Oh , et al. (15 additional authors not shown)

Abstract: We present a novel transmon qubit fabrication technique that yields systematic improvements in T$_1$ relaxation times. We fabricate devices using an encapsulation strategy that involves passivating the surface of niobium and thereby preventing the formation of its lossy surface oxide. By maintaining the same superconducting metal and only varying the surface structure, this comparative investigati… ▽ More We present a novel transmon qubit fabrication technique that yields systematic improvements in T$_1$ relaxation times. We fabricate devices using an encapsulation strategy that involves passivating the surface of niobium and thereby preventing the formation of its lossy surface oxide. By maintaining the same superconducting metal and only varying the surface structure, this comparative investigation examining different capping materials, such as tantalum, aluminum, titanium nitride, and gold, and film substrates across different qubit foundries definitively demonstrates the detrimental impact that niobium oxides have on the coherence times of superconducting qubits, compared to native oxides of tantalum, aluminum or titanium nitride. Our surface-encapsulated niobium qubit devices exhibit T$_1$ relaxation times 2 to 5 times longer than baseline niobium qubit devices with native niobium oxides. When capping niobium with tantalum, we obtain median qubit lifetimes above 300 microseconds, with maximum values up to 600 microseconds, that represent the highest lifetimes to date for superconducting qubits prepared on both sapphire and silicon. Our comparative structural and chemical analysis suggests why amorphous niobium oxides may induce higher losses compared to other amorphous oxides. These results are in line with high-accuracy measurements of the niobium oxide loss tangent obtained with ultra-high Q superconducting radiofrequency (SRF) cavities. This new surface encapsulation strategy enables even further reduction of dielectric losses via passivation with ambient-stable materials, while preserving fabrication and scalable manufacturability thanks to the compatibility with silicon processes. △ Less

Submitted 24 January, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

Journal ref: npj Quantum Inf 10, 43 (2024)

arXiv:2303.07130 [pdf, other]

Enhancing COVID-19 Severity Analysis through Ensemble Methods

Authors: Anand Thyagachandran, Hema A Murthy

Abstract: Computed Tomography (CT) scans provide a detailed image of the lungs, allowing clinicians to observe the extent of damage caused by COVID-19. The CT severity score (CTSS) based scoring method is used to identify the extent of lung involvement observed on a CT scan. This paper presents a domain knowledge-based pipeline for extracting regions of infection in COVID-19 patients using a combination of… ▽ More Computed Tomography (CT) scans provide a detailed image of the lungs, allowing clinicians to observe the extent of damage caused by COVID-19. The CT severity score (CTSS) based scoring method is used to identify the extent of lung involvement observed on a CT scan. This paper presents a domain knowledge-based pipeline for extracting regions of infection in COVID-19 patients using a combination of image-processing algorithms and a pre-trained UNET model. The severity of the infection is then classified into different categories using an ensemble of three machine-learning models: Extreme Gradient Boosting, Extremely Randomized Trees, and Support Vector Machine. The proposed system was evaluated on a validation dataset in the AI-Enabled Medical Image Analysis Workshop and COVID-19 Diagnosis Competition (AI-MIA-COV19D) and achieved a macro F1 score of 64%. These results demonstrate the potential of combining domain knowledge with machine learning techniques for accurate COVID-19 diagnosis using CT scans. The implementation of the proposed system for severity analysis is available at \textit{https://github.com/aanandt/Enhancing-COVID-19-Severity-Analysis-through-Ensemble-Methods.git } △ Less

Submitted 17 March, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

arXiv:2302.06227 [pdf, other]

Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages

Authors: Sudhanshu Srivastava, Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

Abstract: Hidden-Markov-model (HMM) based text-to-speech (HTS) offers flexibility in speaking styles along with fast training and synthesis while being computationally less intense. HTS performs well even in low-resource scenarios. The primary drawback is that the voice quality is poor compared to that of E2E systems. A hybrid approach combining HMM-based feature generation and neural-network-based HiFi-GAN… ▽ More Hidden-Markov-model (HMM) based text-to-speech (HTS) offers flexibility in speaking styles along with fast training and synthesis while being computationally less intense. HTS performs well even in low-resource scenarios. The primary drawback is that the voice quality is poor compared to that of E2E systems. A hybrid approach combining HMM-based feature generation and neural-network-based HiFi-GAN vocoder to improve HTS synthesis quality is proposed. HTS is trained on high-resolution mel-spectrograms instead of conventional mel generalized coefficients (MGC), and the output mel-spectrogram corresponding to the input text is used in a HiFi-GAN vocoder trained on Indic languages, to produce naturalness that is equivalent to that of E2E systems, as evidenced from the DMOS and PC tests. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: 5 pages, 5 figures

arXiv:2212.11982 [pdf, other]

HMM-based data augmentation for E2E systems for building conversational speech synthesis systems

Authors: Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

Abstract: This paper proposes an approach to build a high-quality text-to-speech (TTS) system for technical domains using data augmentation. An end-to-end (E2E) system is trained on hidden Markov model (HMM) based synthesized speech and further fine-tuned with studio-recorded TTS data to improve the timbre of the synthesized voice. The motivation behind the work is that issues of word skips and repetitions… ▽ More This paper proposes an approach to build a high-quality text-to-speech (TTS) system for technical domains using data augmentation. An end-to-end (E2E) system is trained on hidden Markov model (HMM) based synthesized speech and further fine-tuned with studio-recorded TTS data to improve the timbre of the synthesized voice. The motivation behind the work is that issues of word skips and repetitions are usually absent in HMM systems due to their ability to model the duration distribution of phonemes accurately. Context-dependent pentaphone modeling, along with tree-based clustering and state-tying, takes care of unseen context and out-of-vocabulary words. A language model is also employed to reduce synthesis errors further. Subjective evaluations indicate that speech produced using the proposed system is superior to the baseline E2E synthesis approach in terms of intelligibility when combining complementing attributes from HMM and E2E frameworks. The further analysis highlights the proposed approach's efficacy in low-resource scenarios. △ Less

Submitted 22 December, 2022; originally announced December 2022.

Comments: 6 pages, 7 figures, 33 references

arXiv:2212.07419 [pdf, other]

Resonantly enhanced superconductivity mediated by spinor condensates

Authors: Giacomo Bighin, Puneet A. Murthy, Nicolò Defenu, Tilman Enss

Abstract: Achieving strong interactions in fermionic many-body systems is a major theme of research in condensed matter physics. It is well-known that interactions between fermions can be mediated through a bosonic medium, such as a phonon bath or Bose-Einstein condensate (BEC). Here, we show that such induced attraction can be resonantly enhanced when the bosonic medium is a two-component spinor BEC. The s… ▽ More Achieving strong interactions in fermionic many-body systems is a major theme of research in condensed matter physics. It is well-known that interactions between fermions can be mediated through a bosonic medium, such as a phonon bath or Bose-Einstein condensate (BEC). Here, we show that such induced attraction can be resonantly enhanced when the bosonic medium is a two-component spinor BEC. The strongest interaction is achieved by tuning the boson-boson scattering to the quantum critical spinodal point of the BEC where the sound velocity vanishes. The fermion pairing gap and the superconducting critical temperature can thus be dramatically enhanced. We propose two experimental realizations of this scenario, with exciton-polariton systems in two-dimensional semiconductors and ultracold atomic Bose-Fermi mixtures. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 8 pages, 3 figures

arXiv:2211.08790 [pdf, other]

Structural Segmentation and Labeling of Tabla Solo Performances

Authors: Gowriprasad R, R Aravind, Hema A Murthy

Abstract: Tabla is a North Indian percussion instrument used as an accompaniment and an exclusive instrument for solo performances. Tabla solo is intricate and elaborate, exhibiting rhythmic evolution through a sequence of homogeneous sections marked by shared rhythmic characteristics. Each section has a specific structure and name associated with it. Tabla learning and performance in the Indian subcontinen… ▽ More Tabla is a North Indian percussion instrument used as an accompaniment and an exclusive instrument for solo performances. Tabla solo is intricate and elaborate, exhibiting rhythmic evolution through a sequence of homogeneous sections marked by shared rhythmic characteristics. Each section has a specific structure and name associated with it. Tabla learning and performance in the Indian subcontinent is based on stylistic schools called gharana-s. Several compositions by various composers from different gharana-s are played in each section. This paper addresses the task of segmenting the tabla solo concert into musically meaningful sections. We then assign suitable section labels and recognize gharana-s from the sections. We present a diverse collection of over 38 hours of solo tabla recordings for the task. We motivate the problem and present different challenges and facets of the tasks. Inspired by the distinct musical properties of tabla solo, we compute several rhythmic and timbral features for the segmentation task. This work explores the approach of automatically locating the significant changes in the rhythmic structure by analyzing local self-similarity in an unsupervised manner. We also explore supervised random forest and a convolutional neural network trained on hand-crafted features. Both supervised and unsupervised approaches are also tested on a set of held-out recordings. Segmentation of an audio piece into its structural components and labeling is crucial to many music information retrieval applications like repetitive structure finding, audio summarization, and fast music navigation. This work helps us obtain a comprehensive musical description of the tabla solo concert. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: 35 pages, 11 figures

arXiv:2211.02787 [pdf, other]

One-point asymptotics for half-flat ASEP

Authors: Evgeni Dimitrov, Anushka Murthy

Abstract: We consider the asymmetric simple exclusion process (ASEP) with half-flat initial condition. We show that the one-point marginals of the ASEP height function are described by those of the $\mbox{Airy}_{2 \rightarrow 1}$ process, introduced by Borodin-Ferrari-Sasamoto in (Commun. Pure Appl. Math., 61, 1603-1629, 2008). This result was conjectured by Ortmann-Quastel-Remenik (Ann. Appl. Probab., 26,… ▽ More We consider the asymmetric simple exclusion process (ASEP) with half-flat initial condition. We show that the one-point marginals of the ASEP height function are described by those of the $\mbox{Airy}_{2 \rightarrow 1}$ process, introduced by Borodin-Ferrari-Sasamoto in (Commun. Pure Appl. Math., 61, 1603-1629, 2008). This result was conjectured by Ortmann-Quastel-Remenik (Ann. Appl. Probab., 26, 507-548), based on an informal asymptotic analysis of exact formulas for generating functions of the half-flat ASEP height function at one spatial point. Our present work provides a fully rigorous derivation and asymptotic analysis of the same generating functions, under certain parameter restrictions of the model. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: 39 pages, 2 figures

MSC Class: 82B20; 60K35

arXiv:2211.01603 [pdf, other]

Using Signal Processing in Tandem With Adapted Mixture Models for Classifying Genomic Signals

Authors: Saish Jaiswal, Shreya Nema, Hema A Murthy, Manikandan Narayanan

Abstract: Genomic signal processing has been used successfully in bioinformatics to analyze biomolecular sequences and gain varied insights into DNA structure, gene organization, protein binding, sequence evolution, etc. But challenges remain in finding the appropriate spectral representation of a biomolecular sequence, especially when multiple variable-length sequences need to be handled consistently. In t… ▽ More Genomic signal processing has been used successfully in bioinformatics to analyze biomolecular sequences and gain varied insights into DNA structure, gene organization, protein binding, sequence evolution, etc. But challenges remain in finding the appropriate spectral representation of a biomolecular sequence, especially when multiple variable-length sequences need to be handled consistently. In this study, we address this challenge in the context of the well-studied problem of classifying genomic sequences into different taxonomic units (strain, phyla, order, etc.). We propose a novel technique that employs signal processing in tandem with Gaussian mixture models to improve the spectral representation of a sequence and subsequently the taxonomic classification accuracies. The sequences are first transformed into spectra, and projected to a subspace, where sequences belonging to different taxons are better distinguishable. Our method outperforms a similar state-of-the-art method on established benchmark datasets by an absolute margin of 6.06% accuracy. △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2210.17153 [pdf, other]

The Importance of Accurate Alignments in End-to-End Speech Synthesis

Authors: Anusha Prakash, Hema A Murthy

Abstract: Unit selection synthesis systems required accurate segmentation and labeling of the speech signal owing to the concatenative nature. Hidden Markov model-based speech synthesis accommodates some transcription errors, but it was later shown that accurate transcriptions yield highly intelligible speech with smaller amounts of training data. With the arrival of end-to-end (E2E) systems, it was observe… ▽ More Unit selection synthesis systems required accurate segmentation and labeling of the speech signal owing to the concatenative nature. Hidden Markov model-based speech synthesis accommodates some transcription errors, but it was later shown that accurate transcriptions yield highly intelligible speech with smaller amounts of training data. With the arrival of end-to-end (E2E) systems, it was observed that very good quality speech could be synthesised with large amounts of data. As end-to-end synthesis progressed from Tacotron to FastSpeech2, it has become imminent that features that represent prosody are important for good-quality synthesis. In particular, durations of the sub-word units are important. Variants of FastSpeech use a teacher model or forced alignments to obtain good-quality synthesis. In this paper, we focus on duration prediction, using signal processing cues in tandem with forced alignment to produce accurate phone durations during training. The current work aims to highlight the importance of accurate alignments for good-quality synthesis. An attempt is made to train the E2E systems with accurately labeled data, and compare the same with approximately labeled data. △ Less

Submitted 31 October, 2022; originally announced October 2022.

Comments: Version 1 uploaded

arXiv:2207.13024 [pdf, other]

High quality superconducting Nb co-planar resonators on sapphire substrate

Authors: S. Zhu, F. Crisa, M. Bal, A. A. Murthy, J. Lee, Z. Sung, A. Lunin, D. Frolov, R. Pilipenko, D. Bafia, A. Mitra, A. Romanenko, A. Grassellino

Abstract: We present measurements and simulations of superconducting Nb co-planar waveguide resonators on sapphire substrate down to millikelvin temperature range with different readout powers. In the high temperature regime, we demonstrate that the Nb film residual surface resistance is comparable to that observed in the ultra-high quality, bulk Nb 3D superconducting radio frequency cavities while the reso… ▽ More We present measurements and simulations of superconducting Nb co-planar waveguide resonators on sapphire substrate down to millikelvin temperature range with different readout powers. In the high temperature regime, we demonstrate that the Nb film residual surface resistance is comparable to that observed in the ultra-high quality, bulk Nb 3D superconducting radio frequency cavities while the resonator quality is dominated by the BCS thermally excited quasiparticles. At low temperature both the resonator quality factor and frequency can be well explained using the two-level system models. Through the energy participation ratio simulations, we find that the two-level system loss tangent is $\sim 10^{-2}$, which agrees quite well with similar studies performed on the Nb 3D cavities. △ Less

Submitted 26 July, 2022; originally announced July 2022.

arXiv:2207.12495 [pdf, other]

Stress-induced omega phase transition in Nb thin films for superconducting qubits

Authors: Jaeyel Lee, Zuhawn Sung, Akshay A. Murthy, Anna Grassellino, Alex Romanenko

Abstract: We report the observation of omega phase formation in Nb thin films deposited by high-power impulse magnetron sputtering (HiPIMS) for superconducting qubits using transmission electron microscopy (TEM). We hypothesize that this phase transformation to the omega phase with hexagonal structure from bcc phase as well as the formation of {111}<112> mechanical twins is induced by internal stress in the… ▽ More We report the observation of omega phase formation in Nb thin films deposited by high-power impulse magnetron sputtering (HiPIMS) for superconducting qubits using transmission electron microscopy (TEM). We hypothesize that this phase transformation to the omega phase with hexagonal structure from bcc phase as well as the formation of {111}<112> mechanical twins is induced by internal stress in the Nb thin films. In terms of lateral dimensions, the size of the omega phase of Nb range from 10 to 100 nm, which is comparable to the coherence length of Nb (~40 nm). In terms of overall volume fraction, ~1 vol.% of the Nb grains exhibit this omega phase. We also find that the omega phase in Nb is not observed in large grain Nb samples, suggesting that the phase transition can be suppressed through reducing the grain boundary density, which may serve as a source of strain and dislocations in this system. The current finding may indicate that the Nb thin film is prone to the omega phase transition due to the internal stress in the Nb thin film. We conclude by discussing effects of the omega phase on the superconducting properties of Nb thin films and discussing pathways to mitigate their formation. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: 5 pages, 4 figures

arXiv:2204.08605 [pdf, other]

Quantum computing hardware for HEP algorithms and sensing

Authors: M. Sohaib Alam, Sergey Belomestnykh, Nicholas Bornman, Gustavo Cancelo, Yu-Chiu Chao, Mattia Checchin, Vinh San Dinh, Anna Grassellino, Erik J. Gustafson, Roni Harnik, Corey Rae Harrington McRae, Ziwen Huang, Keshav Kapoor, Taeyoon Kim, James B. Kowalkowski, Matthew J. Kramer, Yulia Krasnikova, Prem Kumar, Doga Murat Kurkcuoglu, Henry Lamm, Adam L. Lyon, Despina Milathianaki, Akshay Murthy, Josh Mutus, Ivan Nekrashevich , et al. (15 additional authors not shown)

Abstract: Quantum information science harnesses the principles of quantum mechanics to realize computational algorithms with complexities vastly intractable by current computer platforms. Typical applications range from quantum chemistry to optimization problems and also include simulations for high energy physics. The recent maturing of quantum hardware has triggered preliminary explorations by several ins… ▽ More Quantum information science harnesses the principles of quantum mechanics to realize computational algorithms with complexities vastly intractable by current computer platforms. Typical applications range from quantum chemistry to optimization problems and also include simulations for high energy physics. The recent maturing of quantum hardware has triggered preliminary explorations by several institutions (including Fermilab) of quantum hardware capable of demonstrating quantum advantage in multiple domains, from quantum computing to communications, to sensing. The Superconducting Quantum Materials and Systems (SQMS) Center, led by Fermilab, is dedicated to providing breakthroughs in quantum computing and sensing, mediating quantum engineering and HEP based material science. The main goal of the Center is to deploy quantum systems with superior performance tailored to the algorithms used in high energy physics. In this Snowmass paper, we discuss the two most promising superconducting quantum architectures for HEP algorithms, i.e. three-level systems (qutrits) supported by transmon devices coupled to planar devices and multi-level systems (qudits with arbitrary N energy levels) supported by superconducting 3D cavities. For each architecture, we demonstrate exemplary HEP algorithms and identify the current challenges, ongoing work and future opportunities. Furthermore, we discuss the prospects and complexities of interconnecting the different architectures and individual computational nodes. Finally, we review several different strategies of error protection and correction and discuss their potential to improve the performance of the two architectures. This whitepaper seeks to reach out to the HEP community and drive progress in both HEP research and QIS hardware. △ Less

Submitted 29 April, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

Comments: contribution to Snowmass 2021

Report number: FERMILAB-PUB-22-260-SQMS

arXiv:2203.08710 [pdf, other]

Developing a Chemical and Structural Understanding of the Surface Oxide in a Niobium Superconducting Qubit

Authors: Akshay A. Murthy, Paul Masih Das, Stephanie M. Ribet, Cameron Kopas, Jaeyel Lee, Matthew J. Reagor, Lin Zhou, Matthew J. Kramer, Mark C. Hersam, Mattia Checchin, Anna Grassellino, Roberto dos Reis, Vinayak P. Dravid, Alexander Romanenko

Abstract: Superconducting thin films of niobium have been extensively employed in transmon qubit architectures. Although these architectures have demonstrated remarkable improvements in recent years, further improvements in performance through materials engineering will aid in large-scale deployment. Here, we use information retrieved from secondary ion mass spectrometry and electron microscopy to conduct a… ▽ More Superconducting thin films of niobium have been extensively employed in transmon qubit architectures. Although these architectures have demonstrated remarkable improvements in recent years, further improvements in performance through materials engineering will aid in large-scale deployment. Here, we use information retrieved from secondary ion mass spectrometry and electron microscopy to conduct a detailed assessment of the surface oxide that forms in ambient conditions for transmon test qubit devices patterned from a niobium film. We observe that this oxide exhibits a varying stoichiometry with NbO and NbO$_2$ found closer to the niobium film and Nb$_2$O$_5$ found closer to the surface. In terms of structural analysis, we find that the Nb$_2$O$_5$ region is semicrystalline in nature and exhibits randomly oriented grains on the order of 1-2 nm corresponding to monoclinic N-Nb$_2$O$_5$ that are dispersed throughout an amorphous matrix. Using fluctuation electron microscopy, we are able to map the relative crystallinity in the Nb$_2$O$_5$ region with nanometer spatial resolution. Through this correlative method, we observe that amorphous regions are more likely to contain oxygen vacancies and exhibit weaker bonds between the niobium and oxygen atoms. Based on these findings, we expect that oxygen vacancies likely serve as a decoherence mechanism in quantum systems. △ Less

Submitted 28 July, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

Comments: 13 pages, 4 figures

arXiv:2108.13539 [pdf, other]

doi 10.1063/5.0079321

TOF-SIMS Analysis of Decoherence Sources in Nb Superconducting Resonators

Authors: Akshay A. Murthy, Jae-Yel Lee, Cameron Kopas, Matthew J. Reagor, Anthony P. McFadden, David P. Pappas, Mattia Checchin, Anna Grassellino, Alexander Romanenko

Abstract: Superconducting qubits have emerged as a potentially foundational platform technology for addressing complex computational problems deemed intractable with classical computing. Despite recent advances enabling multiqubit designs that exhibit coherence lifetimes on the order of hundreds of $μ$s, material quality and interfacial structures continue to curb device performance. When niobium is deploye… ▽ More Superconducting qubits have emerged as a potentially foundational platform technology for addressing complex computational problems deemed intractable with classical computing. Despite recent advances enabling multiqubit designs that exhibit coherence lifetimes on the order of hundreds of $μ$s, material quality and interfacial structures continue to curb device performance. When niobium is deployed as the superconducting material, two-level system defects in the thin film and adjacent dielectric regions introduce stochastic noise and dissipate electromagnetic energy at the cryogenic operating temperatures. In this study, we utilize time-of-flight secondary ion mass spectrometry (TOF-SIMS) to understand the role specific fabrication procedures play in introducing such dissipation mechanisms in these complex systems. We interrogated Nb thin films and transmon qubit structures fabricated by Rigetti Computing and at the National Institute of Standards and Technology through slight variations in the processing and vacuum conditions. We find that when Nb film is sputtered onto the Si substrate, oxide and silicide regions are generated at various interfaces. We also observe that impurity species such as niobium hydrides and carbides are incorporated within the niobium layer during the subsequent lithographic patterning steps. The formation of these resistive compounds likely impact the superconducting properties of the Nb thin film. Additionally, we observe the presence of halogen species distributed throughout the patterned thin films. We conclude by hypothesizing the source of such impurities in these structures in an effort to intelligently fabricate superconducting qubits and extend coherence times moving forward. △ Less

Submitted 30 August, 2021; originally announced August 2021.

Comments: 7 pages, 4 figures

arXiv:2108.13352 [pdf, other]

Oxygen Vacancies in Niobium Pentoxide as a Source of Two-Level System Losses in Superconducting Niobium

Authors: Daniel Bafia, Akshay Murthy, Anna Grassellino, Alexander Romanenko

Abstract: We identify a major source of quantum decoherence in three-dimensional superconducting radio-frequency (SRF) resonators and two-dimensional transmon qubits composed of oxidized niobium: oxygen vacancies in the niobium pentoxide which drive two-level system (TLS) losses. By probing the effect of sequential \textit{in situ} vacuum baking treatments on the RF performance of bulk Nb SRF resonators and… ▽ More We identify a major source of quantum decoherence in three-dimensional superconducting radio-frequency (SRF) resonators and two-dimensional transmon qubits composed of oxidized niobium: oxygen vacancies in the niobium pentoxide which drive two-level system (TLS) losses. By probing the effect of sequential \textit{in situ} vacuum baking treatments on the RF performance of bulk Nb SRF resonators and on the oxide structure of a representative Nb sample using time-of-flight secondary ion mass spectrometry (ToF-SIMS), we find a non-monotonic evolution of cavity quality factor $Q_0$ which correlates with the interplay of Nb\textsubscript{2}O\textsubscript{5} vacancy generation and oxide thickness reduction. We localize this effect to the oxide itself and present the insignificant role of diffused interstitial oxygen in the underlying Nb by regrowing a new oxide \textit{via} wet oxidation which reveals a mitigation of aggravated TLS losses. We hypothesize that such vacancies in the pentoxide serve as magnetic impurities and are a source of TLS-driven RF loss. △ Less

Submitted 26 July, 2024; v1 submitted 30 August, 2021; originally announced August 2021.

arXiv:2108.10385 [pdf]

Discovery of Nb hydride precipitates in superconducting qubits

Authors: Jaeyel Lee, Zuhawn Sung, Akshay A. Murthy, Matt Reagor, Anna Grassellino, Alexander Romanenko

Abstract: We report the first evidence of the formation of niobium hydrides within niobium films on silicon substrates in superconducting qubits fabricated at Rigetti Computing. We combine complementary techniques including room and cryogenic temperature atomic scale high-resolution and scanning transmission electron microscopy (HR-TEM and STEM), atomic force microscopy (AFM), and the time-of-flight seconda… ▽ More We report the first evidence of the formation of niobium hydrides within niobium films on silicon substrates in superconducting qubits fabricated at Rigetti Computing. We combine complementary techniques including room and cryogenic temperature atomic scale high-resolution and scanning transmission electron microscopy (HR-TEM and STEM), atomic force microscopy (AFM), and the time-of-flight secondary ion mass spectroscopy (TOF-SIMS) to reveal the existence of the niobium hydride precipitates directly in the Rigetti chip areas. Electron diffraction and high-resolution transmission electron microscopy (HR-TEM) analyses are performed at room and cryogenic temperatures (~106 K) on superconducting qubit niobium film areas, and reveal the formation of three types of Nb hydride domains with different crystalline orientations and atomic structures. There is also variation in their size and morphology from small (~5 nm) irregular shape domains within the Nb grains to large (~10-100 nm) Nb grains fully converted to niobium hydride. As niobium hydrides are non-superconducting and can easily change in size and location upon different cooldowns to cryogenic temperatures, our findings highlight a new previously unknown source of decoherence in superconducting qubits, contributing to both quasiparticle and two-level system (TLS) losses, and offering a potential explanation for qubit performance changes upon cooldowns. A pathway to mitigate the formation of the Nb hydrides for superconducting qubit applications is also discussed. △ Less

Submitted 26 September, 2023; v1 submitted 23 August, 2021; originally announced August 2021.

arXiv:2108.02517 [pdf, other]

Multi-task Federated Edge Learning (MtFEEL) in Wireless Networks

Authors: Sawan Singh Mahara, Shruti M., B. N. Bharath, Akash Murthy

Abstract: Federated Learning (FL) has evolved as a promising technique to handle distributed machine learning across edge devices. A single neural network (NN) that optimises a global objective is generally learned in most work in FL, which could be suboptimal for edge devices. Although works finding a NN personalised for edge device specific tasks exist, they lack generalisation and/or convergence guarante… ▽ More Federated Learning (FL) has evolved as a promising technique to handle distributed machine learning across edge devices. A single neural network (NN) that optimises a global objective is generally learned in most work in FL, which could be suboptimal for edge devices. Although works finding a NN personalised for edge device specific tasks exist, they lack generalisation and/or convergence guarantees. In this paper, a novel communication efficient FL algorithm for personalised learning in a wireless setting with guarantees is presented. The algorithm relies on finding a ``better`` empirical estimate of losses at each device, using a weighted average of the losses across different devices. It is devised from a Probably Approximately Correct (PAC) bound on the true loss in terms of the proposed empirical loss and is bounded by (i) the Rademacher complexity, (ii) the discrepancy, (iii) and a penalty term. Using a signed gradient feedback to find a personalised NN at each device, it is also proven to converge in a Rayleigh flat fading (in the uplink) channel, at a rate of the order max{1/SNR,1/sqrt(T)} Experimental results show that the proposed algorithm outperforms locally trained devices as well as the conventionally used FedAvg and FedSGD algorithms under practical SNR regimes. △ Less

Submitted 9 March, 2022; v1 submitted 5 August, 2021; originally announced August 2021.

arXiv:2105.12987 [pdf]

Freeform nanostructuring of hexagonal boron nitride

Authors: Nolan Lassaline, Deepankur Thureja, Thibault Chervy, Daniel Petter, Puneet A. Murthy, Armin W. Knoll, David J. Norris

Abstract: Hexagonal boron nitride (hBN)-long-known as a thermally stable ceramic-is now available as atomically smooth, single-crystalline flakes, revolutionizing its use in optoelectronics. For nanophotonics, these flakes offer strong nonlinearities, hyperbolic dispersion, and single-photon emission, providing unique properties for optical and quantum-optical applications. For nanoelectronics, their pristi… ▽ More Hexagonal boron nitride (hBN)-long-known as a thermally stable ceramic-is now available as atomically smooth, single-crystalline flakes, revolutionizing its use in optoelectronics. For nanophotonics, these flakes offer strong nonlinearities, hyperbolic dispersion, and single-photon emission, providing unique properties for optical and quantum-optical applications. For nanoelectronics, their pristine surfaces, chemical stability, and wide bandgap have made them the key substrate, encapsulant, and gate dielectric for two-dimensional electronic devices. However, while exploring these advantages, researchers have been restricted to flat flakes or those patterned with basic slits and holes, severely limiting advanced architectures. If freely varying flake profiles were possible, the hBN structure would present a powerful design parameter to further manipulate the flow of photons, electrons, and excitons in next-generation devices. Here, we demonstrate freeform nanostructuring of hBN by combining thermal scanning-probe lithography and reactive-ion etching to shape flakes with surprising fidelity. We leverage sub-nanometer height control and high spatial resolution to produce previously unattainable flake structures for a broad range of optoelectronic applications. For photonics, we fabricate microelements and show the straightforward transfer and integration of such elements by placing a spherical hBN microlens between two planar mirrors to obtain a stable, high-quality optical microcavity. We then decrease the patterning length scale to introduce Fourier surfaces for electrons, creating sophisticated, high-resolution landscapes in hBN, offering new possibilities for strain and band-structure engineering. These capabilities can advance the discovery and exploitation of emerging phenomena in hyperbolic metamaterials, polaritonics, twistronics, quantum materials, and 2D optoelectronic devices. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2103.03215 [pdf, other]

Front-end Diarization for Percussion Separation in Taniavartanam of Carnatic Music Concerts

Authors: Nauman Dawalatabad, Jilt Sebastian, Jom Kuriakose, C. Chandra Sekhar, Shrikanth Narayanan, Hema A. Murthy

Abstract: Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source se… ▽ More Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source separation algorithms assume the presence of multiple percussive voices throughout the audio segment. We prevent this by first subjecting the taniavartanam to diarization. This process results in homogeneous clusters consisting of segments of either a single voice or multiple voices. A cluster of segments with multiple voices is identified using the Gaussian mixture model (GMM), which is then subjected to source separation. A deep recurrent neural network (DRNN) based approach is used to separate the multiple instrument segments. The effectiveness of the proposed system is evaluated on a standard Carnatic music dataset. The proposed approach provides close-to-oracle performance for non-overlapping segments and a significant improvement over traditional separation schemes. △ Less

Submitted 4 March, 2021; originally announced March 2021.

arXiv:2102.08989 [pdf, other]

Tunable quantum confinement of neutral excitons using electric fields and exciton-charge interactions

Authors: Deepankur Thureja, Atac Imamoglu, Tomasz Smolenski, Alexander Popert, Thibault Chervy, Xiaobo Lu, Song Liu, Katayun Barmak, Kenji Watanabe, Takashi Taniguchi, David J. Norris, Martin Kroner, Puneet A. Murthy

Abstract: Quantum confinement is the discretization of energy when motion of particles is restricted to length scales smaller than their de Broglie wavelength. The experimental realization of this effect has had wide ranging impact in diverse fields of physics and facilitated the development of new technologies. In semiconductor physics, quantum confinement of optically excited quasiparticles, such as excit… ▽ More Quantum confinement is the discretization of energy when motion of particles is restricted to length scales smaller than their de Broglie wavelength. The experimental realization of this effect has had wide ranging impact in diverse fields of physics and facilitated the development of new technologies. In semiconductor physics, quantum confinement of optically excited quasiparticles, such as excitons or trions, is typically achieved by modulation of material properties - an approach crucially limited by the lack of insitu tunability and scalability of confining potentials. Achieving fully tunable quantum confinement of optical excitations has therefore been an outstanding goal in quantum photonics. Here, we demonstrate electrically controlled quantum confinement of neutral excitons in a gate-defined monolayer p-i-n diode. A combination of dc Stark shift induced by large in-plane fields and a previously unknown confining mechanism based on repulsive interaction between excitons and free charges ensures tight exciton confinement in the narrow neutral region. Quantization of exciton motion manifests in multiple discrete, spectrally narrow, voltage-dependent optical resonances that emerge below the free exciton resonance. Our measurements reveal several unique physical features of these quantum confined excitons, including an in-plane dipolar character, one-dimensional center-of-mass confinement, and strikingly enhanced exciton size in the presence of magnetic fields. Our method provides an experimental route towards creating scalable arrays of identical single photon sources, which will constitute building blocks of strongly correlated photonic systems. △ Less

Submitted 11 January, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

arXiv:2012.13842 [pdf, other]

doi 10.1021/acs.nanolett.1c01636

Spatial Mapping of Electrostatics and Dynamics across 2D Heterostructures

Authors: Akshay A. Murthy, Stephanie M. Ribet, Teodor K. Stanev, Pufan Liu, Kenji Watanabe, Takashi Taniguchi, Nathaniel P. Stern, Roberto dos Reis, Vinayak P. Dravid

Abstract: In situ electron microscopy is a key tool for understanding the mechanisms driving novel phenomena in 2D structures. Unfortunately, due to various practical challenges, technologically relevant 2D heterostructures prove challenging to address with electron microscopy. Here, we use the differential phase contrast imaging technique to build a methodology for probing local electrostatic fields during… ▽ More In situ electron microscopy is a key tool for understanding the mechanisms driving novel phenomena in 2D structures. Unfortunately, due to various practical challenges, technologically relevant 2D heterostructures prove challenging to address with electron microscopy. Here, we use the differential phase contrast imaging technique to build a methodology for probing local electrostatic fields during electrical operation with nanoscale precision in such materials. We find that by combining a traditional DPC setup with a high pass filter, we can largely eliminate electric fluctuations emanating from short-range atomic potentials. With this method, a priori electric field expectations can be directly compared with experimentally derived values to readily identify inhomogeneities and potentially problematic regions. We use this platform to analyze the electric field and charge density distribution across layers of hBN and MoS2. △ Less

Submitted 22 April, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

Comments: 13 pages, 4 figures

arXiv:2011.07279 [pdf, other]

Towards Zero-Shot Learning with Fewer Seen Class Examples

Authors: Vinay Kumar Verma, Ashish Mishra, Anubha Pandey, Hema A. Murthy, Piyush Rai

Abstract: We present a meta-learning based generative model for zero-shot learning (ZSL) towards a challenging setting when the number of training examples from each \emph{seen} class is very few. This setup contrasts with the conventional ZSL approaches, where training typically assumes the availability of a sufficiently large number of training examples from each of the seen classes. The proposed approach… ▽ More We present a meta-learning based generative model for zero-shot learning (ZSL) towards a challenging setting when the number of training examples from each \emph{seen} class is very few. This setup contrasts with the conventional ZSL approaches, where training typically assumes the availability of a sufficiently large number of training examples from each of the seen classes. The proposed approach leverages meta-learning to train a deep generative model that integrates variational autoencoder and generative adversarial networks. We propose a novel task distribution where meta-train and meta-validation classes are disjoint to simulate the ZSL behaviour in training. Once trained, the model can generate synthetic examples from seen and unseen classes. Synthesize samples can then be used to train the ZSL framework in a supervised manner. The meta-learner enables our model to generates high-fidelity samples using only a small number of training examples from seen classes. We conduct extensive experiments and ablation studies on four benchmark datasets of ZSL and observe that the proposed model outperforms state-of-the-art approaches by a significant margin when the number of examples per seen class is very small. △ Less

Submitted 14 November, 2020; originally announced November 2020.

Comments: Accepted in WACV 2021

arXiv:2011.05292 [pdf, other]

doi 10.1016/j.bspc.2021.102679

A Stochastic Optimal Control Model with Internal Feedback and Velocity Tracking for Saccades

Authors: Varsha V, Aditya Murthy, Radhakant Padhi

Abstract: A stochastic optimal control based model with velocity tracking and internal feedback for saccadic eye movements is presented in this paper. Recent evidence from neurophysiological studies of superior colliculus suggests the presence of a dynamic input to the saccade generation system that encodes saccade velocity, rather than just the saccade amplitude and direction. The new evidence makes it imp… ▽ More A stochastic optimal control based model with velocity tracking and internal feedback for saccadic eye movements is presented in this paper. Recent evidence from neurophysiological studies of superior colliculus suggests the presence of a dynamic input to the saccade generation system that encodes saccade velocity, rather than just the saccade amplitude and direction. The new evidence makes it imperative to test if saccade control can use a desired velocity input which is the basis for the proposed velocity tracking model. The model is validated using behavioral data of saccades generated by healthy human subjects. It generates trajectories of horizontal saccades made to different amplitudes as well as predicts vertical and oblique saccade behavior. This paper presents the first-ever model of the saccadic system in an optimal control framework using an alternate interpretation of velocity-based control, contrary to the dominant end-point based models available in the literature. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: 10 pages, 10 figures

arXiv:2011.02195 [pdf, other]

Correlation based Multi-phasal models for improved imagined speech EEG recognition

Authors: Rini A Sharon, Hema A Murthy

Abstract: Translation of imagined speech electroencephalogram(EEG) into human understandable commands greatly facilitates the design of naturalistic brain computer interfaces. To achieve improved imagined speech unit classification, this work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding… ▽ More Translation of imagined speech electroencephalogram(EEG) into human understandable commands greatly facilitates the design of naturalistic brain computer interfaces. To achieve improved imagined speech unit classification, this work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding to specific speech units. A bi-phase common representation learning module using neural networks is designed to model the correlation and reproducibility between an analysis phase and a support phase. The trained Correlation Network is then employed to extract discriminative features of the analysis phase. These features are further classified into five binary phonological categories using machine learning models such as Gaussian mixture based hidden Markov model and deep neural networks. The proposed approach further handles the non-availability of multi-phasal data during decoding. Topographic visualizations along with result-based inferences suggest that the multi-phasal correlation modelling approach proposed in the paper enhances imagined-speech EEG recognition performance. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Journal ref: Interspeech SMM 2020

arXiv:2010.06304 [pdf, other]

doi 10.1109/TASLP.2020.3036231

Novel Architectures for Unsupervised Information Bottleneck based Speaker Diarization of Meetings

Authors: Nauman Dawalatabad, Srikanth Madikeri, C. Chandra Sekhar, Hema A. Murthy

Abstract: Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In… ▽ More Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In the first part of the work, a varying length segment initialization technique for Information Bottleneck (IB) based speaker diarization system using phoneme rate as the side information is proposed. This initialization distributes speaker information uniformly across the segments and provides a better starting point for IB based clustering. In the second part of the work, we present a Two-Pass Information Bottleneck (TPIB) based speaker diarization system that incorporates speaker discriminative features during the process of diarization. The TPIB based speaker diarization system has shown improvement over the baseline IB based system. During the first pass of the TPIB system, a coarse segmentation is performed using IB based clustering. The alignments obtained are used to generate speaker discriminative features using a shallow feed-forward neural network and linear discriminant analysis. The discriminative features obtained are used in the second pass to obtain the final speaker boundaries. In the final part of the paper, variable segment initialization is combined with the TPIB framework. This leverages the advantages of better segment initialization and speaker discriminative features that results in an additional improvement in performance. An evaluation on standard meeting datasets shows that a significant absolute improvement of 3.9% and 4.7% is obtained on the NIST and AMI datasets, respectively. △ Less

Submitted 13 October, 2020; originally announced October 2020.

Comments: Accepted in IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021, pp 14-27

arXiv:2010.05497 [pdf, other]

The "Sound of Silence" in EEG -- Cognitive voice activity detection

Authors: Rini A Sharon, Hema A Murthy

Abstract: Speech cognition bears potential application as a brain computer interface that can improve the quality of life for the otherwise communication impaired people. While speech and resting state EEG are popularly studied, here we attempt to explore a "non-speech"(NS) state of brain activity corresponding to the silence regions of speech audio. Firstly, speech perception is studied to inspect the exis… ▽ More Speech cognition bears potential application as a brain computer interface that can improve the quality of life for the otherwise communication impaired people. While speech and resting state EEG are popularly studied, here we attempt to explore a "non-speech"(NS) state of brain activity corresponding to the silence regions of speech audio. Firstly, speech perception is studied to inspect the existence of such a state, followed by its identification in speech imagination. Analogous to how voice activity detection is employed to enhance the performance of speech recognition, the EEG state activity detection protocol implemented here is applied to boost the confidence of imagined speech EEG decoding. Classification of speech and NS state is done using two datasets collected from laboratory-based and commercial-based devices. The state sequential information thus obtained is further utilized to reduce the search space of imagined EEG unit recognition. Temporal signal structures and topographic maps of NS states are visualized across subjects and sessions. The recognition performance and the visual distinction observed demonstrates the existence of silence signatures in EEG. △ Less

Submitted 12 October, 2020; originally announced October 2020.

arXiv:2009.04983 [pdf, other]

Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020

Authors: Karthik Pandia D S, Anusha Prakash, Mano Ranjith Kumar, Hema A Murthy

Abstract: A Spoken dialogue system for an unseen language is referred to as Zero resource speech. It is especially beneficial for developing applications for languages that have low digital resources. Zero resource speech synthesis is the task of building text-to-speech (TTS) models in the absence of transcriptions. In this work, speech is modelled as a sequence of transient and steady-state acoustic units,… ▽ More A Spoken dialogue system for an unseen language is referred to as Zero resource speech. It is especially beneficial for developing applications for languages that have low digital resources. Zero resource speech synthesis is the task of building text-to-speech (TTS) models in the absence of transcriptions. In this work, speech is modelled as a sequence of transient and steady-state acoustic units, and a unique set of acoustic units is discovered by iterative training. Using the acoustic unit sequence, TTS models are trained. The main goal of this work is to improve the synthesis quality of zero resource TTS system. Four different systems are proposed. All the systems consist of three stages: unit discovery, followed by unit sequence to spectrogram mapping, and finally spectrogram to speech inversion. Modifications are proposed to the spectrogram mapping stage. These modifications include training the mapping on voice data, using x-vectors to improve the mapping, two-stage learning, and gender-specific modelling. Evaluation of the proposed systems in the Zerospeech 2020 challenge shows that quite good quality synthesis can be achieved. △ Less

Submitted 10 September, 2020; originally announced September 2020.

Comments: Accepted for publication in Interspeech 2020

arXiv:2007.13517 [pdf, other]

doi 10.1109/TIFS.2021.3067998

Evidence of Task-Independent Person-Specific Signatures in EEG using Subspace Techniques

Authors: Mari Ganesh Kumar, Shrikanth Narayanan, Mriganka Sur, Hema A Murthy

Abstract: Electroencephalography (EEG) signals are promising as alternatives to other biometrics owing to their protection against spoofing. Previous studies have focused on capturing individual variability by analyzing task/condition-specific EEG. This work attempts to model biometric signatures independent of task/condition by normalizing the associated variance. Toward this goal, the paper extends ideas… ▽ More Electroencephalography (EEG) signals are promising as alternatives to other biometrics owing to their protection against spoofing. Previous studies have focused on capturing individual variability by analyzing task/condition-specific EEG. This work attempts to model biometric signatures independent of task/condition by normalizing the associated variance. Toward this goal, the paper extends ideas from subspace-based text-independent speaker recognition and proposes novel modifications for modeling multi-channel EEG data. The proposed techniques assume that biometric information is present in the entire EEG signal and accumulate statistics across time in a high dimensional space. These high dimensional statistics are then projected to a lower dimensional space where the biometric information is preserved. The lower dimensional embeddings obtained using the proposed approach are shown to be task-independent. The best subspace system identifies individuals with accuracies of 86.4% and 35.9% on datasets with 30 and 920 subjects, respectively, using just nine EEG channels. The paper also provides insights into the subspace model's scalability to unseen tasks and individuals during training and the number of channels needed for subspace modeling. △ Less

Submitted 25 March, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

Comments: ©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal ref: IEEE Transactions on Information Forensics and Security, 2021

arXiv:2006.10855 [pdf]

doi 10.1038/s41467-020-19499-x

Lithography-free IR polarization converters via orthogonal in-plane phonons in a-MoO3 flakes

Authors: Sina Abedini Dereshgi, Thomas G. Folland, Akshay A. Murthy, Xianglian Song, Ibrahim Tanriover, Vinayak P. Dravid, Joshua D. Caldwell, Koray Aydin

Abstract: Exploiting polaritons in natural vdW materials has been successful in achieving extreme light confinement and low-loss optical devices and enabling simplified device integration. Recently, a-MoO3 has been reported as a semiconducting biaxial vdW material capable of sustaining naturally orthogonal in-plane phonon polariton modes in IR. In this study, we investigate the polarization-dependent optica… ▽ More Exploiting polaritons in natural vdW materials has been successful in achieving extreme light confinement and low-loss optical devices and enabling simplified device integration. Recently, a-MoO3 has been reported as a semiconducting biaxial vdW material capable of sustaining naturally orthogonal in-plane phonon polariton modes in IR. In this study, we investigate the polarization-dependent optical characteristics of cavities formed using a-MoO3 to extend the degrees of freedom in the design of IR photonic components exploiting the in-plane anisotropy of this material. Polarization-dependent absorption over 80% in a multilayer Fabry-Perot structure with a-MoO3 is reported without the need for nanoscale fabrication on the a-MoO3. We observe coupling between the a-MoO3 optical phonons and the Fabry-Perot cavity resonances. Using cross-polarized reflectance spectroscopy we show that the strong birefringence results in 15% of the total power converted into the orthogonal polarization with respect to incident wave. These findings can open new avenues in the quest for polarization filters and low-loss, integrated planar IR photonics and in dictating polarization control. △ Less

Submitted 16 October, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

arXiv:2006.06971 [pdf, other]

doi 10.21437/Interspeech.2020-2663

Generic Indic Text-to-speech Synthesisers with Rapid Adaptation in an End-to-end Framework

Authors: Anusha Prakash, Hema A Murthy

Abstract: Building text-to-speech (TTS) synthesisers for Indian languages is a difficult task owing to a large number of active languages. Indian languages can be classified into a finite set of families, prominent among them, Indo-Aryan and Dravidian. The proposed work exploits this property to build a generic TTS system using multiple languages from the same family in an end-to-end framework. Generic syst… ▽ More Building text-to-speech (TTS) synthesisers for Indian languages is a difficult task owing to a large number of active languages. Indian languages can be classified into a finite set of families, prominent among them, Indo-Aryan and Dravidian. The proposed work exploits this property to build a generic TTS system using multiple languages from the same family in an end-to-end framework. Generic systems are quite robust as they are capable of capturing a variety of phonotactics across languages. These systems are then adapted to a new language in the same family using small amounts of adaptation data. Experiments indicate that good quality TTS systems can be built using only 7 minutes of adaptation data. An average degradation mean opinion score of 3.98 is obtained for the adapted TTSes. Extensive analysis of systematic interactions between languages in the generic TTSes is carried out. x-vectors are included as speaker embedding to synthesise text in a particular speaker's voice. An interesting observation is that the prosody of the target speaker's voice is preserved. These results are quite promising as they indicate the capability of generic TTSes to handle speaker and language switching seamlessly, along with the ease of adaptation to a new language. △ Less

Submitted 12 June, 2020; originally announced June 2020.

Journal ref: INTERSPEECH (2002) 2962-2966

arXiv:2006.04372 [pdf, ps, other]

doi 10.21437/Interspeech.2019-2336

Zero resource speech synthesis using transcripts derived from perceptual acoustic units

Authors: Karthik Pandia D S, Hema A Murthy

Abstract: Zerospeech synthesis is the task of building vocabulary independent speech synthesis systems, where transcriptions are not available for training data. It is, therefore, necessary to convert training data into a sequence of fundamental acoustic units that can be used for synthesis during the test. This paper attempts to discover, and model perceptual acoustic units consisting of steady-state, and… ▽ More Zerospeech synthesis is the task of building vocabulary independent speech synthesis systems, where transcriptions are not available for training data. It is, therefore, necessary to convert training data into a sequence of fundamental acoustic units that can be used for synthesis during the test. This paper attempts to discover, and model perceptual acoustic units consisting of steady-state, and transient regions in speech. The transients roughly correspond to CV, VC units, while the steady-state corresponds to sonorants and fricatives. The speech signal is first preprocessed by segmenting the same into CVC-like units using a short-term energy-like contour. These CVC segments are clustered using a connected components-based graph clustering technique. The clustered CVC segments are initialized such that the onset (CV) and decays (VC) correspond to transients, and the rhyme corresponds to steady-states. Following this initialization, the units are allowed to re-organise on the continuous speech into a final set of AUs in an HMM-GMM framework. AU sequences thus obtained are used to train synthesis models. The performance of the proposed approach is evaluated on the Zerospeech 2019 challenge database. Subjective and objective scores show that reasonably good quality synthesis with low bit rate encoding can be achieved using the proposed AUs. △ Less

Submitted 8 June, 2020; originally announced June 2020.

arXiv:2001.06657 [pdf, other]

Stacked Adversarial Network for Zero-Shot Sketch based Image Retrieval

Authors: Anubha Pandey, Ashish Mishra, Vinay Kumar Verma, Anurag Mittal, Hema A. Murthy

Abstract: Conventional approaches to Sketch-Based Image Retrieval (SBIR) assume that the data of all the classes are available during training. The assumption may not always be practical since the data of a few classes may be unavailable, or the classes may not appear at the time of training. Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) relaxes this constraint and allows the algorithm to handle previous… ▽ More Conventional approaches to Sketch-Based Image Retrieval (SBIR) assume that the data of all the classes are available during training. The assumption may not always be practical since the data of a few classes may be unavailable, or the classes may not appear at the time of training. Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) relaxes this constraint and allows the algorithm to handle previously unseen classes during the test. This paper proposes a generative approach based on the Stacked Adversarial Network (SAN) and the advantage of Siamese Network (SN) for ZS-SBIR. While SAN generates a high-quality sample, SN learns a better distance metric compared to that of the nearest neighbor search. The capability of the generative model to synthesize image features based on the sketch reduces the SBIR problem to that of an image-to-image retrieval problem. We evaluate the efficacy of our proposed approach on TU-Berlin, and Sketchy database in both standard ZSL and generalized ZSL setting. The proposed method yields a significant improvement in standard ZSL as well as in a more challenging generalized ZSL setting (GZSL) for SBIR. △ Less

Submitted 18 January, 2020; originally announced January 2020.

Comments: Accepted in WACV'2020

arXiv:1911.11712 [pdf]

Au@MoS2@WS2 Core-Shell Architectures: A Solution to Versatile Colloidal Suspensions of 2D Heterostructures

Authors: Jennifer G. DiStefano, Akshay A. Murthy, Chamille J. Lescott, Roberto dos Reis, Yuan Li, Vinayak P. Dravid

Abstract: For years, solution processing has provided a versatile platform to extend the applications of transition metal dichalcogenides (TMDs) beyond those achievable with traditional preparation methods. However, existing solution-based synthesis and exfoliation approaches are not compatible with complex geometries, particularly when interfacial control is desired. As a result, promising TMD structures,… ▽ More For years, solution processing has provided a versatile platform to extend the applications of transition metal dichalcogenides (TMDs) beyond those achievable with traditional preparation methods. However, existing solution-based synthesis and exfoliation approaches are not compatible with complex geometries, particularly when interfacial control is desired. As a result, promising TMD structures, including MoS2/WS2 heterostructures, are barred from the rich assembly and modification opportunities possible with solution preparation. Here, we introduce a strategy that combines traditional vapor phase deposition and solution chemistry to build TMD core-shell heterostructures housed in aqueous media. We report the first synthesized TMD core-shell heterostructure, Au@MoS2@WS2, with an Au nanoparticle core and MoS2 and WS2 shells, and provide a means of suspending the structure in solution to allow for higher order patterning and ligand-based functionalization. High-resolution electron microscopy and Raman spectroscopy provide detailed analysis of the structure and interfaces of the core-shell heterostructures. UV-vis, dynamic light scattering, and zeta potential measurements exhibit the outstanding natural stability and monodispersity of Au@MoS2@WS2 in solution. As a proof of concept, the aqueous environment is utilized to both functionalize the core-shell heterostructures with electrostatic ligands and pattern them into desired configurations on a target substrate. This work harnesses the advantages of vapor phase preparation of nanomaterials and the functionality possible with aqueous suspension to expand future engineering and application opportunities of TMD heterostructures. △ Less

Submitted 26 November, 2019; originally announced November 2019.

Comments: 7 pages, 4 figures

arXiv:1911.10824 [pdf, other]

Direct imaging of the order parameter of an atomic superfluid using matterwave optics

Authors: Puneet A. Murthy, Selim Jochim

Abstract: We propose a method to directly measure the complex phase distribution, superfluid density and velocity field in an ultracold atomic superfluid. The method consists of mapping the momentum distribution of the gas to real space using matterwave focusing, and manipulating the amplitude and phase by means of tailor made optical potentials. This makes it possible to find analogues of well-known techni… ▽ More We propose a method to directly measure the complex phase distribution, superfluid density and velocity field in an ultracold atomic superfluid. The method consists of mapping the momentum distribution of the gas to real space using matterwave focusing, and manipulating the amplitude and phase by means of tailor made optical potentials. This makes it possible to find analogues of well-known techniques in optical microscopy such as Zernike phase contrast imaging, dark field imaging and schlieren imaging. Applying these ideas directly at the level of the macroscopic wavefunction of the superfluid will allow visualization of interesting effects such as phase fluctuations and topological defects, and enable measurements of transport properties such as vorticity. △ Less

Submitted 25 November, 2019; originally announced November 2019.

arXiv:1910.02879 [pdf]

doi 10.1021/acsnano.9b06581

Direct Visualization of Electric Field induced Structural Dynamics in Monolayer Transition Metal Dichalcogenides

Authors: Akshay A. Murthy, Teodor K. Stanev, Roberto dos Reis, Shiqiang Hao, Chris Wolverton, Nathaniel P. Stern, Vinayak P. Dravid

Abstract: Layered transition metal dichalcogenides (TMDs) offer many attractive features for next-generation low-dimensional device geometries. Due to the practical and fabrication challenges related to in situ methods, the atomistic dynamics that give rise to realizable macroscopic device properties are often unclear. In this study, in situ transmission electron microscopy techniques are utilized in order… ▽ More Layered transition metal dichalcogenides (TMDs) offer many attractive features for next-generation low-dimensional device geometries. Due to the practical and fabrication challenges related to in situ methods, the atomistic dynamics that give rise to realizable macroscopic device properties are often unclear. In this study, in situ transmission electron microscopy techniques are utilized in order to understand the structural dynamics at play, especially at interfaces and defects, in the prototypical film of monolayer MoS2 under electrical bias. Through our sample fabrication process, we clearly identify the presence of mass transport in the presence of a lateral electric field. In particular, we observe that the voids present at grain boundaries combine to induce structural deformation. The electric field mediates a net vacancy flux from the grain boundary interior to the exposed surface edge sites that leaves molybdenum clusters in its wake. Following the initial biasing cycles, however, the mass flow is largely diminished, and the resultant structure remains stable over repeated biasing. We believe insights from this work can help explain observations of non-uniform heating and preferential oxidation at grain boundary sites in these materials. △ Less

Submitted 11 February, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

Comments: 4 figures, ACS Nano

arXiv:1904.09975 [pdf, other]

Chaotic Quantum Behaved Particle Swarm Optimization for Multiobjective Optimization in Habitability Studies

Authors: Arun John, Anish Murthy

Abstract: In this paper, based on the Quantum-behaved Particle Swarm Optimization algorithm, we evolve the algorithm to optimize a multiobjective optimization problem, namely the Cobb Douglas Habitability function which is based on CES production functions in Economics. We also propose some changes to the Quantum-behaved Particle Swarm Optimization algorithm to mitigate the problem of the algorithm prematur… ▽ More In this paper, based on the Quantum-behaved Particle Swarm Optimization algorithm, we evolve the algorithm to optimize a multiobjective optimization problem, namely the Cobb Douglas Habitability function which is based on CES production functions in Economics. We also propose some changes to the Quantum-behaved Particle Swarm Optimization algorithm to mitigate the problem of the algorithm prematurely converging and show the results of the proposed changes to the Quantum-behaved Particle Swarm Optimization. △ Less

Submitted 30 April, 2019; v1 submitted 19 April, 2019; originally announced April 2019.

arXiv:1904.07453 [pdf, other]

doi 10.1109/ASRU46091.2019.9003824

Spoof detection using time-delay shallow neural network and feature switching

Authors: Mari Ganesh Kumar, Suvidha Rupesh Kumar, Saranya M, B. Bharathi, Hema A. Murthy

Abstract: Detecting spoofed utterances is a fundamental problem in voice-based biometrics. Spoofing can be performed either by logical accesses like speech synthesis, voice conversion or by physical accesses such as replaying the pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based speaker verification approach, this paper proposes a time-delay shallow neural network (TD-SNN) for s… ▽ More Detecting spoofed utterances is a fundamental problem in voice-based biometrics. Spoofing can be performed either by logical accesses like speech synthesis, voice conversion or by physical accesses such as replaying the pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based speaker verification approach, this paper proposes a time-delay shallow neural network (TD-SNN) for spoof detection for both logical and physical access. The novelty of the proposed TD-SNN system vis-a-vis conventional DNN systems is that it can handle variable length utterances during testing. Performance of the proposed TD-SNN systems and the baseline Gaussian mixture models (GMMs) is analyzed on the ASV-spoof-2019 dataset. The performance of the systems is measured in terms of the minimum normalized tandem detection cost function (min-t-DCF). When studied with individual features, the TD-SNN system consistently outperforms the GMM system for physical access. For logical access, GMM surpasses TD-SNN systems for certain individual features. When combined with the decision-level feature switching (DLFS) paradigm, the best TD-SNN system outperforms the best baseline GMM system on evaluation data with a relative improvement of 48.03\% and 49.47\% for both logical and physical access, respectively. △ Less

Submitted 23 January, 2020; v1 submitted 16 April, 2019; originally announced April 2019.

Journal ref: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1011--1017

arXiv:1903.10870 [pdf, other]

Algorithms and Improved bounds for online learning under finite hypothesis class

Authors: Ankit Sharma, Late C. A. Murthy

Abstract: Online learning is the process of answering a sequence of questions based on the correct answers to the previous questions. It is studied in many research areas such as game theory, information theory and machine learning. There are two main components of online learning framework. First, the learning algorithm also known as the learner and second, the hypothesis class which is essentially a set o… ▽ More Online learning is the process of answering a sequence of questions based on the correct answers to the previous questions. It is studied in many research areas such as game theory, information theory and machine learning. There are two main components of online learning framework. First, the learning algorithm also known as the learner and second, the hypothesis class which is essentially a set of functions which learner uses to predict answers to the questions. Sometimes, this class contains some functions which have the capability to provide correct answers to the entire sequence of questions. This case is called realizable case. And when hypothesis class does not contain such functions is called unrealizable case. The goal of the learner, in both the cases, is to make as few mistakes as that could have been made by most powerful functions in hypothesis class over the entire sequence of questions. Performance of the learners is analysed by theoretical bounds on the number of mistakes made by them. This paper proposes three algorithms to improve the mistakes bound in the unrealizable case. Proposed algorithms perform highly better than the existing ones in the long run when most of the input sequences presented to the learner are likely to be realizable. △ Less

Submitted 24 March, 2019; originally announced March 2019.

Comments: 17 pages, 2 figures, 9 tables

arXiv:1903.10672 [pdf, other]

Robustness of Neural Networks to Parameter Quantization

Authors: Abhishek Murthy, Himel Das, Md Ariful Islam

Abstract: Quantization, a commonly used technique to reduce the memory footprint of a neural network for edge computing, entails reducing the precision of the floating-point representation used for the parameters of the network. The impact of such rounding-off errors on the overall performance of the neural network is estimated using testing, which is not exhaustive and thus cannot be used to guarantee the… ▽ More Quantization, a commonly used technique to reduce the memory footprint of a neural network for edge computing, entails reducing the precision of the floating-point representation used for the parameters of the network. The impact of such rounding-off errors on the overall performance of the neural network is estimated using testing, which is not exhaustive and thus cannot be used to guarantee the safety of the model. We present a framework based on Satisfiability Modulo Theory (SMT) solvers to quantify the robustness of neural networks to parameter perturbation. To this end, we introduce notions of local and global robustness that capture the deviation in the confidence of class assignments due to parameter quantization. The robustness notions are then cast as instances of SMT problems and solved automatically using solvers, such as dReal. We demonstrate our framework on two simple Multi-Layer Perceptrons (MLP) that perform binary classification on a two-dimensional input. In addition to quantifying the robustness, we also show that Rectified Linear Unit activation results in higher robustness than linear activations for our MLPs. △ Less

Submitted 26 March, 2019; originally announced March 2019.

arXiv:1903.08987 [pdf, other]

Some New Copula Based Distribution-free Tests of Independence among Several Random Variables

Authors: Angshuman Roy, Anil Ghosh, Alok Goswami, C. A. Murthy

Abstract: Over the last couple of decades, several copula based methods have been proposed in the literature to test for the independence among several random variables. But these existing tests are not invariant under monotone transformations of the variables, and they often perform poorly if the dependence among the variables is highly non-monotone in nature. In this article, we propose a copula based mea… ▽ More Over the last couple of decades, several copula based methods have been proposed in the literature to test for the independence among several random variables. But these existing tests are not invariant under monotone transformations of the variables, and they often perform poorly if the dependence among the variables is highly non-monotone in nature. In this article, we propose a copula based measure of dependency and use it to construct some new distribution-free tests of independence. The proposed measure and the resulting tests, all are invariant under permutations and monotone transformations of the variables. Our dependency measure involves a kernel function, and we use the Gaussian kernel for that purpose. We adopt a multi-scale approach, where we look at the results obtained for several choices of the bandwidth parameter associated with the Gaussian kernel and aggregate them judiciously. Large sample properties of the dependency measure and the resulting tests are derived under appropriate regularity conditions. Several simulated and real data sets are analyzed to compare the performance of the proposed tests with some popular tests available in the literature. △ Less

Submitted 14 November, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

Comments: arXiv admin note: text overlap with arXiv:1708.07485

arXiv:1902.08051 [pdf, other]

doi 10.1109/ICASSP.2019.8683114

Incremental Transfer Learning in Two-pass Information Bottleneck based Speaker Diarization System for Meetings

Authors: Nauman Dawalatabad, Srikanth Madikeri, C Chandra Sekhar, Hema A Murthy

Abstract: The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This pap… ▽ More The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This paper attempts to improve the RTF of the TPIB system using an incremental transfer learning approach where the parameters learned by the ANN from other conversations are updated using current conversation rather than learning parameters from scratch. This reduces the RTF significantly. The effectiveness of the proposed approach compared to the baseline IB and the TPIB systems is demonstrated on standard NIST and AMI conversational meeting datasets. With a minor degradation in performance, the proposed system shows a significant improvement of 33.07% and 24.45% in RTF with respect to TPIB system on the NIST RT-04Eval and AMI-1 datasets, respectively. △ Less

Submitted 21 February, 2019; originally announced February 2019.

Comments: 5 pages, 2 figures, To appear in Proc. ICASSP 2019, May 12-17, 2019, Brighton, UK

arXiv:1811.04661 [pdf, ps, other]

RelDenClu: A Relative Density based Biclustering Method for identifying non-linear feature relations

Authors: Namita Jain, Susmita Ghosh, C. A. Murthy

Abstract: The existing biclustering algorithms for finding feature relation based biclusters often depend on assumptions like monotonicity or linearity. Though a few algorithms overcome this problem by using density-based methods, they tend to miss out many biclusters because they use global criteria for identifying dense regions. The proposed method, RelDenClu uses the local variations in marginal and join… ▽ More The existing biclustering algorithms for finding feature relation based biclusters often depend on assumptions like monotonicity or linearity. Though a few algorithms overcome this problem by using density-based methods, they tend to miss out many biclusters because they use global criteria for identifying dense regions. The proposed method, RelDenClu uses the local variations in marginal and joint densities for each pair of features to find the subset of observations, which forms the bases of the relation between them. It then finds the set of features connected by a common set of observations, resulting in a bicluster. To show the effectiveness of the proposed methodology, experimentation has been carried out on fifteen types of simulated datasets. Further, it has been applied to six real-life datasets. For three of these real-life datasets, the proposed method is used for unsupervised learning, while for other three real-life datasets it is used as an aid to supervised learning. For all the datasets the performance of the proposed method is compared with that of seven different state-of-the-art algorithms and the proposed algorithm is seen to produce better results. The efficacy of proposed algorithm is also seen by its use on COVID-19 dataset for identifying some features (genetic, demographics and others) that are likely to affect the spread of COVID-19. △ Less

Submitted 11 May, 2021; v1 submitted 12 November, 2018; originally announced November 2018.

arXiv:1810.10169 [pdf, other]

Exploiting Partial Correlations in Distributionally Robust Optimization

Authors: Divya Padmanabhan, Karthik Natarajan, Karthyek R. A. Murthy

Abstract: In this paper, we identify partial correlation information structures that allow for simpler reformulations in evaluating the maximum expected value of mixed integer linear programs with random objective coefficients. To this end, assuming only the knowledge of the mean and the covariance matrix entries restricted to block-diagonal patterns, we develop a reduced semidefinite programming formulatio… ▽ More In this paper, we identify partial correlation information structures that allow for simpler reformulations in evaluating the maximum expected value of mixed integer linear programs with random objective coefficients. To this end, assuming only the knowledge of the mean and the covariance matrix entries restricted to block-diagonal patterns, we develop a reduced semidefinite programming formulation, the complexity of solving which is related to characterizing a suitable projection of the convex hull of the set $\{(\bold{x}, \bold{x}\bold{x}'): \bold{x} \in \mathcal{X}\}$ where $\mathcal{X}$ is the feasible region. In some cases, this lends itself to efficient representations that result in polynomial-time solvable instances, most notably for the distributionally robust appointment scheduling problem with random job durations as well as for computing tight bounds in Project Evaluation and Review Technique (PERT) networks and linear assignment problems. To the best of our knowledge, this is the first example of a distributionally robust optimization formulation for appointment scheduling that permits a tight polynomial-time solvable semidefinite programming reformulation which explicitly captures partially known correlation information between uncertain processing times of the jobs to be scheduled. △ Less

Submitted 23 October, 2018; originally announced October 2018.

Showing 1–50 of 70 results for author: Murthy, A