default search action
L. Paola García-Perera
Person information
- affiliation: Johns Hopkins University, Center for Language and Speech Processing, Baltimore, MD, USA
- affiliation (former): Nuance Communications, Inc.
- affiliation (former): Agnitio S.L., Madrid, Spain
- affiliation (PhD 2014): University of Zaragoza, Spain
- affiliation: Monterrey Institute of Technology and Higher Education (ITESM), Computer Science Department, Monterrey, Mexico
Other persons with the same name
- Paola García 0002 — Universidad Panamericana, México City, México
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c72]Ruizhe Huang, Mahsa Yarmohammadi, Jan Trmal, Jing Liu, Desh Raj, Leibny Paola García, Alexei V. Ivanov, Patrick Ehlen, Mingzhi Yu, Dan Povey, Sanjeev Khudanpur:
ConEC: Earnings Call Dataset with Real-world Contexts for Benchmarking Contextual Speech Recognition. LREC/COLING 2024: 3700-3706 - [c71]Xiangyu Zhang, Daijiao Liu, Hexin Liu, Qiquan Zhang, Hanyu Meng, Leibny Paola García-Perera, EngSiong Chng, Lina Yao:
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model. EMNLP 2024: 159-171 - [c70]Ruixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola García, Amir Manbachi:
Unidirectional Brain-Computer Interface: Artificial Neural Network Encoding Natural Images to FMRI Response in the Visual Cortex. ICASSP 2024: 1851-1855 - [c69]Hexin Liu, Leibny Paola García, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur:
Enhancing Code-Switching Speech Recognition With Interactive Language Biases. ICASSP 2024: 10886-10890 - [c68]Patrick Foley, Matthew Wiesner, Bismarck Odoom, Leibny Paola García-Perera, Kenton Murray, Philipp Koehn:
Where are you from? Geolocating Speech and Applications to Language Identification. NAACL-HLT 2024: 5114-5126 - [c67]Desh Raj, Matthew Wiesner, Matthew Maciejewski, Paola García, Daniel Povey, Sanjeev Khudanpur:
On Speaker Attribution with SURT. Odyssey 2024: 91-98 - [c66]Lucas Goncalves, Ali N. Salman, Abinay Reddy Naini, Laureano Moro-Velázquez, Thomas Thebaud, Paola García, Najim Dehak, Berrak Sisman, Carlos Busso:
Odyssey 2024 - Speech Emotion Recognition Challenge: Dataset, Baseline Framework, and Results. Odyssey 2024: 247-254 - [i41]Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
On Speaker Attribution with SURT. CoRR abs/2401.15676 (2024) - [i40]Xiangyu Zhang, Daijiao Liu, Hexin Liu, Qiquan Zhang, Hanyu Meng, Leibny Paola García, Eng Siong Chng, Lina Yao:
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model. CoRR abs/2402.10642 (2024) - [i39]Samuele Cornell, Taejin Park, Steve Huang, Christoph Böddeker, Xuankai Chang, Matthew Maciejewski, Matthew Wiesner, Paola García, Shinji Watanabe:
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization. CoRR abs/2407.16447 (2024) - [i38]Zexin Cai, Henry Li Xinyuan, Ashi Garg, Leibny Paola García-Perera, Kevin Duh, Sanjeev Khudanpur, Nicholas Andrews, Matthew Wiesner:
Privacy versus Emotion Preservation Trade-offs in Emotion-Preserving Speaker Anonymization. CoRR abs/2409.03655 (2024) - [i37]Henry Li Xinyuan, Zexin Cai, Ashi Garg, Kevin Duh, Leibny Paola García-Perera, Sanjeev Khudanpur, Nicholas Andrews, Matthew Wiesner:
HLTCOE JHU Submission to the Voice Privacy Challenge 2024. CoRR abs/2409.08913 (2024) - [i36]Xinyuan Qian, Jiaran Gao, Yaodan Zhang, Qiquan Zhang, Hexin Liu, Leibny Paola García, Haizhou Li:
SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model. CoRR abs/2411.07751 (2024) - 2023
- [j7]Juan Arturo Nolazco-Flores, Ana Verónica Guerrero-Galván, Carolina Del-Valle-Soto, Leibny Paola García-Perera:
Genre Classification of Books on Spanish. IEEE Access 11: 132878-132892 (2023) - [j6]Shota Horiguchi, Shinji Watanabe, Paola García, Yuki Takashima, Yohei Kawaguchi:
Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors. IEEE ACM Trans. Audio Speech Lang. Process. 31: 706-720 (2023) - [c65]Tuan Vu Ho, Shota Horiguchi, Shinji Watanabe, Paola García, Takashi Sumiyoshi:
Synthetic Data Augmentation for ASR with Domain Filtering. APSIPA ASC 2023: 1760-1765 - [c64]Dongji Gao, Hainan Xu, Desh Raj, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition. ASRU 2023: 1-8 - [c63]Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-Yi Lee, Shinji Watanabe, Sanjeev Khudanpur:
Euro: Espnet Unsupervised ASR Open-Source Toolkit. ICASSP 2023: 1-5 - [c62]Zili Huang, Desh Raj, Paola García, Sanjeev Khudanpur:
Adapting Self-Supervised Models to Multi-Talker Speech Recognition Using Speaker Embeddings. ICASSP 2023: 1-5 - [c61]Ruizhe Huang, Matthew Wiesner, Leibny Paola García-Perera, Daniel Povey, Jan Trmal, Sanjeev Khudanpur:
Building Keyword Search System from End-To-End Asr Systems. ICASSP 2023: 1-5 - [c60]Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola García:
PQLM - Multilingual Decentralized Portable Quantum Language Model. ICASSP 2023: 1-5 - [c59]Hexin Liu, Haihua Xu, Leibny Paola García, Andy W. H. Khong, Yi He, Sanjeev Khudanpur:
Reducing Language Confusion for Code-Switching Speech Recognition with Token-Level Language Diarization. ICASSP 2023: 1-5 - [c58]Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe, Ann Lee, Hung-Yi Lee:
Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR. ICASSP 2023: 1-5 - [c57]Yu Xuan, Xiangyu Zhang, Shuyue Stella Li, Zihan Shen, Xin Xie, Leibny Paola García, Roberto Togneri:
A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters. ICASSP 2023: 1-5 - [c56]Chun Chieh Chang, Leibny Paola García-Perera, Sanjeev Khudanpur:
Crosslingual Handwritten Text Generation Using GANs. ICDAR Workshops (2) 2023: 285-301 - [c55]Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, Wenhan Chao, Paola García:
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extracters. ICNLSP 2023: 200-211 - [c54]Jesús Villalba, Jonas Borgstrom, Maliha Jahan, Saurabh Kataria, Leibny Paola García, Pedro A. Torres-Carrasquillo, Najim Dehak:
Advances in Language Recognition in Low Resource African Languages: The JHU-MIT Submission for NIST LRE22. INTERSPEECH 2023: 521-525 - [c53]Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola García, Daniel Povey, Sanjeev Khudanpur:
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts. INTERSPEECH 2023: 924-928 - [c52]Yi Han Victoria Chua, Hexin Liu, Leibny Paola García, Fei Ting Woon, Jinyi Wong, Xiangyu Zhang, Sanjeev Khudanpur, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles:
MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization. INTERSPEECH 2023: 4109-4113 - [c51]Suzy J. Styles, Yi Han Victoria Chua, Fei Ting Woon, Hexin Liu, Leibny Paola García, Sanjeev Khudanpur, Andy W. H. Khong, Justin Dauwels:
Investigating model performance in language identification: beyond simple error statistics. INTERSPEECH 2023: 4129-4133 - [i35]Suzy J. Styles, Yi Han Victoria Chua, Fei Ting Woon, Hexin Liu, Leibny Paola García-Perera, Sanjeev Khudanpur, Andy W. H. Khong, Justin Dauwels:
Investigating model performance in language identification: beyond simple error statistics. CoRR abs/2305.18925 (2023) - [i34]Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola García, Daniel Povey, Sanjeev Khudanpur:
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts. CoRR abs/2306.01031 (2023) - [i33]Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola García, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur:
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios. CoRR abs/2306.13734 (2023) - [i32]Ruixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola García, Amir Manbachi:
Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex. CoRR abs/2309.15018 (2023) - [i31]Dongji Gao, Hainan Xu, Desh Raj, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition. CoRR abs/2309.15796 (2023) - [i30]Hexin Liu, Leibny Paola García, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur:
Enhancing Code-switching Speech Recognition with Interactive Language Biases. CoRR abs/2309.16953 (2023) - [i29]Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, Wenhan Chao, Leibny Paola García:
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors. CoRR abs/2311.15954 (2023) - 2022
- [j5]Zili Huang, Marc Delcroix, Leibny Paola García-Perera, Shinji Watanabe, Desh Raj, Sanjeev Khudanpur:
Joint speaker diarization and speech recognition based on region proposal networks. Comput. Speech Lang. 72: 101316 (2022) - [j4]Hexin Liu, Leibny Paola García-Perera, Andy W. H. Khong, Eng Siong Chng, Suzy J. Styles, Sanjeev Khudanpur:
Efficient Self-Supervised Learning Representations for Spoken Language Identification. IEEE J. Sel. Top. Signal Process. 16(6): 1296-1307 (2022) - [j3]Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola García:
Encoder-Decoder Based Attractors for End-to-End Neural Diarization. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1493-1507 (2022) - [c50]Zili Huang, Shinji Watanabe, Shu-Wen Yang, Paola García, Sanjeev Khudanpur:
Investigating Self-Supervised Learning for Speech Enhancement and Separation. ICASSP 2022: 6837-6841 - [c49]Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, Yohei Kawaguchi:
Multi-Channel End-To-End Neural Diarization with Distributed Microphones. ICASSP 2022: 7332-7336 - [c48]Yuki Takashima, Shota Horiguchi, Shinji Watanabe, Leibny Paola García-Perera, Yohei Kawaguchi:
Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models. INTERSPEECH 2022: 2218-2222 - [c47]Hexin Liu, Leibny Paola García-Perera, Andy W. H. Khong, Suzy J. Styles, Sanjeev Khudanpur:
PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification. INTERSPEECH 2022: 2233-2237 - [c46]Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Magdalena Rybicka, Carlos D. Castillo, Jaejin Cho, L. Paola García-Perera, Pedro A. Torres-Carrasquillo, Najim Dehak:
Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21. Odyssey 2022: 213-220 - [c45]Hexin Liu, Leibny Paola García-Perera, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles, Sanjeev Khudanpur:
Enhancing Language Identification Using Dual-Mode Model with Knowledge Distillation. Odyssey 2022: 248-254 - [c44]Shota Horiguchi, Yuki Takashima, Shinji Watanabe, Paola García:
Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization. SLT 2022: 620-625 - [c43]Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. SLT 2022: 1128-1135 - [i28]Hexin Liu, Leibny Paola García-Perera, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles, Sanjeev Khudanpur:
Enhance Language Identification using Dual-mode Model with Knowledge Distillation. CoRR abs/2203.03218 (2022) - [i27]Shota Horiguchi, Shinji Watanabe, Paola García, Yuki Takashima, Yohei Kawaguchi:
Online Neural Diarization of Unlimited Numbers of Speakers. CoRR abs/2206.02432 (2022) - [i26]Xiangyu Zhang, Zhanhong He, Shuyue Stella Li, Roberto Togneri, Leibny Paola García-Perera:
Investigating self-supervised learning for lyrics recognition. CoRR abs/2209.12702 (2022) - [i25]Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola García-Perera:
PQLM - Multilingual Decentralized Portable Quantum Language Model for Privacy Protection. CoRR abs/2210.03221 (2022) - [i24]Shota Horiguchi, Yuki Takashima, Shinji Watanabe, Paola García:
Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization. CoRR abs/2210.03459 (2022) - [i23]Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. CoRR abs/2210.07189 (2022) - [i22]Hexin Liu, Haihua Xu, Leibny Paola García, Andy W. H. Khong, Yi He, Sanjeev Khudanpur:
Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization. CoRR abs/2210.14567 (2022) - [i21]Zili Huang, Desh Raj, Paola García, Sanjeev Khudanpur:
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings. CoRR abs/2211.00482 (2022) - [i20]Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe, Ann Lee, Hung-yi Lee:
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR. CoRR abs/2211.03025 (2022) - [i19]Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-yi Lee, Shinji Watanabe, Sanjeev Khudanpur:
EURO: ESPnet Unsupervised ASR Open-source Toolkit. CoRR abs/2211.17196 (2022) - 2021
- [c42]Shota Horiguchi, Shinji Watanabe, Paola García, Yawen Xue, Yuki Takashima, Yohei Kawaguchi:
Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors. ASRU 2021: 98-105 - [c41]Carlos Rodrigo Castillo-Sanchez, Leibny Paola García-Perera:
The CLIR-CLSP System for the IberSPEECH-RTVE 2020 Speaker Diarization and Identity Assignment Challenge. IberSPEECH 2021 - [c40]Shota Horiguchi, Paola García, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu:
End-To-End Speaker Diarization as Post-Processing. ICASSP 2021: 7188-7192 - [c39]Hexin Liu, Leibny Paola García-Perera, Xinyi Zhang, Justin Dauwels, Andy W. H. Khong, Sanjeev Khudanpur, Suzy J. Styles:
End-to-End Language Diarization for Bilingual Code-Switching Speech. Interspeech 2021: 1489-1493 - [c38]Matthew Wiesner, Mousmita Sarma, Ashish Arora, Desh Raj, Dongji Gao, Ruizhe Huang, Supreet Preet, Moris Johnson, Zikra Iqbal, Nagendra Goel, Jan Trmal, Leibny Paola García-Perera, Sanjeev Khudanpur:
Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition. Interspeech 2021: 2906-2910 - [c37]Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Leibny Paola García-Perera, Kenji Nagamatsu:
Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization. Interspeech 2021: 3096-3100 - [c36]Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Leibny Paola García-Perera, Kenji Nagamatsu:
Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers. Interspeech 2021: 3116-3120 - [c35]Yawen Xue, Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Paola García, Kenji Nagamatsu:
Online End-To-End Neural Diarization with Speaker-Tracing Buffer. SLT 2021: 841-848 - [c34]Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola García, Kenji Nagamatsu:
End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection. SLT 2021: 849-856 - [c33]Desh Raj, Leibny Paola García-Perera, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur:
DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs. SLT 2021: 881-888 - [e1]Joseph Turian, Björn W. Schuller, Dorien Herremans, Katrin Kirchhoff, L. Paola García-Perera, Philippe Esling:
HEAR: Holistic Evaluation of Audio Representations, Virtual Event, December 13-14, 2021. Proceedings of Machine Learning Research 166, PMLR 2021 [contents] - [i18]Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola García, Kenji Nagamatsu:
Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers. CoRR abs/2101.08473 (2021) - [i17]Shota Horiguchi, Nelson Yalta, Paola García, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur:
The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap. CoRR abs/2102.01363 (2021) - [i16]Yuki Takashima, Yusuke Fujita, Shinji Watanabe, Shota Horiguchi, Paola García, Kenji Nagamatsu:
End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection. CoRR abs/2106.04078 (2021) - [i15]Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Paola García, Kenji Nagamatsu:
Semi-Supervised Training with Pseudo-Labeling for End-to-End Neural Diarization. CoRR abs/2106.04764 (2021) - [i14]Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Paola García:
Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization. CoRR abs/2106.10654 (2021) - [i13]Shota Horiguchi, Shinji Watanabe, Paola García, Yawen Xue, Yuki Takashima, Yohei Kawaguchi:
Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors. CoRR abs/2107.01545 (2021) - [i12]Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, Yohei Kawaguchi:
Multi-Channel End-to-End Neural Diarization with Distributed Microphones. CoRR abs/2110.04694 (2021) - 2020
- [j2]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Leibny Paola García-Perera, Fred Richardson, Réda Dehak, Pedro A. Torres-Carrasquillo, Najim Dehak:
State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations. Comput. Speech Lang. 60 (2020) - [c32]Zili Huang, Shinji Watanabe, Yusuke Fujita, Paola García, Yiwen Shao, Daniel Povey, Sanjeev Khudanpur:
Speaker Diarization with Region Proposal Network. ICASSP 2020: 6514-6518 - [c31]Latané Bullock, Hervé Bredin, Leibny Paola García-Perera:
Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection. ICASSP 2020: 7114-7118 - [c30]Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, Nanxin Chen, L. Paola García-Perera, Najim Dehak:
Feature Enhancement with Deep Feature Losses for Speaker Verification. ICASSP 2020: 7584-7588 - [c29]Phani Sankar Nidadavolu, Saurabh Kataria, Jesús Villalba, L. Paola García-Perera, Najim Dehak:
Unsupervised Feature Enhancement for Speaker Verification. ICASSP 2020: 7599-7603 - [c28]Marvin Lavechin, Marie-Philippe Gill, Ruben Bousbib, Hervé Bredin, Leibny Paola García-Perera:
End-to-End Domain-Adversarial Voice Activity Detection. INTERSPEECH 2020: 3685-3689 - [c27]Jesús Antonio Villalba López, Daniel Garcia-Romero, Nanxin Chen, Gregory Sell, Jonas Borgstrom, Alan McCree, Leibny Paola García-Perera, Saurabh Kataria, Phani Sankar Nidadavolu, Pedro Torres-Carrasquiilo, Najim Dehak:
Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19. Odyssey 2020: 273-280 - [c26]Leibny Paola García-Perera, Jesús Villalba, Hervé Bredin, Jun Du, Diego Castán, Alejandrina Cristià, Latané Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Léo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak:
Speaker Detection in the Wild: Lessons Learned from JSALT 2019. Odyssey 2020: 415-422 - [i11]Zili Huang, Shinji Watanabe, Yusuke Fujita, Paola García, Yiwen Shao, Daniel Povey, Sanjeev Khudanpur:
Speaker Diarization with Region Proposal Network. CoRR abs/2002.06220 (2020) - [i10]Phani Sankar Nidadavolu, Saurabh Kataria, L. Paola García-Perera, Jesús Villalba, Najim Dehak:
Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild. CoRR abs/2005.08331 (2020) - [i9]Ashish Arora, Desh Raj, Aswin Shanmugam Subramanian, Ke Li, Bar Ben-Yair, Matthew Maciejewski, Piotr Zelasko, Paola García, Shinji Watanabe, Sanjeev Khudanpur:
The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge. CoRR abs/2006.07898 (2020) - [i8]Carlos Rodrigo Castillo-Sanchez, Leibny Paola García-Perera, Anabel Martín-González:
DNN Speaker Tracking with Embeddings. CoRR abs/2007.10248 (2020) - [i7]Desh Raj, Leibny Paola García-Perera, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur:
DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs. CoRR abs/2011.01997 (2020) - [i6]Shota Horiguchi, Paola García, Yusuke Fujita, Shinji Watanabe, Kenji Nagamatsu:
End-to-End Speaker Diarization as Post-Processing. CoRR abs/2012.10055 (2020)
2010 – 2019
- 2019
- [c25]Chun-Chieh Chang, Ashish Arora, Leibny Paola García-Perera, David Etter, Daniel Povey, Sanjeev Khudanpur:
Optical Character Recognition with Chinese and Korean Character Decomposition. WML@ICDAR 2019: 134-139 - [c24]Ashish Arora, Paola García, Shinji Watanabe, Vimal Manohar, Yiwen Shao, Sanjeev Khudanpur, Chun-Chieh Chang, Babak Rekabdar, Bagher BabaAli, Daniel Povey, David Etter, Desh Raj, Hossein Hadian, Jan Trmal:
Using ASR Methods for OCR. ICDAR 2019: 663-668 - [c23]Fei Wu, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network. INTERSPEECH 2019: 1-5 - [c22]Jiamin Xie, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur:
Multi-PLDA Diarization on Children's Speech. INTERSPEECH 2019: 376-380 - [c21]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Fred Richardson, Suwon Shon, François Grondin, Réda Dehak, Leibny Paola García-Perera, Daniel Povey, Pedro A. Torres-Carrasquillo, Sanjeev Khudanpur, Najim Dehak:
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18. INTERSPEECH 2019: 1488-1492 - [c20]Matthew Maciejewski, Gregory Sell, Yusuke Fujita, Leibny Paola García-Perera, Shinji Watanabe, Sanjeev Khudanpur:
Analysis of Robustness of Deep Single-Channel Speech Separation Using Corpora Constructed From Multiple Domains. WASPAA 2019: 165-169 - [i5]Latané Bullock, Hervé Bredin, Leibny Paola García-Perera:
Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection. CoRR abs/1910.11646 (2019) - [i4]Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, Nanxin Chen, Paola García, Najim Dehak:
Feature Enhancement with Deep Feature Losses for Speaker Verification. CoRR abs/1910.11905 (2019) - [i3]Phani Sankar Nidadavolu, Saurabh Kataria, Jesús Villalba, L. Paola García-Perera, Najim Dehak:
Unsupervised Feature Enhancement for speaker verification. CoRR abs/1910.11915 (2019) - [i2]Paola García, Jesús Villalba, Hervé Bredin, Jun Du, Diego Castán, Alejandrina Cristià, Latané Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Léo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak:
Speaker detection in the wild: Lessons learned from JSALT 2019. CoRR abs/1912.00938 (2019) - 2018
- [c19]Zili Huang, L. Paola García-Perera, Jesús Villalba, Daniel Povey, Najim Dehak:
JHU Diarization System Description. IberSPEECH 2018: 236-239 - [i1]Matthew Maciejewski, Gregory Sell, Leibny Paola García-Perera, Shinji Watanabe, Sanjeev Khudanpur:
Building Corpora for Single-Channel Speech Separation Across Multiple Domains. CoRR abs/1811.02641 (2018) - 2017
- [c18]Jesús Jorrín, Paola García, Luis Buera:
DNN Bottleneck Features for Speaker Clustering. INTERSPEECH 2017: 1024-1028 - [c17]Oldrich Plchot, Pavel Matejka, Anna Silnova, Ondrej Novotný, Mireia Díez Sánchez, Johan Rohdin, Ondrej Glembek, Niko Brümmer, Albert Swart, Jesús Jorrín-Prieto, Paola García, Luis Buera, Patrick Kenny, Md. Jahangir Alam, Gautam Bhattacharya:
Analysis and Description of ABC Submission to NIST SRE 2016. INTERSPEECH 2017: 1348-1352 - 2016
- [c16]Jesús Jorrín-Prieto, Carlos Vaquero, Paola García:
Analysis of the Impact of the Audio Database Characteristics in the Accuracy of a Speaker Clustering System. Odyssey 2016: 393-399 - 2015
- [c15]Paola García, Eduardo Lleida, Diego Castán, José Manuel Marcos, David Romero:
Context-Aware Communicator for All. HCI (7) 2015: 426-437 - 2013
- [c14]Leibny Paola García-Perera, Bhiksha Raj, Juan Arturo Nolazco-Flores:
Optimization of the DET curve in speaker verification under noisy conditions. ICASSP 2013: 7765-7769 - [c13]Leibny Paola García-Perera, Bhiksha Raj, Juan Arturo Nolazco-Flores:
Ensemble approach in speaker verification. INTERSPEECH 2013: 2455-2459 - 2012
- [c12]L. Paola García-Perera, Juan Arturo Nolazco-Flores, Bhiksha Raj, Richard M. Stern:
Optimization of the DET curve in speaker verification. SLT 2012: 318-323 - 2011
- [j1]L. Paola García-Perera, Roberto Aceves-Lopez, Juan Arturo Nolazco-Flores:
Speaker Verification in Different Database Scenarios. Computación y Sistemas 15(1) (2011) - 2010
- [c11]Sébastien Marcel, Chris McCool, Pavel Matejka, Timo Ahonen, Jan Cernocký, Shayok Chakraborty, Vineeth Nallure Balasubramanian, Sethuraman Panchanathan, Chi-Ho Chan, Josef Kittler, Norman Poh, Benoit G. B. Fauve, Ondrej Glembek, Oldrich Plchot, Zdenek Jancik, Anthony Larcher, Christophe Lévy, Driss Matrouf, Jean-François Bonastre, Ping-Han Lee, Jui-Yu Hung, Si-Wei Wu, Yi-Ping Hung, Lukás Machlica, John S. D. Mason, Sandra Mau, Conrad Sanderson, David Monzo, Antonio Albiol, Hieu V. Nguyen, Li Bai, Yan Wang, Matti Niskanen, Markus Turtinen, Juan Arturo Nolazco-Flores, L. Paola García-Perera, Roberto Aceves-Lopez, Mauricio Villegas, Roberto Paredes:
On the Results of the First Mobile Biometry (MOBIO) Face and Speaker Verification Evaluation. ICPR Contests 2010: 210-225 - [c10]Juan Arturo Nolazco-Flores, Roberto A. Aceves L., L. Paola García-Perera:
Speech Magnitude-Spectrum Information-Entropy (MSIE) for Automatic Speech Recognition in Noisy Environments. ICPR 2010: 4364-4367
2000 – 2009
- 2008
- [c9]Juan Arturo Nolazco-Flores, L. Paola García-Perera:
Enhancing acoustic models for robust speaker verification. ICASSP 2008: 4837-4840 - 2007
- [c8]Igmar Hernández, Paola García, Juan Arturo Nolazco, Luis Buera, Eduardo Lleida:
Robust Automatic Speech Recognition Using PD-MEEMLIN. IbPRIA (2) 2007: 1-8 - 2006
- [c7]Juan Arturo Nolazco-Flores, J. Carlos Mex-Perera, L. Paola García-Perera, Brenda Sanchez-Torres:
Using PCA to Improve the Generation of Speech Keys. MICAI 2006: 1085-1094 - 2005
- [c6]L. Paola García-Perera, J. Carlos Mex-Perera, Juan Arturo Nolazco-Flores:
Multi-speaker voice cryptographic key generation. AICCSA 2005: 93 - [c5]L. Paola García-Perera, Juan Arturo Nolazco-Flores, J. Carlos Mex-Perera:
Phoneme Spotting for Speech-Based Crypto-key Generation. CIARP 2005: 770-777 - [c4]L. Paola García-Perera, Juan Arturo Nolazco-Flores, J. Carlos Mex-Perera:
Cryptographic-Speech-Key Generation Architecture Improvements. IbPRIA (2) 2005: 579-585 - [c3]L. Paola García-Perera, Juan Arturo Nolazco-Flores, J. Carlos Mex-Perera:
Parameter Optimization in a Text-Dependent Cryptographic-Speech-Key Generation Task. NOLISP 2005: 92-99 - 2004
- [c2]L. Paola García-Perera, J. Carlos Mex-Perera, Juan Arturo Nolazco-Flores:
SVM Applied to the Generation of Biometric Speech Key. CIARP 2004: 637-644 - [c1]L. Paola García-Perera, J. Carlos Mex-Perera, Juan Arturo Nolazco-Flores:
Cryptographic-Speech-Key Generation Using the SVM Technique over the lp-Cepstral Speech Space. Summer School on Neural Networks 2004: 370-374
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-24 17:17 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint