default search action
Kshitiz Kumar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [i5]Mohammad Soleymanpour, Mahmoud Al Ismail, Fahimeh Bahmaninezhad, Kshitiz Kumar, Jian Wu:
Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss. CoRR abs/2308.06327 (2023) - 2022
- [c31]Daniel Tompkins, Kshitiz Kumar, Jian Wu:
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, and Pretraining: an Ablation Study. ICASSP 2022: 1016-1020 - [i4]Daniel Tompkins, Kshitiz Kumar, Jian Wu:
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study. CoRR abs/2202.03514 (2022) - 2021
- [c30]Amit Das, Kshitiz Kumar, Jian Wu:
Multi-Dialect Speech Recognition in English Using Attention on Ensemble of Experts. ICASSP 2021: 6244-6248 - [c29]Amber Afshan, Kshitiz Kumar, Jian Wu:
Sequence-Level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models. Interspeech 2021: 4084-4088 - [i3]Amber Afshan, Kshitiz Kumar, Jian Wu:
Sequence-level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models. CoRR abs/2107.00099 (2021) - 2020
- [c28]Kshitiz Kumar, Emilian Stoimenov, Hosam Khalil, Jian Wu:
Fast and Slow Acoustic Model. INTERSPEECH 2020: 541-545 - [c27]Kshitiz Kumar, Bo Ren, Yifan Gong, Jian Wu:
Bandpass Noise Generation and Augmentation for Unified ASR. INTERSPEECH 2020: 1683-1687 - [c26]Kshitiz Kumar, Chaojun Liu, Yifan Gong, Jian Wu:
1-D Row-Convolution LSTM: Fast Streaming ASR at Accuracy Parity with LC-BLSTM. INTERSPEECH 2020: 2107-2111 - [c25]Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, Jinyu Li:
Transfer Learning Approaches for Streaming End-to-End Speech Recognition System. INTERSPEECH 2020: 2152-2156 - [i2]Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, Jinyu Li:
Transfer Learning Approaches for Streaming End-to-End Speech Recognition System. CoRR abs/2008.05086 (2020)
2010 – 2019
- 2019
- [c24]Kshitiz Kumar, Tasos Anastasakos, Yifan Gong:
Word Characters and Phone Pronunciation Embedding for ASR Confidence Classifier. ICASSP 2019: 2712-2716 - [c23]Kshitiz Kumar, Yifan Gong:
Static and Dynamic State Predictions for Acoustic Model Combination. ICASSP 2019: 2782-2786 - [i1]Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong:
Speaker Adaptation for End-to-End CTC Models. CoRR abs/1901.01239 (2019) - 2018
- [c22]Xinhui Zhou, Chiman Kwan, Bulent Ayhan, Chanwoo Kim, Kshitiz Kumar, Richard M. Stern:
A Comparative Study of Spatial Speech Separation Techniques to Improve Speech Recognition. ISNN 2018: 494-502 - [c21]Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong:
Speaker Adaptation for End-to-End CTC Models. SLT 2018: 542-549 - 2017
- [c20]Yong Zhao, Jinyu Li, Kshitiz Kumar, Yifan Gong:
Extended low-rank plus diagonal adaptation for deep and recurrent neural networks. ICASSP 2017: 5040-5044 - [p1]Yifan Gong, Yan Huang, Kshitiz Kumar, Jinyu Li, Chaojun Liu, Guoli Ye, Shi-Xiong Zhang, Yong Zhao, Rui Zhao:
Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 401-417 - 2016
- [c19]Chaojun Liu, Yongqiang Wang, Kshitiz Kumar, Yifan Gong:
Investigations on speaker adaptation of LSTM RNN models for speech recognition. ICASSP 2016: 5020-5024 - [c18]Kshitiz Kumar, Chaojun Liu, Yifan Gong:
Non-negative intermediate-layer DNN adaptation for a 10-KB speaker adaptation profile. ICASSP 2016: 5285-5289 - 2015
- [c17]Kshitiz Kumar, Ziad Al Bawab, Yong Zhao, Chaojun Liu, Benoît Dumoulin, Yifan Gong:
Confidence-features and confidence-scores for ASR applications in arbitration and DNN speaker adaptation. INTERSPEECH 2015: 702-706 - [c16]Kshitiz Kumar, Chaojun Liu, Kaisheng Yao, Yifan Gong:
Intermediate-layer DNN adaptation for offline and session-based iterative speaker adaptation. INTERSPEECH 2015: 1091-1095 - [c15]Kshitiz Kumar, Chaojun Liu, Yifan Gong:
Delta-melspectra features for noise robustness to DNN-based ASR systems. INTERSPEECH 2015: 2445-2448 - 2014
- [c14]Kshitiz Kumar, Chaojun Liu, Yifan Gong:
Normalization of ASR confidence classifier scores via confidence mapping. INTERSPEECH 2014: 1199-1203 - 2013
- [c13]Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng:
Predicting speech recognition confidence using deep learning with word identity and score features. ICASSP 2013: 7413-7417 - 2011
- [b1]Kshitiz Kumar:
A Spectro-Temporal Framework for Compensation of Reverberation for Speech Recognition. Carnegie Mellon University, USA, 2011 - [c12]Kshitiz Kumar, Rita Singh, Bhiksha Raj, Richard M. Stern:
Gammatone sub-band magnitude-domain dereverberation for ASR. ICASSP 2011: 4604-4607 - [c11]Kshitiz Kumar, Chanwoo Kim, Richard M. Stern:
Delta-spectral cepstral coefficients for robust speech recognition. ICASSP 2011: 4784-4787 - [c10]Chanwoo Kim, Kshitiz Kumar, Richard M. Stern:
Binaural sound source separation motivated by auditory processing. ICASSP 2011: 5072-5075 - [c9]Kshitiz Kumar, Bhiksha Raj, Rita Singh, Richard M. Stern:
An iterative least-squares technique for dereverberation. ICASSP 2011: 5488-5491 - 2010
- [c8]Kshitiz Kumar, Richard M. Stern:
Maximum-likelihood-based cepstral inverse filtering for blind speech dereverberation. ICASSP 2010: 4282-4285
2000 – 2009
- 2009
- [c7]Chanwoo Kim, Kshitiz Kumar, Richard M. Stern:
Robust speech recognition using a Small Power Boosting algorithm. ASRU 2009: 243-248 - [c6]Kshitiz Kumar, Jirí Navrátil, Etienne Marcheret, Vit Libal, Ganesh N. Ramaswamy, Gerasimos Potamianos:
Audio-visual speech synchronization detection using a bimodal linear prediction model. CVPR Workshops 2009: 53-59 - [c5]Kshitiz Kumar, Jirí Navrátil, Etienne Marcheret, Vit Libal, Gerasimos Potamianos:
Robust audio-visual speech synchrony detection by generalized bimodal linear prediction. INTERSPEECH 2009: 2251-2254 - [c4]Chanwoo Kim, Kshitiz Kumar, Bhiksha Raj, Richard M. Stern:
Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain. INTERSPEECH 2009: 2495-2498 - 2008
- [c3]Kshitiz Kumar, Qi Wu, Yiming Wang, Marios Savvides:
Noise robust speaker identification using Bhattacharyya distance in adapted Gaussian models space. EUSIPCO 2008: 1-4 - [c2]Kshitiz Kumar, Richard M. Stern:
Environment-invariant compensation for reverberation using linear post-filtering for minimum distortion. ICASSP 2008: 4121-4124 - 2007
- [c1]Kshitiz Kumar, Tsuhan Chen, Richard M. Stern:
Profile View Lip Reading. ICASSP (4) 2007: 429-432
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-21 20:30 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint