default search action

combined dblp search
author search
venue search
publication search

ask others

Zhehuai Chen

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/Hu0Y0ZCC24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/Hu0Y0ZCC24
Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, EngSiong Chng:
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators. ACL (1) 2024: 74-90
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XuCJG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XuCJG24
Hainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg:
Transducers with Pronunciation-Aware Embeddings for Automatic Speech Recognition. ICASSP 2024: 12026-12030
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenHAHPLGBG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenHAHPLGBG24
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg:
SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation. ICASSP 2024: 13521-13525
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-04235
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-04235
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar S. Aleksic:
High-precision Voice Search Query Correction via Retrievable Speech-text Embedings. CoRR abs/2401.04235 (2024)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-06894
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-06894
Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng:
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators. CoRR abs/2402.06894 (2024)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-04295
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-04295
Hainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg:
Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition. CoRR abs/2404.04295 (2024)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-12946
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-12946
Vahid Noroozi, Zhehuai Chen, Somshubra Majumdar, Steve Huang, Jagadeesh Balam, Boris Ginsburg:
Instruction Data Generation and Unsupervised Adaptation for Speech Language Models. CoRR abs/2406.12946 (2024)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-18871
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-18871
Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, He Huang, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee:
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment. CoRR abs/2406.18871 (2024)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19674
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19674
Krishna C. Puvvada, Piotr Zelasko, He Huang, Oleksii Hrinchuk, Nithin Rao Koluguri, Kunal Dhawan, Somshubra Majumdar, Elena Rastorgueva, Zhehuai Chen, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg:
Less is More: Accurate Speech Recognition & Translation without Web-Scale Data. CoRR abs/2406.19674 (2024)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19954
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19954
Zhehuai Chen, He Huang, Oleksii Hrinchuk, Krishna C. Puvvada, Nithin Rao Koluguri, Piotr Zelasko, Jagadeesh Balam, Boris Ginsburg:
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5. CoRR abs/2406.19954 (2024)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09785
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09785
Chao-Han Huck Yang, Taejin Park, Yuan Gong, Yuanchao Li, Zhehuai Chen, Yen-Ting Lin, Chen Chen, Yuchen Hu, Kunal Dhawan, Piotr Zelasko, Chao Zhang, Yun-Nung Chen, Yu Tsao, Jagadeesh Balam, Boris Ginsburg, Sabato Marco Siniscalchi, Eng Siong Chng, Peter Bell, Catherine Lai, Shinji Watanabe, Andreas Stolcke:
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition. CoRR abs/2409.09785 (2024)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-11538
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-11538
Ke Hu, Zhehuai Chen, Chao-Han Huck Yang, Piotr Zelasko, Oleksii Hrinchuk, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg:
Chain-of-Thought Prompting for Speech Translation. CoRR abs/2409.11538 (2024)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-13523
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-13523
Piotr Zelasko, Zhehuai Chen, Mengru Wang, Daniel Galvez, Oleksii Hrinchuk, Shuoyang Ding, Ke Hu, Jagadeesh Balam, Vitaly Lavrukhin, Boris Ginsburg:
EMMeTT: Efficient Multimodal Machine Translation Training. CoRR abs/2409.13523 (2024)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-20007
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-20007
Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, Chao-Han Huck Yang, Jagadeesh Balam, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee:
Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data. CoRR abs/2409.20007 (2024)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-17485
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-17485
Yifan Peng, Krishna C. Puvvada, Zhehuai Chen, Piotr Zelasko, He Huang, Kunal Dhawan, Ke Hu, Shinji Watanabe, Jagadeesh Balam, Boris Ginsburg:
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning. CoRR abs/2410.17485 (2024)
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-22499
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-22499
Siqi Ouyang, Oleksii Hrinchuk, Zhehuai Chen, Vitaly Lavrukhin, Jagadeesh Balam, Lei Li, Boris Ginsburg:
Anticipating Future with Large Language Model for Simultaneous Machine Translation. CoRR abs/2410.22499 (2024)
2023
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SaekiZCMWZBRR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SaekiZCMWZBRR23
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. ICASSP 2023: 1-5
[c28]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangCZZHH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangCZZHH23
Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani:
Accelerating RNN-T Training and Inference Using CTC Guidance. ICASSP 2023: 1-5
[c27]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangKBCRRZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangKBCRRZ23
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. ICASSP 2023: 1-5
[c26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BlauAMWRCGBHR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BlauAMWRCGBHR23
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran:
Using Text Injection to Improve Recognition of Personal Identifiers in Speech. INTERSPEECH 2023: 191-195
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01037
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01037
Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-14514
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-14514
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. CoRR abs/2304.14514 (2023)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-07393
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-07393
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran:
Using Text Injection to Improve Recognition of Personal Identifiers in Speech. CoRR abs/2308.07393 (2023)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-09424
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-09424
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg:
SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation. CoRR abs/2310.09424 (2023)
2022
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenZRRMW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenZRRMW22
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. ICASSP 2022: 7677-7681
[c24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuWZHCH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuWZHCH22
Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani:
Unsupervised Data Selection via Discrete Speech Representation for ASR. INTERSPEECH 2022: 3393-3397
[c23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZRRMBZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZRRMBZ22
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. INTERSPEECH 2022: 4093-4097
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SainathPBZHCLWS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SainathPBZHCLWS22
Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model for ASR. SLT 2022: 52-59
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChenBRZRMC22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChenBRZRMC22
Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-03409
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-03409
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. CoRR abs/2204.03409 (2022)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-08014
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-08014
Alëna Aksënova, Zhehuai Chen, Chung-Cheng Chiu, Daan van Esch, Pavel Golik, Wei Han, Levi King, Bhuvana Ramabhadran, Andrew Rosenberg, Suzan Schwartz, Gary Wang:
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data. CoRR abs/2205.08014 (2022)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-07353
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-07353
Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model For ASR. CoRR abs/2210.07353 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-10027
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-10027
Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15447
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15447
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-16481
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-16481
Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani:
Accelerating RNN-T Training and Inference Using CTC guidance. CoRR abs/2210.16481 (2022)
2021
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChenZRRWM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChenZRRWM21
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. ASRU 2021: 251-258
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/0001CXP0K21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/0001CXP0K21
Hang Lv, Zhehuai Chen, Hainan Xu, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
An Asynchronous WFST-Based Decoder for Automatic Speech Recognition. ICASSP 2021: 6019-6023
[c18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRZZGHEWRM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRZZGHEWRM21
Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. Interspeech 2021: 736-740
[c17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRBZCJCDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRBZCJCDM21
Zhehuai Chen, Bhuvana Ramabhadran, Fadi Biadsy, Xia Zhang, Youzheng Chen, Liyang Jiang, Fang Chu, Rohan Doshi, Pedro J. Moreno:
Conformer Parrotron: A Faster and Stronger End-to-End Speech Conversion and Recognition Model for Atypical Speech. Interspeech 2021: 4828-4832
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-09063
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-09063
Hang Lv, Zhehuai Chen, Hainan Xu, Daniel Povey, Lei Xie, Sanjeev Khudanpur:
An Asynchronous WFST-Based Decoder For Automatic Speech Recognition. CoRR abs/2103.09063 (2021)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2108-12226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-12226
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. CoRR abs/2108.12226 (2021)
2020
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LiuCLHLY20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LiuCLHLY20
Qi Liu, Zhehuai Chen, Hao Li, Mingkun Huang, Yizhou Lu, Kai Yu:
Modular End-to-End Automatic Speech Recognition Framework for Acoustic-to-Word Model. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2174-2183 (2020)
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangRCZRWM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangRCZRWM20
Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Yonghui Wu, Pedro J. Moreno:
Improving Speech Recognition Using Consistent Predictions on Synthesized Speech. ICASSP 2020: 7029-7033
[c15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenR0WRM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenR0WRM20
Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. INTERSPEECH 2020: 556-560
[c14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangRCZRM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangRCZRM20
Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno:
SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR. INTERSPEECH 2020: 2832-2836
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2008-00953
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-00953
Qi Liu, Zhehuai Chen, Hao Li, Mingkun Huang, Yizhou Lu, Kai Yu:
Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model. CoRR abs/2008.00953 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChenYXLXPK19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChenYXLXPK19
Zhehuai Chen, Mahsa Yarmohammadi, Hainan Xu, Hang Lv, Lei Xie, Daniel Povey, Sanjeev Khudanpur:
Incremental Lattice Determinization for WFST Decoders. ASRU 2019: 1-7
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenJWSF19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenJWSF19
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder. ICASSP 2019: 6186-6190
[c11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenJWSF19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenJWSF19
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR. INTERSPEECH 2019: 3490-3494
2018
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/speech/ChenQ018
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/speech/ChenQ018
Zhehuai Chen, Yanmin Qian, Kai Yu:
Sequence discriminative training for deep learning based acoustic keyword spotting. Speech Commun. 102: 100-111 (2018)
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/ChenDLX18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/ChenDLX18
Zhehuai Chen, Jasha Droppo, Jinyu Li, Wayne Xiong:
Progressive Joint Modeling in Unsupervised Single-Channel Overlapped Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 26(1): 184-196 (2018)
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenLLY18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenLLY18
Zhehuai Chen, Qi Liu, Hao Li, Kai Yu:
On Modular Training of Neural Acoustics-to-Word Model for LVCSR. ICASSP 2018: 4754-4758
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenD18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenD18
Zhehuai Chen, Jasha Droppo:
Sequence Modeling in Unsupervised Single-Channel Overlapped Speech Recognition. ICASSP 2018: 4809-4813
[c8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenLXWPK18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenLXWPK18
Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
A GPU-based WFST Decoder with Exact Lattice Generation. INTERSPEECH 2018: 2212-2216
[c7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangYCQ018
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangYCQ018
Mingkun Huang, Yongbin You, Zhehuai Chen, Yanmin Qian, Kai Yu:
Knowledge Distillation for Sequence Model. INTERSPEECH 2018: 3703-3707
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1803-01090
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1803-01090
Zhehuai Chen, Qi Liu, Hao Li, Kai Yu:
On Modular Training of Neural Acoustics-to-Word Model for LVCSR. CoRR abs/1803.01090 (2018)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1804-03243
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1804-03243
Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey, Sanjeev Khudanpur:
A GPU-based WFST Decoder with Exact Lattice Generation. CoRR abs/1804.03243 (2018)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1808-00639
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-00639
Zhehuai Chen, Yanmin Qian, Kai Yu:
Sequence Discriminative Training for Deep Learning based Acoustic Keyword Spotting. CoRR abs/1808.00639 (2018)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1808-00687
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-00687
Zhehuai Chen:
Linguistic Search Optimization for Deep Learning Based LVCSR. CoRR abs/1808.00687 (2018)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1812-02142
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1812-02142
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
End-to-end contextual speech recognition using class language models and a token passing decoder. CoRR abs/1812.02142 (2018)
2017
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/ChenZQY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/ChenZQY17
Zhehuai Chen, Yimeng Zhuang, Yanmin Qian, Kai Yu:
Phone Synchronous Speech Recognition With CTC Lattices. IEEE ACM Trans. Audio Speech Lang. Process. 25(1): 86-97 (2017)
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/cncl/WuHCQY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cncl/WuHCQY17
Yue Wu, Tianxing He, Zhehuai Chen, Yanmin Qian, Kai Yu:
Multi-view LSTM Language Model with Word-Synchronized Auxiliary Feature for LVCSR. CCL 2017: 398-410
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenZY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenZY17
Zhehuai Chen, Yimeng Zhuang, Kai Yu:
Confidence measures for CTC-based phone synchronous decoding. ICASSP 2017: 4850-4854
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/iscide/ChenQY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscide/ChenQY17
Zhehuai Chen, Yanmin Qian, Kai Yu:
A Unified Confidence Measure Framework Using Auxiliary Normalization Graph. IScIDE 2017: 123-133
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ChenDLX17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ChenDLX17
Zhehuai Chen, Jasha Droppo, Jinyu Li, Wayne Xiong:
Progressive Joint Modeling in Unsupervised Single-channel Overlapped Speech Recognition. CoRR abs/1707.07048 (2017)
2016
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenDXY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenDXY16
Zhehuai Chen, Wei Deng, Tao Xu, Kai Yu:
Phone Synchronous Decoding with CTC Lattice. INTERSPEECH 2016: 1923-1927
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/ZhengCWY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/ZhengCWY16
Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu:
Directed automatic speech transcription error correction using bidirectional LSTM. ISCSLP 2016: 1-5
2015
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenCXY15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenCXY15
Bo Chen, Zhehuai Chen, Jiachen Xu, Kai Yu:
An investigation of context clustering for statistical speech synthesis with deep neural network. INTERSPEECH 2015: 2212-2216

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.