default search action
Xie Chen 0001
Person information
- affiliation: Shanghai Jiao Tong University, China
- affiliation (former): Microsoft, Redmond, WA, USA
- affiliation (former): University of Cambridge, UK
Other persons with the same name
- Xie Chen 0002 — Nanyang Technological University, Singapore
- Xie Chen 0003 — Beijing Institute of Technology, Beijing, China
- Xie Chen 0004 — Hohai University, Nanjing, Jiangsu, China
- Xie Chen 0005 — Syracuse University, Department of Electrical and Computer Engineering, NY, USA
- Xie Chen 0006 — Osaka City University, Faculty of Engineering, Japan
- Xie Chen 0007 — Beijing University of Posts and Telecommunications, State Key Laboratory of Networking and Switching Technology, China
- Xie Chen 0008 — Hangzhou Dianzi University, School of Computer Science and Technology, China
- Xie Chen 0009 — Caltech, Pasadena, CA, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j5]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
Advanced Long-Content Speech Recognition With Factorized Neural Transducer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1803-1815 (2024) - [c64]Chenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai Yu:
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding. AAAI 2024: 17924-17932 - [c63]Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen:
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation. ACL (Findings) 2024: 15747-15760 - [c62]Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. ICASSP 2024: 10401-10405 - [c61]Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu:
VoiceFlow: Efficient Text-To-Speech with Rectified Flow Matching. ICASSP 2024: 11121-11125 - [c60]Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition. ICASSP 2024: 11146-11150 - [c59]Sen Liu, Yiwei Guo, Xie Chen, Kai Yu:
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations. ICASSP 2024: 11521-11525 - [c58]Feiyu Shen, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
Acoustic BPE for Speech Generation with Discrete Tokens. ICASSP 2024: 11746-11750 - [c57]Junjie Li, Yiwei Guo, Xie Chen, Kai Yu:
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention. ICASSP 2024: 12296-12300 - [c56]Wen Huang, Anbai Jiang, Bing Han, Xinhu Zheng, Yihong Qiu, Wenxi Chen, Yuzhe Liang, Pingyi Fan, Wei-Qiang Zhang, Cheng Lu, Xie Chen, Jia Liu, Yanmin Qian:
Semi-Supervised Acoustic Scene Classification with Test-Time Adaptation. ICME Workshops 2024: 1-5 - [c55]Yuzhe Liang, Wenxi Chen, Anbai Jiang, Yihong Qiu, Xinhu Zheng, Wen Huang, Bing Han, Yanmin Qian, Pingyi Fan, Wei-Qiang Zhang, L. Cheng, Jia Liu, Xie Chen:
Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio Transformer. ICME Workshops 2024: 1-6 - [c54]Zhisheng Zheng, Puyuan Peng, Ziyang Ma, Xie Chen, Eunsol Choi, David Harwath:
BAT: Learning to Reason about Spatial Sounds with Large Language Models. ICML 2024 - [c53]Mingjie Chen, Hezhao Zhang, Yuanchao Li, Jiachen Luo, Wen Wu, Ziyang Ma, Peter Bell, Catherine Lai, Joshua D. Reiss, Lin Wang, Philip C. Woodland, Xie Chen, Huy Phan, Thomas Hain:
1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem. Odyssey 2024: 260-265 - [i62]Wenxi Chen, Yuzhe Liang, Ziyang Ma, Zhisheng Zheng, Xie Chen:
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer. CoRR abs/2401.03497 (2024) - [i61]Yakun Song, Zhuo Chen, Xiaofei Wang, Ziyang Ma, Xie Chen:
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering. CoRR abs/2401.07333 (2024) - [i60]Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, Hui Zhang, Xie Chen, Kai Yu:
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech. CoRR abs/2401.14321 (2024) - [i59]Zhisheng Zheng, Puyuan Peng, Ziyang Ma, Xie Chen, Eunsol Choi, David Harwath:
BAT: Learning to Reason about Spatial Sounds with Large Language Models. CoRR abs/2402.01591 (2024) - [i58]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity. CoRR abs/2402.08846 (2024) - [i57]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
Advanced Long-Content Speech Recognition With Factorized Neural Transducer. CoRR abs/2403.13423 (2024) - [i56]Yiwei Guo, Chenrun Wang, Yifan Yang, Hankun Wang, Ziyang Ma, Chenpeng Du, Shuai Wang, Hanzheng Li, Shuai Fan, Hui Zhang, Xie Chen, Kai Yu:
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge. CoRR abs/2404.06079 (2024) - [i55]Sen Liu, Yiwei Guo, Xie Chen, Kai Yu:
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations. CoRR abs/2404.14946 (2024) - [i54]Zheng Lian, Haiyang Sun, Licai Sun, Zhuofan Wen, Siyuan Zhang, Shun Chen, Hao Gu, Jinming Zhao, Ziyang Ma, Xie Chen, Jiangyan Yi, Rui Liu, Kele Xu, Bin Liu, Erik Cambria, Guoying Zhao, Björn W. Schuller, Jianhua Tao:
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition. CoRR abs/2404.17113 (2024) - [i53]Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie Chen:
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting. CoRR abs/2404.19040 (2024) - [i52]Hankun Wang, Chenpeng Du, Yiwei Guo, Shuai Wang, Xie Chen, Kai Yu:
Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech. CoRR abs/2404.19723 (2024) - [i51]Tao Liu, Feilong Chen, Shuai Fan, Chenpeng Du, Qi Chen, Xie Chen, Kai Yu:
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding. CoRR abs/2405.03121 (2024) - [i50]Mingjie Chen, Hezhao Zhang, Yuanchao Li, Jiachen Luo, Wen Wu, Ziyang Ma, Peter Bell, Catherine Lai, Joshua D. Reiss, Lin Wang, Philip C. Woodland, Xie Chen, Huy Phan, Thomas Hain:
1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem. CoRR abs/2405.20064 (2024) - [i49]Guanrou Yang, Ziyang Ma, Fan Yu, Zhifu Gao, Shiliang Zhang, Xie Chen:
MaLa-ASR: Multimedia-Assisted LLM-Based ASR. CoRR abs/2406.05839 (2024) - [i48]Zheshu Song, Jianheng Zhuo, Yifan Yang, Ziyang Ma, Shixiong Zhang, Xie Chen:
LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR. CoRR abs/2406.06619 (2024) - [i47]Ziyang Ma, Mingjie Chen, Hezhao Zhang, Zhisheng Zheng, Wenxi Chen, Xiquan Li, Jiaxin Ye, Xie Chen, Thomas Hain:
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark. CoRR abs/2406.07162 (2024) - [i46]Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin:
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units. CoRR abs/2406.07725 (2024) - [i45]Anbai Jiang, Bing Han, Zhiqiang Lv, Yufeng Deng, Wei-Qiang Zhang, Xie Chen, Yanmin Qian, Jia Liu, Pingyi Fan:
AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection. CoRR abs/2406.11364 (2024) - [i44]Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen:
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement. CoRR abs/2406.11546 (2024) - [i43]Yakun Song, Zhuo Chen, Xiaofei Wang, Ziyang Ma, Guanrou Yang, Xie Chen:
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers. CoRR abs/2406.15752 (2024) - [i42]Bohan Li, Feiyu Shen, Yiwei Guo, Shuai Wang, Xie Chen, Kai Yu:
On the Effectiveness of Acoustic BPE in Decoder-Only TTS. CoRR abs/2407.03892 (2024) - [i41]Ziyang Ma, Yakun Song, Chenpeng Du, Jian Cong, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xie Chen:
Language Model Can Listen While Speaking. CoRR abs/2408.02622 (2024) - [i40]Mingyu Cui, Yifan Yang, Jiajun Deng, Jiawen Kang, Shujie Hu, Tianzi Wang, Zhaoqing Li, Shiliang Zhang, Xie Chen, Xunying Liu:
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR. CoRR abs/2409.08797 (2024) - [i39]Mingyu Cui, Daxin Tan, Yifan Yang, Dingdong Wang, Huimeng Wang, Xiao Chen, Xie Chen, Xunying Liu:
Exploring SSL Discrete Tokens for Multilingual ASR. CoRR abs/2409.08805 (2024) - 2023
- [j4]Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu:
Speaker Adaptive Text-to-Speech With Timbre-Normalized Vector-Quantized Feature. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3446-3456 (2023) - [c52]Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang:
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition. ASRU 2023: 1-6 - [c51]Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen:
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning. ASRU 2023: 1-7 - [c50]Qi Chen, Ziyang Ma, Tao Liu, Xu Tan, Qu Lu, Kai Yu, Xie Chen:
Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation. ICASSP 2023: 1-5 - [c49]Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng:
Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition. ICASSP 2023: 1-5 - [c48]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer. ICASSP 2023: 1-5 - [c47]Xun Gong, Wei Wang, Hang Shao, Xie Chen, Yanmin Qian:
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR. ICASSP 2023: 1-5 - [c46]Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
Emodiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance. ICASSP 2023: 1-5 - [c45]Tianrui Wang, Xie Chen, Zhuo Chen, Shu Yu, Weibin Zhu:
An Adapter Based Multi-Label Pre-Training for Speech Separation and Enhancement. ICASSP 2023: 1-5 - [c44]Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen:
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets. INTERSPEECH 2023: 82-86 - [c43]Sen Liu, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech. INTERSPEECH 2023: 616-620 - [c42]Zheng Liang, Zheshu Song, Ziyang Ma, Chenpeng Du, Kai Yu, Xie Chen:
Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation. INTERSPEECH 2023: 919-923 - [c41]Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen:
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. INTERSPEECH 2023: 1269-1273 - [c40]Mingyu Cui, Jiawen Kang, Jiajun Deng, Xi Yin, Yutao Xie, Xie Chen, Xunying Liu:
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems. INTERSPEECH 2023: 2223-2227 - [c39]Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen:
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition. INTERSPEECH 2023: 3307-3311 - [c38]Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey:
Blank-regularized CTC for Frame Skipping in Neural Transducer. INTERSPEECH 2023: 4409-4413 - [c37]Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian:
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder. ACM Multimedia 2023: 4281-4289 - [i38]Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng:
Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition. CoRR abs/2302.09331 (2023) - [i37]Qi Chen, Ziyang Ma, Tao Liu, Xu Tan, Qu Lu, Xie Chen, Kai Yu:
Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation. CoRR abs/2303.05322 (2023) - [i36]Chenpeng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian:
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder. CoRR abs/2303.17550 (2023) - [i35]Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey:
Blank-regularized CTC for Frame Skipping in Neural Transducer. CoRR abs/2305.11558 (2023) - [i34]Chenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai Yu:
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding. CoRR abs/2306.07547 (2023) - [i33]Zheng Liang, Zheshu Song, Ziyang Ma, Chenpeng Du, Kai Yu, Xie Chen:
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation. CoRR abs/2306.08588 (2023) - [i32]Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen:
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. CoRR abs/2306.08920 (2023) - [i31]Mingyu Cui, Jiawen Kang, Jiajun Deng, Xi Yin, Yutao Xie, Xie Chen, Xunying Liu:
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems. CoRR abs/2306.13307 (2023) - [i30]Sen Liu, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech. CoRR abs/2306.14145 (2023) - [i29]Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen:
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition. CoRR abs/2308.14814 (2023) - [i28]Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu:
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching. CoRR abs/2309.05027 (2023) - [i27]Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. CoRR abs/2309.07377 (2023) - [i26]Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen:
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer. CoRR abs/2309.07648 (2023) - [i25]Junzhe Liu, Jianwei Yu, Xie Chen:
Improved Factorized Neural Transducer Model For text-only Domain Adaptation. CoRR abs/2309.09524 (2023) - [i24]Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition. CoRR abs/2309.10294 (2023) - [i23]Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen:
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning. CoRR abs/2309.13860 (2023) - [i22]Feiyu Shen, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
Acoustic BPE for Speech Generation with Discrete Tokens. CoRR abs/2310.14580 (2023) - [i21]Hanglei Zhang, Yiwei Guo, Sen Liu, Xie Chen, Kai Yu:
Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations. CoRR abs/2311.01260 (2023) - [i20]Junjie Li, Yiwei Guo, Xie Chen, Kai Yu:
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention. CoRR abs/2312.08676 (2023) - [i19]Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen:
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation. CoRR abs/2312.15185 (2023) - 2022
- [c36]Xie Chen, Zhong Meng, Sarangarajan Parthasarathy, Jinyu Li:
Factorized Neural Transducer for Efficient Language Model Adaptation. ICASSP 2022: 8132-8136 - [c35]Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu:
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature. INTERSPEECH 2022: 1596-1600 - [c34]Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong:
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition. INTERSPEECH 2022: 2608-2612 - [i18]Chenpeng Du, Yiwei Guo, Xie Chen, Kai Yu:
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature. CoRR abs/2204.00768 (2022) - [i17]Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang:
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition. CoRR abs/2210.15631 (2022) - [i16]Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen:
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets. CoRR abs/2211.07321 (2022) - [i15]Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
LongFNT: Long-form Speech Recognition with Factorized Neural Transducer. CoRR abs/2211.09412 (2022) - [i14]Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance. CoRR abs/2211.09496 (2022) - [i13]Changli Tang, Yujin Wang, Xie Chen, Wei-Qiang Zhang:
Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models. CoRR abs/2212.10092 (2022) - 2021
- [c33]Xie Chen, Yu Wu, Zhenghao Wang, Shujie Liu, Jinyu Li:
Developing Real-Time Streaming Transformer Transducer for Speech Recognition on Large-Scale Dataset. ICASSP 2021: 5904-5908 - [c32]Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong:
Internal Language Model Training for Domain-Adaptive End-To-End Speech Recognition. ICASSP 2021: 7338-7342 - [c31]Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, Matei Zaharia:
Memory-Efficient Pipeline-Parallel DNN Training. ICML 2021: 7937-7947 - [c30]Yan Deng, Rui Zhao, Zhong Meng, Xie Chen, Bing Liu, Jinyu Li, Yifan Gong, Lei He:
Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS. Interspeech 2021: 751-755 - [c29]Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong:
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition. Interspeech 2021: 2596-2600 - [c28]Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong:
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition. SLT 2021: 243-250 - [i12]Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong:
Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition. CoRR abs/2102.01380 (2021) - [i11]Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong:
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition. CoRR abs/2106.02302 (2021) - [i10]Xie Chen, Zhong Meng, Sarangarajan Parthasarathy, Jinyu Li:
Factorized Neural Transducer for Efficient Language Model Adaptation. CoRR abs/2110.01500 (2021) - [i9]Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong:
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition. CoRR abs/2110.05354 (2021) - [i8]Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers. CoRR abs/2111.14836 (2021) - 2020
- [c27]Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers. ICASSP 2020: 7939-7943 - [i7]Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, Matei Zaharia:
Memory-Efficient Pipeline-Parallel DNN Training. CoRR abs/2006.09503 (2020) - [i6]Xie Chen, Sarangarajan Parthasarathy, William Gale, Shuangyu Chang, Michael Zeng:
LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition. CoRR abs/2010.11349 (2020) - [i5]Xie Chen, Yu Wu, Zhenghao Wang, Shujie Liu, Jinyu Li:
Developing Real-time Streaming Transformer Transducer for Speech Recognition on Large-scale Dataset. CoRR abs/2010.11395 (2020) - [i4]Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong:
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition. CoRR abs/2011.01991 (2020)
2010 – 2019
- 2019
- [j3]Xie Chen, Xunying Liu, Yu Wang, Anton Ragni, Jeremy Heng Meng Wong, Mark J. F. Gales:
Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 27(9): 1444-1454 (2019) - [c26]Max W. Y. Lam, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen Meng:
Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition. ICASSP 2019: 7235-7239 - [c25]Xie Chen, Jun Zhang, Tasos Anastasakos, Fil Alleva:
Investigation of Sampling Techniques for Maximum Entropy Language Modeling Training. ICASSP 2019: 7240-7244 - [c24]Jianwei Yu, Max W. Y. Lam, Xie Chen, Shoukang Hu, Songxiang Liu, Xixin Wu, Xunying Liu, Helen Meng:
Recurrent Neural Network Language Model Training Using Natural Gradient. ICASSP 2019: 7260-7264 - [i3]Sarangarajan Parthasarathy, William Gale, Xie Chen, George Polovets, Shuangyu Chang:
Long-span language modeling for speech recognition. CoRR abs/1911.04571 (2019) - 2018
- [c23]Meng Zhang, Xie Chen, Ronan Cummins, Øistein E. Andersen, Ted Briscoe:
The Effect of Adding Authorship Knowledge in Automated Text Scoring. BEA@NAACL-HLT 2018: 305-314 - [c22]Yu Wang, Xie Chen, Mark J. F. Gales, Anton Ragni, Jeremy Heng Meng Wong:
Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription. ICASSP 2018: 5899-5903 - [c21]Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur:
Neural Network Language Modeling with Letter-Based Features and Importance Sampling. ICASSP 2018: 6109-6113 - [c20]Xunying Liu, Shansong Liu, Jinze Sha, Jianwei Yu, Zhiyuan Xu, Xie Chen, Helen Meng:
Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition. ICASSP 2018: 6114-6118 - [c19]Oscar Chen, Anton Ragni, Mark J. F. Gales, Xie Chen:
Active Memory Networks for Language Modeling. INTERSPEECH 2018: 3338-3342 - [i2]Yu Wang, Xie Chen, Mark J. F. Gales, Anton Ragni, Jeremy Heng Meng Wong:
Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription. CoRR abs/1802.00254 (2018) - 2017
- [c18]Xie Chen, X. Liu, Anton Ragni, Y. Wang, Mark J. F. Gales:
Future word contexts in neural network language models. ASRU 2017: 97-103 - [c17]Xie Chen, Anton Ragni, J. Vasilakes, Xunying Liu, Kate M. Knill, Mark J. F. Gales:
Recurrent neural network language models for keyword search. ICASSP 2017: 5775-5779 - [c16]Tongtong Shen, Longbiao Wang, Xie Chen, Kuntharrgyal Khysru, Jianwu Dang:
Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models. ICONIP (2) 2017: 266-275 - [c15]Xie Chen, Anton Ragni, Xunying Liu, Mark J. F. Gales:
Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition. INTERSPEECH 2017: 269-273 - [i1]Xie Chen, Xunying Liu, Anton Ragni, Yu Wang, Mark J. F. Gales:
Future Word Contexts in Neural Network Language Models. CoRR abs/1708.05592 (2017) - 2016
- [j2]Xunying Liu, Xie Chen, Yongqiang Wang, Mark J. F. Gales, Philip C. Woodland:
Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models. IEEE ACM Trans. Audio Speech Lang. Process. 24(8): 1438-1449 (2016) - [j1]Xie Chen, Xunying Liu, Yongqiang Wang, Mark J. F. Gales, Philip C. Woodland:
Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 24(11): 2146-2157 (2016) - [c14]Xie Chen, Xunying Liu, Y. Qian, Mark J. F. Gales, Philip C. Woodland:
CUED-RNNLM - An open-source toolkit for efficient training and evaluation of recurrent neural network language models. ICASSP 2016: 6000-6004 - [c13]Anton Ragni, Edgar Dakin, Xie Chen, Mark J. F. Gales, Kate M. Knill:
Multi-Language Neural Network Language Models. INTERSPEECH 2016: 3042-3046 - 2015
- [c12]Xie Chen, Xunying Liu, Mark J. F. Gales, Philip C. Woodland:
Investigation of back-off based interpolation between recurrent neural network and n-gram language models. ASRU 2015: 181-186 - [c11]Thomas Drugman, Yannis Stylianou, Langzhou Chen, Xie Chen, Mark J. F. Gales:
Robust excitation-based features for Automatic Speech Recognition. ICASSP 2015: 4664-4668 - [c10]Xie Chen, Xunying Liu, Mark J. F. Gales, Philip C. Woodland:
Improving the training and evaluation efficiency of recurrent neural network language models. ICASSP 2015: 5401-5405 - [c9]Xunying Liu, Xie Chen, Mark J. F. Gales, Philip C. Woodland:
Paraphrastic recurrent neural network language models. ICASSP 2015: 5406-5410 - [c8]Xie Chen, Xunying Liu, Mark J. F. Gales, Philip C. Woodland:
Recurrent neural network language model training with noise contrastive estimation for speech recognition. ICASSP 2015: 5411-5415 - [c7]Xie Chen, T. Tan, Xunying Liu, Pierre Lanchantin, M. Wan, Mark J. F. Gales, Philip C. Woodland:
Recurrent neural network language model adaptation for multi-genre broadcast speech recognition. INTERSPEECH 2015: 3511-3515 - 2014
- [c6]Xunying Liu, Yongqiang Wang, Xie Chen, Mark J. F. Gales, Philip C. Woodland:
Efficient lattice rescoring using recurrent neural network language models. ICASSP 2014: 4908-4912 - [c5]Takuya Yoshioka, Xie Chen, Mark J. F. Gales:
Impact of single-microphone dereverberation on DNN-based meeting transcription systems. ICASSP 2014: 5527-5531 - [c4]Xie Chen, Yongqiang Wang, Xunying Liu, Mark J. F. Gales, Philip C. Woodland:
Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch. INTERSPEECH 2014: 641-645 - [c3]Xie Chen, Mark J. F. Gales, Kate M. Knill, Catherine Breslin, Langzhou Chen, K. K. Chin, Vincent Wan:
An initial investigation of long-term adaptation for meeting transcription. INTERSPEECH 2014: 954-958 - 2012
- [c2]Xie Chen, Adam Eversole, Gang Li, Dong Yu, Frank Seide:
Pipelined Back-Propagation for Context-Dependent Deep Neural Networks. INTERSPEECH 2012: 26-29 - 2011
- [c1]Frank Seide, Gang Li, Xie Chen, Dong Yu:
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription. ASRU 2011: 24-29
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-15 20:42 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint