default search action
Sheng Li 0010
Person information
- unicode name: 李勝
- unicode name: 李胜
- affiliation: National Institute of Information and Communications Technology (NICT), Universal Communication Research Institute (UCRI), Kyoto, Japan
- affiliation (2012-2017, PhD 2016): Kyoto University, Graduate School of Informatics, Japan
- affiliation (2008-2012): Shenzhen Institutes of Advanced Technology, Shenzhen, China
- affiliation (2008-2012): Chinese Academy of Sciences, Beijing, China
- affiliation (2008-2012): Chinese University of Hong Kong, Hong Kong
- affiliation (2002-2009): Nanjing University, China
Other persons with the same name
- Sheng Li — disambiguation page
- Sheng Li 0001 — University of Virginia, Charlottesville, VA, USA (and 3 more)
- Sheng Li 0002 — Peking University, Department of Psychology, Beijing, China (and 2 more)
- Sheng Li 0003 — Harbin Institute of Technology, Laboratory of Machine Intelligence and Translation, China
- Sheng Li 0005 — Zhejiang University of Technology, Hangzhou, China (and 2 more)
- Sheng Li 0006 — Fudan University, School of Computer Science, Shanghai Institute of Intelligent Electronics and Systems, China (and 2 more)
- Sheng Li 0007 — Google, Mountain View, CA, USA (and 3 more)
- Sheng Li 0008 — Peking University, Department of Computer Science, Beijing, China
- Sheng Li 0009 — Xijing University, Xi'an, China (and 1 more)
- Sheng Li 0011 — Zhongnan University of Economics and Law, School of Information and Safety Engineering, Wuhan, China (and 1 more)
- Sheng Li 0012 — Nanjing Institute of Technology, School of Electric Power Engineering, China
- Sheng Li 0013 — Central University of Finance and Economics, Beijing, China
- Sheng Li 0014 — Karlsruhe Institute of Technology, Germany
- Sheng Li 0015 — University of Texas Health Science Center at Houston, TX, USA
- Sheng Li 0016 — Nanjing University of Science and Technology, Nanjing, Jiangsu, China
- Sheng Li 0017 — Alibaba Inc., Hangzhou, China (and 1 more)
- Sheng Li 0018 — Wuhan University of Technology, National Engineering Laboratory for Fiber Optic Sensing Technology, China
- Sheng Li 0019 — University of Pittsburgh, PA, USA (and 1 more)
- Sheng Li 0020 — University of Electronic Science and Technology of China, Chengdu, China (and 1 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j11]Sheng Li, Jiyi Li, Yang Cao:
Phantom in the opera: adversarial music attack for robot dialogue system. Frontiers Comput. Sci. 6 (2024) - [j10]Sheng Li, Jiyi Li, Chenhui Chu:
Voices of the Himalayas: Benchmarking Speech Recognition Systems for the Tibetan Language. Int. J. Asian Lang. Process. 34(1): 2450001:1-2450001:13 (2024) - [j9]Nan Li, Longbiao Wang, Meng Ge, Masashi Unoki, Sheng Li, Jianwu Dang:
Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network. Speech Commun. 157: 103024 (2024) - [c83]Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa:
Enhancing Privacy of Spatiotemporal Federated Learning Against Gradient Inversion Attacks. DASFAA (1) 2024: 457-473 - [c82]Sheng Li, Bei Liu, Jianlong Fu:
Revisiting Generative Adversarial Network for Downstream Task of Speech Recognition. GEM 2024: 1-3 - [c81]Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara:
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction. ICASSP 2024: 876-880 - [c80]Yi Zhao, Chunyu Qiang, Hao Li, Yulan Hu, Wangjin Zhou, Sheng Li:
Enhancing Realism in 3D Facial Animation Using Conformer-Based Generation and Automated Post-Processing. ICASSP 2024: 8341-8345 - [c79]Yankun Wu, Yuta Nakashima, Noa Garcia, Sheng Li, Zhaoyang Zeng:
Reproducibility Companion Paper: Stable Diffusion for Content-Style Disentanglement in Art Analysis. ICMR 2024: 1228-1231 - [i10]Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara:
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction. CoRR abs/2401.13249 (2024) - [i9]Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa:
Enhancing Privacy of Spatiotemporal Federated Learning against Gradient Inversion Attacks. CoRR abs/2407.08529 (2024) - [i8]Yuka Ko, Sheng Li, Chao-Han Huck Yang, Tatsuya Kawahara:
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction. CoRR abs/2408.16180 (2024) - [i7]Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa:
Extracting Spatiotemporal Data from Gradients with Large Language Models. CoRR abs/2410.16121 (2024) - 2023
- [j8]Soky Kak, Sheng Li, Chenhui Chu, Tatsuya Kawahara:
Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings. Int. J. Asian Lang. Process. 33(4): 2350024:1-2350024:17 (2023) - [j7]Yuqin Lin, Jianwu Dang, Longbiao Wang, Sheng Li, Chenchen Ding:
Disordered speech recognition considering low resources and abnormal articulation. Speech Commun. 155: 103002 (2023) - [c78]Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi:
Towards Speech Dialogue Translation Mediating Speakers of Different Languages. ACL (Findings) 2023: 1122-1134 - [c77]Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki:
Multi-Domain Dialogue State Tracking with Disentangled Domain-Slot Attention. ACL (Findings) 2023: 4928-4938 - [c76]Zili Qi, Xinhui Hu, Wangjin Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu:
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement. ASRU 2023: 1-6 - [c75]Wenqing Wei, Zhengdong Yang, Yuan Gao, Jiyi Li, Chenhui Chu, Shogo Okada, Sheng Li:
FedCPC: An Effective Federated Contrastive Learning Method for Privacy Preserving Early-Stage Alzheimers Speech Detection. ASRU 2023: 1-6 - [c74]Sheng Li, Jiyi Li:
Correction while Recognition: Combining Pretrained Language Model for Taiwan-Accented Speech Recognition. ICANN (7) 2023: 389-400 - [c73]Soky Kak, Sheng Li, Chenhui Chu, Tatsuya Kawahara:
Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language. ICASSP 2023: 1-5 - [c72]Helen Korving, Sheng Li, Di Zhou, Paula Sophia Sterkenburg, Panos Markopoulos, Emilia I. Barakova:
Development of a Pain Signaling System Using Machine Learning. ICASSP Workshops 2023: 1-5 - [c71]Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi:
Hierarchical Softmax for End-To-End Low-Resource Multilingual Speech Recognition. ICASSP 2023: 1-5 - [c70]Chao Tan, Yang Cao, Sheng Li, Masatoshi Yoshikawa:
General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition. ICASSP 2023: 1-5 - [c69]Kai Wang, Yuhang Yang, Hao Huang, Ying Hu, Sheng Li:
Speakeraugment: Data Augmentation for Generalizable Source Separation via Speaker Parameter Manipulation. ICASSP 2023: 1-5 - [c68]Yuhang Yang, Haihua Xu, Hao Huang, Eng Siong Chng, Sheng Li:
Speech-Text Based Multi-Modal Training with Bidirectional Attention for Improved Speech Recognition. ICASSP 2023: 1-5 - [c67]Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, Chenhui Chu:
The Kyoto Speech-to-Speech Translation System for IWSLT 2023. IWSLT@ACL 2023: 357-362 - [c66]Wangjin Zhou, Zhengdong Yang, Sheng Li, Chenhui Chu:
KyotoMOS: An Automatic MOS Scoring System for Speech Synthesis. MMAsia (Workshops) 2023: 7:1-7:3 - [c65]Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He:
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization. MMAsia 2023: 93:1-93:5 - [c64]Xiaojiao Chen, Sheng Li, Jiyi Li, Yang Cao, Hao Huang, Liang He:
GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System. MMAsia 2023: 94:1-94:5 - [i6]Shuichiro Shimizu, Chenhui Chu, Sheng Li, Sadao Kurohashi:
Towards Speech Dialogue Translation Mediating Speakers of Different Languages. CoRR abs/2305.09210 (2023) - 2022
- [j6]Siqing Qin, Longbiao Wang, Sheng Li, Jianwu Dang, Lixin Pan:
Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling. EURASIP J. Audio Speech Music. Process. 2022(1): 2 (2022) - [c63]Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki:
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network. EUSIPCO 2022: 379-383 - [c62]Kai Wang, Yizhou Peng, Hao Huang, Ying Hu, Sheng Li:
Mining Hard Samples Locally And Globally For Improved Speech Separation. ICASSP 2022: 6037-6041 - [c61]Yongjie Lv, Longbiao Wang, Meng Ge, Sheng Li, Chenchen Ding, Lixin Pan, Yuguang Wang, Jianwu Dang, Kiyoshi Honda:
Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation. ICASSP 2022: 7992-7996 - [c60]Xiaojiao Chen, Sheng Li, Hao Huang:
GhostVec: Directly Extracting Speaker Embedding from End-to-End Speech Recognition Model Using Adversarial Examples. ICONIP (6) 2022: 482-492 - [c59]Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong:
An End-to-End Chinese and Japanese Bilingual Speech Recognition Systems with Shared Character Decomposition. ICONIP (6) 2022: 493-503 - [c58]Guangxing Li, Wangjin Zhou, Sheng Li, Yi Zhao, Jichen Yang, Hao Huang:
Investigating Effective Domain Adaptation Method for Speaker Verification Task. ICONIP (6) 2022: 517-527 - [c57]Hao Shi, Longbiao Wang, Sheng Li, Jianwu Dang, Tatsuya Kawahara:
Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction. INTERSPEECH 2022: 221-225 - [c56]Nan Li, Meng Ge, Longbiao Wang, Masashi Unoki, Sheng Li, Jianwu Dang:
Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network. INTERSPEECH 2022: 361-365 - [c55]Longfei Yang, Wenqing Wei, Sheng Li, Jiyi Li, Takahiro Shinozaki:
Augmented Adversarial Self-Supervised Learning for Early-Stage Alzheimer's Speech Detection. INTERSPEECH 2022: 541-545 - [c54]Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki:
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection. INTERSPEECH 2022: 664-668 - [c53]Soky Kak, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara:
Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism. INTERSPEECH 2022: 1362-1366 - [c52]Siqing Qin, Longbiao Wang, Sheng Li, Yuqin Lin, Jianwu Dang:
Finer-grained Modeling units-based Meta-Learning for Low-resource Tibetan Speech Recognition. INTERSPEECH 2022: 2133-2137 - [c51]Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao:
Fusion of Self-supervised Learned Models for MOS Prediction. INTERSPEECH 2022: 5443-5447 - [c50]Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong:
Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection. LREC 2022: 7291-7297 - [c49]Soky Kak, Zhuo Gong, Sheng Li:
Nict-Tib1: A Public Speech Corpus Of Lhasa Dialect For Benchmarking Tibetan Language Speech Recognition Systems. O-COCOSDA 2022 2022: 1-5 - [c48]Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai, Nobuaki Minematsu:
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model. Odyssey 2022: 415-420 - [c47]Longfei Yang, Jiyi Li, Sheng Li, Takahiro Shinozaki:
Multi-Domain Dialogue State Tracking with Top-K Slot Self Attention. SIGDIAL 2022: 231-236 - [i5]Qianying Liu, Yuhang Yang, Zhuo Gong, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Sadao Kurohashi:
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition. CoRR abs/2204.03855 (2022) - [i4]Zhengdong Yang, Wangjin Zhou, Chenhui Chu, Sheng Li, Raj Dabre, Raphael Rubino, Yi Zhao:
Fusion of Self-supervised Learned Models for MOS Prediction. CoRR abs/2204.04855 (2022) - [i3]Yuhang Yang, Haihua Xu, Hao Huang, Eng Siong Chng, Sheng Li:
Speech-text based multi-modal training with bidirectional attention for improved speech recognition. CoRR abs/2211.00325 (2022) - 2021
- [j5]Soky Kak, Masato Mimura, Tatsuya Kawahara, Chenhui Chu, Sheng Li, Chenchen Ding, Sethserey Sam:
TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies. Int. J. Asian Lang. Process. 31(3&4): 2250007:1-2250007:21 (2021) - [c46]Soky Kak, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara:
On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora. APSIPA ASC 2021: 433-437 - [c45]Hao Shi, Longbiao Wang, Sheng Li, Cunhang Fan, Jianwu Dang, Tatsuya Kawahara:
Spectrograms Fusion-based End-to-end Robust Automatic Speech Recognition. APSIPA ASC 2021: 438-442 - [c44]Yizhou Peng, Jicheng Zhang, Haobo Zhang, Haihua Xu, Hao Huang, Sheng Li, Eng Siong Chng:
Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework. APSIPA ASC 2021: 1043-1048 - [c43]Shunfei Chen, Xinhui Hu, Sheng Li, Xinkang Xu:
An Investigation of Using Hybrid Modeling Units for Improving End-to-End Speech Recognition System. ICASSP 2021: 6743-6747 - [c42]Nan Li, Longbiao Wang, Masashi Unoki, Sheng Li, Rui Wang, Meng Ge, Jianwu Dang:
Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network. ICASSP 2021: 6828-6832 - [c41]Hao Huang, Kai Wang, Ying Hu, Sheng Li:
Encoder-Decoder Based Pitch Tracking and Joint Model Training for Mandarin Tone Classification. ICASSP 2021: 6943-6947 - [c40]Luya Qiang, Hao Shi, Meng Ge, Haoran Yin, Nan Li, Longbiao Wang, Sheng Li, Jianwu Dang:
Speech Dereverberation Based on Scale-Aware Mean Square Error Loss. ICONIP (5) 2021: 55-63 - [c39]Dawei Liu, Longbiao Wang, Sheng Li, Haoyu Li, Chenchen Ding, Ju Zhang, Jianwu Dang:
Exploring Effective Speech Representation via ASR for High-Quality End-to-End Multispeaker TTS. ICONIP (6) 2021: 110-118 - [c38]Haoran Yin, Hao Shi, Longbiao Wang, Luya Qiang, Sheng Li, Meng Ge, Gaoyan Zhang, Jianwu Dang:
Simultaneous Progressive Filtering-Based Monaural Speech Enhancement. ICONIP (5) 2021: 213-221 - [c37]Kai Wang, Hao Huang, Ying Hu, Zhihua Huang, Sheng Li:
End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain. Interspeech 2021: 3046-3050 - [c36]Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li, Xinkang Xu:
An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model. Interspeech 2021: 3266-3270 - [c35]Soky Kak, Masato Mimura, Tatsuya Kawahara, Sheng Li, Chenchen Ding, Chenhui Chu, Sethserey Sam:
Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC). O-COCOSDA 2021: 122-127 - 2020
- [j4]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2674-2683 (2020) - [c34]Yaowei Han, Yang Cao, Sheng Li, Qiang Ma, Masatoshi Yoshikawa:
Voice-Indistinguishability - Protecting Voiceprint with Differential Privacy under an Untrusted Server. CCS 2020: 2125-2127 - [c33]Yuqin Lin, Longbiao Wang, Jianwu Dang, Sheng Li, Chenchen Ding:
End-to-End Articulatory Modeling for Dysarthric Articulatory Attribute Detection. ICASSP 2020: 7349-7353 - [c32]Hao Shi, Longbiao Wang, Meng Ge, Sheng Li, Jianwu Dang:
Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation. ICASSP 2020: 7544-7548 - [c31]Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa:
Voice-Indistinguishability: Protecting Voiceprint In Privacy-Preserving Speech Data Release. ICME 2020: 1-6 - [c30]Shaotong Guo, Longbiao Wang, Sheng Li, Ju Zhang, Cheng Gong, Yuguang Wang, Jianwu Dang, Kiyoshi Honda:
Investigation of Effectively Synthesizing Code-Switched Speech Using Highly Imbalanced Mix-Lingual Data. ICONIP (1) 2020: 36-47 - [c29]Hao Shi, Longbiao Wang, Sheng Li, Chenchen Ding, Meng Ge, Nan Li, Jianwu Dang, Hiroshi Seki:
Singing Voice Extraction with Attention-Based Spectrograms Fusion. INTERSPEECH 2020: 2412-2416 - [c28]Yuqin Lin, Longbiao Wang, Sheng Li, Jianwu Dang, Chenchen Ding:
Staged Knowledge Distillation for End-to-End Dysarthric Speech Recognition and Speech Attribute Transcription. INTERSPEECH 2020: 4791-4795 - [c27]Aye Thida, Nway Nway Han, Sheinn Thawtar Oo, Sheng Li, Chenchen Ding:
VOIS: The First Speech Therapy App Specifically Designed for Myanmar Hearing-Impaired Children. O-COCOSDA 2020: 151-154 - [c26]Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li, Hisashi Kawai:
Compensation on x-vector for Short Utterance Spoken Language Identification. Odyssey 2020: 47-52 - [c25]Sheng Li, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai:
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes. Odyssey 2020: 385-390 - [p1]Xugang Lu, Sheng Li, Masakiyo Fujimoto:
Automatic Speech Recognition. Speech-to-Speech Translation 2020: 21-38 - [i2]Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa:
Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release. CoRR abs/2004.07442 (2020)
2010 – 2019
- 2019
- [c24]Lixin Pan, Sheng Li, Longbiao Wang, Jianwu Dang:
Effective Training End-to-End ASR systems for Low-resource Lhasa Dialect of Tibetan Language. APSIPA 2019: 1152-1156 - [c23]Soky Kak, Sheng Li, Tatsuya Kawahara, Sopheap Seng:
Multi-lingual Transformer Training for Khmer Automatic Speech Recognition. APSIPA 2019: 1893-1896 - [c22]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Interactive Learning of Teacher-student Model for Short Utterance Spoken Language Identification. ICASSP 2019: 5981-5985 - [c21]Ryoichi Takashima, Sheng Li, Hisashi Kawai:
Investigation of Sequence-level Knowledge Distillation Methods for CTC Acoustic Models. ICASSP 2019: 6156-6160 - [c20]Sheng Li, Chenchen Ding, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition. INTERSPEECH 2019: 2145-2149 - [c19]Sheng Li, Xugang Lu, Chenchen Ding, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese. INTERSPEECH 2019: 2200-2204 - [c18]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection. INTERSPEECH 2019: 3614-3618 - [c17]Sheng Li, Raj Dabre, Xugang Lu, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation. INTERSPEECH 2019: 4400-4404 - [i1]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Deep progressive multi-scale attention for acoustic event classification. CoRR abs/1912.12011 (2019) - 2018
- [c16]Ryoichi Takashima, Sheng Li, Hisashi Kawai:
An Investigation of a Knowledge Distillation Method for CTC Acoustic Models. ICASSP 2018: 5809-5813 - [c15]Ryoichi Takashima, Sheng Li, Hisashi Kawai:
CTC Loss Function with a Unit-Level Ambiguity Penalty. ICASSP 2018: 5909-5913 - [c14]Xugang Lu, Peng Shen, Sheng Li, Yu Tsao, Hisashi Kawai:
Temporal Attentive Pooling for Acoustic Event Detection. INTERSPEECH 2018: 1354-1357 - [c13]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification. INTERSPEECH 2018: 1813-1817 - [c12]Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks. INTERSPEECH 2018: 3708-3712 - [c11]Sheng Li, Xugang Lu, Ryoichi Takashima, Peng Shen, Tatsuya Kawahara, Hisashi Kawai:
Improving Very Deep Time-Delay Neural Network With Vertical-Attention For Effectively Training CTC-Based ASR Systems. SLT 2018: 77-83 - 2017
- [c10]Sheng Li, Xugang Lu, Peng Shen, Ryoichi Takashima, Tatsuya Kawahara, Hisashi Kawai:
Incremental training and constructing the very deep convolutional residual network acoustic models. ASRU 2017: 222-227 - [c9]Sheng Li, Xugang Lu, Shinsuke Sakai, Masato Mimura, Tatsuya Kawahara:
Semi-supervised ensemble DNN acoustic model training. ICASSP 2017: 5270-5274 - [c8]Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai:
Conditional Generative Adversarial Nets Classifier for Spoken Language Identification. INTERSPEECH 2017: 2814-2818 - 2016
- [b1]Sheng Li:
Speech Recognition Enhanced by Lightly-supervised and Semi-supervised Acoustic Model Training. Kyoto University, Japan, 2016 - [j3]Sheng Li, Yuya Akita, Tatsuya Kawahara:
Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses. IEEE ACM Trans. Audio Speech Lang. Process. 24(9): 1524-1534 (2016) - [c7]Sheng Li, Yuya Akita, Tatsuya Kawahara:
Data selection from multiple ASR systems' hypotheses for unsupervised acoustic model training. ICASSP 2016: 5875-5879 - [c6]Sheng Li, Xugang Lu, Shinsuke Mori, Yuya Akita, Tatsuya Kawahara:
Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data. ISCSLP 2016: 1-5 - 2015
- [j2]Sheng Li, Yuya Akita, Tatsuya Kawahara:
Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training. IEICE Trans. Inf. Syst. 98-D(8): 1545-1552 (2015) - [c5]Sheng Li, Xugang Lu, Yuya Akita, Tatsuya Kawahara:
Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation. INTERSPEECH 2015: 2892-2896 - [c4]Sheng Li, Yuya Akita, Tatsuya Kawahara:
Discriminative data selection for lightly supervised training of acoustic model using closed caption texts. INTERSPEECH 2015: 3526-3530 - 2014
- [c3]Sheng Li, Yuya Akita, Tatsuya Kawahara:
Corpus and transcription system of Chinese Lecture Room. ISCSLP 2014: 442-445 - 2012
- [j1]Lan Wang, Hui Chen, Sheng Li, Helen M. Meng:
Phoneme-level articulatory animation in pronunciation training. Speech Commun. 54(7): 845-856 (2012) - [c2]Sheng Li, Lan Wang:
Cross Linguistic Comparison of Mandarin and English EMA Articulatory Data. INTERSPEECH 2012: 903-906 - 2011
- [c1]Sheng Li, Lan Wang, En Qi:
The Phoneme-Level Articulator Dynamics for Pronunciation Animation. IALP 2011: 283-286
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-26 00:51 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint