default search action
Jun Du 0002
Person information
- affiliation (PhD 2009): University of Science and Technology of China, USTC, School of Information Science and Technology, Hefei, Anhui, China
- affiliation (2010-2013): Microsoft Research Asia, Department of handwriting recognition, OCR, China
- affiliation (2009-2010): iFlytek Research, Department of speech recognition, China
- affiliation (PhD 2009): University of Science and Technology of China, China
Other persons with the same name
- Jun Du — disambiguation page
- Jun Du 0001 — Tsinghua University, Beijing National Research Center for Information Science and Technology, BNRist, Beijing, China (and 2 more)
- Jun Du 0003 — Shandong Normal University, School of Physics and Electronics, Jinan, China
- Jun Du 0004 — Anhui University, School of Mathematical Sciences, Hefei, China
- Jun Du 0005 — University of Western Ontario, Department of Computer Science, London, ONT, Canada (and 1 more)
- Jun Du 0006 — Shanghai Jiao Tong University, School of Medicine, China (and 1 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [j67]Pengfei Hu, Jiefeng Ma, Zhenrong Zhang, Jun Du, Jianshu Zhang:
Count, decompose and correct: A new approach to handwritten Chinese character error correction. Pattern Recognit. 160: 111110 (2025) - [j66]Zilu Guo, Jun Du, Sabato Marco Siniscalchi, Jia Pan, Qingfeng Liu:
Controllable Conformer for Speech Enhancement and Recognition. IEEE Signal Process. Lett. 32: 156-160 (2025) - 2024
- [j65]Ahmed M. A. Shaalan, Jun Du:
High-order dilated nested arrays with increased degrees of freedom and reduced mutual coupling. Digit. Signal Process. 153: 104650 (2024) - [j64]Nimol Thuon, Jun Du, Zhenrong Zhang, Jiefeng Ma, Pengfei Hu:
Generate, transform, and clean: the role of GANs and transformers in palm leaf manuscript generation and enhancement. Int. J. Document Anal. Recognit. 27(3): 415-432 (2024) - [j63]Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Jun Du, Jianshu Zhang, Baocai Yin, Bing Yin, Cong Liu:
SEMv2: Table separation line detection based on instance segmentation. Pattern Recognit. 149: 110279 (2024) - [j62]Hang Chen, Qing Wang, Jun Du, Bao-Cai Yin, Jia Pan, Chin-Hui Lee:
Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2508-2521 (2024) - [j61]Zilu Guo, Qing Wang, Jun Du, Jia Pan, Qing-Feng Liu, Chin-Hui Lee:
A Variance-Preserving Interpolation Approach for Diffusion Models With Applications to Single Channel Speech Enhancement and Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3025-3038 (2024) - [j60]Hang Chen, Qing Wang, Jun Du, Genshun Wan, Shifu Xiong, Baocai Yin, Jia Pan, Chin-Hui Lee:
Collaborative Viseme Subword and End-to-End Modeling for Word-Level Lip Reading. IEEE Trans. Multim. 26: 9358-9371 (2024) - [c203]Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Haotian Wang, Chin-Hui Lee:
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition. CVPR 2024: 27435-27445 - [c202]Chenyu Liu, Jia Pan, Jinshui Hu, Baocai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du, Qingfeng Liu:
NAMER: Non-autoregressive Modeling for Handwritten Mathematical Expression Recognition. ECCV (57) 2024: 273-291 - [c201]Zhenrong Zhang, Shuhang Liu, Pengfei Hu, Jiefeng Ma, Jun Du, Jianshu Zhang, Yu Hu:
UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition. EMNLP (Findings) 2024: 6131-6143 - [c200]Hongbo Lan, Tianyou Cheng, Maokui He, Hang Chen, Jun Du:
The USTC System for Cadenza 2024 Challenge. ICASSP Workshops 2024: 57-58 - [c199]Hang Chen, Shilong Wu, Chenxi Wang, Jun Du, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Jingdong Chen, Odette Scharenborg, Zhong-Qiu Wang, Bao-Cai Yin, Jia Pan:
Summary on the Multimodal Information-Based Speech Processing (MISP) 2023 Challenge. ICASSP Workshops 2024: 123-124 - [c198]Hanbo Cheng, Jun Du, Pengfei Hu, Jiefeng Ma, Zhenrong Zhang, Mobai Xue:
Viewing Writing as Video: Optical Flow based Multi-Modal Handwritten Mathematical Expression Recognition. ICASSP 2024: 5695-5699 - [c197]Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao:
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction. ICASSP 2024: 8351-8355 - [c196]Minghui Wu, Haitao Tang, Jiahuan Fan, Ruoyu Wang, Hang Chen, Yanyong Zhang, Jun Du, Hengshun Zhou, Lei Sun, Xin Fang, Tian Gao, Genshun Wan, Jia Pan, Jianqing Gao:
Implicit Enhancement of Target Speaker in Speaker-Adaptive ASR through Efficient Joint Optimization. ICASSP 2024: 10051-10055 - [c195]Gaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee:
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture. ICASSP 2024: 11626-11630 - [c194]Haotian Wang, Jun Du, Yusheng Dai, Chin-Hui Lee, Yuling Ren, Yu Liu:
Improving Multi-Modal Emotion Recognition Using Entropy-Based Fusion and Pruning-Based Network Architecture Optimization. ICASSP 2024: 11766-11770 - [c193]Feng Ma, Yanhui Tu, Maokui He, Ruoyu Wang, Shutong Niu, Lei Sun, Zhongfu Ye, Jun Du, Jia Pan, Chin-Hui Lee:
A Spatial Long-Term Iterative Mask Estimation Approach for Multi-Channel Speaker Diarization and Speech Recognition. ICASSP 2024: 12331-12335 - [c192]Zhongyuan Han, Jun Du, Mobai Xue, Jiefeng Ma, Pengfei Hu, Zhenrong Zhang:
Radical Similarity Based Model Optimization and Post-correction for Chinese Character Recognition. ICDAR (1) 2024: 152-168 - [c191]Mingjun Chen, Hao Wu, Qikai Chang, Hanbo Cheng, Jiefeng Ma, Pengfei Hu, Zhenrong Zhang, Chenyu Liu, Changpeng Pi, Jinshui Hu, Baocai Yin, Bing Yin, Cong Liu, Jun Du:
ICDAR 2024 Competition on Recognition of Chemical Structures. ICDAR (6) 2024: 397-409 - [c190]Yicheng Pan, Zhenrong Zhang, Jiefeng Ma, Pengfei Hu, Jun Du, Qing Wang, Jianshu Zhang, Dan Liu, Si Wei:
Maths: Multimodal Transformer-Based Human-Readable Solver. ICME 2024: 1-6 - [c189]Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee:
Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios. ICME 2024: 1-6 - [c188]Qing Wang, Guirui Zhong, Hengyi Hong, Lei Wang, Mingqi Cai, Xin Fang, Ya Jiang, Jun Du:
The NERCSLIP-USTC System for Semi-Supervised Acoustic Scene Classification of ICME 2024 Grand Challenge. ICME Workshops 2024: 1-4 - [c187]Chen-Yue Zhang, Hang Chen, Jun Du, Sabato Marco Siniscalchi, Ya Jiang, Chin-Hui Lee:
Summary on the Chat-Scenario Chinese Lipreading (ChatCLR) Challenge. ICME Workshops 2024: 1-6 - [c186]Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du:
SEMv3: A Fast and Robust Approach to Table Separation Line Detection. IJCAI 2024: 1191-1199 - [c185]Shuxian Wang, Qing Wang, Jun Du, Lei Wang, Fan Chu, Yuxuan Zhou, Mingqi Cai, Xin Fang:
Representation Learning Using Machine Attribute Information for Anomalous Sound Detection in Real Scenarios. IJCNN 2024: 1-7 - [i75]Hanbo Cheng, Chenyu Liu, Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Jun Du:
Bidirectional Trained Tree-Structured Decoder for Handwritten Mathematical Expression Recognition. CoRR abs/2401.00435 (2024) - [i74]Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee:
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition. CoRR abs/2403.04245 (2024) - [i73]Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang:
Multitask frame-level learning for few-shot sound event detection. CoRR abs/2403.11091 (2024) - [i72]Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du:
SEMv3: A Fast and Robust Approach to Table Separation Line Detection. CoRR abs/2405.11862 (2024) - [i71]Chang Li, Ruoyu Wang, Lijuan Liu, Jun Du, Yixuan Sun, Zilu Guo, Zhenrong Zhang, Yuan Jiang:
Quality-aware Masked Diffusion Transformer for Enhanced Music Generation. CoRR abs/2405.15863 (2024) - [i70]Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li:
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection. CoRR abs/2406.07256 (2024) - [i69]Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang:
SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding. CoRR abs/2406.08757 (2024) - [i68]Ming Gao, Hang Chen, Jun Du, Xin Xu, Hongxiao Guo, Hui Bu, Jianxing Yang, Ming Li, Chin-Hui Lee:
Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design. CoRR abs/2406.10304 (2024) - [i67]Chenyu Liu, Jia Pan, Jinshui Hu, Baocai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du, Qingfeng Liu:
NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition. CoRR abs/2407.11380 (2024) - [i66]Shutong Niu, Ruoyu Wang, Jun Du, Gaobin Yang, Yanhui Tu, Siyuan Wu, Shuangqing Qian, Huaxin Wu, Haitao Xu, Xueyang Zhang, Guolong Zhong, Xindi Yu, Jieru Chen, Mengzhi Wang, Di Cai, Tian Gao, Genshun Wan, Feng Ma, Jia Pan, Jianqing Gao:
The USTC-NERCSLIP Systems for the CHiME-8 NOTSOFAR-1 Challenge. CoRR abs/2409.02041 (2024) - [i65]Hongfei Xue, Rong Gong, Mingchen Shao, Xin Xu, Lezhi Wang, Lei Xie, Hui Bu, Jiaming Zhou, Yong Qin, Jun Du, Ming Li, Binbin Zhang, Bin Jia:
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge. CoRR abs/2409.05430 (2024) - [i64]Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Shuhang Liu, Jun Du, Jianshu Zhang:
DocMamba: Efficient Document Pre-training with State Space Model. CoRR abs/2409.11887 (2024) - [i63]Zhenrong Zhang, Shuhang Liu, Pengfei Hu, Jiefeng Ma, Jun Du, Jianshu Zhang, Yu Hu:
UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition. CoRR abs/2409.13148 (2024) - [i62]Ruoyu Wang, Shutong Niu, Gaobin Yang, Jun Du, Shuangqing Qian, Tian Gao, Jia Pan:
Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party Meetings. CoRR abs/2409.16803 (2024) - [i61]Shuhang Liu, Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Jun Du, Qing Wang, Jianshu Zhang, Chenyu Liu:
See then Tell: Enhancing Key Information Extraction with Vision Grounding. CoRR abs/2409.19573 (2024) - [i60]Hanbo Cheng, Limin Lin, Chenyu Liu, Pengcheng Xia, Pengfei Hu, Jiefeng Ma, Jun Du, Jia Pan:
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation. CoRR abs/2410.13726 (2024) - [i59]Mao-Kui He, Jun Du, Shutong Niu, Qing-Feng Liu, Chin-Hui Lee:
Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization. CoRR abs/2410.22350 (2024) - [i58]Shutong Niu, Jun Du, Ruoyu Wang, Gaobin Yang, Tian Gao, Jia Pan, Yu Hu:
DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions. CoRR abs/2411.06667 (2024) - [i57]Haotian Wang, Yuzhe Weng, Yueyan Li, Zilu Guo, Jun Du, Shutong Niu, Jiefeng Ma, Shan He, Xiaoyan Wu, Qiming Hu, Bing Yin, Cong Liu, Qingfeng Liu:
EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion. CoRR abs/2411.16726 (2024) - 2023
- [j59]Mobai Xue, Jun Du, Bin Wang, Bo Ren, Yu Hu:
Joint optimization for attention-based generation and recognition of chinese characters using tree position embedding. Pattern Recognit. 140: 109538 (2023) - [j58]Shi Cheng, Jun Du, Shutong Niu, Alejandrina Cristià, Xin Wang, Qing Wang, Chin-Hui Lee:
Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions. Speech Commun. 152: 102956 (2023) - [j57]Li Chai, Hang Chen, Jun Du, Qing-Feng Liu, Chin-Hui Lee:
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech. Speech Commun. 153: 102958 (2023) - [j56]Jie Zhang, Rui Tao, Jun Du, Li-Rong Dai:
Energy-Efficient Sparsity-Driven Speech Enhancement in Wireless Acoustic Sensor Networks. IEEE ACM Trans. Audio Speech Lang. Process. 31: 215-228 (2023) - [j55]Shutong Niu, Jun Du, Lei Sun, Yu Hu, Chin-Hui Lee:
QDM-SSD: Quality-Aware Dynamic Masking for Separation-Based Speaker Diarization. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1037-1049 (2023) - [j54]Qing Wang, Jun Du, Huaxin Wu, Jia Pan, Feng Ma, Chin-Hui Lee:
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1251-1264 (2023) - [j53]Mao-Kui He, Jun Du, Qing-Feng Liu, Chin-Hui Lee:
ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1561-1573 (2023) - [j52]Jie Zhang, Rui Tao, Jun Du, Li-Rong Dai:
SDW-SWF: Speech Distortion Weighted Single-Channel Wiener Filter for Noise Reduction. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3176-3189 (2023) - [j51]Yunqing Li, Jun Du, Jianshu Zhang, Changjie Wu:
A Tree-Structure Analysis Network on Handwritten Chinese Character Error Correction. IEEE Trans. Multim. 25: 3615-3627 (2023) - [j50]Zhenrong Zhang, Jiefeng Ma, Jun Du, Licheng Wang, Jianshu Zhang:
Multimodal Pre-Training Based on Graph Attention Network for Document Understanding. IEEE Trans. Multim. 25: 6743-6755 (2023) - [c184]Jiefeng Ma, Jun Du, Pengfei Hu, Zhenrong Zhang, Jianshu Zhang, Huihui Zhu, Cong Liu:
HRDoc: Dataset and Baseline Method toward Hierarchical Reconstruction of Document Structures. AAAI 2023: 1870-1877 - [c183]Hang Chen, Jun Du, Zhe Wang, Chenxi Wang, Yuling Ren, Qinglong Li, Ruibo Liu, Chin-Hui Lee:
Correlated Multi-Level Speech Enhancement for Robust Real-World ASR Applications Using Mask-Waveform-Feature Optimization. APSIPA ASC 2023: 96-101 - [c182]Chang Wang, Jun Du, Hang Chen, Ruoyu Wang, Chao-Han Huck Yang, Jiangjiang Zhao, Yuling Ren, Qinglong Li, Chin-Hui Lee:
Enhancing Privacy Preservation with Quantum Computing for Word-Level Audio-Visual Speech Recognition. APSIPA ASC 2023: 635-642 - [c181]Shi Cheng, Jun Du, Qing Wang, Ya Jiang, Zhaoxu Nian, Shutong Niu, Chin-Hui Lee, Yu Gao, Wenbin Zhang:
Improving Sound Event Localization and Detection with Class-Dependent Sound Separation for Real-World Scenarios. APSIPA ASC 2023: 2068-2073 - [c180]Shilong Wu, Jun Du, Mao-Kui He, Shutong Niu, Hang Chen, Haitao Tang, Chin-Hui Lee:
Semi-Supervised Multi-Channel Speaker Diarization With Cross-Channel Attention. ASRU 2023: 1-8 - [c179]Yan Wang, Jun Du, Jiefeng Ma, Pengfei Hu, Zhenrong Zhang, Jianshu Zhang:
USTC-iFLYTEK at DocILE: A Multi-modal Approach Using Domain-specific GraphDoc. CLEF (Working Notes) 2023: 598-610 - [c178]Junyi Xie, Jiefeng Ma, Xinnan Zhang, Jianshu Zhang, Jun Du:
Enhancing Math Word Problem Solving Through Salient Clue Prioritization: A Joint Token-Phrase-Level Feature Integration Approach. IALP 2023: 252-257 - [c177]Hang Chen, Shilong Wu, Yusheng Dai, Zhe Wang, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Bao-Cai Yin, Jia Pan, Jianqing Gao, Cong Liu:
Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge. ICASSP 2023: 1-2 - [c176]Ya Jiang, Hang Chen, Jun Du, Qing Wang, Chin-Hui Lee:
Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion. ICASSP 2023: 1-5 - [c175]Shutong Niu, Jun Du, Qing Wang, Li Chai, Huaxin Wu, Zhaoxu Nian, Lei Sun, Yi Fang, Jia Pan, Chin-Hui Lee:
An Experimental Study on Sound Event Localization and Detection Under Realistic Testing Conditions. ICASSP 2023: 1-5 - [c174]Ahmed M. A. Shaalan, Jun Du:
Super Dilated Nested Arrays with Ideal Critical Weights and Increased Degrees of Freedom. ICASSP 2023: 1-5 - [c173]Ruoyu Wang, Jun Du, Tian Gao:
Quantum Transfer Learning Using the Large-Scale Unsupervised Pre-Trained Model Wavlm-Large for Synthetic Speech Detection. ICASSP 2023: 1-5 - [c172]Qing Wang, Jun Du, Zhaoxu Nian, Shutong Niu, Li Chai, Huaxin Wu, Jia Pan, Chin-Hui Lee:
Loss Function Design for DNN-Based Sound Event Localization and Detection on Low-Resource Realistic Data. ICASSP 2023: 1-5 - [c171]Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Baocai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition. ICASSP 2023: 1-5 - [c170]Chenyue Zhang, Hang Chen, Jun Du, Bao-Cai Yin, Jia Pan, Chin-Hui Lee:
Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement. ICASSP 2023: 1-5 - [c169]Xinzhe Jiang, Jun Du, Pengfei Hu, Mobai Xue, Jiefeng Ma, Jiajia Wu, Jianshu Zhang:
Group, Contrast and Recognize: A Self-supervised Method for Chinese Character Recognition. ICDAR (4) 2023: 411-427 - [c168]Jinshui Hu, Chenyu Liu, Qiandong Yan, Xuyang Zhu, Jiajia Wu, Jun Du, Li-Rong Dai:
Vision-Language Adaptive Mutual Decoder for OOV-STR. ICIG (2) 2023: 175-186 - [c167]Xueyang Zhang, Shuxian Wang, Jun Du, Genwei Yan, Jigang Tang, Tian Gao, Xin Fang, Jia Pan, Jianqing Gao:
Frame-Level Embedding Learning for Few-shot Bioacoustic Event Detection. ICME 2023: 750-755 - [c166]Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee:
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder. ICME 2023: 2627-2632 - [c165]Gaobin Yang, Jun Du, Maokui He, Shutong Niu, Baoxiang Li, Jiakui Li, Chin-Hui Lee:
AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in the SUPERB Benchmark. INTERSPEECH 2023: 421-425 - [c164]Zilu Guo, Jun Du, Chin-Hui Lee, Yu Gao, Wenbin Zhang:
Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement. INTERSPEECH 2023: 1065-1069 - [c163]Haotian Wang, Jun Du, Hengshun Zhou, Chin-Hui Lee, Yuling Ren, Jiangjiang Zhao:
A Multiple-Teacher Pruning Based Self-Distillation (MT-PSD) Approach to Model Compression for Audio-Visual Wake Word Spotting. INTERSPEECH 2023: 2678-2682 - [c162]Shutong Niu, Jun Du, Maokui He, Chin-Hui Lee, Baoxiang Li, Jiakui Li:
Unsupervised Adaptation with Quality-Aware Masking to Improve Target-Speaker Voice Activity Detection for Speaker Diarization. INTERSPEECH 2023: 3482-3486 - [c161]Jinshui Hu, Hao Wu, Mingjun Chen, Chenyu Liu, Jiajia Wu, Shi Yin, Baocai Yin, Bing Yin, Cong Liu, Jun Du, Lirong Dai:
Handwritten Chemical Structure Image to Structure-Specific Markup Using Random Conditional Guided Decoder. ACM Multimedia 2023: 8114-8124 - [c160]Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng:
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023. ACM Multimedia 2023: 9531-9535 - [i56]Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Jun Du, Jianshu Zhang, Huihui Zhu, Baocai Yin, Bing Yin, Cong Liu:
SEMv2: Table Separation Line Detection Based on Conditional Convolution. CoRR abs/2303.04384 (2023) - [i55]Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Baocai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The Multimodal Information based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition. CoRR abs/2303.06326 (2023) - [i54]Jiefeng Ma, Jun Du, Pengfei Hu, Zhenrong Zhang, Jianshu Zhang, Huihui Zhu, Cong Liu:
HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures. CoRR abs/2303.13839 (2023) - [i53]Zilu Guo, Jun Du, Chin-Hui Lee, Yu Gao, Wenbin Zhang:
Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement. CoRR abs/2306.08527 (2023) - [i52]Pengfei Hu, Jiefeng Ma, Zhenrong Zhang, Jun Du, Jianshu Zhang:
Count, Decode and Fetch: A New Approach to Handwritten Chinese Character Error Correction. CoRR abs/2307.16253 (2023) - [i51]Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee:
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder. CoRR abs/2308.08488 (2023) - [i50]Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee:
The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge. CoRR abs/2308.14638 (2023) - [i49]Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng:
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023. CoRR abs/2309.07925 (2023) - [i48]Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao:
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction. CoRR abs/2309.08348 (2023) - [i47]Gaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee:
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture. CoRR abs/2309.09180 (2023) - 2022
- [j49]Zi-Rui Wang, Jun Du:
Fast writer adaptation with style extractor network for handwritten text recognition. Neural Networks 147: 42-52 (2022) - [j48]Jiajia Wu, Jun Du, Fengren Wang, Chen Yang, Xinzhe Jiang, Jinshui Hu, Bing Yin, Jianshu Zhang, Lirong Dai:
A multimodal attention fusion network with a dynamic vocabulary for TextVQA. Pattern Recognit. 122: 108214 (2022) - [j47]Zhenrong Zhang, Jianshu Zhang, Jun Du, Fengren Wang:
Split, Embed and Merge: An accurate table structure recognizer. Pattern Recognit. 126: 108565 (2022) - [j46]Chen Yang, Jun Du, Jianshu Zhang, Changjie Wu, Mingjun Chen, Jiajia Wu:
Tree-based data augmentation and mutual learning for offline handwritten mathematical expression recognition. Pattern Recognit. 132: 108910 (2022) - [j45]Ahmed M. A. Shaalan, Jun Du, Yanhui Tu:
Dilated Nested Arrays With More Degrees of Freedom (DOFs) and Less Mutual Coupling - Part I: The Fundamental Geometry. IEEE Trans. Signal Process. 70: 2518-2531 (2022) - [c159]Changjie Wu, Jun Du, Yunqing Li, Jianshu Zhang, Chen Yang, Bo Ren, Yiqing Hu:
TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition. AAAI 2022: 2694-2702 - [c158]Yanyan Yue, Jun Du, Maokui He:
Online Neural Speaker Diarization with Core Samples. CCBR 2022: 364-372 - [c157]Ruoyu Wang, Jun Du, Chang Wang:
Multi-branch Network with Circle Loss Using Voice Conversion and Channel Robust Data Augmentation for Synthetic Speech Detection. CCBR 2022: 613-620 - [c156]Ahmed M. A. Shaalan, Jun Du:
The Prototype Co-Prime Array with a Robust Difference Co-Array. ICASSP 2022: 5048-5052 - [c155]Zhaoxu Nian, Jun Du, Yu Ting Yeung, Renyu Wang:
A Time Domain Progressive Learning Approach with SNR Constriction for Single-Channel Speech Enhancement and Recognition. ICASSP 2022: 6277-6281 - [c154]Hengshun Zhou, Jun Du, Chao-Han Huck Yang, Shifu Xiong, Chin-Hui Lee:
A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning. ICASSP 2022: 7572-7576 - [c153]Shutong Niu, Jun Du, Lei Sun, Chin-Hui Lee:
Improving Separation-Based Speaker Diarization Via Iterative Model Refinement And Speaker Embedding Based Post-Processing. ICASSP 2022: 8387-8391 - [c152]Maokui He, Xiang Lv, Weilin Zhou, Jingjing Yin, Xiaoqi Zhang, Yuxuan Wang, Shutong Niu, Yuhang Cao, Heng Lu, Jun Du, Chin-Hui Lee:
The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge. ICASSP 2022: 9166-9170 - [c151]Hang Chen, Hengshun Zhou, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Bao-Cai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results. ICASSP 2022: 9266-9270 - [c150]Nimol Thuon, Jun Du, Jianshu Zhang:
Improving Isolated Glyph Classification Task for Palm Leaf Manuscripts. ICFHR 2022: 65-79 - [c149]Xinzhe Jiang, Jianshu Zhang, Jun Du, Zhenrong Zhang, Jiajia Wu:
Scene Text Recognition with Self-supervised Contrastive Predictive Coding. ICPR 2022: 1514-1521 - [c148]Pengfei Hu, Zhenrong Zhang, Jianshu Zhang, Jun Du, Jiajia Wu:
Multimodal Tree Decoder for Table of Contents Extraction in Document Images. ICPR 2022: 1756-1762 - [c147]Hengshun Zhou, Jun Du, Gongzhen Zou, Zhaoxu Nian, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Odette Scharenborg, Jingdong Chen, Shifu Xiong, Jianqing Gao:
Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis. INTERSPEECH 2022: 1111-1115 - [c146]Mao-Kui He, Jun Du, Chin-Hui Lee:
End-to-End Audio-Visual Neural Speaker Diarization. INTERSPEECH 2022: 1461-1465 - [c145]Yanyan Yue, Jun Du, Mao-Kui He, Yu Ting Yeung, Renyu Wang:
Online Speaker Diarization with Core Samples Selection. INTERSPEECH 2022: 1466-1470 - [c144]Hang Chen, Jun Du, Yusheng Dai, Chin-Hui Lee, Sabato Marco Siniscalchi, Shinji Watanabe, Odette Scharenborg, Jingdong Chen, Baocai Yin, Jia Pan:
Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis. INTERSPEECH 2022: 1766-1770 - [c143]Yajian Wang, Jun Du, Hang Chen, Qing Wang, Chin-Hui Lee:
Deep Segment Model for Acoustic Scene Classification. INTERSPEECH 2022: 4177-4181 - [c142]Guolong Zhong, Hongyu Song, Ruoyu Wang, Lei Sun, Diyuan Liu, Jia Pan, Xin Fang, Jun Du, Jie Zhang, Lirong Dai:
External Text Based Data Augmentation for Low-Resource Speech Recognition in the Constrained Condition of OpenASR21 Challenge. INTERSPEECH 2022: 4860-4864 - [c141]Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du, Chin-Hui Lee:
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function. ISCSLP 2022: 250-254 - [c140]Chenxi Wang, Hang Chen, Jun Du, Baocai Yin, Jia Pan:
Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement. ISCSLP 2022: 255-259 - [c139]Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee:
A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification. ISCSLP 2022: 453-457 - [i46]Maokui He, Xiang Lv, Weilin Zhou, Jingjing Yin, Xiaoqi Zhang, Yuxuan Wang, Shutong Niu, Yuhang Cao, Heng Lu, Jun Du, Chin-Hui Lee:
The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge. CoRR abs/2202.04855 (2022) - [i45]Hengshun Zhou, Jun Du, Chao-Han Huck Yang, Shifu Xiong, Chin-Hui Lee:
A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning. CoRR abs/2202.08509 (2022) - [i44]Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee:
A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification. CoRR abs/2203.04114 (2022) - [i43]Zhenrong Zhang, Jiefeng Ma, Jun Du, Licheng Wang, Jianshu Zhang:
Multimodal Pre-training Based on Graph Attention Network for Document Understanding. CoRR abs/2203.13530 (2022) - [i42]Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du, Chin-Hui Lee:
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function. CoRR abs/2210.14581 (2022) - [i41]Pengfei Hu, Zhenrong Zhang, Jianshu Zhang, Jun Du, Jiajia Wu:
Multimodal Tree Decoder for Table of Contents Extraction in Document Images. CoRR abs/2212.02896 (2022) - 2021
- [j44]Jia Jia, Wei Chen, Kai Yu, Xiaodong He, Jun Du, Heung-Yeung Shum:
The practice of speech and language processing in China. Commun. ACM 64(11): 81-87 (2021) - [j43]Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Bao-Cai Yin, Chin-Hui Lee:
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement. Neural Networks 143: 171-182 (2021) - [j42]Yixing Zhu, Jun Du:
TextMountain: Accurate scene text detection via instance segmentation. Pattern Recognit. 110: 107336 (2021) - [j41]Zi-Rui Wang, Jun Du:
Joint architecture and knowledge distillation in CNN for Chinese text recognition. Pattern Recognit. 111: 107722 (2021) - [j40]Jiaming Wang, Jun Du, Jianshu Zhang, Bin Wang, Bo Ren:
Stroke constrained attention network for online handwritten mathematical expression recognition. Pattern Recognit. 119: 108047 (2021) - [j39]Li Chai, Jun Du, Qing-Feng Liu, Chin-Hui Lee:
A Cross-Entropy-Guided Measure (CEGM) for Assessing Speech Recognition Performance and Optimizing DNN-Based Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 29: 106-117 (2021) - [j38]Jie Zhang, Jun Du, Li-Rong Dai:
Sensor Selection for Relative Acoustic Transfer Function Steered Linearly-Constrained Beamformers. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1220-1232 (2021) - [j37]Hengshun Zhou, Jun Du, Yuanyuan Zhang, Qing Wang, Qing-Feng Liu, Chin-Hui Lee:
Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2617-2629 (2021) - [j36]Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Lirong Dai:
SRD: A Tree Structure Based Decoder for Online Handwritten Mathematical Expression Recognition. IEEE Trans. Multim. 23: 2471-2480 (2021) - [c138]Xin Fang, Zhen-Hua Ling, Lei Sun, Shutong Niu, Jun Du, Cong Liu, Zhi-Chao Sheng:
A Deep Analysis of Speech Separation Guided Diarization Under Realistic Conditions. APSIPA ASC 2021: 667-671 - [c137]Qifeng Zeng, Jun Du, Zirui Wang:
HMM-based Lip Reading with Stingy Residual 3D Convolution. APSIPA ASC 2021: 1438-1443 - [c136]Koen Oostermeijer, Jun Du, Qing Wang, Chin-Hui Lee:
Speech Enhancement Autoencoder with Hierarchical Latent Structure. ICASSP 2021: 671-675 - [c135]Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee:
A Two-Stage Approach to Device-Robust Acoustic Scene Classification. ICASSP 2021: 845-849 - [c134]Ahmed M. A. Shaalan, Jun Du, Yanhui Tu:
TCLA Array: A New Sparse Array Design with Less Mutual Coupling. ICASSP 2021: 4605-4609 - [c133]Zhaoxu Nian, Yan-Hui Tu, Jun Du, Chin-Hui Lee:
A Progressive Learning Approach to Adaptive Noise and Speech Estimation for Speech Enhancement and Noisy Speech Recognition. ICASSP 2021: 6913-6917 - [c132]Jiaming Wang, Qing Wang, Jun Du, Jianshu Zhang, Bin Wang, Bo Ren:
MRD: A Memory Relation Decoder for Online Handwritten Mathematical Expression Recognition. ICDAR (3) 2021: 39-54 - [c131]Mobai Xue, Jun Du, Jianshu Zhang, Zi-Rui Wang, Bin Wang, Bo Ren:
Radical Composition Network for Chinese Character Generation. ICDAR (1) 2021: 252-267 - [c130]Zhenrong Zhang, Jun Du:
Accurate Oriented Instance Segmentation in Aerial Images. ICIG (1) 2021: 160-170 - [c129]Jiefeng Ma, Zirui Wang, Jun Du:
An Open-Source Library of 2D-GMM-HMM Based on Kaldi Toolkit and Its Application to Handwritten Chinese Character Recognition. ICIG (1) 2021: 235-244 - [c128]Hengshun Zhou, Jun Du, Hang Chen, Zijun Jing, Shifu Xiong, Chin-Hui Lee:
Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments. Interspeech 2021: 341-345 - [c127]Xiaoqi Zhang, Jun Du, Li Chai, Chin-Hui Lee:
A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement. Interspeech 2021: 2701-2705 - [c126]Koen Oostermeijer, Qing Wang, Jun Du:
Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement. Interspeech 2021: 2831-2835 - [c125]Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Bao-Cai Yin, Chin-Hui Lee:
Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries. Interspeech 2021: 3001-3005 - [c124]Yu-Xuan Wang, Jun Du, Maokui He, Shutong Niu, Lei Sun, Chin-Hui Lee:
Scenario-Dependent Speaker Diarization for DIHARD-III Challenge. Interspeech 2021: 3106-3110 - [c123]Maokui He, Desh Raj, Zili Huang, Jun Du, Zhuo Chen, Shinji Watanabe:
Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker. Interspeech 2021: 3555-3559 - [c122]Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Y. Liberman:
The Third DIHARD Diarization Challenge. Interspeech 2021: 3570-3574 - [c121]Yihui Fu, Luyao Cheng, Shubo Lv, Yukai Jv, Yuxiang Kong, Zhuo Chen, Yanxin Hu, Lei Xie, Jian Wu, Hui Bu, Xin Xu, Jun Du, Jingdong Chen:
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario. Interspeech 2021: 3665-3669 - [c120]Qing Wang, Huaxin Wu, Zijun Jing, Feng Ma, Yi Fang, Yuxuan Wang, Tairan Chen, Jia Pan, Jun Du, Chin-Hui Lee:
A Model Ensemble Approach for Sound Event Localization and Detection. ISCSLP 2021: 1-5 - [c119]Siyuan Zheng, Jun Du, Hengshun Zhou, Xue Bai, Chin-Hui Lee, Shipeng Li:
Speech Emotion Recognition Based on Acoustic Segment Model. ISCSLP 2021: 1-5 - [c118]Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Maokui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey:
Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis. SLT 2021: 897-904 - [c117]Li Chai, Jun Du, Diyuan Liu, Yanhui Tu, Chin-Hui Lee:
Acoustic Modeling for Multi-Array Conversational Speech Recognition in the Chime-6 Challenge. SLT 2021: 912-918 - [i40]Qing Wang, Jun Du, Huaxin Wu, Jia Pan, Feng Ma, Chin-Hui Lee:
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection. CoRR abs/2101.02919 (2021) - [i39]Yuxuan Wang, Mao-Kui He, Shutong Niu, Lei Sun, Tian Gao, Xin Fang, Jia Pan, Jun Du, Chin-Hui Lee:
USTC-NELSLIP System Description for DIHARD-III Challenge. CoRR abs/2103.10661 (2021) - [i38]Yihui Fu, Luyao Cheng, Shubo Lv, Yukai Jv, Yuxiang Kong, Zhuo Chen, Yanxin Hu, Lei Xie, Jian Wu, Hui Bu, Xin Xu, Jun Du, Jingdong Chen:
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario. CoRR abs/2104.03603 (2021) - [i37]Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Qing Wang, Yuyang Wang, Xianjun Xia, Yuanjun Zhao, Yuzhong Wu, Yannan Wang, Jun Du, Chin-Hui Lee:
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification. CoRR abs/2107.01461 (2021) - [i36]Shutong Niu, Jun Du, Lei Sun, Chin-Hui Lee:
Separation Guided Speaker Diarization in Realistic Mismatched Conditions. CoRR abs/2107.02357 (2021) - [i35]Zhenrong Zhang, Jianshu Zhang, Jun Du:
Split, embed and merge: An accurate table structure recognizer. CoRR abs/2107.05214 (2021) - [i34]Hengshun Zhou, Jun Du, Yuanyuan Zhang, Qing Wang, Qing-Feng Liu, Chin-Hui Lee:
Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition. CoRR abs/2111.08910 (2021) - 2020
- [j35]Zi-Rui Wang, Jun Du, Jia-Ming Wang:
Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recognit. 100: 107102 (2020) - [j34]Jianshu Zhang, Jun Du, Lirong Dai:
Radical analysis network for learning hierarchies of Chinese characters. Pattern Recognit. 103: 107305 (2020) - [j33]Jun Qi, Jun Du, Sabato Marco Siniscalchi, Xiaoli Ma, Chin-Hui Lee:
On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression. IEEE Signal Process. Lett. 27: 1485-1489 (2020) - [j32]Jia Pan, Genshun Wan, Jun Du, Zhongfu Ye:
Online Speaker Adaptation Using Memory-Aware Networks for Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1025-1037 (2020) - [j31]Yanhui Tu, Jun Du, Tian Gao, Chin-Hui Lee:
A Multi-Target SNR-Progressive Learning Approach to Regression Based Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1608-1619 (2020) - [j30]Yixing Zhu, Jun Du, Xueqing Wu:
Adaptive Period Embedding for Representing Oriented Objects in Aerial Images. IEEE Trans. Geosci. Remote. Sens. 58(10): 7247-7257 (2020) - [j29]Jun Qi, Jun Du, Sabato Marco Siniscalchi, Xiaoli Ma, Chin-Hui Lee:
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network-Based Vector-to-Vector Regression. IEEE Trans. Signal Process. 68: 3411-3422 (2020) - [c116]Koen Oostermeijer, Qing Wang, Jun Du:
Frequency Gating: Improved Convolutional Neural Networks for Speech Enhancement in the Time-Frequency Domain. APSIPA 2020: 465-470 - [c115]Jun Qi, Xiaoli Ma, Chin-Hui Lee, Jun Du, Sabato Marco Siniscalchi:
Performance Analysis for Tensor-Train Decomposition to Deep Neural Network Based Vector-to-Vector Regression. CISS 2020: 1-6 - [c114]Xue Bai, Jun Du, Jia Pan, Hengshun Zhou, Yanhui Tu, Chin-Hui Lee:
High-Resolution Attention Network with Acoustic Segment Model for Acoustic Scene Classification. ICASSP 2020: 656-660 - [c113]Shutong Niu, Jun Du, Li Chai, Chin-Hui Lee:
A Maximum Likelihood Approach to Multi-Objective Learning Using Generalized Gaussian Distributions for Dnn-Based Speech Enhancement. ICASSP 2020: 6229-6233 - [c112]Yanhui Tu, Jun Du, Chin-Hui Lee:
2D-to-2D Mask Estimation for Speech Enhancement Based on Fully Convolutional Neural Network. ICASSP 2020: 6664-6668 - [c111]Bin Gu, Wu Guo, Lirong Dai, Jun Du:
An Improved Deep Neural Network for Modeling Speaker Characteristics at Different Temporal Scales. ICASSP 2020: 6814-6818 - [c110]Lei Sun, Jun Du, Xueyang Zhang, Tian Gao, Xin Fang, Chin-Hui Lee:
Progressive Multi-Target Network Based Speech Enhancement with Snr-Preselection for Robust Speaker Diarization. ICASSP 2020: 7099-7103 - [c109]Xin Wang, Jun Du, Alejandrina Cristià, Lei Sun, Chin-Hui Lee:
A Study of Child Speech Extraction Using Joint Speech Enhancement and Separation in Realistic Conditions. ICASSP 2020: 7304-7308 - [c108]Fenglin Ding, Wu Guo, Lirong Dai, Jun Du:
Attention-Based Gated Scaling Adaptive Acoustic Model for CTC-Based Speech Recognition. ICASSP 2020: 7404-7408 - [c107]Xin Tang, Jun Du, Li Chai, Yannan Wang, Qing Wang, Chin-Hui Lee:
Geometry Constrained Progressive Learning for Lstm-Based Speech Enhancement. ICASSP 2020: 7514-7518 - [c106]Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Si Wei, Lirong Dai:
A Tree-Structured Decoder for Image-to-Markup Generation. ICML 2020: 11076-11085 - [c105]Changjie Wu, Qing Wang, Jianshu Zhang, Jun Du, Jia-Ming Wang, Jiajia Wu, Jin-Shui Hu:
Stroke Based Posterior Attention for Online Handwritten Mathematical Expression Recognition. ICPR 2020: 2943-2949 - [c104]Chen Yang, Qing Wang, Jun Du, Jianshu Zhang, Changjie Wu, Jiaming Wang:
A Transformer-based Radical Analysis Network for Chinese Character Recognition. ICPR 2020: 3714-3719 - [c103]Yunqing Li, Yixing Zhu, Jun Du, Changjie Wu, Jianshu Zhang:
Radical Counter Network for Robust Chinese Character Recognition. ICPR 2020: 4191-4197 - [c102]Yanhui Tu, Jun Du, Lei Sun, Feng Ma, Jia Pan, Chin-Hui Lee:
A Space-and-Speaker-Aware Iterative Mask Estimation Approach to Multi-Channel Speech Recognition in the CHiME-6 Challenge. INTERSPEECH 2020: 96-100 - [c101]Fenglin Ding, Wu Guo, Bin Gu, Zhen-Hua Ling, Jun Du:
Unsupervised Regularization-Based Adaptive Training for Speech Recognition. INTERSPEECH 2020: 996-1000 - [c100]Hu Hu, Sabato Marco Siniscalchi, Yannan Wang, Xue Bai, Jun Du, Chin-Hui Lee:
An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances. INTERSPEECH 2020: 1201-1205 - [c99]Fenglin Ding, Wu Guo, Bin Gu, Zhen-Hua Ling, Jun Du:
Adaptive Speaker Normalization for CTC-Based Speech Recognition. INTERSPEECH 2020: 1266-1270 - [c98]Bin Gu, Wu Guo, Fenglin Ding, Zhen-Hua Ling, Jun Du:
An Adaptive X-Vector Model for Text-Independent Speaker Verification. INTERSPEECH 2020: 1506-1510 - [c97]Hengshun Zhou, Jun Du, Yanhui Tu, Chin-Hui Lee:
Using Speech Enhancement Preprocessing for Speech Emotion Recognition in Realistic Noisy Conditions. INTERSPEECH 2020: 4098-4102 - [c96]Yu-Xuan Wang, Jun Du, Li Chai, Chin-Hui Lee, Jia Pan:
A Noise-Aware Memory-Attention Network Architecture for Regression-Based Speech Enhancement. INTERSPEECH 2020: 4501-4505 - [c95]Leibny Paola García-Perera, Jesús Villalba, Hervé Bredin, Jun Du, Diego Castán, Alejandrina Cristià, Latané Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Léo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak:
Speaker Detection in the Wild: Lessons Learned from JSALT 2019. Odyssey 2020: 415-422 - [i33]Fenglin Ding, Wu Guo, Lirong Dai, Jun Du:
Attentive batch normalization for lstm-based acoustic modeling of speech recognition. CoRR abs/2001.00129 (2020) - [i32]Jia-Ming Wang, Jun Du, Jianshu Zhang:
Stroke Constrained Attention Network for Online Handwritten Mathematical Expression Recognition. CoRR abs/2002.08670 (2020) - [i31]Neville Ryant, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Y. Liberman:
Third DIHARD Challenge Evaluation Plan. CoRR abs/2006.05815 (2020) - [i30]Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee:
Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation. CoRR abs/2007.08389 (2020) - [i29]Hu Hu, Sabato Marco Siniscalchi, Yannan Wang, Xue Bai, Jun Du, Chin-Hui Lee:
An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances. CoRR abs/2008.00107 (2020) - [i28]Jun Qi, Jun Du, Sabato Marco Siniscalchi, Xiaoli Ma, Chin-Hui Lee:
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network Based Vector-to-Vector Regression. CoRR abs/2008.05459 (2020) - [i27]Jun Qi, Jun Du, Sabato Marco Siniscalchi, Xiaoli Ma, Chin-Hui Lee:
On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression. CoRR abs/2008.07281 (2020) - [i26]Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Bao-Cai Yin, Chin-Hui Lee:
Correlating Subword Articulation with Lip Shapes for Embedding Aware Audio-Visual Speech Enhancement. CoRR abs/2009.09561 (2020) - [i25]Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee:
A Two-Stage Approach to Device-Robust Acoustic Scene Classification. CoRR abs/2011.01447 (2020) - [i24]Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Mao-Kui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey:
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis. CoRR abs/2011.02014 (2020) - [i23]Koen Oostermeijer, Qing Wang, Jun Du:
Frequency Gating: Improved Convolutional Neural Networks for Speech Enhancement in the Time-Frequency Domain. CoRR abs/2011.04092 (2020) - [i22]Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Y. Liberman:
The Third DIHARD Diarization Challenge. CoRR abs/2012.01477 (2020) - [i21]Hengshun Zhou, Debin Meng, Yuanyuan Zhang, Xiaojiang Peng, Jun Du, Kai Wang, Yu Qiao:
Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition. CoRR abs/2012.13912 (2020) - [i20]Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Chin-Hui Lee, Bao-Cai Yin:
Lip-reading with Hierarchical Pyramidal Convolution and Self-Attention. CoRR abs/2012.14360 (2020)
2010 – 2019
- 2019
- [j28]Lei Sun, Jun Du, Tian Gao, Yi Fang, Feng Ma, Chin-Hui Lee:
A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge. IEEE J. Sel. Top. Signal Process. 13(4): 827-840 (2019) - [j27]Yixing Zhu, Chixiang Ma, Jun Du:
Rotated cascade R-CNN: A shape robust detector with coordinate regression. Pattern Recognit. 96 (2019) - [j26]Yanhui Tu, Jun Du, Lei Sun, Feng Ma, Hai-Kun Wang, Jingdong Chen, Chin-Hui Lee:
An iterative mask estimation approach to deep learning based multi-channel speech recognition. Speech Commun. 106: 31-43 (2019) - [j25]Jianqing Gao, Jun Du, Enhong Chen:
Mixed-Bandwidth Cross-Channel Speech Recognition via Joint Optimization of DNN-Based Bandwidth Expansion and Acoustic Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 27(3): 559-571 (2019) - [j24]Li Chai, Jun Du, Qing-Feng Liu, Chin-Hui Lee:
Using Generalized Gaussian Distributions to Improve Regression Error Modeling for Deep Learning-Based Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 27(12): 1919-1931 (2019) - [j23]Jun Qi, Jun Du, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 27(12): 1932-1943 (2019) - [j22]Yanhui Tu, Jun Du, Chin-Hui Lee:
Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 27(12): 2080-2091 (2019) - [j21]Jianshu Zhang, Jun Du, Lirong Dai:
Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition. IEEE Trans. Multim. 21(1): 221-233 (2019) - [c94]Xin Tang, Jun Du, Li Chai, Yannan Wang, Qing Wang, Chin-Hui Lee:
A LSTM-Based Joint Progressive Learning Framework for Simultaneous Speech Dereverberation and Denoising. APSIPA 2019: 274-278 - [c93]Nan Zhou, Jun Du, Yanhui Tu, Tian Gao, Chin-Hui Lee:
A Speech Enhancement Neural Network Architecture with SNR-Progressive Multi-Target Learning for Robust Speech Recognition. APSIPA 2019: 873-877 - [c92]Yanhui Tu, Jun Du, Chin-Hui Lee:
DNN Training Based on Classic Gain Function for Single-channel Speech Enhancement and Recognition. ICASSP 2019: 910-914 - [c91]Lei Sun, Jun Du, Tian Gao, Yi Fang, Feng Ma, Jia Pan, Chin-Hui Lee:
A Two-stage Single-channel Speaker-dependent Speech Separation Approach for Chime-5 Challenge. ICASSP 2019: 6650-6654 - [c90]Changjie Wu, Zi-Rui Wang, Jun Du, Jianshu Zhang, Jia-Ming Wang:
Joint Spatial and Radical Analysis Network For Distorted Chinese Character Recognition. WML@ICDAR 2019: 122-127 - [c89]Jia-Ming Wang, Jun Du, Jianshu Zhang, Zi-Rui Wang:
Multi-modal Attention Network for Handwritten Mathematical Expression Recognition. ICDAR 2019: 1181-1186 - [c88]Hengshun Zhou, Debin Meng, Yuanyuan Zhang, Xiaojiang Peng, Jun Du, Kai Wang, Yu Qiao:
Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition. ICMI 2019: 562-566 - [c87]Yuanyuan Zhang, Zi-Rui Wang, Jun Du:
Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition. IJCNN 2019: 1-8 - [c86]Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristià, Jun Du, Sriram Ganapathy, Mark Y. Liberman:
The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines. INTERSPEECH 2019: 978-982 - [c85]Lanhua You, Wu Guo, Li-Rong Dai, Jun Du:
Multi-Task Learning with High-Order Statistics for x-Vector Based Text-Independent Speaker Verification. INTERSPEECH 2019: 1158-1162 - [c84]Lanhua You, Wu Guo, Li-Rong Dai, Jun Du:
Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification. INTERSPEECH 2019: 1168-1172 - [c83]Feng Ma, Li Chai, Jun Du, Diyuan Liu, Zhongfu Ye, Chin-Hui Lee:
Acoustic Model Ensembling Using Effective Data Augmentation for CHiME-5 Challenge. INTERSPEECH 2019: 1258-1262 - [c82]Li Chai, Jun Du, Chin-Hui Lee:
KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement. INTERSPEECH 2019: 1806-1810 - [c81]Li Chai, Jun Du, Chin-Hui Lee:
A Cross-Entropy-Guided (CEG) Measure for Speech Enhancement Front-End Assessing Performances of Back-End Automatic Speech Recognition. INTERSPEECH 2019: 3431-3435 - [c80]Xue Bai, Jun Du, Zi-Rui Wang, Chin-Hui Lee:
A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models. INTERSPEECH 2019: 3619-3623 - [c79]Zhi Chen, Wu Guo, Li-Rong Dai, Zhen-Hua Ling, Jun Du:
Neural Text Clustering with Document-Level Attention Based on Dynamic Soft Labels. INTERSPEECH 2019: 4225-4229 - [i19]Yuanyuan Zhang, Zi-Rui Wang, Jun Du:
Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition. CoRR abs/1901.04889 (2019) - [i18]Lanhua You, Wu Guo, Lirong Dai, Jun Du:
Deep Neural Network Embedding Learning with High-Order Statistics for Text-Independent Speaker Verification. CoRR abs/1903.12058 (2019) - [i17]Lanhua You, Wu Guo, Lirong Dai, Jun Du:
Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification. CoRR abs/1903.12092 (2019) - [i16]Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristià, Jun Du, Sriram Ganapathy, Mark Y. Liberman:
The Second DIHARD Diarization Challenge: Dataset, task, and baselines. CoRR abs/1906.07839 (2019) - [i15]Yixing Zhu, Xueqing Wu, Jun Du:
Adaptive Period Embedding for Representing Oriented Objects in Aerial Images. CoRR abs/1906.09447 (2019) - [i14]Paola García, Jesús Villalba, Hervé Bredin, Jun Du, Diego Castán, Alejandrina Cristià, Latané Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Léo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak:
Speaker detection in the wild: Lessons learned from JSALT 2019. CoRR abs/1912.00938 (2019) - [i13]Zi-Rui Wang, Jun Du:
Joint Architecture and Knowledge Distillation in Convolutional Neural Network for Offline Handwritten Chinese Text Recognition. CoRR abs/1912.07806 (2019) - 2018
- [j20]Zi-Rui Wang, Jun Du, Wenchao Wang, Jian-Fang Zhai, Jin-Shui Hu:
A comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition. Int. J. Document Anal. Recognit. 21(4): 241-251 (2018) - [j19]Qing Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Multiobjective Learning and Ensembling Approach to High-Performance Speech Enhancement With Compact Neural Network Architectures. IEEE ACM Trans. Audio Speech Lang. Process. 26(7): 1181-1193 (2018) - [j18]Yanhui Tu, Jun Du, Chin-Hui Lee:
A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech. J. Signal Process. Syst. 90(7): 963-973 (2018) - [j17]Lei Sun, Jun Du, Zhipeng Xie, Yong Xu:
Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition. J. Signal Process. Syst. 90(7): 975-983 (2018) - [c78]Jia Pan, Diyuan Liu, Genshun Wan, Jun Du, Qingfeng Liu, Zhongfu Ye:
Online Speaker Adaptation for LVCSR Based on Attention Mechanism. APSIPA 2018: 183-186 - [c77]Yanhui Tu, Jun Du, Nan Zhou, Chin-Hui Lee:
Online LSTM-based Iterative Mask Estimation for Multi-Channel Speech Enhancement and ASR. APSIPA 2018: 362-366 - [c76]Mao-Kui He, Jun Du, Zi-Rui Wang, Lei Sun:
A Novel Training Strategy Using Dynamic Data Generation for Deep Neural Network Based Speech Enhancement. APSIPA 2018: 1228-1232 - [c75]Bing Yin, Jun Du, Lei Sun, Xueyang Zhang, Shan He, Zhenhua Ling, Guoping Hu, Wu Guo:
An Analysis of Speaker Diarization Fusion Methods For The First DIHARD Challenge. APSIPA 2018: 1473-1477 - [c74]Yuanyuan Zhang, Jun Du, Zi-Rui Wang, Jianshu Zhang, Yanhui Tu:
Attention Based Fully Convolutional Network for Speech Emotion Recognition. APSIPA 2018: 1771-1775 - [c73]Zi-Rui Wang, Bao-Cai Yin, Jun Du, Cong Liu, Xiaodong Tao, Guoping Hu:
Fast and Robust Detection of Anatomical Landmarks Using Cascaded 3D Convolutional Networks Guided by Linear Square Regression. CCBR 2018: 599-608 - [c72]Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Densely Connected Progressive Learning for LSTM-Based Speech Enhancement. ICASSP 2018: 5054-5058 - [c71]Neville Ryant, Elika Bergelson, Kenneth Church, Alejandrina Cristià, Jun Du, Sriram Ganapathy, Sanjeev Khudanpur, Diana Kowalski, Mahesh Krishnamoorthy, Rajat Kulshreshta, Mark Y. Liberman, Yu-Ding Lu, Matthew Maciejewski, Florian Metze, Ján Profant, Lei Sun, Yu Tsao, Zhou Yu:
Enhancement and Analysis of Conversational Speech: JSALT 2017. ICASSP 2018: 5154-5158 - [c70]Lei Sun, Jun Du, Tian Gao, Yu-Ding Lu, Yu Tsao, Chin-Hui Lee, Neville Ryant:
A Novel LSTM-Based Speech Preprocessor for Speaker Diarization in Realistic Mismatch Conditions. ICASSP 2018: 5234-5238 - [c69]Wenchao Wang, Jianshu Zhang, Jun Du, Zi-Rui Wang, Yixing Zhu:
DenseRAN for Offline Handwritten Chinese Character Recognition. ICFHR 2018: 104-109 - [c68]Wenchao Wang, Jun Du, Zi-Rui Wang:
Parsimonious HMMs for Offline Handwritten Chinese Text Recognition. ICFHR 2018: 145-150 - [c67]Jianshu Zhang, Yixing Zhu, Jun Du, Lirong Dai:
Radical Analysis Network for Zero-Shot Learning in Printed Chinese Character Recognition. ICME 2018: 1-6 - [c66]Jianshu Zhang, Jun Du, Lirong Dai:
Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition. ICPR 2018: 2245-2250 - [c65]Jianshu Zhang, Yixing Zhu, Jun Du, Lirong Dai:
Trajectory-based Radical Analysis Network for Online Handwritten Chinese Character Recognition. ICPR 2018: 3681-3686 - [c64]Yixing Zhu, Jun Du:
Sliding Line Point Regression for Shape Robust Scene Text Detection. ICPR 2018: 3735-3740 - [c63]Lei Sun, Jun Du, Chao Jiang, Xueyang Zhang, Shan He, Bing Yin, Chin-Hui Lee:
Speaker Diarization with Enhancing Speech for the First DIHARD Challenge. INTERSPEECH 2018: 2793-2797 - [c62]Li Chai, Jun Du, Chin-Hui Lee:
Error Modeling via Asymmetric Laplace Distribution for Deep Neural Network Based Single-Channel Speech Enhancement. INTERSPEECH 2018: 3269-3273 - [c61]Xin Wang, Jun Du, Lei Sun, Qing Wang, Chin-Hui Lee:
A Progressive Deep Learning Approach to Child Speech Separation. ISCSLP 2018: 76-80 - [c60]Qing Wang, Jun Du, Li Chai, Li-Rong Dai, Chin-Hui Lee:
A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network. ISCSLP 2018: 295-299 - [c59]Hengshun Zhou, Xue Bai, Jun Du:
An Investigation of Transfer Learning Mechanism for Acoustic Scene Classification. ISCSLP 2018: 404-408 - [i12]Jianshu Zhang, Jun Du, Lirong Dai:
Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition. CoRR abs/1801.03530 (2018) - [i11]Yixing Zhu, Jun Du:
Sliding Line Point Regression for Shape Robust Scene Text Detection. CoRR abs/1801.09969 (2018) - [i10]Jianshu Zhang, Yixing Zhu, Jun Du, Lirong Dai:
Trajectory-based Radical Analysis Network for Online Handwritten Chinese Character Recognition. CoRR abs/1801.10109 (2018) - [i9]Yuanyuan Zhang, Jun Du, Zi-Rui Wang, Jianshu Zhang:
Attention Based Fully Convolutional Network for Speech Emotion Recognition. CoRR abs/1806.01506 (2018) - [i8]Wenchao Wang, Jianshu Zhang, Jun Du, Zi-Rui Wang, Yixing Zhu:
DenseRAN for Offline Handwritten Chinese Character Recognition. CoRR abs/1808.04134 (2018) - [i7]Wenchao Wang, Jun Du, Zi-Rui Wang:
Parsimonious HMMs for Offline Handwritten Chinese Text Recognition. CoRR abs/1808.04138 (2018) - [i6]Li Chai, Jun Du, Chin-Hui Lee:
Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR. CoRR abs/1811.11517 (2018) - [i5]Yixing Zhu, Jun Du:
TextMountain: Accurate Scene Text Detection via Instance Segmentation. CoRR abs/1811.12786 (2018) - [i4]Zi-Rui Wang, Jun Du, Jia-Ming Wang:
Writer-Aware CNN for Parsimonious HMM-Based Offline Handwritten Chinese Text Recognition. CoRR abs/1812.09809 (2018) - 2017
- [j16]Yanhui Tu, Jun Du, Qing Wang, Xiao Bao, Li-Rong Dai, Chin-Hui Lee:
An information fusion framework with multi-channel feature concatenation and multi-perspective system combination for the deep-learning-based robust recognition of microphone array speech. Comput. Speech Lang. 46: 517-534 (2017) - [j15]Jun Du, Jian-Fang Zhai, Jin-Shui Hu:
Writer adaptation via deeply learned features for online Chinese handwriting recognition. Int. J. Document Anal. Recognit. 20(1): 69-78 (2017) - [j14]Jun Du, Yong Xu:
Hierarchical deep neural network for multivariate regression. Pattern Recognit. 63: 149-157 (2017) - [j13]Jianshu Zhang, Jun Du, Shiliang Zhang, Dan Liu, Yulong Hu, Jin-Shui Hu, Si Wei, Li-Rong Dai:
Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognit. 71: 196-206 (2017) - [j12]Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A unified DNN approach to speaker-dependent simultaneous speech enhancement and speech separation in low SNR environments. Speech Commun. 95: 28-39 (2017) - [j11]Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Gender Mixture Detection Approach to Unsupervised Single-Channel Speech Separation Based on Deep Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 25(7): 1535-1546 (2017) - [c58]Yixing Zhu, Jun Du, Jianshu Zhang:
Dual Learning of the Generator and Recognizer for Chinese Characters. ACPR 2017: 536-541 - [c57]Zi-Rui Wang, Jun Du, Jin-Shui Hu, Yulong Hu:
Deep Convolutional Neural Network Based Hidden Markov Model for Offline Handwritten Chinese Text Recognition. ACPR 2017: 816-821 - [c56]Xin Wang, Jun Du, Yannan Wang:
A maximum likelihood approach to deep neural network based speech dereverberation. APSIPA 2017: 155-158 - [c55]Yanhui Tu, Jun Du, Lei Sun, Chin-Hui Lee:
LSTM-based iterative mask estimation and post-processing for multi-channel speech enhancement. APSIPA 2017: 488-491 - [c54]Qing Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Joint noise and mask aware training for DNN-based speech enhancement with SUB-band features. HSCMA 2017: 101-105 - [c53]Lei Sun, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Multiple-target deep learning for LSTM-RNN based speech enhancement. HSCMA 2017: 136-140 - [c52]Jianshu Zhang, Jun Du, Lirong Dai:
A GRU-Based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition. ICDAR 2017: 902-907 - [c51]Xiao Bao, Tian Gao, Jun Du, Li-Rong Dai:
An investigation of high-resolution modeling units of deep neural networks for acoustic scene classification. IJCNN 2017: 3028-3035 - [c50]Yanhui Tu, Jun Du, Lei Sun, Feng Ma, Chin-Hui Lee:
On Design of Robust Deep Models for CHiME-4 Multi-Channel Speech Recognition with Multiple Configurations of Array Microphones. INTERSPEECH 2017: 394-398 - [c49]Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Maximum Likelihood Approach to Deep Neural Network Based Nonlinear Spectral Mapping for Single-Channel Speech Separation. INTERSPEECH 2017: 1178-1182 - [c48]Li Chai, Jun Du, Yannan Wang:
Gaussian density guided deep neural network for single-channel speech enhancement. MLSP 2017: 1-6 - [c47]Shi-Xue Wen, Jun Du, Chin-Hui Lee:
On generating mixing noise signals with basis functions for simulating noisy speech and learning dnn-based speech enhancement models. MLSP 2017: 1-6 - [i3]Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai, Chin-Hui Lee:
Multi-Objective Learning and Mask-Based Post-Processing for Deep Neural Network Based Speech Enhancement. CoRR abs/1703.07172 (2017) - [i2]Jianshu Zhang, Yixing Zhu, Jun Du, Li-Rong Dai:
RAN: Radical analysis networks for zero-shot learning of Chinese characters. CoRR abs/1711.01889 (2017) - [i1]Jianshu Zhang, Jun Du, Li-Rong Dai:
A GRU-based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition. CoRR abs/1712.03991 (2017) - 2016
- [j10]Tian Gao, Jun Du, Yong Xu, Cong Liu, Li-Rong Dai, Chin-Hui Lee:
Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition. EURASIP J. Adv. Signal Process. 2016: 86 (2016) - [j9]Jun Du, Yanhui Tu, Li-Rong Dai, Chin-Hui Lee:
A Regression Approach to Single-Channel Speech Separation Via High-Resolution Deep Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 24(8): 1424-1437 (2016) - [c46]Qing Wang, Jun Du, Li-Rong Dai:
Boosting DNN-based speech enhancement via explicit transformations. APSIPA 2016: 1-4 - [c45]Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Unsupervised single-channel speech separation via deep neural network for different gender mixtures. APSIPA 2016: 1-4 - [c44]Nan Zhou, Jun Du:
Recognition of Social Touch Gestures Using 3D Convolutional Neural Networks. CCPR (1) 2016: 164-173 - [c43]Zi-Rui Wang, Jun Du:
Writer Code Based Adaptation of Deep Neural Network for Offline Handwritten Chinese Text Recognition. ICFHR 2016: 548-553 - [c42]Jun Du, Zi-Rui Wang, Jian-Fang Zhai, Jin-Shui Hu:
Deep neural network based hidden Markov model for offline handwritten Chinese text recognition. ICPR 2016: 3428-3433 - [c41]Jianqing Gao, Jun Du, Changqing Kong, Huaifang Lu, Enhong Chen, Chin-Hui Lee:
An experimental study on joint modeling of mixed-bandwidth data via deep neural networks for robust speech recognition. IJCNN 2016: 588-594 - [c40]Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee:
SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement. INTERSPEECH 2016: 3713-3717 - [c39]Nana Fan, Jun Du, Li-Rong Dai:
A regression approach to binaural speech segregation via deep neural network. ISCSLP 2016: 1-5 - [c38]Yanhui Tu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A speaker-dependent deep learning approach to joint speech separation and acoustic modeling for multi-talker automatic speech recognition. ISCSLP 2016: 1-5 - [c37]Zhipeng Xie, Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang:
Deep neural network for robust speech recognition with auxiliary features from laser-Doppler vibrometer sensor. ISCSLP 2016: 1-5 - 2015
- [j8]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Regression Approach to Speech Enhancement Based on Deep Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 23(1): 7-19 (2015) - [c36]Jun Du, Qing Wang, Yanhui Tu, Xiao Bao, Li-Rong Dai, Chin-Hui Lee:
An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework. ASRU 2015: 430-435 - [c35]Tian Gao, Jun Du, Li Xu, Cong Liu, Li-Rong Dai, Chin-Hui Lee:
A unified speaker-dependent speech separation and enhancement system based on deep neural networks. ChinaSIP 2015: 687-691 - [c34]Tian Gao, Jun Du, Yong Xu, Cong Liu, Li-Rong Dai, Chin-Hui Lee:
Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments. LVA/ICA 2015: 75-82 - [c33]Yanhui Tu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Speech Separation based on signal-noise-dependent deep neural networks for robust speech recognition. ICASSP 2015: 61-65 - [c32]Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Joint training of front-end and back-end deep neural networks for robust speech recognition. ICASSP 2015: 4375-4379 - [c31]Jun Du, Jian-Fang Zhai, Jin-Shui Hu, Bo Zhu, Si Wei, Li-Rong Dai:
Writer adaptive feature extraction based on convolutional neural networks for online handwritten Chinese character recognition. ICDAR 2015: 841-845 - [c30]Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
High-resolution acoustic modeling and compact language modeling of language-universal speech attributes for spoken language identification. INTERSPEECH 2015: 992-996 - [c29]Yong Xu, Jun Du, Zhen Huang, Li-Rong Dai, Chin-Hui Lee:
Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement. INTERSPEECH 2015: 1508-1512 - [c28]Qing Wang, Jun Du, Xiao Bao, Zi-Rui Wang, Li-Rong Dai, Chin-Hui Lee:
A universal VAD based on jointly trained deep neural networks. INTERSPEECH 2015: 2282-2286 - 2014
- [j7]Jun Du, Qiang Huo:
An irrelevant variability normalization approach to discriminative training of multi-prototype based classifiers and its applications for online handwritten Chinese character recognition. Pattern Recognit. 47(12): 3959-3966 (2014) - [j6]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
An Experimental Study on Speech Enhancement Based on Deep Neural Networks. IEEE Signal Process. Lett. 21(1): 65-68 (2014) - [j5]Jun Du, Qiang Huo:
An Improved VTS Feature Compensation using Mixture Models of Distortion and IVN Training for Noisy Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 22(11): 1601-1611 (2014) - [c27]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Global variance equalization for improving deep neural network based speech enhancement. ChinaSIP 2014: 71-75 - [c26]Jun Du, Li-Rong Dai, Qiang Huo:
Synthesized stereo mapping via deep neural networks for noisy speech recognition. ICASSP 2014: 1764-1768 - [c25]Jun Du, Jin-Shui Hu, Bo Zhu, Si Wei, Li-Rong Dai:
Writer Adaptation Using Bottleneck Features and Discriminative Linear Regression for Online Handwritten Chinese Character Recognition. ICFHR 2014: 311-316 - [c24]Jun Du, Jin-Shui Hu, Bo Zhu, Si Wei, Li-Rong Dai:
A Study of Designing Compact Classifiers Using Deep Neural Networks for Online Handwritten Chinese Character Recognition. ICPR 2014: 2950-2955 - [c23]Jun Du, Qing Wang, Tian Gao, Yong Xu, Li-Rong Dai, Chin-Hui Lee:
Robust speech recognition with speech enhanced deep neural networks. INTERSPEECH 2014: 616-620 - [c22]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Dynamic noise aware training for speech enhancement based on deep neural networks. INTERSPEECH 2014: 2670-2674 - [c21]Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A fusion approach to spoken language identification based on combining multiple phone recognizers and speech attribute detectors. ISCSLP 2014: 158-162 - [c20]Yanhui Tu, Jun Du, Yong Xu, Li-Rong Dai, Chin-Hui Lee:
Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers. ISCSLP 2014: 250-254 - [c19]Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Cross-language transfer learning for deep neural network based speech enhancement. ISCSLP 2014: 336-340 - 2013
- [j4]Jun Du, Qiang Huo:
A discriminative linear regression approach to adaptation of multi-prototype based classifiers and its applications for Chinese OCR. Pattern Recognit. 46(8): 2313-2322 (2013) - [c18]Jun Du, Qiang Huo:
A VTS-based feature compensation approach to noisy speech recognition using mixture models of distortion. ICASSP 2013: 7078-7082 - [c17]Jun Du, Qiang Huo:
An Irrelevant Variability Normalization Based Discriminative Training Approach for Online Handwritten Chinese Character Recognition. ICDAR 2013: 69-73 - 2012
- [c16]Jun Du, Qiang Huo, Kai Chen:
Designing compact classifiers for rotation-free recognition of large vocabulary online handwritten Chinese characters. ICASSP 2012: 1721-1724 - [c15]Jun Du, Qiang Huo:
A discriminative linear regression approach to OCR adaptation. ICPR 2012: 629-632 - [c14]Jun Du, Qiang Huo:
IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition. INTERSPEECH 2012: 1227-1230 - [c13]Jun Du, Qiang Huo:
Synthesized stereo-based stochastic mapping with data selection for robust speech recognition. ISCSLP 2012: 122-125 - 2011
- [j3]Jun Du, Yu Hu, Hui Jiang:
Boosted Mixture Learning of Gaussian Mixture Hidden Markov Models Based on Maximum Likelihood for Speech Recognition. IEEE Trans. Speech Audio Process. 19(7): 2091-2100 (2011) - [j2]Jun Du, Qiang Huo:
A Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of an Explicit Distortion Model for Noisy Speech Recognition. IEEE Trans. Speech Audio Process. 19(8): 2285-2293 (2011) - [c12]Jun Du, Qiang Huo, Lei Sun, Jian Sun:
Snap and Translate Using Windows Phone. ICDAR 2011: 809-813 - 2010
- [c11]Jun Du, Yu Hu, Li-Rong Dai, Ren-Hua Wang:
HMM-based pseudo-clean speech synthesis for splice algorithm. ICASSP 2010: 4570-4573 - [c10]Jun Du, Yu Hu, Hui Jiang:
Boosted mixture learning of Gaussian mixture HMMs for speech recognition. INTERSPEECH 2010: 2942-2945
2000 – 2009
- 2008
- [c9]Jun Du, Ren-Hua Wang:
Cepstral shape normalization (CSN) for robust speech recognition. ICASSP 2008: 4389-4392 - [c8]Jun Du, Qiang Huo:
A feature compensation approach using piecewise linear approximation of an explicit distortion model for noisy speech recognition. ICASSP 2008: 4721-4724 - [c7]Jun Du, Qiang Huo:
A speech enhancement approach using piecewise linear approximation of an explicit model of environmental distortions. INTERSPEECH 2008: 569-572 - [c6]Jun Du, Qiang Huo:
A feature compensation approach using high-order vector taylor series approximation of an explicit distortion model for noisy speech recognition. INTERSPEECH 2008: 1257-1260 - [c5]Jun Du, Qiang Huo, Yu Hu:
Evaluation of a Feature Compensation Approach Using High-Order Vector Taylor Series Approximation of an Explicit Distortion Modelon Aurora2, Aurora3, and Aurora4 Tasks. ISCSLP 2008: 81-84 - 2007
- [j1]Jun Du, Peng Liu, Frank K. Soong, Jian-Lai Zhou, Ren-Hua Wang:
Performance of Discriminative HMM Training in Noise. Int. J. Comput. Linguistics Chin. Lang. Process. 12(3) (2007) - [c4]Jun Du, Peng Liu, Hui Jiang, Frank K. Soong, Ren-Hua Wang:
A New Minimum Divergence Approach to Discriminative Training. ICASSP (4) 2007: 677-680 - 2006
- [c3]Jun Du, Peng Liu, Frank K. Soong, Jian-Lai Zhou, Ren-Hua Wang:
Minimum divergence based discriminative training. INTERSPEECH 2006 - [c2]Jun Du, Peng Liu, Frank K. Soong, Jian-Lai Zhou, Ren-Hua Wang:
Noisy Speech Recognition Performance of Discriminative HMMs. ISCSLP (Selected Papers) 2006: 358-369 - [c1]Zhijie Yan, Peng Liu, Jun Du, Frank K. Soong, Renhua Wang:
Training Discriminative HMM by Optimal Allocation of Gaussian Kernels. ISCSLP 2006
Coauthor Index
aka: Mao-Kui He
aka: Bao-Cai Yin
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-13 01:07 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint