Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Picture for Xiaohai Tian

Xiaohai Tian

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words

Add code
Jun 19, 2024
Viaarxiv icon

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

Jan 22, 2024
Viaarxiv icon

Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

May 19, 2023
Figure 1 for Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring
Figure 2 for Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring
Figure 3 for Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring
Figure 4 for Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring
Viaarxiv icon

An ASR-free Fluency Scoring Approach with Self-Supervised Learning

Mar 13, 2023
Figure 1 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 2 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 3 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 4 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Viaarxiv icon

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

Mar 13, 2023
Figure 1 for Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring
Figure 2 for Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring
Figure 3 for Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring
Figure 4 for Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring
Viaarxiv icon

TTS-Guided Training for Accent Conversion Without Parallel Data

Add code
Dec 20, 2022
Figure 1 for TTS-Guided Training for Accent Conversion Without Parallel Data
Figure 2 for TTS-Guided Training for Accent Conversion Without Parallel Data
Figure 3 for TTS-Guided Training for Accent Conversion Without Parallel Data
Figure 4 for TTS-Guided Training for Accent Conversion Without Parallel Data
Viaarxiv icon

Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information

Mar 01, 2022
Figure 1 for Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information
Figure 2 for Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information
Figure 3 for Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information
Figure 4 for Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information
Viaarxiv icon

The Multi-speaker Multi-style Voice Cloning Challenge 2021

Apr 05, 2021
Figure 1 for The Multi-speaker Multi-style Voice Cloning Challenge 2021
Figure 2 for The Multi-speaker Multi-style Voice Cloning Challenge 2021
Viaarxiv icon

Spoofing detection under noisy conditions: a preliminary investigation and an initial database

Feb 09, 2016
Figure 1 for Spoofing detection under noisy conditions: a preliminary investigation and an initial database
Figure 2 for Spoofing detection under noisy conditions: a preliminary investigation and an initial database
Figure 3 for Spoofing detection under noisy conditions: a preliminary investigation and an initial database
Figure 4 for Spoofing detection under noisy conditions: a preliminary investigation and an initial database
Viaarxiv icon

A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis

Oct 06, 2015
Figure 1 for A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis
Figure 2 for A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis
Figure 3 for A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis
Figure 4 for A Waveform Representation Framework for High-quality Statistical Parametric Speech Synthesis
Viaarxiv icon