Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Feb 6, 2019 · Here, we optimize a speaker embedding model with prototypical network loss (PNL), a state-of-the-art approach for the few-shot image classification task.
ABSTRACT. Speaker embedding models that utilize neural networks to map utterances to a space where distances reflect similarity between speakers have driven ...
A query xj is classified based on how close it is to the class prototype cyj of class yj (computed as the average of f (x) for all x in the support set Syj.
This work introduces metric learning (ML) to enhance the deep embedding learning for text-independent speaker verification (V) and conducts experiments.
The proposed method, which uses fusion MFCC, LPC, PLP, RMS, centroid, and entropy information and combinations of their delta and delta-delta values to further ...
Feb 6, 2019 · The resulting embedding model outperforms the state-of-the-art triplet loss based models in both speaker verification and identification tasks, ...
Speaker embedding models that utilize neural networks to map utterances to a space where distances reflect similarity between speakers have driven recent ...
Jun 25, 2021 · Open-set speaker recognition can be regarded as a metric learn- ing problem, which is to maximize inter-class variance and.
Recently, speaker verification systems using deep neural net- works have shown their effectiveness on large scale datasets.
In this paper, we present an extensive evaluation of most popular loss functions for speaker recognition on the. VoxCeleb dataset. We demonstrate that the ...