Candidate Doctor of Computer Science at Institut Teknologi Sepuluh Nopember (ITS) Surabaya, and Lecturer on STMIK PPKIA Tarakanita Rahmawati - Tarakan - Kalimantan Utara - Indonesia
International Conference on Electronics Representation and Algorithm (ICERA), 2021
One of the most common methods used in the process of identifying speakers is the Gaussian Mixtur... more One of the most common methods used in the process of identifying speakers is the Gaussian Mixture Model (GMM) method. The quality of GMM depends on the method selected to train the Gaussian. One method that the researcher has chosen is to use k-Means. In this study, an evaluation process was performed on the k-Means GMM using three centroid initialization methods: randomization, seeding and density analysis. The application of seeding uses the k-Means method, whereas the application of density analysis uses the histogram method. We applied two evaluation criteria, namely the complexity of the training process and the accuracy of the speaker identification process. Experiments were conducted over three types of voice test duration: 2, 4 and 6 seconds. We also used nine types of Gaussian components, ranging from 4 to 20 components, with an increasing scale of 2+n. Our proposed method using density analysis has a clustering process time of 33.7% lower, but with the highest accuracy of 95.5%.
International Conference on Electronics Representation and Algorithm (ICERA), 2021
One of the most common methods used in the process of identifying speakers is the Gaussian Mixtur... more One of the most common methods used in the process of identifying speakers is the Gaussian Mixture Model (GMM) method. The quality of GMM depends on the method selected to train the Gaussian. One method that the researcher has chosen is to use k-Means. In this study, an evaluation process was performed on the k-Means GMM using three centroid initialization methods: randomization, seeding and density analysis. The application of seeding uses the k-Means method, whereas the application of density analysis uses the histogram method. We applied two evaluation criteria, namely the complexity of the training process and the accuracy of the speaker identification process. Experiments were conducted over three types of voice test duration: 2, 4 and 6 seconds. We also used nine types of Gaussian components, ranging from 4 to 20 components, with an increasing scale of 2+n. Our proposed method using density analysis has a clustering process time of 33.7% lower, but with the highest accuracy of 95.5%.
Uploads
Papers by Endyk Noviyantono