An Application To Control Media Player With Voice Commands: Journal of Polytechnic January 2020
An Application To Control Media Player With Voice Commands: Journal of Polytechnic January 2020
An Application To Control Media Player With Voice Commands: Journal of Polytechnic January 2020
net/publication/338516411
CITATIONS READS
0 265
3 authors, including:
Some of the authors of this publication are also working on these related projects:
Realtime pca based face recognition for following staff View project
All content following this page was uploaded by Abdullah Elen on 07 January 2021.
JOURNAL of POLYTECHNIC
ORCID1: 0000-0002-1622-9059
ORCID2: 0000-0001-7733-9959
ORCID3: 0000-0003-1644-0476
Bu makaleye şu şekilde atıfta bulunabilirsiniz(To cite to this article): Avuçlu E., Özçifçi A. ve Elen A.,
“Ses komutları ile media player kontrolü için bir uygulama”, Politeknik Dergisi, 23(4): 1311-1315, (2020).
DOI: 10.2339/politeknik.646675
An Application to Control Media Player with Voice Commands
Highlights
In the developed application, operations with keyboard and mouse can be done with voice commands.
Voice commands can be sent with the wireless headset from anywhere in the shooting area.
Graphical Abstract
The following Figure shows a general voice recognition process.
Originality
In this study, an application that provides media player control with voice commands was developed.
Findings
In this study, test procedures were performed with 20 people. In some word tests, more than one test was performed
over the same person's voice.
Conclusion
100% accurate recognition can be achieved by using short words and words with full pronunciation when making
voice definitions.
ÖZ
Günümüzde teknolojiyi kullanmak insanların hayatlarını kolaylaştırmak açısından büyük öneme sahiptir. Teknoloji ile bazı
uygulamaları çalıştırmak çok kolay bir hal almıştır. Bu çalışmada ses komutları ile media player kontrolü sağlayan bir uygulama
geliştirilmiştir. Herhangi bir engelinden dolayı kendi kendine müzik dinleyemeyen kişilerin ihtiyaçlarını gidermek için bu
uygulama geliştirilmiştir. Uygulama C# programlama dilinde gerçekleştirilmiştir. Media player’ı ses komutları ile yönetebilmek
için önce ses tanıma kütüphanelerinden faydalanılmıştır. Geliştirilen uygulama da klavye Mouse ile media player üzerinden yapılan
işlemler ses komutları ile gerçekleştirilebilmektedir. Ses komutları kablosuz kulaklık ile çekim alanının olduğu bir yerden
verilebilir.
Anahtar Kelimeler: Ses tanıma, media player kontrolü, engelli birey.
1311
Emre AVUÇLU, Ayhan ÖZÇİFÇİ, Abdullah ELEN / POLİTEKNİK DERGİSİ, Politeknik Dergisi,2020;23(4): 1311-1315
2. MATERIAL and METHOD The voice wave that forms the sound has two important
The application was programmed in C# programming features. These properties are amplitude and frequency
language. This section describes how the voice [15]. Frequency, while determining the soundness and
recognition process is performed. quiver characteristics of voice; amplitude determines the
intensity of the voice and the energy it carries. Equation
2.1. Voice Recognition Process
1 is given for the Total Amplitude (TG) calculation.
First stage; the voice recorded in the system. Once the
𝑇𝐺 = ∑𝑛𝑡=1 𝑥(𝑡) (1)
voice is recorded, it can go through various processes and
be processed. The following Figure 1 shows a general In this equation x (t); amplitude at time t; In other words,
voice recognition process. it expresses the energy carried by the voice wave at the
moment t. If the sum of the total amplitude value
calculated by this method is above a certain value, then
the meaning of sound, that is, speech, is started.
Filters are used for two purposes in the processing of
voice. These are the separation of the voice signal and the
correction of the voice signal. Digital filters are FIR
(Finite Impulse Response) filter and IIR (Infinite Impulse
Response) filter. In FIR filters, the input signal forms the
output 𝑦𝑛 , which is the weighted sum of the current and
previous inputs versus 𝑥𝑛 . The mathematical expression
of this filter is given by Equation 2.
𝑦𝑛 = 𝑏0 𝑥𝑛 + 𝑏1 𝑥𝑛−1 + 𝑏2 𝑥𝑛−2 + ⋯ + 𝑏𝑞 𝑥𝑛−𝑞 (2)
In this equation 𝑦𝑛 is the result of the filter output. In IIR
filters, the input signal constitutes the output 𝑦𝑛 , which
represents the weighted sum of the previous outputs,
together with the weighted sums of the current and
previous inputs versus 𝑥𝑛 . In this model, together with
the 𝑥𝑛 input, the weighted sum of the previous p outputs
gives the filter output 𝑦𝑛 . After digitizing the voice, the
voice is encoded and the voice recognition process is
completed. The following libraries should first be added
to the system for voice recognition.
using System.Diagnostics;
Figure 1. Voice recognition process.
private SpeechLib.SpSharedRecoContext
objRecoContext = null;
The voice is digitized to perform these operations. The
private SpeechLib.ISpeechRecoGrammar grammar
voice is first filtered and then sampled for digitization.
= null;
Figure 2 shows an example of digitization function.
private SpeechLib.ISpeechGrammarRule menuRule
= null;
1312
AN APPLICATION TO CONTROL MEDIA PLAYER WITH VOICE COMMANDS… Politeknik Dergisi, 2020; 23 (4) : 1311-1315
using System.Speech.Recognition;
if (avuclu.Text == "player")
{
var mediaPlayer = "C:\\Program Files\\Windows
Media Player\\wmplayer.exe";
System.Diagnostics.Process.Start(mediaPlayer);
}
Figure 3. General structure of the system.
The general form design view of the application to be The code block required to activate or deactivate the
managed by voice commands is shown in Figure 5. We application is as follows.
can activate or deactivate this application at any time.
1313
Emre AVUÇLU, Ayhan ÖZÇİFÇİ, Abdullah ELEN / POLİTEKNİK DERGİSİ, Politeknik Dergisi,2020;23(4): 1311-1315
1314
AN APPLICATION TO CONTROL MEDIA PLAYER WITH VOICE COMMANDS… Politeknik Dergisi, 2020; 23 (4) : 1311-1315
1315