A Comparative Study of Indian and Western Music Forms
We extract a set of global features based on the main di- 5.1 Experiment 1: Adaboost on Feature set 1
mensions of music which includes melody, rhythm, timbre
and spatial features as described in [8] [9] and [10]. We We train One-Vs-One Adaboost classifiers. A sample is
used the following features to form a vector of 86 global assigned the class which gets the maximum votes. In case
features- of a tie, the class with the lower index is the predicted la-
• Energy features- mean and variance of the energy bel. Table 1 shows the confusion matrix for 10 runs of the
envelope, Root Mean Square (RMS) energy, low- experiment for Indian Music and Western Music. Over-
energy rate(percentage of frames having less than all the accuracies obtained for Indian Music and Western
average energy). Music are 58.70% and 46.90% respectively.
• Rhythm features- mean and variance of notes on-
set time (successive bursts of energy indicating es- Table 1. Confusion Matrix (in %) for Exp 1
timated positions of the notes), event density (aver-
(a) Indian Music
age frequency of events, i.e., the number of note on- Ca Dh Gh Pu Th
sets per second), tempo, pulse clarity [11](strength Ca 66.50 9.00 5.50 6.50 12.50
of beats). Dh 10.50 64.00 6.00 9.00 10.50
• Pitch features- mean and variance of pitch. Gh 16.50 6.50 64.50 7.00 5.50
Pu 18.50 12.50 14.00 47.50 7.50
• Tonality features- 12-chromagram pitch class ener-
Th 19.50 6.50 13.00 10.00 51.00
gies, 24- key strength major and minor (cross cor- Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri
relation of the chromagram), 6- dimensional tonal
(b)Western Music
centroid vector from the chromagram (corresponds Cl Hi Ja Po Re
to projection of the chords along circles of fifths, of Cl 60.00 8.00 13.50 12.50 6.00
minor thirds, and of major thirds). Hi 3.50 58.50 12.00 14.00 12.00
• Timbre features- mean and variance of attack time Ja 8.50 21.00 48.00 13.00 9.50
of notes onset, rolloff frequency, brightness (percent- Po 9.50 18.50 22.00 32.00 18.00
Re 14.00 17.50 16.50 16.00 36.00
age of energy above 1500Hz), 13 Mel-Frequency Legend: Cl:Classical, Hi:Hip-hop, Ja:Jazz, Po:Pop, Re:Reggae
Cepstral Coefficients (MFCCs), roughness(sensory
dissonance), irregularity (degree of variation of the
successive peaks of the spectrum)
• Spectral-shape features- zero-crossing rate, spread, 5.2 Experiment 2: Adaboost on Feature set 2
centroid, skewness, kurtosis, mean and variance of The Adaboost classification technique is used in a manner
Inverse Fourier Transform of logarithm of spectrum, similar to Experiment 1. The confusion matrices for Indian
flatness, mean and variance of spectrum, mean and music and Western music are shown in Table 2. The overall
variance of spectral flux, mean and variance of spec- accuracies obtained are: Indian music- 87.30%, Western
trum peaks. music- 74.50%.
These features are taken to capture the static structure
of the spectrogram. 5.3 Experiment 3: Adaboost on Feature set 1+2
Adaboost classification technique is used similar to Exper-
4.3 Feature set 3- Frame-wise features
iment 1. Here we take a combination of feature set 1 and
We analyse each song using a framesize of 100 milli sec- feature set 2 for the training and testing. Table 3 shows the
onds with 50% overlap and extract the following features: confusion matrices. The overall accuracies achieved are:
12-chromagram features, 13 MFCC, 13 delta- MFCC, 13 Indian music- 87.80% , Western music- 77.50%.
Table 2. Confusion Matrix (in %) for Exp 2 Table 4. Confusion Matrix (in %) for Exp 4
(a) Indian Music (a) Indian Music
Ca Dh Gh Pu Th Ca Dh Gh Pu Th
Ca 89.50 1.50 5.00 0.50 3.50 Ca 95.00 0 3.00 2.00 0
Dh 5.00 90.00 2.50 0 2.50 Dh 0 99.00 1.00 0 0
Gh 5.50 6.50 81.00 1.50 5.50 Gh 1.00 0 98.50 0.50 0
Pu 3.00 0 2.00 95.00 0 Pu 0 0.500 0 99.50 0
Th 10.50 5.00 3.50 0 81.00 Th 1.00 0 1.00 0 98.00
Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri
Table 3. Confusion Matrix (in %) for Exp 3 offer a variety of cues that do not seem to be present in
(a) Indian Music Western music.
Ca Dh Gh Pu Th
Ca 90.00 1.50 4.00 0.50 4.00
Dh 4.50 91.00 2.00 0 2.50
Gh 6.50 5.50 81.50 2.00 4.50
Pu 3.00 0 2.00 95.00 0
Th 7.50 3.50 7.50 0 81.50
Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri
(b)Western Music
Cl Hi Ja Po Re
Cl 90.50 0 9.00 0.50 0
Hi 0.50 79.00 1.00 12.50 7.00
Ja 10.50 2.00 78.00 0.50 9.00
Po 0.50 15.00 1.50 72.50 10.50
Re 3.50 11.50 9.50 8.00 67.50
Legend: Cl:Classical, Hi:Hip-hop, Ja:Jazz, Po:Pop, Re:Reggae
5.4 Experiment 4: HMM with GMM based on Figure 2. Comparison of performance of Indian and West-
Feature set 3 ern Genres on four experiments
We train Hidden Markov Models on Feature set 3 with 62
hidden states and full covariance matrix. The emissions of The better performance of Indian music over Western
all states were randomly initialized and the states need not music on similar experiments answers the second question
represent exactly the underlying feature. The algorithms of our research problem which suggests that the state-of-
uses maximum likelihood parameter estimation using Ex- the-art features (spectral, timbre, harmony etc) used for
pectation Maximization. The state emissions were 62 di- Western music genre classification can be applied to Indian
mensional chroma and timbre data modelled by Gaussian music where they show higher accuracies.
mixtures with 8 Gaussians. We used the Hidden Markov In Indian music, the classical forms (Carnatic and Dhru-
Model Toolbox of Matlab by [23] for the same. The confu- pad) performed better than Ghazal. This is may be be-
sion matrix for Indian music and Western music are shown cause the Indian classical music is more structured as it has
in Table 4. Overall accuracies of 98.00% and 67.00% were 3 components of raga (melody), taal (rhythm) and drone
obtained for Indian and Western music respectively. (sustained note) and has a strong melodic structure. In last
three experiments Punjabi music has lowest confusion be-
cause its two characteristics features: lively rhythm and
distinct timbre due to bass sound produced by Dhol; are
Our experiments performed better on Indian music than on accounted for in features used. In Western music, Classi-
Western music for given classification techniques and sets cal and Jazz performed better than other genres, with Reg-
of features (Figure 2). The differences in performance of gae performing the worst. These results are consistent with
the two cultural forms of music can perhaps be traced to the G.Tzanetakis et al. [5].
more well-defined structural form and strong melodic con- Performance on Feature set 1 (which is just based on
tent of Indian Music. In Indian music, melody and rhythm the frequency of chromagram notes) for Indian music and
