A Comparative Study of Indian and Western Music Forms
A Comparative Study of Indian and Western Music Forms
A Comparative Study of Indian and Western Music Forms
We extract a set of global features based on the main di- 5.1 Experiment 1: Adaboost on Feature set 1
mensions of music which includes melody, rhythm, timbre
and spatial features as described in [8] [9] and [10]. We We train One-Vs-One Adaboost classifiers. A sample is
used the following features to form a vector of 86 global assigned the class which gets the maximum votes. In case
features- of a tie, the class with the lower index is the predicted la-
• Energy features- mean and variance of the energy bel. Table 1 shows the confusion matrix for 10 runs of the
envelope, Root Mean Square (RMS) energy, low- experiment for Indian Music and Western Music. Over-
energy rate(percentage of frames having less than all the accuracies obtained for Indian Music and Western
average energy). Music are 58.70% and 46.90% respectively.
• Rhythm features- mean and variance of notes on-
set time (successive bursts of energy indicating es- Table 1. Confusion Matrix (in %) for Exp 1
timated positions of the notes), event density (aver-
(a) Indian Music
age frequency of events, i.e., the number of note on- Ca Dh Gh Pu Th
sets per second), tempo, pulse clarity [11](strength Ca 66.50 9.00 5.50 6.50 12.50
of beats). Dh 10.50 64.00 6.00 9.00 10.50
• Pitch features- mean and variance of pitch. Gh 16.50 6.50 64.50 7.00 5.50
Pu 18.50 12.50 14.00 47.50 7.50
• Tonality features- 12-chromagram pitch class ener-
Th 19.50 6.50 13.00 10.00 51.00
gies, 24- key strength major and minor (cross cor- Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri
relation of the chromagram), 6- dimensional tonal
(b)Western Music
centroid vector from the chromagram (corresponds Cl Hi Ja Po Re
to projection of the chords along circles of fifths, of Cl 60.00 8.00 13.50 12.50 6.00
minor thirds, and of major thirds). Hi 3.50 58.50 12.00 14.00 12.00
• Timbre features- mean and variance of attack time Ja 8.50 21.00 48.00 13.00 9.50
of notes onset, rolloff frequency, brightness (percent- Po 9.50 18.50 22.00 32.00 18.00
Re 14.00 17.50 16.50 16.00 36.00
age of energy above 1500Hz), 13 Mel-Frequency Legend: Cl:Classical, Hi:Hip-hop, Ja:Jazz, Po:Pop, Re:Reggae
Cepstral Coefficients (MFCCs), roughness(sensory
dissonance), irregularity (degree of variation of the
successive peaks of the spectrum)
• Spectral-shape features- zero-crossing rate, spread, 5.2 Experiment 2: Adaboost on Feature set 2
centroid, skewness, kurtosis, mean and variance of The Adaboost classification technique is used in a manner
Inverse Fourier Transform of logarithm of spectrum, similar to Experiment 1. The confusion matrices for Indian
flatness, mean and variance of spectrum, mean and music and Western music are shown in Table 2. The overall
variance of spectral flux, mean and variance of spec- accuracies obtained are: Indian music- 87.30%, Western
trum peaks. music- 74.50%.
These features are taken to capture the static structure
of the spectrogram. 5.3 Experiment 3: Adaboost on Feature set 1+2
Adaboost classification technique is used similar to Exper-
4.3 Feature set 3- Frame-wise features
iment 1. Here we take a combination of feature set 1 and
We analyse each song using a framesize of 100 milli sec- feature set 2 for the training and testing. Table 3 shows the
onds with 50% overlap and extract the following features: confusion matrices. The overall accuracies achieved are:
12-chromagram features, 13 MFCC, 13 delta- MFCC, 13 Indian music- 87.80% , Western music- 77.50%.
Table 2. Confusion Matrix (in %) for Exp 2 Table 4. Confusion Matrix (in %) for Exp 4
(a) Indian Music (a) Indian Music
Ca Dh Gh Pu Th Ca Dh Gh Pu Th
Ca 89.50 1.50 5.00 0.50 3.50 Ca 95.00 0 3.00 2.00 0
Dh 5.00 90.00 2.50 0 2.50 Dh 0 99.00 1.00 0 0
Gh 5.50 6.50 81.00 1.50 5.50 Gh 1.00 0 98.50 0.50 0
Pu 3.00 0 2.00 95.00 0 Pu 0 0.500 0 99.50 0
Th 10.50 5.00 3.50 0 81.00 Th 1.00 0 1.00 0 98.00
Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri
Table 3. Confusion Matrix (in %) for Exp 3 offer a variety of cues that do not seem to be present in
(a) Indian Music Western music.
Ca Dh Gh Pu Th
Ca 90.00 1.50 4.00 0.50 4.00
Dh 4.50 91.00 2.00 0 2.50
Gh 6.50 5.50 81.50 2.00 4.50
Pu 3.00 0 2.00 95.00 0
Th 7.50 3.50 7.50 0 81.50
Legend: Ca:Carnatic, Dh:Dhrupad, Gh:Ghazal, Pu:Punjabi, Th:Thumri
(b)Western Music
Cl Hi Ja Po Re
Cl 90.50 0 9.00 0.50 0
Hi 0.50 79.00 1.00 12.50 7.00
Ja 10.50 2.00 78.00 0.50 9.00
Po 0.50 15.00 1.50 72.50 10.50
Re 3.50 11.50 9.50 8.00 67.50
Legend: Cl:Classical, Hi:Hip-hop, Ja:Jazz, Po:Pop, Re:Reggae
5.4 Experiment 4: HMM with GMM based on Figure 2. Comparison of performance of Indian and West-
Feature set 3 ern Genres on four experiments
We train Hidden Markov Models on Feature set 3 with 62
hidden states and full covariance matrix. The emissions of The better performance of Indian music over Western
all states were randomly initialized and the states need not music on similar experiments answers the second question
represent exactly the underlying feature. The algorithms of our research problem which suggests that the state-of-
uses maximum likelihood parameter estimation using Ex- the-art features (spectral, timbre, harmony etc) used for
pectation Maximization. The state emissions were 62 di- Western music genre classification can be applied to Indian
mensional chroma and timbre data modelled by Gaussian music where they show higher accuracies.
mixtures with 8 Gaussians. We used the Hidden Markov In Indian music, the classical forms (Carnatic and Dhru-
Model Toolbox of Matlab by [23] for the same. The confu- pad) performed better than Ghazal. This is may be be-
sion matrix for Indian music and Western music are shown cause the Indian classical music is more structured as it has
in Table 4. Overall accuracies of 98.00% and 67.00% were 3 components of raga (melody), taal (rhythm) and drone
obtained for Indian and Western music respectively. (sustained note) and has a strong melodic structure. In last
three experiments Punjabi music has lowest confusion be-
cause its two characteristics features: lively rhythm and
6. DISCUSSION
distinct timbre due to bass sound produced by Dhol; are
Our experiments performed better on Indian music than on accounted for in features used. In Western music, Classi-
Western music for given classification techniques and sets cal and Jazz performed better than other genres, with Reg-
of features (Figure 2). The differences in performance of gae performing the worst. These results are consistent with
the two cultural forms of music can perhaps be traced to the G.Tzanetakis et al. [5].
more well-defined structural form and strong melodic con- Performance on Feature set 1 (which is just based on
tent of Indian Music. In Indian music, melody and rhythm the frequency of chromagram notes) for Indian music and
Western music is about 59% and 47% respectively. This [8] N. Scaringella, G. Zoia, and D. Mlynek, “Automatic Genre
is expected as the Indian music is based on melody, i.e. a Classification of Music Content: A Survey“ IEEE Signal
sequence of notes played in a given order. Whereas, the Processing Magazine 2006, 23(2), pp. 133141.
Western music is more harmonic in nature, i.e. group of [9] Kaichun K. Chang, Jyh-Shing Roger Jang, and Costas S. Il-
iopoulos. ”Music Genre Classification via Compressive Sam-
notes played simultaneously, which is not captured by a pling“ 11th International Society for Music Information Re-
single dominant note in Feature set 1. trieval Conference (ISMIR 2010) 2010, pp. 387-392.
In Feature set 2 we have used the 12-chromagram pitch [10] Thibault Langlois, and G. Marques ”A Music Classifica-
class energies which tries to capture the dominance order tion Method Based on Timbral Features” 10th International
of notes. The dominance order of notes is also represented Society for Music Information Retrieval Conference (ISMIR
2009), 2009, pp. 81-86.
by the frequency of chromagram notes in Feature set 1.
[11] Olivier Lartillot, Tuomas Eerola, Petri Toiviainen, Jose
Thus, there is no significant difference in accuracies of Ex- Fornari ”Multi-feature modelling of pulse clarity: Design,
periment 2 (using Feature set 2) and Experiment 3 (using validation, and optimization” International Conference on
Feature set 1+2). Music Information Retrieval Philadelphia, 2008
The highest accuracy of 98.0% for Indian music genres [12] S.Jothilakshmi,N.Kathiresan ”Automatic Music Genre Clas-
in Experiment 4 is better than S.Jothilakshmi et al. [12] and sification for Indian Music” ICSCA 2012
the state-of-the-art to the best of our knowledge. The ma- [13] P. Chordia and A. Rae ”Raag recognition using pitch-class
jor difference in accuracies between local frame-wise fea- and pitch-class dyad distributions” Proc. of ISMIR, 2007
pp.431436.
tures and global features in case of Indian music may be
[14] A.Vidwans, K.K. Ganguli and Preeti Rao ”Classification of
because the local frame-wise features are better in captur- Indian Classical Vocal Styles from Melodic Contours” Proc.
ing the characteristics of Indian music like raga notes and of the 2nd CompMusic Workshop, Istanbu, Turkey, July 12-
taal (repetition of beats) which require small size windows 13, 2012
to be analysed. An accuracy of 77.5% for Western music [15] J. Martens Lippens, S. and T. De Mulder ”A comparison of
genres in Experiment 3 is better than G.Tzanetakis et al. [5] human and automatic musical genre classification” In IEEE
on the same dataset of five genres (Classical, Hiphop, Jazz, International Conference on Acoustics, Speech, and Signal
Processing, volume 4, pages 233236, 2004
Pop and Reggae).
[16] D. Perrot and R. R. Gjerdigen. ”Scanning the dial: an ex-
ploration of factors in the identification of musical style” In
Proceedings of the 1999 Society for Music Perception and
7. FUTURE WORK
Cognition, 1999.
This work can be extended in various ways: forming a [17] M. Ogihara Li, T. and Q. Li. ”A comparative study on
‘golden-set’ of features that are genre-specific like rhyme content-based music genre classification” In Proceedings of
the 26th annual international ACM SIGIR conference on
in Ghazal, beats in Folk Punjabi etc.; recognition of pat- Research and development in information retrieval, pages
terns like taal in Indian music and chords in Western mu- 282289, 2003.
sic; expansion of classes in terms of genres and sub-genres [18] West, K. and S. Cox, ”Features and classifiers for the auto-
so that we can work with more classes in both datasets matic classification of musical audio signals” Proc. 5th In-
(GTZAN has 10-genres); studying music forms of other ternational Conference on Music Information Retrieval (IS-
MIR), 2004
cultures for example Chinese and Japanese and comparing
them with Indian and Western genres. [19] Michael I Mandel and Daniel P.W.Ellis. ”Song-level features
and support vector machines for music classification” In In-
ternational Society for Music Information Retrieval, 2005.
8. REFERENCES [20] A. Flexer Pampalk, E. and G. Widmer ”Improvements of
audio-based music similarity and genre classification” In
[1] J. J. Aucouturier, F. Pachet. ”Representing musical genre: A Crawford and Sandler, 2005.
state of the art” Journal of New Music Research, pp. 83-93, [21] www.magnatune.com (Access date: 5 May 2013)
2003.
[22] www.ee.columbia.edu/ dpwe/research/musicsim/ (Access
[2] E. Benetos, C. Kotropoulos ”A tensor-based approach for au- date: 5 May 2013)
tomatic music genre classification” Proc. EUSIPCO,2008 [23] http://www.cs.ubc.ca/ murphyk/Software/HMM/hmm.html
[3] J. Bergstra, N. Casagrande, D. Erhan, D. Eck, B. Kegl ”Ag- (Access date: 5 May 2013)
gregate features and Adaboost for music classification” Ma- [24] C. Marques, I. R. Guilherme, R. Y. M. Nakamura and J. P.
chine Learning vol. 65, no. 2-3, pp. 473-484, 2006 Papa ”New trends in musical genre classification using opti-
[4] S. Sukittanon, L. E. Atlas,J. W Pitton ”Modulation-scale mum path forest” ISMIR, 2011.
analysis for content identification” IEEE Trans. Signal Pro- [25] Yannis Panagakis, Constantine Kotropoulos, Gonzalo R.
cessing vol. 52, no. 10, pp 3023-3035, Oct. 2004 Arce ”Music genre classification via sparse representations
of Auditory Temporal Modulations” EUSIPCO,2009.
[5] George Tzanetakis and Perry Cook “Music Genre Classifi-
cation of Audio Signals” IEEE Transactions on Speech and [26] Justin Salamon, Bruno Rochay and Emilia Gomez ”Music
Audio Processing vol. 10, no. 5, pp. 293-302, 2002. genre classification using melody features extracted from
polyphonic music signals” ICASSP,2012.
[6] D. P. W. Ellis and G. E. Poliner. “Identifying cover songs with
[27] Olivier Lartillot, Petri Toiviainen ”A Matlab Toolbox for Mu-
chroma features and dynamic programming beat tracking”
sical Feature Extraction From Audio” International Confer-
ICASSP Hawaii, USA 2007.
ence on Digital Audio Effects, Bordeaux, 2007
[7] S. Kim and S. Narayanan “Dynamic chroma feature vec- [28] ”Raga” http://www.britannica.com/EBchecked/topic/489518
tors with applications to cover song identification” In MMSP (Access date: 5 May 2013)
Cairns, Australia, 2008.