Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/976440.976462dlproceedingsArticle/Chapter ViewAbstractPublication Pagesaus-cswConference Proceedingsconference-collections
Article
Free access

Detecting stress in spoken English using Decision Trees and Support Vector Machines

Published: 01 January 2004 Publication History
  • Get Citation Alerts
  • Abstract

    This paper describes an approach to the detection of stress in spoken New Zealand English. After identifying the vowel segments of the speech signal, the approach extracts two different sets of features - prosodic features and vowel quality features - from the vowel segments. These features are then normalised and scaled to obtain speaker independent feature values that can be used to classify each vowel segment as stressed or unstressed. We used Decision Trees (C4.5) and Support Vector Machines (LIBSVM) to learn stress-detecting classifiers with various combinations of the features. The approach was evaluated on 60 adult female utterances with 703 vowels and a maximum accuracy of 84.72% was achieved. The results showed that a combination of features derived from duration and amplitude achieved the best performance but the vowel quality features also achieved quite reasonable results.

    References

    [1]
    Aull, A. M. & Zue, V. W. (1985), 'Lexical stress determination and its application to speech recognition', in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1549--1552.
    [2]
    Bernthal, J. E. & Bankson, N. W. (1988), Articulation and phonological disorders, Prentice Hall, New Jersey.
    [3]
    Chang, C.-C. & Lin, C.-J. (2003), 'Libsvm: a library for support vector machines', http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf.
    [4]
    Cortes, C. & Vapnik, V. (1995), 'Support-vector network', Machine Learning20, 273--297.
    [5]
    Cruttenden, A. (1997), Intonation, Second edition, Cambridge University Press, Cambridge.
    [6]
    Freij, G., Fallside, F., Hoequist, C. & Nolan, F. (1990), 'Lexical stress estimation and phonological knowledge', Computer Speech and Language4(1), 1--15.
    [7]
    Jenkin, K. L. & Scordilis, M. S. (1996), 'Development and comparison of three syllable stress classifiers', in Proceedings of the International Conference on Spoken Language Processing, Philadelphia, USA, pp. 733--736.
    [8]
    Ladefoged, P. (1967), Three Areas of experimental phonetics, Oxford University Press, London.
    [9]
    Ladefoged, P. (1993), A Course in Phonetics, Third edition, Harcourt Brace Jovanovich, New York.
    [10]
    Ladefoged, P. & Maddieson, I. (1990), 'Vowels of the world's languages', Journal of Phonetics, 18, 93--122.
    [11]
    Lieberman, P. (1960), 'Some acoustic correlates of word stress in American English', Journal of the Acoustical Society of America, 32, 451--454.
    [12]
    Mateescu, D. (2003), 'English phonetics and phonological theory', http://www.unibuc.ro/eBooks/filologie/mateescu.
    [13]
    Pennington, M. C. (1996), Phonology in English language teaching: An international approach, Longman, London.
    [14]
    van Kuijk, D. & Boves, L. (1999), 'Acoustic characteristics of lexical stress in continuous speech', in IEEE International Conference on Acoustics, Speech, and Signal Processing, 3, Munich, Germany, pp. 1655--1658.
    [15]
    Waibel, A. (1986), 'Recognition of lexical stress in a continuous speech system -- a pattern recognition approach', in IEEE International Conference on Acoustics, Speech, and Signal Processing, Tokyo, Japan, pp. 2287--2290.
    [16]
    Wightman, C. W. (1992), Automatic detection of prosodic constituents for parsing, PhD thesis, Boston University.
    [17]
    Xie, H., Andreae, P., Zhang, M. & Warren, P. (2004), 'Learning models for English speech recognition', Proceedings of the 27th Australasian Computer Science Conference, Dunedin, New Zealand.
    [18]
    Ying, G. S., Jamieson, L. H., Chen, R., Michell, C. D. & Liu, H. (1996), 'Lexical stress detection on stress-minimal word pairs', Proceedings of the 1996 International Conference on Spoken Language Processing pp. 1612--1615.

    Cited By

    View all
    • (2016)Improving HMM speech synthesis of interrogative sentences by pitch track transformationsSpeech Communication10.1016/j.specom.2016.06.00582:C(97-112)Online publication date: 1-Sep-2016
    • (2006)Genetic programming for automatic stress detection in spoken englishProceedings of the 2006 international conference on Applications of Evolutionary Computing10.1007/11732242_41(460-471)Online publication date: 10-Apr-2006
    • (2005)Modelling lexical stressProceedings of the 8th international conference on Text, Speech and Dialogue10.1007/11551874_27(211-218)Online publication date: 12-Sep-2005

    Index Terms

    1. Detecting stress in spoken English using Decision Trees and Support Vector Machines

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACSW Frontiers '04: Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32
      January 2004
      192 pages

      Publisher

      Australian Computer Society, Inc.

      Australia

      Publication History

      Published: 01 January 2004

      Author Tags

      1. decision tree
      2. feature extraction
      3. machine learning
      4. speech recognition
      5. stress detection
      6. support vector machine

      Qualifiers

      • Article

      Conference

      ACSW Frontiers '04

      Acceptance Rates

      Overall Acceptance Rate 204 of 424 submissions, 48%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)42
      • Downloads (Last 6 weeks)7

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Improving HMM speech synthesis of interrogative sentences by pitch track transformationsSpeech Communication10.1016/j.specom.2016.06.00582:C(97-112)Online publication date: 1-Sep-2016
      • (2006)Genetic programming for automatic stress detection in spoken englishProceedings of the 2006 international conference on Applications of Evolutionary Computing10.1007/11732242_41(460-471)Online publication date: 10-Apr-2006
      • (2005)Modelling lexical stressProceedings of the 8th international conference on Text, Speech and Dialogue10.1007/11551874_27(211-218)Online publication date: 12-Sep-2005

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media