Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
  • 470 Accesses

Abstract

Dialog strategies have long since been handcrafted by dialog experts. Only within the last decade, research has moved to data-driven methods leading to statistical models. But still, most dialog systems make use solely of the spoken words and their semantics, although speech signals reveal much more about the speaker, e.g. its age, gender, emotional state, etc. Using this speaker state information - along with the semantics - can be a promising way of moving dialog systems towards better performance whilst making them more natural at the same time. Partially Observable Markov Decision Processes (POMDPs), a state-of-the-art statistical modeling method, offer an easy and unified way of integrating speaker state information into dialog systems. In this contribution we present our ongoing research on combining a POMDP-based dialog manager with speaker state information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abdulla, W.H., Kasabov, N.K.: Improving speech recognition performance through gender separation. In: Proc. of ANNES, pp. 218–222 (2001)

    Google Scholar 

  2. Bohus, D., Raux, A., Harris, T.K., Eskenazi, M., Rudnicky, A.I.: Olympus: an open-source framework for conversational spoken language interface research. In: Proc. of, NAACL-HLT-Dialog ’07, pp. 32–39 (2007)

    Google Scholar 

  3. Heinroth, T., Denich, D.: Spoken interaction within the computed world: Evaluation of a multitasking adaptive spoken dialogue system (2011). Compsac

    Google Scholar 

  4. Holzapfel, H.: A dialogue manager for multimodal human-robot interaction and learning of a humanoid robot. Industrial Robot: An International Journal 35(6), 528–535 (2008)

    Article  Google Scholar 

  5. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1–2), 99–134 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  6. Larsson, S., Traum, D.: Information state and dialogue management in the trindi dialogue move engine. Natural Language Engineering Special Issue 6, 323–340 (2000)

    Article  Google Scholar 

  7. Oshry, M., Auburn, R., Baggia, P., Bodell, M., Burke, D., Burnett, D., Candell, E., Carter, J., Mcglashan, S., Lee, A., Porter, B., Rehor, K.: Voice extensible markup language (voicexml) version 2.1 (2007)

    Google Scholar 

  8. Polzehl, T., Schmitt, A., Metze, F.: Salient features for anger recognition in german and english ivr portals. In: Spoken Dialogue Systems Technology and Design, pp. 83–105. Springer New York (2011)

    Google Scholar 

  9. Schmitt, A., Heinroth, T., Bertrand, G.: Towards emotion, age- and genderaware voicexml applications. In: Proc. of IE’09 (2009)

    Google Scholar 

  10. Schmitt, A., Polzehl, T., Liscombe, J.: The influence of the utterance length on the recognition of aged voices. In: Proc. of LREC. Valetta, Malta (2010)

    Google Scholar 

  11. Thomson, B., Young, S.: Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems. Computer Speech & Language 24(4), 562– 588 (2010)

    Article  Google Scholar 

  12. Williams, J., Young, S.: Partially observable markov decision processes for spoken dialog systems. Computer Speech and Language (21), 393–422 (2007)

    Article  Google Scholar 

  13. Young, S., Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., Yu, K.: The hidden information state model: A practical framework for POMDP-based spoken dialogue management. Computer Speech & Language 24(2), 150–174 (2009)

    Article  Google Scholar 

  14. Young, S., Williams, J., Schatzmann, J., Stuttle, M., Weilhammer, K.: D4.3: Bayes net prototype - the hidden information state dialogue manager. Tech. rep., TALK project, IST-507802, 6th FP (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Ultes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this paper

Cite this paper

Ultes, S., Heinroth, T., Schmitt, A., Minker, W. (2011). A Theoretical Framework for a User-Centered Spoken Dialog Manager. In: Delgado, RC., Kobayashi, T. (eds) Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems Workshop. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1335-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1335-6_24

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-1334-9

  • Online ISBN: 978-1-4614-1335-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics