Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1322192.1322235acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
poster

Disambiguating speech commands using physical context

Published: 12 November 2007 Publication History

Abstract

Speech has great potential as an input mechanism for ubiquitous computing. However, the current requirements necessary for accurate speech recognition, such as a quiet environment and a well-positioned and high-quality microphone, are unreasonable to expect in a realistic setting. In a physical environment, there is often contextual information which can be sensed and used to augment the speech signal. We investigated improving speech recognition rates for an electronic personal trainer using knowledge about what equipment was in use as context. We performed an experiment with participants speaking in an instrumented apartment environment and compared the recognition rates of a larger grammar with those of a smaller grammar that is determined by the context.

References

[1]
G. S.Aist and J. Mostow. 1997. Adapting Human Tutorial Interventions for a Reading Tutor that Listens: Using Continous Speech Recognition in Interactive Educational Multimedia. In CALL '97 Conference on Multimedia, England.
[2]
Answers Anywhere www.iAnywhere.com
[3]
Chang, K. Chen, M., Canny, J., Towards Balanced Exercise Programs: Tracking Free-weight Exercises, Ubicomp 2007.
[4]
Coen, M.; Weisman, L; Thomas, K; Groh, M. A Context Sensitive Natural Language Modality for the Intelligent Room. In Proc. MANSE'99. Dublin, Ireland. 1999.
[5]
Consolvo, S., Paulos, E., Smith, I. Mobile Persuasion for Everyday Behavior Change. Mobile Persuasion. Stanford Captology Media. 2007
[6]
Richard C. Davis, T. Scott Saponas, Michael Shilman, and James A. Landay. SketchWizard: Wizard of Oz Prototyping of Pen-based User Interfaces, In submission to UIST 2007
[7]
A. K. Dey, Understanding and Using Context, Personal and Ubiquitous Computing Journal, Volume 5(1), pp 4--7, 2001.
[8]
J. Glass, T. J. Hazen and I. L. Hetherington, "Realtime Telephone-based Speech Recognition in the Jupiter Domain," in Proc. ICASSP '99, Phoenix, pp. 61--64, Mar. 1999.
[9]
Leong et al. CASIS: a context-aware speech interface system. In Proc. Intelligent user interfaces 2005
[10]
Microsoft Speech Server http://microsoft.com/speech
[11]
MySportTraining 3.97 by VidaOne, Inc http://www.pocketgear.com/software_detail.asp?id=630
[12]
T. Paek & D. Chickering. Improving command and control speech recognition on mobile devices: Using predictive user models for language modeling. User Modeling and User-Adapted Interaction, Special Issue on Statistical and Probabilistic Methods for User Modeling, 2007, 17(1--2): 93--117.
[13]
Matthai Philipose, Joshua R. Smith, Bing Jiang, Kishore Sundara-Rajan, Alexander Mamishev, Sumit Roy. Battery-Free Wireless Identification and Sensing, IEEE Pervasive Computing, Vol. 4, No. 1, pp. 37--45, January-March 2005.
[14]
R. Porzel and I. Gurevych, Contextual Coherence in Natural Language Processing, CONTEXT 2003, LNAI 2680, Springer-Verlag, pp 272--285, 2003.
[15]
Lawrence Rabiner, Biing-Hwang Juang, Fundamentals of Speech Recognition, Prentice-Hall, Inc., NJ, 1993
[16]
Joshua R. Smith, Alanson Sample, Pauline Powledge, Alexander Mamishev, Sumit Roy. A wirelessly powered platform for sensing and computation. In Proc. Ubicomp 2006
[17]
C. Wai, R. Pieraccini, and H. M. Meng, A Dynamic Semantic Model for Re-scoring Recognition Hypotheses, In Proceedings of ICASSP2001, pp 589--592, 2001.
[18]
Mark Weiser, John Seely Brown "The Coming Age of Calm Technology", In Beyond Calculation: The Next Fifty Years of Computing, Peter J. Denning and Robert M. Metcalfe, New York, Springer-Verlag 1997.
[19]
Mark Weiser, "The Computer for the Twenty-First Century," Scientific American, pp. 94--10, September 1991

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces
November 2007
402 pages
ISBN:9781595938176
DOI:10.1145/1322192
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. context
  2. exercise
  3. fitness
  4. speech recognition

Qualifiers

  • Poster

Conference

ICMI07
Sponsor:
ICMI07: International Conference on Multimodal Interface
November 12 - 15, 2007
Aichi, Nagoya, Japan

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 258
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media