research-article

Building multimodal applications with EMMA

Author:

Michael JohnstonAuthors Info & Claims

ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces

Pages 47 - 54

https://doi.org/10.1145/1647314.1647325

Published: 02 November 2009 Publication History

Abstract

Multimodal interfaces combining natural modalities such as speech and touch with dynamic graphical user interfaces can make it easier and more effective for users to interact with applications and services on mobile devices. However, building these interfaces remains a complex and high specialized task. The W3C EMMA standard provides a representation language for inputs to multimodal systems facilitating plug-and-play of system components and rapid prototyping of interactive multimodal systems. We illustrate the capabilities of the EMMA standard through examination of its use in a series of mobile multimodal applications for the iPhone.

References

[1]

J. Axelsson, C. Cross, J. Ferrans, G. McCobb, T. V. Raman, and L. Wilson. Mobile X+V 1.2, 2005. http://openstandardswork.net/specs/multimodal/x+v/mobile/12/.

[2]

S. Bangalore and M. Johnston. Robust understanding in multimodal interfaces. Computational Linguistics (Forthcoming).

Digital Library

[3]

A. Cheyer and L. Julia. Multimodal Maps: An Agent-Based Approach. Lecture Notes in Computer Science, 1374:103--113, 1998.

Digital Library

[4]

P. Cohen, M. Johnston, D. McGee, S. L. Oviatt, J. Pittman, I. Smith, L. Chen, and J. Clow. Multimodal interaction for distributed interactive simulation. In M. Maybury and W. Wahlster, editors, Readings in Intelligent Interfaces. Morgan Kaufmann Publishers, 1998.

Digital Library

[5]

G. D. Fabbrizio, J. Wilpon, and T. Okken. A speech mashup framework for multimodal mobile services. In Proceedings of ICMI, Boston, USA, 2009.

Digital Library

[6]

V. Goffin, C. Allauzen, E. Bocchieri, D. Hakkani-Tur, A. Ljolje, S. Parthasarathy, M. Rahim, G. Riccardi, and M. Saraclar. The AT&T WATSON Speech Recognizer. In Proceedings of ICASSP, Philadelphia, PA, 2005.

[7]

A. Gruenstein, I. McGraw, and I. Badr. The WAMI toolkit for developing, deploying, and evaluating web-accessible multimodal interfaces. In Proceedings of ICMI, pages 141--148, 2008.

Digital Library

[8]

A. Gruenstein, J. Orszulak, S. Liu, S. Roberts, J. Zabel, B. Reimer, B. Mehler, S. Seneff, J. Glass, and J. Coughlin. City browser: developing a conversational automotive HMI. In Proceedings CHI '09 Extended Abstracts, pages 4291--4296. ACM, 2009.

Digital Library

[9]

J. Gustafson, L. Bell, J. Beskow, J. Boye, R. Carlson, J. Edlund, B. Granstrom, D. House, and M.Wiren. AdApt - a multimodal conversational dialogue system in an apartment domain. In Proceedings of ICSLP, pages 134--137, Beijing, China, 2000.

[10]

A. Hunt and S. McGlashan. Speech Recognition Grammar Specification Version 1.0, March 2004. http://www.w3.org/TR/2004/REC-speech-grammar-20040316/.

[11]

M. Johnston, P. Baggia, D. C. Burnett, J. Carter, D. A. Dahl, G. McCobb, and D. Raggett. EMMA: Extensible MultiModal Annotation markup language, February 2009. http://www.w3.org/TR/2009/REC-emma-20090210.

[12]

M. Johnston and S. Bangalore. Finite-state multimodal integration and understanding. Journal of Natural Language Engineering, 11(2):159--187, 2005.

Digital Library

[13]

M. Johnston, S. Bangalore, G. Vasireddy, A. Stent, P. Ehlen, M. Walker, S. Whittaker, and P. Maloor. MATCH: An architecture for multimodal dialog systems. In Proceedings of ACL, pages 376--383, Philadelphia, 2002.

Digital Library

[14]

M. Johnston, L.-F. D'Haro, M. Levine, and B. Renger. A multimodal interface for access to content in the home. In Proceedings of ACL, pages 376--383, 2007.

[15]

N. Reithinger, S. Bergweiler, R. Engel, G. Herzog, N. Pfleger, M. Romanelli, and D. Sonntag. A look under the hood: design and development of the first SmartWeb system demonstrator. In Proceedings of ICMI, pages 159--166, Trento, Italy, 2005.

Digital Library

[16]

L. V. Tichelen and D. Burke. Semantic Interpretation for Speech Recognition (SISR) Version 1.0, April 2007. http://www.w3.org/TR/2007/REC-semanticinterpretation-20070405/.

[17]

W. Wahlster. SmartKom: Fusion and fission of speech, gestures, and facial expressions. In Proceedings of the 1st International Workshop on Man-Machine Symbiotic Systems, pages 213--225, Kyoto, Japan, 2002.

[18]

K. Wang. SALT: A spoken language interface for web-based multimodal dialog systems. In Proceedings of ICASSP, pages 2241--2244, 2002.

[19]

K. Wittenburg, T. Lanning, D. Schwenke, H. Shubin, and A. Vetro. The prospects for unrestricted speech input for TV content search. In AVI '06: Proceedings of the working conference on Advanced visual interfaces, pages 352--359, New York, NY, USA, 2006. ACM.

Digital Library

Cited By

Krishnaswamy NPustejovsky J(2022)Affordance embeddings for situated language understandingFrontiers in Artificial Intelligence10.3389/frai.2022.7747525Online publication date: 23-Sep-2022
https://doi.org/10.3389/frai.2022.774752
Krishnaswamy NPustejovsky J(2021)The Role of Embodiment and Simulation in Evaluating HCI: Experiments and EvaluationDigital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior10.1007/978-3-030-77817-0_17(220-232)Online publication date: 3-Jul-2021
https://doi.org/10.1007/978-3-030-77817-0_17
Hutchens MKrishnaswamy NCochran BPustejovsky J(2020)Jarvis: A Multimodal Visualization Tool for Bioinformatic DataHCI International 2020 – Late Breaking Papers: Interaction, Knowledge and Social Media10.1007/978-3-030-60152-2_9(104-116)Online publication date: 27-Sep-2020
https://doi.org/10.1007/978-3-030-60152-2_9
Show More Cited By

Index Terms

Building multimodal applications with EMMA
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Sound-based input / output
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Sound-based input / output
      2. Touch screens
    2. Interaction paradigms
      1. Natural language interfaces
  2. Interaction design
    1. Interaction design process and methods
      1. Interface design prototyping

Recommendations

Location grounding in multimodal local search
ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction

Computational models of dialog context have often focused on unimodal spoken dialog or text, using the language itself as the primary locus of contextual information. But as we move from spoken interaction to situated multimodal interaction on mobile ...
Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves Alignment
Speech and Computer
Abstract
The paper explores multimodal collaboration in expository discourse considering verbal and nonverbal moves used by the participants in turn-taking. It reports the results of an experiment which tested speech, gesture and gaze alignment as affected ...
Multimodal human discourse: gesture and speech

Gesture and speech combine to form a rich basis for human conversational interaction. To exploit these modalities in HCI, we need to understand the interplay between them and the way in which they support communication. We propose a framework for the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces

November 2009

374 pages

ISBN:9781605587721

DOI:10.1145/1647314

General Chairs:
James L. Crowley
INRIA Grenoble Rhône-Alpes Research Centre, France
,
Yuri Ivanov
MERL, USA
,
Christopher Wren
Google, USA
,
Program Chairs:
Daniel Gatica-Perez
Idiap Research Institute, Switzerland
,
Michael Johnston
AT&T Research, USA
,
Rainer Stiefelhagen
University of Karlsruhe, Germany

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMI-MLMI '09

Sponsor:

SIGCHI

ICMI-MLMI '09: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES/WORKSHOP ON MACHINE LEARNING FOR MULTIMODAL INTERFACES

November 2 - 4, 2009

Massachusetts, Cambridge, USA

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
357
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Krishnaswamy NPustejovsky J(2022)Affordance embeddings for situated language understandingFrontiers in Artificial Intelligence10.3389/frai.2022.7747525Online publication date: 23-Sep-2022
https://doi.org/10.3389/frai.2022.774752
Krishnaswamy NPustejovsky J(2021)The Role of Embodiment and Simulation in Evaluating HCI: Experiments and EvaluationDigital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior10.1007/978-3-030-77817-0_17(220-232)Online publication date: 3-Jul-2021
https://doi.org/10.1007/978-3-030-77817-0_17
Hutchens MKrishnaswamy NCochran BPustejovsky J(2020)Jarvis: A Multimodal Visualization Tool for Bioinformatic DataHCI International 2020 – Late Breaking Papers: Interaction, Knowledge and Social Media10.1007/978-3-030-60152-2_9(104-116)Online publication date: 27-Sep-2020
https://doi.org/10.1007/978-3-030-60152-2_9
Cohen PTumuluri R(2019)Commercialization of multimodal systemsThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233812(621-658)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1145/3233795.3233812
Tumuluri RDahl DPaternò FZancanaro M(2019)Standardized representations and markup languages for multimodal interactionThe Handbook of Multimodal-Multisensor Interfaces10.1145/3233795.3233806(347-392)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.1145/3233795.3233806
Khallouki HBahaj MRoose P(2019)Smart Emergency Alert System Using Internet of Things and Linked Open Data for Chronic Disease PatientsThe Sundarbans: A Disaster-Prone Eco-Region10.1007/978-3-030-11884-6_17(174-184)Online publication date: 6-Feb-2019
https://doi.org/10.1007/978-3-030-11884-6_17
Bellal ZBenslimane SElouali N(2017)Using programming by demonstration for multimodality in mobile-human interactionsProceedings of the 29th Conference on l'Interaction Homme-Machine10.1145/3132129.3132154(243-251)Online publication date: 29-Aug-2017
https://dl.acm.org/doi/10.1145/3132129.3132154
Peters SJohanssen JBruegge BNakano YAndré ENishida TMorency LBusso CPelachaud C(2016)An IDE for multimodal controls in smart buildingsProceedings of the 18th ACM International Conference on Multimodal Interaction10.1145/2993148.2993162(61-65)Online publication date: 31-Oct-2016
https://dl.acm.org/doi/10.1145/2993148.2993162
Johnston M(2016)Extensible Multimodal Annotation for Intelligent Interactive SystemsMultimodal Interaction with W3C Standards10.1007/978-3-319-42816-1_3(37-64)Online publication date: 18-Nov-2016
https://doi.org/10.1007/978-3-319-42816-1_3
Teixeira AAlmeida NPereira COliveira e Silva MVieira DSilva S(2016)Applications of the Multimodal Interaction Architecture in Ambient Assisted LivingMultimodal Interaction with W3C Standards10.1007/978-3-319-42816-1_12(271-291)Online publication date: 18-Nov-2016
https://doi.org/10.1007/978-3-319-42816-1_12
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten