Article

The “FAME” interactive space

Authors:

H. RodriguezAuthors Info & Claims

MLMI'05: Proceedings of the Second international conference on Machine Learning for Multimodal Interaction

Pages 126 - 137

https://doi.org/10.1007/11677482_11

Published: 11 July 2005 Publication History

Abstract

This paper describes the “FAME” multi-modal demonstrator, which integrates multiple communication modes – vision, speech and object manipulation – by combining the physical and virtual worlds to provide support for multi-cultural or multi-lingual communication and problem solving.

The major challenges are automatic perception of human actions and understanding of dialogs between people from different cultural or linguistic backgrounds. The system acts as an information butler, which demonstrates context awareness using computer vision, speech and dialog modeling. The integrated computer-enhanced human-to-human communication has been publicly demonstrated at the FORUM2004 in Barcelona and at IST2004 in The Hague.

Specifically, the “Interactive Space” described features an “Augmented Table” for multi-cultural interaction, which allows several users at the same time to perform multi-modal, cross-lingual document retrieval of audio-visual documents previously recorded by an “Intelligent Cameraman” during a week-long seminar.

References

[1]

Gieselmann, P., Denecke, M.: Towards multimodal interaction within an intelligent room. In: Proc. Eurospeech 2003, Geneva; Switzerland, ISCA (2003)

[2]

Crowley, J.L., Reignier, P.: Dynamic composition of process federations for context aware perception of human activity. In: Proc. International Conference on Integration of Knowledge Intensive Multi-Agent Systems, KIMAS'03, 10, IEEE (2003)

[3]

Consorci Universitat Internacional Menéndez Pelayo de Barcelona: "Tecnologies de la llengua: darrers avenços" and "Llenguatge, cognició i evolució". http://www.cuimpb.es/ (2004)

[4]

FORUM2004: Universal Forum of Cultures. http://www.barcelona2004.org/ (2004)

[5]

Lachenal, C., Coutaz, J.: A reference framework for multi-surface interaction. In: Proc. HCI International 2003, Crete; Greece, Crete University Press (2003)

[6]

Metze, F., Fügen, C., Pan, Y., Waibel, A.: Automatically Transcribing Meetings Using Distant Microphones. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Philadelphia, PA; USA, IEEE (2005)

[7]

SRI AI Center: Open Agent Architecture 2.3.0. http://www.ai.sri.com/~oaa/ (2003)

[8]

Rey, G., Crowley, J.L., Coutaz, J., Reignier, P.: Perceptual components for context aware computing. In: Proc. UBICOMP 2002 - International Conference on Ubiquitous Computing, Springer (2002)

Digital Library

[9]

Allen, J.: Towards a general theory of action and time. Artificial Intelligence 13 (1984)

Digital Library

[10]

Caporossi, A., Hall, D., Reignier, P., Crowley, J.: Robust visual tracking from dynamic control of processing. In: PETS04, Workshop on Performance Evaluation for tracking and Surveillance, ECCV04, Prague; Czech Republic (2004)

[11]

Lamel, L., Gauvain, J., Eskenazi, M.: BREF, a large vocabulary spoken corpus for French. In: Proc. Eurospeech 1991, Geneva, Switzerland (1991)

[12]

Surcin, S., Stiefelhagen, R., McDonough, J.: Evaluation packages for the first chil evaluation campaign. CHIL Deliverable D4.2 (2005) http://chil.server.de/.

[13]

Bertran, M., Gatius, M., Rodriguez, H.: FameIr, multimedia information retrieval shell. In: Proceedings of JOTRI 2003, Madrid; Spain, Universidad Carlos III (2003)

[14]

The Global WordNet Association: EuroWordNet. http://www.globalwordnet.org/ (1999)

[15]

Arranz, V., Bertran, M., Rodriguez, H.: Which is the current topic? what is relevant to it? a topic detection retrieval and presentation system. FAME Deliverable D7.2 (2003)

[16]

Ware, C., Balakrishnan, R.: Reaching for objects in vr displays: Lag and frame rate. ACM Transactions on Computer-Human Interaction (TOCHI) 1 (1994) 331-356

Digital Library

[17]

Watson, B., Walker, N., Ribarsky, W., Spaulding, V.: The effects of variation of system responsiveness on user performance in virtual environments. Human Factors, Special Section on Virtual Environments 3 (1998) 403-414

[18]

Liang, J., Shaw, C., Green, M.: On temporal-spatial realism in the virtual reality environment. In: ACMsymposium on User interface software and technology, Hilton Head, South Carolina (1991) 19-25

Digital Library

[19]

Denecke, M.: Rapid prototyping for spoken dialogue systems. In: Proceedings of the 19th International Conference on Computational Linguistics, Taiwan (2002)

Digital Library

[20]

Holzapfel, H.: Towards development of multilingual spoken dialogue systems. In: Proceedings of the 2nd Language and Technology Conference. (2005)

[21]

Cettolo, M., Brugnara, F., Federico, M.: Advances in the automatic transcription of lectures. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal; Canada, IEEE (2004)

[22]

Lamel, L., Schiel, F., Fourcin, A., Mariani, J., Tillmann, H.: The translanguage english database (ted). In: Proc. ICSLP1994, Yokohama; Japan, ISCA (1994) 1795 - 1798

[23]

Coutaz, J., et al.: Evaluation of the fame interaction techniques and lessons learned. FAME Deliverable D8.2 (2005)

Cited By

Popescu-Belis AYazdani MNanchen AGarner PMoore JTraum DChai JPassonneau R(2011)A just-in-time document retrieval system for dialogues or monologuesProceedings of the SIGDIAL 2011 Conference10.5555/2132890.2132935(350-352)Online publication date: 17-Jun-2011
https://dl.acm.org/doi/10.5555/2132890.2132935
Popescu-Belis AYazdani MNanchen AGarner PKurohashi S(2011)A speech-based just-in-time retrieval system using semantic searchProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations10.5555/2002440.2002454(80-85)Online publication date: 21-Jun-2011
https://dl.acm.org/doi/10.5555/2002440.2002454
Popescu-Belis AKilgour JNanchen APoller PLarson MOrdelman RMetze Fde Jong FKraaij W(2010)The ACLDProceedings of the 2010 international workshop on Searching spontaneous conversational speech10.1145/1878101.1878111(45-48)Online publication date: 29-Oct-2010
https://dl.acm.org/doi/10.1145/1878101.1878111
Show More Cited By

The “FAME” interactive space
1. Human-centered computing

Recommendations

Conversation space: visualising multi-threaded conversation
AVI '00: Proceedings of the working conference on Advanced visual interfaces

This paper explicates the metaphors used to conceive of asynchronous text-based communication (ATBC) software, such as email and newsgroups. Design of such software has been guided by an understanding of ATBC as essentially a text communication (textual ...
Interactive museum tour platform: using robotics and mobile technologies
Pointing to space: modeling of deictic interaction referring to regions
HRI '10: Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction

In daily conversation, we sometimes observe a deictic interaction scene that refers to a region in a space, such as saying "please put it over there" with pointing. How can such an interaction be possible with a robot? Is it enough to simulate people's ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

MLMI'05: Proceedings of the Second international conference on Machine Learning for Multimodal Interaction

July 2005

487 pages

ISBN:3540325492

Editors:
Steve Renals
University of Edinburgh, Edinburgh, Scotland
,
Samy Bengio
IDIAP Research Institute, Martigny, Switzerland

Sponsors

NIST: National Institute of Standards and Technology
SNSF: Swiss National Science Foundation
ESSI/ESPRIT: European Commission

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 July 2005

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 31 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Popescu-Belis AYazdani MNanchen AGarner PMoore JTraum DChai JPassonneau R(2011)A just-in-time document retrieval system for dialogues or monologuesProceedings of the SIGDIAL 2011 Conference10.5555/2132890.2132935(350-352)Online publication date: 17-Jun-2011
https://dl.acm.org/doi/10.5555/2132890.2132935
Popescu-Belis AYazdani MNanchen AGarner PKurohashi S(2011)A speech-based just-in-time retrieval system using semantic searchProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations10.5555/2002440.2002454(80-85)Online publication date: 21-Jun-2011
https://dl.acm.org/doi/10.5555/2002440.2002454
Popescu-Belis AKilgour JNanchen APoller PLarson MOrdelman RMetze Fde Jong FKraaij W(2010)The ACLDProceedings of the 2010 international workshop on Searching spontaneous conversational speech10.1145/1878101.1878111(45-48)Online publication date: 29-Oct-2010
https://dl.acm.org/doi/10.1145/1878101.1878111
Coutaz J(2006)Meta-user interfaces for ambient spacesProceedings of the 5th international conference on Task models and diagrams for users interface design10.5555/1756988.1756990(1-15)Online publication date: 23-Oct-2006
https://dl.acm.org/doi/10.5555/1756988.1756990
Vaufreydaz DEmonet RReignier P(2006)A lightweight speech detection system for perceptive environmentsProceedings of the Third international conference on Machine Learning for Multimodal Interaction10.1007/11965152_30(336-345)Online publication date: 1-May-2006
https://dl.acm.org/doi/10.1007/11965152_30

View Options

View options

Media

Figures

Other

Tables

View Table of Contents