ABSTRACT As a new generation of multimodal/media systems begins to define itself, researchers are... more ABSTRACT As a new generation of multimodal/media systems begins to define itself, researchers are attempting to learn how to combine different modes into strategically integrated whole systems. In theory, well designed multimodal systems should be able to integrate ...
We present a statistical approach to developing multimodal recognition systems and, in particular... more We present a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first identify the primary factors that influence multimodal recognition performance by evaluating the multimodal recognition probabilities. We then develop two techniques, an estimate approach and a learning approach, which are designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems
During multimodal communication, we speak, shift eye gaze, gesture, and move in a powerful flow o... more During multimodal communication, we speak, shift eye gaze, gesture, and move in a powerful flow of communication that bears little resemblance to the discrete keyboard and mouse clicks entered sequentially with a graphical user interface (GUI). A profound shift is now ...
Multimodal systems process combined natural input modes—such as speech, pen, touch, hand gestures... more Multimodal systems process combined natural input modes—such as speech, pen, touch, hand gestures, eye gaze, and head and body movements—in a coordinated manner with multimedia system output. These systems represent a new direction for computing that draws from ...
... 6. Oviatt, S. Mutual disambiguation of recognition errors in a multimodal architecture. ... L... more ... 6. Oviatt, S. Mutual disambiguation of recognition errors in a multimodal architecture. ... L., Suhm, B., Bers, J., Holzman, T., Winograd, T., Landay, J., Larson, J., and Ferro, D. Designing the user interface for multimodal speech and gesture ... Human Computer Interaction, in press. ...
... previous research, it was predicted that people would prefer to interact multi-modally rather... more ... previous research, it was predicted that people would prefer to interact multi-modally rather ... exclusively writ-ten input, but none of the participants preferred uni-modal spoken input. Figure 5 (left panel) illustrates this strong preference for multimodal versus unimodal interaction ...
... landmark or street not currently in view by its name, to which the system responded by automa... more ... landmark or street not currently in view by its name, to which the system responded by automatically centering on and highlighting the described entity.' As they worked, people could ... In developing this simulation, an emphasis was placed on providing automated support for ...
... previous research, it was predicted that people would prefer to interact multi-modally rather... more ... previous research, it was predicted that people would prefer to interact multi-modally rather ... exclusively writ-ten input, but none of the participants preferred uni-modal spoken input. Figure 5 (left panel) illustrates this strong preference for multimodal versus unimodal interaction ...
ABSTRACT As a new generation of multimodal/media systems begins to define itself, researchers are... more ABSTRACT As a new generation of multimodal/media systems begins to define itself, researchers are attempting to learn how to combine different modes into strategically integrated whole systems. In theory, well designed multimodal systems should be able to integrate ...
We present a statistical approach to developing multimodal recognition systems and, in particular... more We present a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first identify the primary factors that influence multimodal recognition performance by evaluating the multimodal recognition probabilities. We then develop two techniques, an estimate approach and a learning approach, which are designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems
During multimodal communication, we speak, shift eye gaze, gesture, and move in a powerful flow o... more During multimodal communication, we speak, shift eye gaze, gesture, and move in a powerful flow of communication that bears little resemblance to the discrete keyboard and mouse clicks entered sequentially with a graphical user interface (GUI). A profound shift is now ...
Multimodal systems process combined natural input modes—such as speech, pen, touch, hand gestures... more Multimodal systems process combined natural input modes—such as speech, pen, touch, hand gestures, eye gaze, and head and body movements—in a coordinated manner with multimedia system output. These systems represent a new direction for computing that draws from ...
... 6. Oviatt, S. Mutual disambiguation of recognition errors in a multimodal architecture. ... L... more ... 6. Oviatt, S. Mutual disambiguation of recognition errors in a multimodal architecture. ... L., Suhm, B., Bers, J., Holzman, T., Winograd, T., Landay, J., Larson, J., and Ferro, D. Designing the user interface for multimodal speech and gesture ... Human Computer Interaction, in press. ...
... previous research, it was predicted that people would prefer to interact multi-modally rather... more ... previous research, it was predicted that people would prefer to interact multi-modally rather ... exclusively writ-ten input, but none of the participants preferred uni-modal spoken input. Figure 5 (left panel) illustrates this strong preference for multimodal versus unimodal interaction ...
... landmark or street not currently in view by its name, to which the system responded by automa... more ... landmark or street not currently in view by its name, to which the system responded by automatically centering on and highlighting the described entity.' As they worked, people could ... In developing this simulation, an emphasis was placed on providing automated support for ...
... previous research, it was predicted that people would prefer to interact multi-modally rather... more ... previous research, it was predicted that people would prefer to interact multi-modally rather ... exclusively writ-ten input, but none of the participants preferred uni-modal spoken input. Figure 5 (left panel) illustrates this strong preference for multimodal versus unimodal interaction ...
Uploads