Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
In this paper, we address the problem of automatic discovery of speech patterns using audio-visual information fusion. Un- like those previous studies based ...
In this paper, we propose a new multimodal fusion strategy for open-set speaker identification using a combination of early and late integration following ...
Abstract—It is well-known that early integration (also called data fusion) is effective when the modalities are cor- related, and late integration (also ...
Missing: Discovery | Show results with:Discovery
This paper proposes a novel multimodal self-supervised architecture for energy-efficient audio-visual (AV) speech enhancement that integrates Graph.
Missing: Discovery | Show results with:Discovery
This paper proposes a novel multimodal self-supervised architecture for energy-efficient audio-visual (AV) speech enhancement that integrates Graph Neural ...
Missing: Discovery | Show results with:Discovery
In [17], CCA is used to fuse features from speech and lip texture/movement to form audiovisual feature synchronization which aids in speaker identification.
We extract relevant and informative audio-visual features using multiple multi-class Support Vector Machines with probabilistic.
Dec 2, 2022 · In this study, we propose a new audio-visual video summarization framework integrating four ways of audio-visual information fusion with GRU- ...
Sep 13, 2022 · This paper proposes a novel multimodal self-supervised architecture for energy-efficient audio-visual (AV) speech enhancement that ...
Abstract. This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in.