Alexey Karpov

Followers

Following

Public Views

Interests

Uploads

Papers by Alexey Karpov

Icando: Intellectual Computer Assistant For Disabled Operators

Publication in the conference proceedings of EUSIPCO, Florence, Italy, 2006

Download

Multimodal System For Hands-Free Pc Control

Publication in the conference proceedings of EUSIPCO, Antalya, Turkey, 2005

Download

Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds

Companion Publication of the 2020 International Conference on Multimodal Interaction, 2020

This paper investigates different fusion strategies as well as provides insights on their effecti... more This paper investigates different fusion strategies as well as provides insights on their effectiveness alongside standalone classifiers in the framework of paralinguistic analysis of infant vocalizations. The combinations of such systems as Support Vector Machines (SVM) and Extreme Learning Machines (ELM) based classifiers, as well as its weighted kernel version are explored, training systems on different acoustic feature representations and implementing weighted score-level fusion of the predictions. The proposed framework is tested on INTERSPEECH ComParE-2019 Baby Sounds corpus, which is a collection of Home Bank infant vocalization corpora annotated for five classes. Adhering to the challenge protocol, using a single test set submission we outperform the challenge baseline Unweighted Average Recall (UAR) score and achieve a comparable result to the state-of-the-art.

Viseme-dependent weight optimization for CHMM-based audio-visual speech recognition

Interspeech 2010, 2010

Download

RESULTS OF THE ISS CREW MISSIONS Main Results of the ISS-41 / 42 Expedition Training and Activity When Carrying out the Mission Plan

The paper considers the results of the ISS-41/42 crew's activity aboard the spacecraft “Soyuz... more The paper considers the results of the ISS-41/42 crew's activity aboard the spacecraft “SoyuzTMA-14M” and International Space Station. Also, it contains the comparative analysis and estimation of the crew’s contribution to the overall flight program of the ISS. Particular attention is paid to the implementation of scientific applied research and experiments aboard the station. Comments and suggestions on upgrading the ISS Russian Segment are given.

Download

Human-Robot Interaction in a Manned Space Flight: An Ontological Approach

MANNED SPACEFLIGHT, 2019

The use of robotic systems (RSs) in future manned space missions requires the creation of the cos... more The use of robotic systems (RSs) in future manned space missions requires the creation of the cosmonaut-researcher a holistic view on the forms of interaction within the “human – robot” system (HRS) under the adverse environmental conditions. For these purposes, educational and reference materials (ERMs) are needed in fields of ergonomics and its representation in the design of human-machine interfaces (HMI). The paper considers the application of the ontological approach in the actual subject area – the ergonomics of the HMI, as the way of interdisciplinary integration various scientific fields – Informatics, ergonomics, psychophysiology, etc.

Download

Cross-Corpus Data Augmentation for Acoustic Addressee Detection

Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, 2019

Download

Simulation of the «Cosmonaut-Robot» System Interaction on the Lunar Surface Based on Methods of Machine Vision and Computer Graphics

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2017

Download

An Analysis of Visual Faces Datasets

Lecture Notes in Computer Science, 2016

This paper presents an analysis of datasets of images of human faces with annotated facial keypoi... more This paper presents an analysis of datasets of images of human faces with annotated facial keypoints, which are important in human-machine interaction, and their comparison. Datasets are divided according to external conditions of the subject into two groups: datasets in laboratory conditions and in the wild data. Moreover, a quick review of the state-of-the-art methods for keypoints detection is provided. Existing methods are categorized into the following three groups according to the approach to the solution of the problem: top-down, bottom-up and their combination.

Bimodal Speech Recognition Fusing Audio-Visual Modalities

Lecture Notes in Computer Science, 2016

In this paper, we present a novel bimodal speech recognition technique that fuses both audio info... more In this paper, we present a novel bimodal speech recognition technique that fuses both audio information sound signal and visual information movements of lips for Russian speech recognition. We propose an architecture of the automatic system for bimodal recognition of audio-visual speech, which uses one stationary microphone Oktava and one high-speed camera JAI Pulnix 200 frames per second at 640i¾?×i¾?480 pixels to get audio and video signals. We describe also developed software for audio-visual speech database recording, phonemic and visemic structures of the Russian language, as well as probabilistic models of bimodal speech units based on Coupled Hidden Markov Models. Realization of a transformation method from a Coupled Hidden Markov Model into an equivalent 2-stream Hidden Markov Model is presented as well.

A Universal Assistive Technology with Multimodal Input and Multimedia Output Interfaces

Lecture Notes in Computer Science, 2014

Download

Promising Approaches for the Use of Service Robots in the Domain of Manned Space Exploration

SPIIRAS Proceedings, 2014

Download

ICANDO: Low cost multimodal interface for hand disabled people

Download

Combined gesture-speech analysis and speech driven gesture synthesis

2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings, 2006

Speech activity and speaker novelty detection methods for meeting processing

2009 International Conference on Ultra Modern Telecommunications and Workshops, 2009

Multimodal human computer interaction with MIDAS intelligent infokiosk

Proceedings - International Conference on Pattern Recognition, 2010

... Alexey Karpov, Andrey Ronzhin, Irina Kipyatkova, Alexander Ronzhin St. ... Most important of ... more

Multichannel system of audio-visual support of remote mobile participant at e-meeting

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010

Web-based collaboration using the wireless devices that have multimedia playback capabilities is ... more Web-based collaboration using the wireless devices that have multimedia playback capabilities is a viable alternative to traditional face-to-face meetings. E-meetings are popular in businesses because of their cost savings. To provide quick and effective engagement to the meeting activity, the remote user should be able to perceive whole events in the meeting room and have the same possibilities like participants

A video monitoring model with a distributed camera system for the smart space

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010

Download

Designing a multimodal corpus of audio-visual speech using a high-speed camera

International Conference on Signal Processing Proceedings, ICSP, 2012

In this paper, we present a research on designing and processing an audio-visual speech database ... more In this paper, we present a research on designing and processing an audio-visual speech database for an automatic Russian speech recognition system using Oktava MK-012 microphone and JAI Pulnix RMC-6740GE high-speed camera (200 frames per second). Developed audio-visual speech recording system is described, it provides synchronization and fusion of audio and video data recorded by the independent sensors. The system automatically detects voice activity in audio signal and stores only speech fragments discarding non-informative signals. Also it takes into account and processes natural asynchrony of both speech modalities. Methods for feature extraction of acoustic (based on Mel-frequency cepstral coefficients) and visual speech (pixel-based features of mouth region) and multimodal data temporal segmentation (by forced alignment) are presented.

Client and Speech Detection System for Intelligent Infokiosk

Text, Speech and Dialogue, 2010

Client and Speech Detection System for Intelligent Infokiosk Andrey Ronzhin1, Alexey Karpov1,Irin... more

Icando: Intellectual Computer Assistant For Disabled Operators

Publication in the conference proceedings of EUSIPCO, Florence, Italy, 2006

Download

Multimodal System For Hands-Free Pc Control

Publication in the conference proceedings of EUSIPCO, Antalya, Turkey, 2005

Download

Combining Clustering and Functionals based Acoustic Feature Representations for Classification of Baby Sounds

Companion Publication of the 2020 International Conference on Multimodal Interaction, 2020

Viseme-dependent weight optimization for CHMM-based audio-visual speech recognition

Interspeech 2010, 2010

Download

RESULTS OF THE ISS CREW MISSIONS Main Results of the ISS-41 / 42 Expedition Training and Activity When Carrying out the Mission Plan

Download

Human-Robot Interaction in a Manned Space Flight: An Ontological Approach

MANNED SPACEFLIGHT, 2019

Download

Cross-Corpus Data Augmentation for Acoustic Addressee Detection

Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, 2019

Download

Simulation of the «Cosmonaut-Robot» System Interaction on the Lunar Surface Based on Methods of Machine Vision and Computer Graphics

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2017

Download

An Analysis of Visual Faces Datasets

Lecture Notes in Computer Science, 2016

Bimodal Speech Recognition Fusing Audio-Visual Modalities

Lecture Notes in Computer Science, 2016

A Universal Assistive Technology with Multimodal Input and Multimedia Output Interfaces

Lecture Notes in Computer Science, 2014

Download

Promising Approaches for the Use of Service Robots in the Domain of Manned Space Exploration

SPIIRAS Proceedings, 2014

Download

ICANDO: Low cost multimodal interface for hand disabled people

Download

Combined gesture-speech analysis and speech driven gesture synthesis

2006 IEEE International Conference on Multimedia and Expo, ICME 2006 - Proceedings, 2006

Speech activity and speaker novelty detection methods for meeting processing

2009 International Conference on Ultra Modern Telecommunications and Workshops, 2009

Multimodal human computer interaction with MIDAS intelligent infokiosk

Proceedings - International Conference on Pattern Recognition, 2010

... Alexey Karpov, Andrey Ronzhin, Irina Kipyatkova, Alexander Ronzhin St. ... Most important of ... more

Multichannel system of audio-visual support of remote mobile participant at e-meeting

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010

A video monitoring model with a distributed camera system for the smart space

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2010

Download

Designing a multimodal corpus of audio-visual speech using a high-speed camera

International Conference on Signal Processing Proceedings, ICSP, 2012

Client and Speech Detection System for Intelligent Infokiosk

Text, Speech and Dialogue, 2010

Client and Speech Detection System for Intelligent Infokiosk Andrey Ronzhin1, Alexey Karpov1,Irin... more

Alexey Karpov

Uploads

Papers by Alexey Karpov

Log In