abstract

Selection of an Object Requested by Speech Based on Generic Object Recognition

Authors:

Hitoshi Nishimura,

Yuko Ozasa,

Yasuo Ariki,

Mikio NakanoAuthors Info & Claims

MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction

Pages 23 - 24

https://doi.org/10.1145/2666499.2666505

Published: 16 November 2014 Publication History

Get Access

Abstract

In this paper, we propose a method that a robot can select an object specified by human speech among several objects based on generic object recognition. Although object selection methods have been proposed based on specific object recognition, generic object recognition is more useful for the selection in a real environment. In the proposed method, an object is selected by integrating speech recognition results and generic object recognition results. We investigated the relation between the method of narrowing down candidates based on speech and image recognition results and the object selection accuracy.

References

[1]

Nishimura et al. Selection of unknown objects specified by speech using models constructed from web images. In Proc ICPR, pages 477--482, 2014.

Digital Library

Google Scholar

[2]

Ozasa et al. Disambiguation in unknown object detection by integrating image and speech recognition confidences. In Proc ACCV, pages 85--96. 2013.

Digital Library

Google Scholar

[3]

Sermanet et al. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.

Google Scholar

[4]

Julius. Open source large vocabulary csr engine Julius. http://julius.sourceforge.jp/.

Google Scholar

Cited By

View all

Kitano YTakiguchi TAriki Y(2015)Estimation of object functions using deformable part model2015 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV)10.1109/FCV.2015.7103740(1-4)Online publication date: Jan-2015
https://doi.org/10.1109/FCV.2015.7103740

Index Terms

Selection of an Object Requested by Speech Based on Generic Object Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition

Recommendations

Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings

Primal access recognition of visual objects (PARVO), a computer vision system that addresses the problem of fast and generic recognition of unexpected 3D objects from single 2D views, is considered. Recently, recognition by components (RBC), which is a ...
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

We study the low-variance and robust features for speech recognition system on the AURORA-4 corpus.We propose to compute cepstral features from a regularized MVDR (RMVDR) spectral estimates, denoted as RMVDR-based Cepstral Coefficient (RMCC) features.A ...

Comments

Information & Contributors

Information

Published In

MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction

November 2014

40 pages

ISBN:9781450305518

DOI:10.1145/2666499

General Chairs:
Mary Ellen Foster
Heriot-Watt University, Edinburgh, Scotland
,
Manuel Giuliani
University of Salzburg, Austria
,
Ronald Petrick
University of Edinburgh, Scotland

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 November 2014

Check for updates

Author Tags

Qualifiers

Abstract

Conference

ICMI '14

Sponsor:

SIGCHI

ICMI '14: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 16, 2014

Istanbul, Turkey

Acceptance Rates

MMRWHRI '14 Paper Acceptance Rate 3 of 5 submissions, 60%;

Overall Acceptance Rate 3 of 5 submissions, 60%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
80
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Kitano YTakiguchi TAriki Y(2015)Estimation of object functions using deformable part model2015 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV)10.1109/FCV.2015.7103740(1-4)Online publication date: Jan-2015
https://doi.org/10.1109/FCV.2015.7103740

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations