research-article

Searching and Matching Texture-free 3D Shapes in Images

Authors:

Efstratios Gavves,

Cees G. M. SnoekAuthors Info & Claims

ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Pages 326 - 334

https://doi.org/10.1145/3206025.3206057

Published: 05 June 2018 Publication History

Abstract

The goal of this paper is to search and match the best rendered view of a texture-free 3D shape to an object of interest in a 2D query image. Matching rendered views of 3D shapes to RGB images is challenging because, 1) 3D shapes are not always a perfect match for the image queries, 2) there is great domain difference between rendered and RGB images, and 3) estimating the object scale versus distance is inherently ambiguous in images from uncalibrated cameras. In this work we propose a deeply learned matching function that attacks these challenges and can be used for a search engine that finds the appropriate 3D shape and matches it to objects in 2D query images. We evaluate the proposed matching function and search engine with a series of controlled experiments on the 24 most populated vehicle categories in PASCAL3D+. We test the capability of the learned matching function in transferring to unseen 3D shapes and study overall search engine sensitivity w.r.t available 3D shapes and object localization accuracy, showing promising results in retrieving 3D shapes given 2D image queries.

References

[1]

Jurgen Assfalg, Alberto Del Bimbo, and Pietro Pala. 2004. Retrieval of 3D Objects by Visual Similarity. In MIR.

Digital Library

[2]

Mathieu Aubry, Daniel Maturana, Alexei Efros, Bryan Russell, and Josef Sivic. 2014. Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models. In CVPR.

Digital Library

[3]

Aayush Bansal, Bryan Russell, and Abhinav Gupta. 2016. Marr Revisited: 2D-3D Alignment via Surface Normal Prediction CVPR.

[4]

Angel X. Chang, Thomas A. Funkhouser, Leonidas J. Guibas, Pat Hanrahan, Qi-Xing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. 2015. ShapeNet: An Information-Rich 3D Model Repository. CoRR (2015).

[5]

Christopher Bongsoo Choy, Michael Stark, Sam Corbett-Davies, and Silvio Savarese. 2015. Enriching Object Detection with 2D-3D Registration and Continuous Viewpoint Estimation CVPR.

[6]

M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. 2015. The Pascal Visual Object Classes Challenge: A Retrospective. IJCV Vol. 111, 1 (2015), 98--136.

Digital Library

[7]

Christoph Feichtenhofer, Axel Pinz, and Andrew Zisserman. 2016. Convolutional two-stream network fusion for video action recognition CVPR.

[8]

Thomas Funkhouser, Patrick Min, Michael Kazhdan, Joyce Chen, Alex Halderman, David Dobkin, and David Jacobs. 2003. A Search Engine for 3D Models. ACM Trans. Graph. Vol. 22, 1 (2003), 83--105.

Digital Library

[9]

Saurabh Gupta, Pablo Arbelaez, Ross Girshick, and Jitendra Malik. 2015. Aligning 3D models to RGB-D images of cluttered scenes CVPR.

[10]

Richard Hartley and Andrew Zisserman. 2003. Multiple View Geometry in Computer Vision (bibinfoedition2 ed.). Cambridge University Press, New York, NY, USA.

Digital Library

[11]

Fabian Junkert, Markus Eberts, Adrian Ulges, and Ulrich Schwanecke. 2017. Cross-modal Image-Graphics Retrieval by Neural Transfer Learning ICMR. 330--337.

Digital Library

[12]

Abhishek Kar, Shubham Tulsiani, Joao Carreira, and Jitendra Malik. 2015. Amodal completion and size constancy in natural scenes ICCV.

Digital Library

[13]

T. Kato, T. Kurita, N. Otsu, and K. Hirata. 1992. A sketch retrieval method for full color image database -- Query by Visual Example ICPR.

[14]

Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR.

[15]

Svetlana Kordumova, Thomas Mensink, and Cees G. M. Snoek. 2016. Pooling Objects for Recognizing Scenes without Examples ICMR.

Digital Library

[16]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS.

Digital Library

[17]

M. S. Lew, N. Sebe, C. Djeraba, and R. Jain. 2006. Content-based multimedia information retrieval: State of the art and challenges. ACM TOMCCAP Vol. 2, 1 (2006), 1--19.

Digital Library

[18]

Joseph J Lim, Aditya Khosla, and Antonio Torralba. 2014. FPM: Fine pose parts-based model with 3d cad models ECCV.

[19]

Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. 2015. Bilinear cnn models for fine-grained visual recognition ICCV.

Digital Library

[20]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In ECCV.

[21]

Arsalan Mousavian, Dragomir Anguelov, John Flynn, and Jana Kosecka. 2016. 3D Bounding Box Estimation Using Deep Learning and Geometry. arXiv preprint arXiv:1612.00496 (2016).

[22]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection CVPR.

[23]

Shaoqing Ren, Kaiming He, and Ross Girshick. 2015. Faster R-CNN_ Towards Real-Time Object Detection with Region Proposal Networks. In NIPS.

Digital Library

[24]

Karen Simonyan and Andrew Zisserman. 2014. Two-stream Convolutional Networks for Action Recognition in Videos NIPS.

Digital Library

[25]

Hao Su, Charles R Qi, Yangyan Li, and Leonidas J Guibas. 2015. Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In ICCV.

Digital Library

[26]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, et almbox. 2015. Going deeper with convolutions. In CVPR.

[27]

Johan W. H. Tangelder and Remco C. Veltkamp. 2007. A survey of content based 3D shape retrieval methods. MTAP Vol. 39, 3 (2007), 441.

Digital Library

[28]

Shubham Tulsiani and Jitendra Malik. 2015. Viewpoints and Keypoints. In CVPR.

[29]

Yu Xiang, Roozbeh Mottaghi, and Silvio Savarese. 2014. Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild WACV.

Index Terms

Searching and Matching Texture-free 3D Shapes in Images
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Matching
        Object recognition
      2. Computer vision tasks
        Visual content-based indexing and retrieval

Recommendations

Measuring 3D shape similarity by graph-based matching of the medial scaffolds

We propose to measure 3D shape similarity by matching a medial axis (MA) based representation-the medial scaffold(MS). Shape similarity is measured as the minimum extent of deformation necessary for one shape to match another, guided by representing the ...
Part-in-whole matching of rigid 3D shapes using geodesic disk spectrum

Part-in-whole matching of rigid 3D shapes has attracted great interest in shape analysis and has various applications in computational archaeology. Rigid part-in-whole matching algorithms are mainly based on methods minimizing geometric distances and ...
Thesaurus-based 3D Object Retrieval with Part-in-Whole Matching

Research in content-based 3D retrieval has already started, and several approaches have been proposed which use in different manner a similarity assessment to match the shape of the query against the shape of the objects in the database. However, the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

June 2018

550 pages

ISBN:9781450350464

DOI:10.1145/3206025

Conference Chairs:
Kiyoharu Aizawa
The Univ. of Tokyo, Japan
,
Michael Lew
Leiden Univ., Netherlands
,
Shin'ichi Satoh
National Inst. of Informatics, Japan

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMR '18

Sponsor:

SIGMM

ICMR '18: International Conference on Multimedia Retrieval

June 11 - 14, 2018

Yokohama, Japan

Acceptance Rates

ICMR '18 Paper Acceptance Rate 44 of 136 submissions, 32%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
128
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten