Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models

Wang, Yida; Deng, Weihong

doi:10.1109/TIP.2018.2858553

Computer Science > Computer Vision and Pattern Recognition

arXiv:1705.08590 (cs)

[Submitted on 24 May 2017 (v1), last revised 24 Apr 2018 (this version, v2)]

Title:Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models

Authors:Yida Wang, Weihong Deng

View PDF

Abstract:Given large amount of real photos for training, Convolutional neural network shows excellent performance on object recognition tasks. However, the process of collecting data is so tedious and the background are also limited which makes it hard to establish a perfect database. In this paper, our generative model trained with synthetic images rendered from 3D models reduces the workload of data collection and limitation of conditions. Our structure is composed of two sub-networks: semantic foreground object reconstruction network based on Bayesian inference and classification network based on multi-triplet cost function for avoiding over-fitting problem on monotone surface and fully utilizing pose information by establishing sphere-like distribution of descriptors in each category which is helpful for recognition on regular photos according to poses, lighting condition, background and category information of rendered images. Firstly, our conjugate structure called generative model with metric learning utilizing additional foreground object channels generated from Bayesian rendering as the joint of two sub-networks. Multi-triplet cost function based on poses for object recognition are used for metric learning which makes it possible training a category classifier purely based on synthetic data. Secondly, we design a coordinate training strategy with the help of adaptive noises acting as corruption on input images to help both sub-networks benefit from each other and avoid inharmonious parameter tuning due to different convergence speed of two sub-networks. Our structure achieves the state of the art accuracy of over 50\% on ShapeNet database with data migration obstacle from synthetic images to real photos. This pipeline makes it applicable to do recognition on real images only based on 3D models.

Comments:	14 pages
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Report number:	Volume: 27 , Issue: 12
Cite as:	arXiv:1705.08590 [cs.CV]
	(or arXiv:1705.08590v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1705.08590
Journal reference:	in IEEE Transactions on Image Processing, vol. 27, no. 12, pp. 5813-5826, Dec. 2018
Related DOI:	https://doi.org/10.1109/TIP.2018.2858553

Submission history

From: Yida Wang [view email]
[v1] Wed, 24 May 2017 03:22:18 UTC (7,746 KB)
[v2] Tue, 24 Apr 2018 13:14:41 UTC (8,831 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Generative Model with Coordinate Metric Learning for Object Recognition Based on 3D Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators