Visual Understanding via Multi-Feature Shared Learning with Global Consistency

Zhang, Lei; Zhang, David

doi:10.1109/TMM.2015.2510509

Computer Science > Computer Vision and Pattern Recognition

arXiv:1505.05233 (cs)

[Submitted on 20 May 2015 (v1), last revised 9 Sep 2015 (this version, v2)]

Title:Visual Understanding via Multi-Feature Shared Learning with Global Consistency

Authors:Lei Zhang, David Zhang

View PDF

Abstract:Image/video data is usually represented with multiple visual features. Fusion of multi-source information for establishing the attributes has been widely recognized. Multi-feature visual recognition has recently received much attention in multimedia applications. This paper studies visual understanding via a newly proposed l_2-norm based multi-feature shared learning framework, which can simultaneously learn a global label matrix and multiple sub-classifiers with the labeled multi-feature data. Additionally, a group graph manifold regularizer composed of the Laplacian and Hessian graph is proposed for better preserving the manifold structure of each feature, such that the label prediction power is much improved through the semi-supervised learning with global label consistency. For convenience, we call the proposed approach Global-Label-Consistent Classifier (GLCC). The merits of the proposed method include: 1) the manifold structure information of each feature is exploited in learning, resulting in a more faithful classification owing to the global label consistency; 2) a group graph manifold regularizer based on the Laplacian and Hessian regularization is constructed; 3) an efficient alternative optimization method is introduced as a fast solver owing to the convex sub-problems. Experiments on several benchmark visual datasets for multimedia understanding, such as the 17-category Oxford Flower dataset, the challenging 101-category Caltech dataset, the YouTube & Consumer Videos dataset and the large-scale NUS-WIDE dataset, demonstrate that the proposed approach compares favorably with the state-of-the-art algorithms. An extensive experiment on the deep convolutional activation features also show the effectiveness of the proposed approach. The code is available on this http URL

Comments:	13 pages,6 figures, this paper is accepted for publication in IEEE Transactions on Multimedia
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:1505.05233 [cs.CV]
	(or arXiv:1505.05233v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1505.05233
Related DOI:	https://doi.org/10.1109/TMM.2015.2510509

Submission history

From: Lei Zhang [view email]
[v1] Wed, 20 May 2015 03:01:08 UTC (1,767 KB)
[v2] Wed, 9 Sep 2015 10:07:11 UTC (1,660 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Visual Understanding via Multi-Feature Shared Learning with Global Consistency

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Visual Understanding via Multi-Feature Shared Learning with Global Consistency

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators