Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Large Sparse Cone Non-negative Matrix Factorization for Image Annotation

Published: 20 April 2017 Publication History

Abstract

Image annotation assigns relevant tags to query images based on their semantic contents. Since Non-negative Matrix Factorization (NMF) has the strong ability to learn parts-based representations, recently, a number of algorithms based on NMF have been proposed for image annotation and have achieved good performance. However, most of the efforts have focused on the representations of images and annotations. The properties of the semantic parts have not been well studied. In this article, we revisit the sparseness-constrained NMF (sNMF) proposed by Hoyer [2004]. By endowing the sparseness constraint with a geometric interpretation and sNMF with theoretical analyses of the generalization ability, we show that NMF with such a sparseness constraint has three advantages for image annotation tasks: (i) The sparseness constraint is more ℓ0-norm oriented than the ℓ1-norm-based sparseness, which significantly enhances the ability of NMF to robustly learn semantic parts. (ii) The sparseness constraint has a large cone interpretation and thus allows the reconstruction error of NMF to be smaller, which means that the learned semantic parts are more powerful to represent images for tagging. (iii) The learned semantic parts are less correlated, which increases the discriminative ability for annotating images. Moreover, we present a new efficient large sparse cone NMF (LsCNMF) algorithm to optimize the sNMF problem by employing the Nesterov’s optimal gradient method. We conducted experiments on the PASCAL VOC07 dataset and demonstrated the effectiveness of LsCNMF for image annotation.

References

[1]
Kobus Barnard, Pinar Duygulu, David Forsyth, Nando De Freitas, David M Blei, and Michael I. Jordan. 2003. Matching words and pictures. Journal of Machine Learning Research 3 (2003), 1107--1135.
[2]
Peter L. Bartlett and Shahar Mendelson. 2003. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3 (2003), 463--482.
[3]
Amir Beck and Marc Teboulle. 2009. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2, 1 (2009), 183--202.
[4]
Jaafar BenAbdallah, Juan C. Caicedo, Fabio A. Gonzalez, and Olfa Nasraoui. 2010. Multimodal image annotation using non-negative matrix factorization. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Vol. 1. IEEE, 128--135.
[5]
Olivier Bousquet, Stéphane Boucheron, and Gábor Lugosi. 2004. Introduction to statistical learning theory. In Advanced Lectures on Machine Learning. Springer, 169--207.
[6]
Peter Bühlmann and Sara Van De Geer. 2011. Statistics for High-dimensional Data: Methods, Theory and Applications. Springer Science 8 Business Media.
[7]
Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 8 (2011), 1548--1560.
[8]
Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno, and Nuno Vasconcelos. 2007. Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 3 (2007), 394--410.
[9]
Minmin Chen, Alice Zheng, and Kilian Weinberger. 2013. Fast image tagging. In Proceedings of the 30th International Conference on Machine Learning. 1274--1282.
[10]
Wei-Sheng Chin, Yong Zhuang, Yu-Chin Juan, and Chih-Jen Lin. 2015. A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Transactions on Intelligent Systems and Technology 6, 1, Article 2 (March 2015), 24 pages.
[11]
Cheng Deng, Rongrong Ji, Dacheng Tao, Xinbo Gao, and Xuelong Li. 2014. Weakly supervised multi-graph learning for robust image reranking. IEEE Transactions on Multimedia 16, 3 (April 2014), 785--795.
[12]
Chris Ding, Tao Li, and Michael I. Jordan. 2010. Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1 (2010), 45--55.
[13]
Chris Ding, Tao Li, and Wei Peng. 2008. On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Computational Statistics Data Analysis 52, 8 (2008), 3913--3927.
[14]
Chris H. Q. Ding, Xiaofeng He, and Horst D. Simon. 2005. On the equivalence of nonnegative matrix factorization and spectral clustering. In SDM, Vol. 5. SIAM, 606--610.
[15]
Jonathan Doherty, Kevin Curran, and Paul McKevitt. 2015. Pattern matching techniques for replacing missing sections of audio streamed across wireless networks. ACM Transactions on Intelligent Systems and Technology 6, 2, Article 25 (March 2015), 38 pages.
[16]
David Donoho and Victoria Stodden. 2004. When does non-negative matrix factorization give a correct decomposition into parts? In Advances in Neural Information Processing Systems. MIT Press, Cambridge, 1141--1148.
[17]
David L. Donoho and Michael Elad. 2003. Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization. Proceedings of the National Academy of Sciences 100, 5 (2003), 2197--2202.
[18]
Matthijs Douze, Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg, and Cordelia Schmid. 2009. Evaluation of GIST descriptors for web-scale image search. (July 2009). http://lear.inrialpes.fr/pubs/2009/DJSAS09.
[19]
Nadia Figueroa, Haiwei Dong, and Abdulmotaleb El Saddik. 2015. A combined approach toward consistent reconstructions of indoor spaces based on 6D RGB-D odometry and KinectFusion. ACM Transactions on Intelligent Systems and Technology 6, 2, Article 14 (March 2015), 10 pages.
[20]
Hao Fu, Qian Zhang, and Guoping Qiu. 2012. Random forest for image annotation. In Proceedings of the 12th European Conference on Computer Vision. Springer-Verlag, 86--99.
[21]
Bo Geng, Yangxi Li, Dacheng Tao, Meng Wang, Zheng-Jun Zha, and Chao Xu. 2012. Parallel lasso for large-scale video concept detection. IEEE Transactions on Multimedia 14, 1 (2012), 55--65.
[22]
Nicolas Gillis. 2012. Sparse and unique nonnegative matrix factorization through data preprocessing. Journal of Machine Learning Research 13, 1 (2012), 3349--3386.
[23]
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2011. Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Transactions on Image Processing 20, 7 (2011), 2030--2048.
[24]
Naiyang Guan, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Transactions on Signal Processing 60, 6 (2012), 2882--2898.
[25]
Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. 2009. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In Proceedings of the IEEE 12th International Conference on Computer Vision. IEEE, 309--316.
[26]
Xiaofei He and Partha Niyogi. 2003. Locality preserving projections. (2003), 153--160. http://papers.nips.cc/paper/2359-locality-preserving-projections.
[27]
Derrall Heath, David Norton, and Dan Ventura. 2014. Conveying semantics through visual metaphor. ACM Transactions on Intelligent Systems and Technology 5, 2 (2014), 31.
[28]
Harold Hotelling. 1933. Analysis of a complex of statistical variables with principal components. Journal of Educational Psychology 24 (1933), 417--441.
[29]
Patrik O. Hoyer. 2004. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5 (2004), 1457--1469.
[30]
Jing Huang, Xinge You, Yuan Yuan, Feng Yang, and Lin Lin. 2010. Rotation invariant iris feature extraction using gaussian Markov random fields with non-separable wavelet. Neurocomputing 73, 4 (2010), 883--894.
[31]
Jiwoon Jeon, Victor Lavrenko, and Raghavan Manmatha. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 119--126.
[32]
Rongrong Ji, Yue Gao, Wei Liu, Xing Xie, Qi Tian, and Xuelong Li. 2015. When location meets social multimedia: A survey on vision-based recognition and mining for geo-social multimedia analytics. ACM Transactions on Intelligent Systems and Technology 6, 1, Article 1 (March 2015), 18 pages.
[33]
Rongrong Ji, Hongxun Yao, Wei Liu, Xiaoshuai Sun, and Qi Tian. 2012. Task-dependent visual-codebook compression. IEEE Transactions on Image Processing 21, 4 (April 2012), 2282--2293.
[34]
Liping Jing, Chao Zhang, and Michael K. Ng. 2012. SNMFCA: Supervised NMF-based image classification and annotation. IEEE Transactions on Image Processing 21, 11 (2012), 4508--4521.
[35]
Mahdi M. Kalayeh, Haroon Idrees, and Mubarak Shah. 2014. NMF-KNN: Image annotation using weighted multi-view non-negative matrix factorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 184--191.
[36]
Jingu Kim, Renato Monteiro, and Haesun Park. 2012. Group sparsity in nonnegative matrix factorization. In SDM. SIAM, 851--862.
[37]
Michel Ledoux and Michel Talagrand. 1991. Probability in banach spaces. Springer, Berlin Heidelberg.
[38]
Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788--791.
[39]
Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems. 556--562.
[40]
Weifeng Liu, Hongli Liu, Dapeng Tao, Yanjiang Wang, and Ke Lu. 2015. Multiview hessian regularized logistic regression for action recognition. Signal Processing 110 (2015), 101--107.
[41]
Weifeng Liu, Huimin Zhang, Dapeng Tao, Yanjiang Wang, and Ke Lu. 2013. Large-scale paralleled sparse principal component analysis. CoRR abs/1312.6182 (2013). http://arxiv.org/abs/1312.6182.
[42]
Xiaoqiang Lu and Xuelong Li. 2014. Multiresolution imaging. IEEE Transactions on Cybernetics 44, 1 (2014), 149--160.
[43]
Xiaoqiang Lu, Yulong Wang, and Yuan Yuan. 2013. Sparse coding from a bayesian perspective. IEEE Transactions on Neural Networks and Learning Systems 24, 6 (2013), 929--939.
[44]
Xiaoqiang Lu, Yuan Yuan, and Pingkun Yan. 2014. Alternatively constrained dictionary learning for image superresolution. IEEE Transactions on Cybernetics 44, 3 (2014), 366--377.
[45]
Ameesh Makadia, Vladimir Pavlovic, and Sanjiv Kumar. 2008. A new baseline for image annotation. In Proceedings of the 10th European Conference on Computer Vision: Part III. Springer-Verlag.
[46]
Andreas Maurer and Massimiliano Pontil. 2010. K-dimensional coding schemes in hilbert spaces. IEEE Transactions on Information Theory 56, 11 (2010), 5839--5846.
[47]
Yurii Nesterov. 1983. A method of solving a convex programming problem with convergence rate O(1/k2). Soviet Mathematics Doklady 27, 2 (1983), 372--376.
[48]
Aude Oliva and Antonio Torralba. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3 (2001), 145--175.
[49]
Weihua Ou, Xinge You, Dacheng Tao, Pengyue Zhang, Yuanyan Tang, and Ziqi Zhu. 2014. Robust face recognition via occlusion dictionary learning. Pattern Recognition 47, 4 (2014), 1559--1572.
[50]
Zhibin Pan, Xinge You, Hong Chen, Dacheng Tao, and Baochuan Pang. 2013. Generalization performance of magnitude-preserving semi-supervised ranking with graph-based regularization. Information Sciences 221 (2013), 284--296.
[51]
Symeon Papadopoulos, Christos Zigkolis, Yiannis Kompatsiaris, and Athena Vakali. 2011. Cluster-based landmark and event detection for tagged photo collections. IEEE MultiMedia 18, 1 (2011), 52--63.
[52]
Robert Peharz and Franz Pernkopf. 2012. Sparse nonnegative matrix factorization with ℓ0-constraints. Neurocomputing 80 (2012), 38--46.
[53]
Ling Shao, Di Wu, and Xuelong Li. 2014. Learning deep and wide: A spectral method for learning deep networks. IEEE Transactions on Neural Networks and Learning Systems 25, 12 (2014), 2303--2308.
[54]
Yuanlong Shao, Yuan Zhou, Xiaofei He, Deng Cai, and Hujun Bao. 2009. Semi-supervised topic modeling for image annotation. In Proceedings of the 17th ACM International Conference on Multimedia. ACM, 521--524.
[55]
Miaojing Shi, Xinghai Sun, Dacheng Tao, Chao Xu, George Baciu, and Hong Liu. 2015. Exploring spatial correlation for visual object retrieval. ACM Transactions on Intelligent Systems and Technology 6, 2, Article 24 (March 2015), 21 pages.
[56]
Jiayu Tang and Paul H. Lewis. 2008. Non-negative matrix factorisation for object class discovery and image auto-annotation. In Proceedings of the International Conference on Content-based Image and Video Retrieval. ACM, 105--112.
[57]
Dapeng Tao, Jun Cheng, Mingli Song, and Xu Lin. 2016. Manifold ranking-based matrix factorization for saliency detection. IEEE Transactions on Neural Networks and Learning Systems 27, 6 (2016), 1122--1134.
[58]
Dapeng Tao, Lianwen Jin, Weifeng Liu, and Xuelong Li. 2013a. Hessian regularized support vector machines for mobile image annotation on the cloud. IEEE Transactions on Multimedia 15, 4 (2013), 833--844.
[59]
Dapeng Tao, Lianwen Jin, Weifeng Liu, and Xuelong Li. 2013b. Hessian regularized support vector machines for mobile image annotation on the cloud. IEEE Transactions on Multimedia 15, 4 (2013), 833--844.
[60]
Dacheng Tao, Xuelong Li, Xindong Wu, and Stephen J. Maybank. 2007. General tensor discriminant analysis and gabor features for gait recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 10 (2007), 1700--1715.
[61]
Dacheng Tao, Xiaoou Tang, Xuelong Li, and Yong Rui. 2006. Direct kernel biased discriminant analysis: A new content-based image retrieval relevance feedback algorithm. IEEE Transactions on Multimedia 8, 4 (2006), 716--727.
[62]
Fabian J. Theis, Kurt Stadlthanner, and Toshihisa Tanaka. 2005. First results on uniqueness of sparse non-negative matrix factorization. In Proceedings of the 13th European Signal Processing Conference. Citeseer.
[63]
Chong Wang, David Blei, and Fei-Fei Li. 2009. Simultaneous image classification and annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1903--1910.
[64]
Fei Wang, Noah Lee, Jimeng Sun, Jianying Hu, and Shahram Ebadollahi. 2011. Automatic group sparse coding. Association for the Advancement of Artificial Intelligence. http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3717.
[65]
Fei Wang, Tao Li, and Changshui Zhang. 2008. Semi-supervised clustering via matrix factorization. In Proceedings of the International Conference on Data Mining. 1--12.
[66]
Fa Yu Wang, Chong-Yung Chi, Tsung-Han Chan, and Yue Wang. 2010. Nonnegative least-correlated component analysis for separation of dependent sources by volume maximization. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 5 (2010), 875--888.
[67]
Xuezhi Wen, Ling Shao, Wei Fang, and Yu Xue. 2015. Efficient feature selection and classification for vehicle detection. IEEE Transactions on Circuits and Systems for Video Technology 25, 3 (2015), 508--517.
[68]
John Wright, Allen Y. Yang, Arvind Ganesh, Shankar S. Sastry, and Yi Ma. 2009. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2009), 210--227.
[69]
Yang Yang, Yi Yang, Zi Huang, Heng Tao Shen, and Feiping Nie. 2011. Tag localization with spatial correlations and joint group sparsity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 881--888.
[70]
Jiho Yoo and Seungjin Choi. 2010. Nonnegative matrix factorization with orthogonality constraints. Journal of Computing Science and Engineering 4, 2 (2010), 97--109.
[71]
Xinge You, Qiuhui Chen, Bin Fang, and Yuan Yan Tang. 2006. Thinning character using modulus minima of wavelet transform. International Journal of Pattern Recognition and Artificial Intelligence 20, 3 (2006), 361--375.
[72]
Jun Yu, Dapeng Tao, Jonathan Li, and Jun Cheng. 2014. Semantic preserving distance metric learning and applications. Information Sciences 281 (2014), 674--686.
[73]
Rafal Zdunek and Andrzej Cichocki. 2006. Non-negative matrix factorization with quasi-newton optimization. In Proceedings of the International Conference on Artificial Intelligence and Soft Computing. 870--879.
[74]
Shaoting Zhang, Junzhou Huang, Yuchi Huang, Yang Yu, Hongsheng Li, and Dimitris N. Metaxas. 2010. Automatic image annotation using group sparsity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3312--3319.
[75]
Wei-Shi Zheng, Stan Z. Li, Jian-Huang Lai, and Shengcai Liao. 2007. On constrained sparse matrix factorization. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 1--8.
[76]
Guoxu Zhou, Shengli Xie, Zuyuan Yang, Jun-Mei Yang, and Zhaoshui He. 2011. Minimum-volume-constrained nonnegative matrix factorization: Enhanced ability of learning parts. IEEE Transactions on Neural Networks 22, 10 (2011), 1626--1637.
[77]
Leyla Zhuhadar, Rong Yang, and Miltiadis D. Lytras. 2013. The impact of social multimedia systems on cyberlearners. Computers in Human Behavior 29, 2 (2013), 378--385.

Cited By

View all
  • (2022)Momentum-Incorporated Symmetric Non-Negative Latent Factor ModelsIEEE Transactions on Big Data10.1109/TBDATA.2020.30126568:4(1096-1106)Online publication date: 1-Aug-2022
  • (2021)BALS: Blocked Alternating Least Squares for Parallel Sparse Matrix Factorization on GPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.306494232:9(2291-2302)Online publication date: 1-Sep-2021
  • (2021)Block-Diagonal Guided Symmetric Nonnegative Matrix FactorizationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3113943(1-1)Online publication date: 2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 3
Special Issue: Mobile Social Multimedia Analytics in the Big Data Era and Regular Papers
May 2017
320 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3040485
  • Editor:
  • Yu Zheng
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2017
Accepted: 01 August 2016
Revised: 01 November 2015
Received: 01 June 2015
Published in TIST Volume 8, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Nesterovs optimal gradient
  2. Non-negative matrix factorization
  3. image annotation
  4. sparseness constraint

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Program for Excellent Young Talents of Yunnan University, the Australian Research Council Projects
  • Shenzhen Technology Project
  • Program for Changjiang Scholars and Innovative Research Team in University of China
  • National Natural Science Foundation of China
  • Guangdong Natural Science Funds
  • Opening Project of State Key Laboratory of Digital Publishing Technology
  • Yunnan Natural Science Funds

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)2
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Momentum-Incorporated Symmetric Non-Negative Latent Factor ModelsIEEE Transactions on Big Data10.1109/TBDATA.2020.30126568:4(1096-1106)Online publication date: 1-Aug-2022
  • (2021)BALS: Blocked Alternating Least Squares for Parallel Sparse Matrix Factorization on GPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.306494232:9(2291-2302)Online publication date: 1-Sep-2021
  • (2021)Block-Diagonal Guided Symmetric Nonnegative Matrix FactorizationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3113943(1-1)Online publication date: 2021
  • (2021)Ball K-Medoids: Faster and ExacterAdvances in Artificial Intelligence and Security10.1007/978-3-030-78615-1_16(180-192)Online publication date: 29-Jun-2021
  • (2021) Weighted ensemble networks for multiview based tiny object quality assessment Concurrency and Computation: Practice and Experience10.1002/cpe.599533:6Online publication date: 23-Jan-2021
  • (2020)Robust Bi-stochastic Graph Regularized Matrix Factorization for Data ClusteringIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2020.3007673(1-1)Online publication date: 2020
  • (2020)Two fast vector-wise update algorithms for orthogonal nonnegative matrix factorization with sparsity constraintJournal of Computational and Applied Mathematics10.1016/j.cam.2020.112785375(112785)Online publication date: Sep-2020
  • (2019)Active Transfer Learning Network: A Unified Deep Joint Spectral–Spatial Feature Learning Model for Hyperspectral Image ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2018.286885157:3(1741-1754)Online publication date: Mar-2019
  • (2019)$p$ -Laplacian Regularization for Scene RecognitionIEEE Transactions on Cybernetics10.1109/TCYB.2018.283384349:8(2927-2940)Online publication date: Aug-2019
  • (2018)Discriminative and Orthogonal Subspace Constraints-Based Nonnegative Matrix FactorizationACM Transactions on Intelligent Systems and Technology10.1145/32290519:6(1-24)Online publication date: 1-Nov-2018
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media