research-article

The knowing camera 2: recognizing and annotating places-of-interest in smartphone photos

Authors:

Sai WuAuthors Info & Claims

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 707 - 716

https://doi.org/10.1145/2600428.2609557

Published: 03 July 2014 Publication History

Abstract

This paper presents a project called Knowing Camera for real-time recognizing and annotating places-of-interest(POI) in smartphone photos, with the availability of online geotagged images of such places. We propose a`"Spatial+Visual" (S+V) framework which consists of a probabilistic field-of-view model in the spatial phase and sparse coding similarity metric in the visual phase to recognize phone-captured POIs. Moreover, we put forward an offline Collaborative Salient Area (COSTAR) mining algorithm to detect common visual features (called Costars) among the noisy photos geotagged on each POI, thus to clean the geotagged image database. The mining result can be utilized to annotate the region-of-interest on the query image during the online query processing. Besides, this mining procedure further improves the efficiency and accuracy of the S+V framework. Our experiments in the real-world and Oxford 5K datasets show promising recognition and annotation performances of the proposed approach, and that the proposed COSTAR mining technique outperforms state-of-the-art approach.

References

[1]

http://www.cs.umd.edu/~mount/ann/.

[2]

http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/.

[3]

R. Achanta, F. J. Estrada, P. Wils, and S. Süsstrunk. Salient region detection and segmentation. In ICVS, volume 5008 of Lecture Notes in Computer Science, pages 66--75. Springer, 2008.

Digital Library

[4]

Y. S. Avrithis, Y. Kalantidis, G. Tolias, and E. Spyrou. Retrieving landmark and non-landmark images from community photo collections. In ACM Multimedia, pages 153--162. ACM, 2010.

Digital Library

[5]

D. M. Chen, G. Baatz, K. Koser, S. S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk. City-scale landmark identification on mobile devices. In CVPR, pages 737--744. IEEE, 2011.

Digital Library

[6]

O. Chum, A. Mikulík, M. Perdoch, and J. Matas. Total recall ii: Query expansion revisited. In CVPR, pages 889--896. IEEE, 2011.

Digital Library

[7]

O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In ICCV, pages 1--8. IEEE, 2007.

[8]

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Statistics, 32(2):407--451, 2004.

[9]

H. Jégou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In European Conference on Computer Vision, volume~I of LNCS, pages 304--317. Springer, oct 2008.

Digital Library

[10]

H. Jegou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. International Journal of Computer Vision, 87(3):316--336, 2010.

Digital Library

[11]

L. Juan and O. Gwon. A Comparison of SIFT, PCA-SIFT and SURF. International Journal of Image Processing (IJIP), 3(4):143--152.

[12]

X. Li, C. Wu, C. Zach, S. Lazebnik, and J.-M. Frahm. Modeling and recognition of landmark image collections using iconic scene graphs. In ECCV (1), volume 5302 of Lecture Notes in Computer Science, pages 427--440. Springer, 2008.

Digital Library

[13]

T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, and H.-Y. Shum. Learning to detect a salient object. IEEE Trans. Pattern Anal. Mach. Intell., 33(2):353--367, 2011.

Digital Library

[14]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60:91--110, 2004.

Digital Library

[15]

D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In CVPR (2), pages 2161--2168. IEEE Computer Society, 2006.

Digital Library

[16]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007.

[17]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR. IEEE Computer Society, 2008.

[18]

J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In ICCV, pages 1470--1477. IEEE Computer Society, 2003.

Digital Library

[19]

P. Turcot and D. G. Lowe. Better matching with fewer features: The selection of useful features in large database recognition problems. In ICCV Workshop on Emergent Issues in Large Amounts of Visual Data (WS-LAVD), 2009.

[20]

T. Tuytelaars and K. Mikolajczyk. Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision, 3(3):177--280, 2007.

Digital Library

[21]

J. Wang, C. Zhang, Y. Zhou, Y. Wei, and Y. Liu. Global contrast of superpixels based salient region detection. In CVM, volume 7633 of Lecture Notes in Computer Science, pages 130--137. Springer, 2012.

Digital Library

[22]

M. Z. Zheng, Yan-Tao, Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, and H. Neven. Tour the world: Building a web-scale landmark recognition engine. In CVPR, pages 1085--1092. IEEE, 2009.

Cited By

Peng PGu XZhu SShou LChen G(2019)One net to rule them all: efficient recognition and retrieval of POI from geo-tagged photosMultimedia Tools and Applications10.1007/s11042-018-6847-yOnline publication date: 4-Jan-2019
https://doi.org/10.1007/s11042-018-6847-y
Guo NLuo JLing ZYang MWu WFu X(2019)Your clicks reveal your secretsMultimedia Tools and Applications10.1007/s11042-018-6815-678:7(8337-8362)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s11042-018-6815-6
Gu JZhang LWang JYu ZXin XLiu Y(2018)Spotlight: Multiple-Object Localization by Mobile Photo Fusion2018 4th International Conference on Big Data Computing and Communications (BIGCOM)10.1109/BIGCOM.2018.00044(231-235)Online publication date: Aug-2018
https://doi.org/10.1109/BIGCOM.2018.00044
Show More Cited By

Index Terms

The knowing camera 2: recognizing and annotating places-of-interest in smartphone photos
1. Information systems
  1. Information retrieval

Recommendations

The knowing camera: recognizing places-of-interest in smartphone photos
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

This paper presents a framework called Knowing Camera for real-time recognizing places-of-interest in smartphone photos, with the availability of online geotagged images of such places. We propose a probabilistic field-of-view model which captures the ...
DeepCamera: A Unified Framework for Recognizing Places-of-Interest based on Deep ConvNets
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

In this work, we present a novel project called DeepCamera(DC) for recognizing places-of-interest(POI) with smartphones. Our framework is based on deep convolutional neural networks(ConvNets) which are currently state-of-the-art solutions to vision ...
One net to rule them all: efficient recognition and retrieval of POI from geo-tagged photos
Abstract
In this work, we present DeepCamera, a novel framework that combines visual recognition and spatial recognition for identifying places-of-interest (POIs) from smartphone photos. Both deep visual features and geographic features of images are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

July 2014

1330 pages

ISBN:9781450322577

DOI:10.1145/2600428

General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SIGIR '14

Sponsor:

SIGIR

SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 6 - 11, 2014

Queensland, Gold Coast, Australia

Acceptance Rates

SIGIR '14 Paper Acceptance Rate 82 of 387 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
480
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Peng PGu XZhu SShou LChen G(2019)One net to rule them all: efficient recognition and retrieval of POI from geo-tagged photosMultimedia Tools and Applications10.1007/s11042-018-6847-yOnline publication date: 4-Jan-2019
https://doi.org/10.1007/s11042-018-6847-y
Guo NLuo JLing ZYang MWu WFu X(2019)Your clicks reveal your secretsMultimedia Tools and Applications10.1007/s11042-018-6815-678:7(8337-8362)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s11042-018-6815-6
Gu JZhang LWang JYu ZXin XLiu Y(2018)Spotlight: Multiple-Object Localization by Mobile Photo Fusion2018 4th International Conference on Big Data Computing and Communications (BIGCOM)10.1109/BIGCOM.2018.00044(231-235)Online publication date: Aug-2018
https://doi.org/10.1109/BIGCOM.2018.00044
Yu JChen YXu XYu JChen YXu X(2018)State-of-Art ResearchesSensing Vehicle Conditions for Detecting Driving Behaviors10.1007/978-3-319-89770-7_5(65-68)Online publication date: 19-Apr-2018
https://doi.org/10.1007/978-3-319-89770-7_5
Peng PShou LChen KChen GWu S(2016)KISSIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.248964728:4(994-1006)Online publication date: 1-Apr-2016
https://dl.acm.org/doi/10.1109/TKDE.2015.2489647
Peng PChen HShou LChen KChen GXu CBailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)DeepCameraProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806620(1891-1894)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806620
Li HPeng PLu HShou LChen KChen GMase KLangheinrich MGatica-Perez DGellersen HChoudhury TYatani K(2015)E2C2Adjunct Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2015 ACM International Symposium on Wearable Computers10.1145/2800835.2800841(9-12)Online publication date: 7-Sep-2015
https://dl.acm.org/doi/10.1145/2800835.2800841
McAuley JTargett CShi Qvan den Hengel ABaeza-Yates RLalmas MMoffat ARibeiro-Neto B(2015)Image-Based Recommendations on Styles and SubstitutesProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767755(43-52)Online publication date: 9-Aug-2015
https://dl.acm.org/doi/10.1145/2766462.2767755

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten