Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2733373.2806392acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Accelerating Large-scale Image Retrieval on Heterogeneous Architectures with Spark

Published: 13 October 2015 Publication History

Abstract

Apache Spark is a general-purpose cluster computing system for big data processing and has drawn much attention recently from several fields, such as pattern recognition, machine learning and so on. Unlike MapReduce, Spark is especially suitable for iterative and interactive computations. With the computing power of Spark, a utility library, referred to as IRlib, is proposed in this work to accelerate large-scale image retrieval applications by jointly harnessing the power of GPU. Similar to the built-in machine learning library of Spark, namely MLlib, IRlib fits into the Spark APIs and benefits from the powerful functionalities of Spark. The main contributions of IRlib lie in two-folds. First, IRlib provides a uniform set of APIs for the programming of image retrieval applications. Second, the computational performance of Spark equipped with multiple GPUs is dramatically boosted by developing high performance modules for common image retrieval related algorithms. Comparative experiments concerning large-scale image retrieval are carried out to demonstrate the significant performance improvement achieved by IRlib as compared with single CPU thread implementation as well as Spark without GPUs employed.

References

[1]
J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. ICCV'03, pages 1470--1477, Oct. 2003.
[2]
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Comm. of the ACM - 50th anniversary issue: 1958--2008, 51(1):107--113, Jan. 2008.
[3]
B. White, T. Yeh, J. Lin, and L. Davis. Web-scale computer vision using MapReduce for multimedia data mining. In MDMKDD'10, pages 1--10, Jul. 2010.
[4]
S. Shukla, M. Lease, and A. Tewari. Parallelizing ListNet training using Spark. In ACM SIGIR'12, pages 1127--1128, Aug. 2012.
[5]
H. Qiu, R. Gu, C. Yuan, and Y. Huang. YAFIM: A parallel frequent itemset mining algorithm with Spark. In PDPS'14, pages 1664--1671, May 2014.
[6]
C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. StarPU: A unified platform for task scheduling on heterogeneous multicore architectures. In Euro-Par'09, pages 863--874, Aug. 2009.
[7]
B. He, W. Fang, Q. Luo, N. K. Govindaraju, and N. T. Wang. Mars: A MapReduce framework on graphics processors. In PACT'08, pages 260--269, Oct. 2008.
[8]
H. Jégou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV'08, pages 304--317, Oct. 2008.
[9]
M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. of the ACM, 24(6):381--395, Jun. 1981.
[10]
G. Tolias, Y. Avrithis, and H. Jégou. To aggregate or not to aggregate: Selective match kernels for image search. In ICCV'13, pages 1401--1408, Dec. 2013.
[11]
H. Jégou, M. Douze, and C. Schmid. Improving Bag-of-Features for large scale image search. IJCV, 87(3):316--336, May 2010.
[12]
K. Mikolajczyk and C. Schmid. Scale and affine invariant interest point detectors. IJCV, 60(1):63--86, Oct. 2004.
[13]
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2):91--110, Nov. 2004.
[14]
M. Perd'och, O. Chum, and J. Matas. Efficient representation of local geometry for large scale object retrieval. In CVPR'09, pages 9--16, Jun. 2009.
[15]
R. Arandjelovic and A. Zisserman. Three things everyone should know to improve object retrieval. In CVPR'12, pages 2911--2918, Jun. 2012.

Cited By

View all
  • (2019)Data Storage and Management for Big MultimediaBig Data Analytics for Large‐Scale Multimedia Search10.1002/9781119376996.ch8(209-238)Online publication date: 15-Mar-2019
  • (2018)Prototyping a Web-Scale Multimedia Retrieval Service Using SparkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/320966214:3s(1-24)Online publication date: 15-Jun-2018
  • (2018)In-Memory Stream Indexing of Massive and Fast Incoming Multimedia ContentIEEE Transactions on Big Data10.1109/TBDATA.2017.26974414:1(40-54)Online publication date: 1-Mar-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '15: Proceedings of the 23rd ACM international conference on Multimedia
October 2015
1402 pages
ISBN:9781450334594
DOI:10.1145/2733373
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graphics processing units
  2. heterogeneous computing
  3. image retrieval
  4. spark

Qualifiers

  • Short-paper

Funding Sources

  • Fundamental Research Funds for the Central Universities
  • ``Shu Guang' project of Shanghai Municipal Education Commission and Shanghai Education Development Foundation
  • National Natural Science Foundation of China
  • Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning

Conference

MM '15
Sponsor:
MM '15: ACM Multimedia Conference
October 26 - 30, 2015
Brisbane, Australia

Acceptance Rates

MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;
Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Data Storage and Management for Big MultimediaBig Data Analytics for Large‐Scale Multimedia Search10.1002/9781119376996.ch8(209-238)Online publication date: 15-Mar-2019
  • (2018)Prototyping a Web-Scale Multimedia Retrieval Service Using SparkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/320966214:3s(1-24)Online publication date: 15-Jun-2018
  • (2018)In-Memory Stream Indexing of Massive and Fast Incoming Multimedia ContentIEEE Transactions on Big Data10.1109/TBDATA.2017.26974414:1(40-54)Online publication date: 1-Mar-2018
  • (2017)Towards Engineering a Web-Scale Multimedia ServiceProceedings of the 8th ACM on Multimedia Systems Conference10.1145/3083187.3083200(1-12)Online publication date: 20-Jun-2017
  • (2017)Spark-SIFT: A Spark-Based Large-Scale Image Feature Extract System2017 13th International Conference on Semantics, Knowledge and Grids (SKG)10.1109/SKG.2017.00020(69-76)Online publication date: Aug-2017
  • (2017)In-Memory Distributed Indexing for Large-Scale Media Data Retrieval2017 IEEE International Symposium on Multimedia (ISM)10.1109/ISM.2017.38(232-239)Online publication date: Dec-2017

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media