Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1458469.1458473acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A metric cache for similarity search

Published: 30 October 2008 Publication History

Abstract

Similarity search in metric spaces is a general paradigm that can be used in several application fields. It can also be effectively exploited in content-based image retrieval systems, which are shifting their target towards the Web-scale dimension. In this context, an important issue becomes the design of scalable solutions, which combine parallel and distributed architectures with caching at several levels.
To this end, we investigate the design of a similarity cache that works in metric spaces. It is able to answer with exact and approximate results: even when an exact match is not present in cache, our cache may return an approximate result set with quality guarantees. By conducting tests on a collection of one million high-quality digital photos, we show that the proposed caching techniques can have a significant impact on performance, like caching on text queries has been proved effective for traditional Web search engines.

References

[1]
C. Böhm, S. Berchtold, and D. A. Keim. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surveys, 33(3):322--373, 2001.
[2]
T. Bozkaya and M. Ozsoyoglu. Indexing large metric spaces for similarity search queries. ACM Trans. Database Syst., 24(3):361--404, 1999.
[3]
E. Chávez, G. Navarro, R. Baeza-Yates, and J. L. Marroquín. Searching in metric spaces. ACM Comp. Surveys, 33(3):273--321, 2001.
[4]
R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval: Ideas, in uences, and trends of the new age. ACM Computing Surveys, 2007.
[5]
T. Fagni, R. Perego, F. Silvestri, and S. Orlando. Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Trans. Inf. Syst., 24(1):51--78, 2006.
[6]
H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A. El Abbadi. Approximate nearest neighbor searching in multimedia databases. In Proc. of 17th ICDE, 2001.
[7]
ISO/IEC. Information technology - Multimedia content description interfaces. Part 6: Reference Software, 2003. 15938--6:2003.
[8]
R. Lempel and S. Moran. Predictive caching and prefetching of query results in search engines. In Proc. of the 12th WWW Conference, pages 19--28, New York, NY, USA, 2003. ACM Press.
[9]
P. Lyman and H. R. Varian. How much information, 2003. http://www.sims.berkeley.edu/how-much-info-2003.
[10]
E. P. Markatos. On Caching Search Engine Query Results. Computer Communications, 24(2):137--143, 2001.
[11]
S. Podlipnig and L. Boszormenyi. A survey of web cache replacement strategies. ACM Comput. Surv., 35(4):374--398, 2003.
[12]
P. Salembier and T. Sikora. Introduction to MPEG-7: Multimedia Content Description Interface. John Wiley & Sons, Inc., New York, NY, USA, 2002.
[13]
H. Samet. Foundations of Multidimensional and Metric Data Structures. Computer Graphics and Geometric Modeling. Morgan Kaufmann Pub., CA, USA, 2006.
[14]
C. Silverstein, H. Marais, M. Henzinger, and M. Moricz. Analysis of a very large web search engine query log. SIGIR Forum, 33(1):6--12, 1999.
[15]
R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proc. of 24th VLDB, pages 194--205, 1998.
[16]
Y. Xie and D. O'Hallaron. Locality in search engine queries and its implications for caching. In Proceedings of 21st IEEE INFOCOM, 2002.
[17]
P. Zezula, G. Amato, V. Dohnal, and M. Batko. Similarity SearchThe Metric Space Approach, volume 32 of Advances in Database Systems. NY, USA, 2006.

Cited By

View all
  • (2024)Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep LearningACM Computing Surveys10.1145/365728356:10(1-40)Online publication date: 14-May-2024
  • (2024)Toward Inference Delivery Networks: Distributing Machine Learning With Optimality GuaranteesIEEE/ACM Transactions on Networking10.1109/TNET.2023.330592232:1(859-873)Online publication date: Feb-2024
  • (2024)TTL model for an LRU-based similarity caching policyComputer Networks10.1016/j.comnet.2024.110206241(110206)Online publication date: Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
LSDS-IR '08: Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
October 2008
90 pages
ISBN:9781605582542
DOI:10.1145/1458469
  • Program Chairs:
  • Sebastian Michel,
  • Gleb Skobeltsyn,
  • Wai Gen Yee
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. content-based retrieval
  2. metric spaces
  3. query-result caching

Qualifiers

  • Research-article

Conference

CIKM08
CIKM08: Conference on Information and Knowledge Management
October 30, 2008
California, Napa Valley, USA

Acceptance Rates

Overall Acceptance Rate 3 of 5 submissions, 60%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)2
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep LearningACM Computing Surveys10.1145/365728356:10(1-40)Online publication date: 14-May-2024
  • (2024)Toward Inference Delivery Networks: Distributing Machine Learning With Optimality GuaranteesIEEE/ACM Transactions on Networking10.1109/TNET.2023.330592232:1(859-873)Online publication date: Feb-2024
  • (2024)TTL model for an LRU-based similarity caching policyComputer Networks10.1016/j.comnet.2024.110206241(110206)Online publication date: Mar-2024
  • (2023)Ascent Similarity Caching With Approximate IndexesIEEE/ACM Transactions on Networking10.1109/TNET.2022.321701231:3(1173-1186)Online publication date: Jun-2023
  • (2022)Caching Historical Embeddings in Conversational SearchACM Transactions on the Web10.1145/357851918:4(1-19)Online publication date: 29-Dec-2022
  • (2022)GRADES: Gradient Descent for Similarity CachingIEEE/ACM Transactions on Networking10.1109/TNET.2022.3187044(1-12)Online publication date: 2022
  • (2022)Accelerating Deep Learning Classification with Error-controlled Approximate-key CachingIEEE INFOCOM 2022 - IEEE Conference on Computer Communications10.1109/INFOCOM48880.2022.9796677(2118-2127)Online publication date: 2-May-2022
  • (2022)Computing the Hit Rate of Similarity CachingGLOBECOM 2022 - 2022 IEEE Global Communications Conference10.1109/GLOBECOM48099.2022.10000890(141-146)Online publication date: 4-Dec-2022
  • (2021)GRADES: Gradient Descent for Similarity CachingIEEE INFOCOM 2021 - IEEE Conference on Computer Communications10.1109/INFOCOM42981.2021.9488757(1-10)Online publication date: 10-May-2021
  • (2021)Taking two Birds with one k-NN Cache2021 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOBECOM46510.2021.9685954(1-6)Online publication date: 7-Dec-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media