Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Selecting vantage objects for similarity indexing

Published: 02 September 2011 Publication History

Abstract

Indexing has become a key element in the pipeline of a multimedia retrieval system, due to continuous increases in database size, data complexity, and complexity of similarity measures. The primary goal of any indexing algorithm is to overcome high computational costs involved with comparing the query to every object in the database. This is achieved by efficient pruning in order to select only a small set of candidate matches. Vantage indexing is an indexing technique that belongs to the category of embedding or mapping approaches, because it maps a dissimilarity space onto a vector space such that traditional access methods can be used for querying. Each object is represented by a vector of dissimilarities to a small set of m reference objects, called vantage objects. Querying takes place within this vector space. The retrieval performance of a system based on this technique can be improved significantly through a proper choice of vantage objects. We propose a new technique for selecting vantage objects that addresses the retrieval performance directly, and present extensive experimental results based on three data sets of different size and modality, including a comparison with other selection strategies. The results clearly demonstrate both the efficacy and scalability of the proposed approach.

References

[1]
Arkin, E. M., Chew, L., Huttenlocher, D., Kedem, K., and Mitchell, J. 1991. An efficiently computable metric for comparing polygonal shapes. Patt. Anal. Mach. Intell.13, 3, 209--216.
[2]
Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., and Wu, A. 1994. An optimal algorithm for approximate nearest neighbor searching. In Proceedings of the 5th ACM SIAM Symposium on Discrete Algorithms. 573--582.
[3]
Athitsos, V., Alon, J., Sclaroff, S., and Kollios, G. 2004. Boostmap: A method for efficient approximate similarity rankings. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'04). Vol. 2, IEEE, Los Alamitos, CA, 268--275.
[4]
Beckmann, N., Kriegel, H., Schneider, R., and Seeger, B. 1990. The r*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'90). ACM, New York, 322--331.
[5]
Bentley, J. 1975. Binary search trees used for associative searching. Comm. ACM 18, 9, 507--519.
[6]
Bóhm, C., Berchtold, S., and Keim, D. A. 2001. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33, 3, 322--373.
[7]
Bozkaya, T. and Ozsoyoglu, M. 1997. Distance-based indexing for high-dimensional metric spaces. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 97). ACM, New York.
[8]
Bozkaya, T. and Ozsoyoglu, M. 1999. Indexing large metric spaces for similarity search queries. Trans. Datab. Syst. 24, 3.
[9]
Brisaboa, N., Farina, A., Pedreira, O., and Reyes, N. 2006. Similarity search using sparse pivots for efficient multimedia information retrieval. In Proceedings of the 8th IEEE International Symposium on Multimedia (ISM'06). IEEE, Los Alamitos, CA, 881--888.
[10]
Buckley, C. and Voorhees, E. M. 2000. Evaluating evaluation measure stability. In Research and Development in Information Retrieval, 33--40.
[11]
Bustos, B., Navarro, G., and Chavez, E. 2003. Pivot selection techniques for proximity searching in metric spaces. Patt. Recogn. Lett. 2357--2366.
[12]
Chavez, E. and Navarro, G. 2001. Searching in metric spaces. ACM Comput. Surv. 33, 3, 273--321.
[13]
Ciaccia, P., Patella, M., and Zezula, P. 1997. M-tree: An efficient access method for similarity search in metric spaces. In Proceedings of the 23rd VLDB Conference. 426--435.
[14]
Faloutsos, C. and Lin, K.-I. 1995. FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'95). ACM, New York, 163--174.
[15]
Gaede, V. and Gunther, O. 1998. Multidimensional access methods. ACM Comput. Surv. 30, 2, 170--231.
[16]
Giannopoulos, P. and Veltkamp, R. C. 2002. A pseudo-metric for weighted point sets. In Proceedings of the European Conference on Computer Vision (ECCV'02). Lecture Notes in Computer Science, vol. 2352, Springer, Berlin, 715--730.
[17]
Gutman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'84). ACM, New York, 47--54.
[18]
Henning, C. and Latecki, L. J. 2003. The choice of vantage objects for image retrieval. Patt. Recogn. 36, 9, 2187--219
[19]
Histecru, G. and Farach-Colton, M. 1999. Cluster-preserving embeddings of proteins. Tech. rep., Rutgers University, Piscataway, NJ.
[20]
Hjaltason, G. and Samet, H. 2003. Properties of embedding methods for similarity searching in metric spaces. Patt. Anal. Mach. Intell. 25, 5, 530--549.
[21]
Hristescu, G. and Farach-Colton, M. 1999. Cluster-preserving embedding of proteins. Tech. rep. 99-50, DIMACS 8.
[22]
Kruskal, J. and Wish, M. 1978. Multidimensional Scaling. Sage Publications, Beverly Hills, CA.
[23]
Latecki, L. J., Lakaemper, R., and Eckhardt, U. 2000. Shape descriptors for non-rigid shapes with a single closed contour. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 424--429.
[24]
Linial, N., London, E., and Rabinovich, Y. 1995. The geometry of graphs and some of its algorithmic applications. Combinatorica 15, 215--245.
[25]
Mokhtarian, F., Abbasi, S., and Kittler, J. 1996. Efficient and robust retrieval by shape content through curvature scale space. In Proceedings of the British Machine and Vision Conference (BMVC'96).
[26]
Pekalska, E., Duin, R., and Paclik, P. 2005. Prototype selection for dissimilarity-based classifiers. In Pattern Recognition, Elsevier, Amsterdam, 189--208.
[27]
Rubner, Y., Tomasi, C., and Guibas, L. 1998. A metric for distributions with applications to image databases. In Proceedings of the IEEE 6th International Conference on Computer Vision (ICCV'98). IEEE, Los Alamitos, CA, 59--88.
[28]
Samet, H. 2006. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann.
[29]
Sellis, T. K., Roussopoulos, N., and Faloutsos, C. 1987. The r-tree: A dynamic index for multi-dimensional objects. In Proceedings of the Conference on Very Large Databases (VLDB). 507--518.
[30]
Typke, R., Giannopoulos, P., Veltkamp, R. C., Wiering, F., and van Oostrum, R. 2003. Using transportation distances for measuring melodic similarity. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). 107--114.
[31]
van Leuken, R. H., Veltkamp, R. C., and Typke, R. 2006. Selecting vantage objects for similarity indexing. In Proceedings of the International Conference on Pattern Recognition (ICPR). 453--456.
[32]
Venkateswaran, J., Lachwani, D., Kahveci, T., and Jermaine, C. 2006. Reference-based indexing of sequence databases. In Proceedings of the Conference on Very Large Databases (VLDB). 906--917.
[33]
Vleugels, J. and Veltkamp, R. C. 2002. Efficient image retrieval through vantage objects. In Pattern Recognition, 69--80.
[34]
Wang, X., Wang, J. T.-L., Lin, K.-I., Shasha, D., Shapiro, B. A., and Zhang, K. 2000. An index structure for data mining and clustering. In Knowledge and Information Systems, 161--184.
[35]
Yianilos, P. N. 1993. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). ACM, New York, 311--321.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 7, Issue 3
August 2011
117 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/2000486
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 September 2011
Accepted: 01 January 2010
Revised: 01 September 2009
Received: 01 July 2008
Published in TOMM Volume 7, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Multimedia retrieval
  2. embedding methods
  3. indexing
  4. vantage objects

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)DIDS: Double Indices and Double Summarizations for Fast Similarity SearchProceedings of the VLDB Endowment10.14778/3665844.366585117:9(2198-2211)Online publication date: 1-May-2024
  • (2022)Indexing Metric Spaces for Exact Similarity SearchACM Computing Surveys10.1145/353496355:6(1-39)Online publication date: 7-Dec-2022
  • (2021)Entity Resolution in Dissimilarity SpacesProceedings of the 25th Pan-Hellenic Conference on Informatics10.1145/3503823.3503899(413-418)Online publication date: 26-Nov-2021
  • (2019)A survey of image data indexing techniquesArtificial Intelligence Review10.1007/s10462-018-9673-852:2(1189-1266)Online publication date: 1-Aug-2019
  • (2019)Object recognition based on critical nodesPattern Analysis & Applications10.1007/s10044-018-00777-w22:1(147-163)Online publication date: 1-Feb-2019
  • (2018)HD-indexProceedings of the VLDB Endowment10.14778/3204028.320403411:8(906-919)Online publication date: 1-Apr-2018
  • (2018)Characterizing the optimal pivots for efficient similarity searches in vector space databases with Minkowski distancesApplied Mathematics and Computation10.1016/j.amc.2018.01.028328:C(203-223)Online publication date: 1-Jul-2018
  • (2018)Data-independent vantage point selection for range queriesThe Journal of Supercomputing10.1007/s11227-018-2384-875:12(7952-7978)Online publication date: 21-Apr-2018
  • (2017)Efficient Metric Indexing for Similarity Search and Similarity JoinsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.250655629:3(556-571)Online publication date: 1-Mar-2017
  • (2017)Distance-Based Index Structures for Fast Similarity SearchCybernetics and Systems Analysis10.1007/s10559-017-9966-y53:4(636-658)Online publication date: 1-Jul-2017
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media