Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1031171.1031213acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Image similarity search with compact data structures

Published: 13 November 2004 Publication History

Abstract

The recent theoretical advances on compact data structures (also called "sketches") have raised the question of whether they can effectively be applied to content-based image retrieval systems. The main challenge is to derive an algorithm that achieves high-quality similarity searches while using compact metadata. This paper proposes a new similarity search method consisting of three parts. The first is a new region feature representation with weighted $=<i></i><inf>1</inf> distance function, and EMD* match, an improved EMD match, to compute image similarity. The second is a thresholding and transformation algorithm to convert feature vectors into very compact data structures. The third is an EMD embedding based filtering method to speed up the query process. We have implemented a prototype system with the proposed method and performed experiments with a 10,000 image database. Our results show that the proposed method can achieve more effective similarity searches than previous approaches with metadata 3 to 72 times more compact than previous systems. The experiments also show that our EMD embedding based filtering technique can speed up the query process by a factor of 5 or more with little loss in query effectiveness.

References

[1]
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137--147, 1999.]]
[2]
S. Ardizzoni, I. Bartolini, and M. Patella. Windsurf: Region-based image retrieval using wavelets. In DEXA Workshop, pages 167--173, 1999.]]
[3]
I. Bartolini, P. Ciaccia, and M. Patella. A sound algorithm for region-based image retrieval using an index. In DEXA Workshop, pages 930--934, 2000.]]
[4]
T. Batu et al. A sublinear algorithm for weakly approximating edit distance. In Proc. of STOC'03, pages 316--324, 2003.]]
[5]
B. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of ACM, 13(7):422--426, 1970.]]
[6]
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. In Proc. of 6th Intl. World Wide Web Conf., pages 391--404, 1997.]]
[7]
C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik. Blobworld: A system for region-based image indexing and retrieval. In Proc. of 3rd Intl. Conf. on Visual Information and Information Systems, pages 509--516, 1999.]]
[8]
M. Charikar. Similarity estimation techniques from rounding algorithms. In Proc. of STOC'02, pages 380--388, 2002.]]
[9]
Y. Deng and B. S. Manjunath. Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(8):800--810, 2001.]]
[10]
J. P. Eakins and M. E. Graham. Content-based image retrieval: A report to the JISC Technology Applications Programme. Technical report, University of Northumbria at Newcastle, Institute for Image Data Research, 1999.]]
[11]
H. Greenspan, G. Dvir, and Y. Rubner. Context-dependent segmentation and matching in image databases. Computer Vision and Image Understanding, 93:86--109, 2004.]]
[12]
P. Indyk. Stable distributions, pseudorandom generators, embeddings and data stream computation. In Proc. of FOCS'00, pages 189--197, 2000.]]
[13]
P. Indyk and R. Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proc. of STOC'98, pages 604--613, 1998.]]
[14]
P. Indyk and N. Thaper. Fast image retrieval via embeddings. In 3rd Intl. Workshop on Statistical and Computational Theories of Vision, 2003.]]
[15]
F. Jing, M. Li, H. Zhang, and B. Zhang. An effective region-based image retrieval framework. In Proc. of ACM Multimedia'02, pages 456--465, 2002.]]
[16]
M. H. Kryder. Future magnetic recording technologies. In FAST'02, invited talk, 2002.]]
[17]
E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM Journal on Computing, 30(2):457--474, 2000.]]
[18]
W. Ma and B. S. Manjunath. NETRA: A toolbox for navigating large image databases. Multimedia Systems, 7(3):184--198, 1999.]]
[19]
W. Ma and H. Zhang. Benchmarking of image features for content-based retrieval. In Proc. of IEEE 32nd Asilomar Conf. on Signals, Systems, Computers, volume 1, pages 253--257, 1998.]]
[20]
D. Martin, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using brightness and texture. In Proc. of NIPS, pages 1255--1262, 2002.]]
[21]
A. Natsev, R. Rastogi, and K. Shim. WALRUS: A similarity retrieval algorithm for image databases. In Proc. of ACM SIGMOD'99, pages 395--406, 1999.]]
[22]
Y. Rubner, C. Tomasi, and L. J. Guibas. The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision, 40(2):99--121, 2000.]]
[23]
Y. Rui, T. S. Huang, and S.-F. Chang. Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 10(4):39--62, 1999.]]
[24]
A. W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(12):1349--1380, 2000.]]
[25]
J. R. Smith and S.-F. Chang. VisualSEEk: A fully automated content-based image query system. In Proc. of ACM Multimedia'96, pages 87--98, 1996.]]
[26]
M. Stricker and M. Orengo. Similarity of color images. In Proc. of SPIE Storage and Retrieval for Image and Video Databases, volume 2420, pages 381--392, 1995.]]
[27]
R. C. Veltkamp and M. Tanase. Content-based image retrieval systems: A survey. Technical Report UU-CS-2000-34, Utrecht University, Information and Computer Sciences, 2000.]]
[28]
J. Z. Wang, J. Li, and G. Wiederhold. SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(9):947--963, 2001.]]

Cited By

View all
  • (2023)An efficient indexing technique for billion-scale nearest neighbor searchMultimedia Tools and Applications10.1007/s11042-023-14825-z82:20(31673-31689)Online publication date: 23-Mar-2023
  • (2022)Registration-Free Multicomponent Joint AVA Inversion Using Optimal TransportIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2021.306327160(1-13)Online publication date: 2022
  • (2021)On the Similarity Search With Hamming Space SketchesIntelligent Analytics With Advanced Multi-Industry Applications10.4018/978-1-7998-4963-6.ch005(97-127)Online publication date: 2021
  • Show More Cited By

Index Terms

  1. Image similarity search with compact data structures

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge management
    November 2004
    678 pages
    ISBN:1581138741
    DOI:10.1145/1031171
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 November 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. compact data structures
    2. image similarity
    3. search

    Qualifiers

    • Article

    Conference

    CIKM04
    Sponsor:
    CIKM04: Conference on Information and Knowledge Management
    November 8 - 13, 2004
    D.C., Washington, USA

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)32
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)An efficient indexing technique for billion-scale nearest neighbor searchMultimedia Tools and Applications10.1007/s11042-023-14825-z82:20(31673-31689)Online publication date: 23-Mar-2023
    • (2022)Registration-Free Multicomponent Joint AVA Inversion Using Optimal TransportIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2021.306327160(1-13)Online publication date: 2022
    • (2021)On the Similarity Search With Hamming Space SketchesIntelligent Analytics With Advanced Multi-Industry Applications10.4018/978-1-7998-4963-6.ch005(97-127)Online publication date: 2021
    • (2021)A Semantic-Based Strategy to Model Multimedia Social NetworksTransactions on Large-Scale Data- and Knowledge-Centered Systems XLVII10.1007/978-3-662-62919-2_2(29-50)Online publication date: 17-Jan-2021
    • (2020)Improving Approximate Nearest Neighbor Search through Learned Adaptive Early TerminationProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3380600(2539-2554)Online publication date: 11-Jun-2020
    • (2020)Fast Locally Weighted PLS Modeling for Large-Scale Industrial ProcessesIndustrial & Engineering Chemistry Research10.1021/acs.iecr.0c0393259:47(20779-20786)Online publication date: 11-Nov-2020
    • (2019)FaceTimeMapInternational Journal of Multimedia Data Engineering and Management10.4018/IJMDEM.201904010310:2(37-59)Online publication date: 1-Apr-2019
    • (2019)Experimental Evaluation of Local Sensitive Hashing Functions for Face Recognition2019 5th International Conference on Web Research (ICWR)10.1109/ICWR.2019.8765276(184-195)Online publication date: Apr-2019
    • (2018)Privacy Threats and Protection Recommendations for the Use of Geosocial Network Data in ResearchSocial Sciences10.3390/socsci71001917:10(191)Online publication date: 11-Oct-2018
    • (2018)Binary Sketches for Secondary FilteringACM Transactions on Information Systems10.1145/323193637:1(1-28)Online publication date: 6-Dec-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media