Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1254882.1254916acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Building high accuracy bloom filters using partitioned hashing

Published: 12 June 2007 Publication History
  • Get Citation Alerts
  • Abstract

    The growing importance of operations such as packet-content inspection, packet classification based on non-IP headers, maintaining flow-state, etc. has led to increased interest in the networking applications of Bloom filters. This is because Bloom filters provide a relatively easy method for hardware implementation of set-membership queries. However, the tradeoff is that Bloom filters only provide a probabilistic test and membership queries can result in false positives. Ideally, we would like this false positive probability to be very low. The main contribution of this paper is a method for significantly reducing this false positive probability in comparison to existing schemes. This is done by developing a partitioned hashing method which results in a choice of hash functions that set far fewer bits in the Bloom filter bit vector than would be the case otherwise. This lower fill factor of the bit vector translates to a much lower false positive probability. We show experimentally that this improved choice can result in as much as a ten-fold increase in accuracy over standard Bloom filters. We also show that the scheme performs much better than other proposed schemes for improving Bloom filters.

    References

    [1]
    B. H. Bloom, "Space/time tradeoffs in hash coding with allowable errors", Communications of the ACM 13:7 (1970), 422--426.
    [2]
    A. Kirsch and M. Mitzenmacher, "Less hashing, same performance: building a better Bloom filter", ESA 2006.
    [3]
    S. Lumetta and M. Mitzenmacher, "Using the power of two choices to improve Bloom filters", Preprint version available at http://www.eecs.harvard.edu/~michaelm.
    [4]
    A. Broder and M. Mitzenmacher, "Network applications fo Bloom filters: a survey", Internet Mathematics, vol. 1. no. 4, pp. 485--509, 2005.
    [5]
    R. Motwani and P. Raghavan, "Randomized algorithms", Cambridge University Press, August, 1995.
    [6]
    Robert Sedgewick, "Algorithms in C", Addison-Wesley Professional, August, 2001.
    [7]
    A. Ostlin and R. Pagh, "Uniform hashing in constant time and linear space", Proceedings of STOC 2003, ACM.
    [8]
    A. Pagh, R. Pagh, and S. Rao, "An optimal Bloom filter replacement", SODA 2005.
    [9]
    B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal, "The Bloomier filter: an efficient data structure for static support lookup tables", SODA 2004.
    [10]
    C. Estan and G. Varghese, "New directions in traffic measurement and accounting", Proceedings of ACM SIGCOMM Conference, 2002.
    [11]
    L. Fan, P. Cao, J. Almeida, and A. Z. Broder, "Summary cache: a scalable wide-area web cache sharing protocol", IEEE/ACM Transactions on Networking 8:3 (2000), 281--293.
    [12]
    E. H. Spafford, "Opus: preventing weak password choices", Computer and Security 11 (1992), 273--278.
    [13]
    U. Manber and S. Wu, "An algorithm for approximate membership checking with application to password security", Information Processing Letters 50 (1994), 19--197.
    [14]
    F. Bonomi, M. Mitzenmacher, R. Panigrahy, S. Singh, and G. Varghese, "Beyond Bloom filters: from approximate membership checks to approximate state machines", Proceedings of ACM SIGCOMM 2006.

    Cited By

    View all
    • (2024)FeCBF: A Novel Sub-Optimal Cascaded Bloom Filter Structure Based on Feature ExtractionIEEE Access10.1109/ACCESS.2024.339906212(67619-67631)Online publication date: 2024
    • (2022)A Pareto optimal Bloom filter family with hash adaptivityThe VLDB Journal10.1007/s00778-022-00755-z32:3(525-548)Online publication date: 26-Jul-2022
    • (2021)Hash Adaptive Bloom Filter2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00061(636-647)Online publication date: Apr-2021
    • Show More Cited By

    Index Terms

    1. Building high accuracy bloom filters using partitioned hashing

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
        June 2007
        398 pages
        ISBN:9781595936394
        DOI:10.1145/1254882
        • cover image ACM SIGMETRICS Performance Evaluation Review
          ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
          SIGMETRICS '07 Conference Proceedings
          June 2007
          382 pages
          ISSN:0163-5999
          DOI:10.1145/1269899
          Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 12 June 2007

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. bloom filter
        2. hashing

        Qualifiers

        • Article

        Conference

        SIGMETRICS07

        Acceptance Rates

        Overall Acceptance Rate 459 of 2,691 submissions, 17%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)23
        • Downloads (Last 6 weeks)1

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)FeCBF: A Novel Sub-Optimal Cascaded Bloom Filter Structure Based on Feature ExtractionIEEE Access10.1109/ACCESS.2024.339906212(67619-67631)Online publication date: 2024
        • (2022)A Pareto optimal Bloom filter family with hash adaptivityThe VLDB Journal10.1007/s00778-022-00755-z32:3(525-548)Online publication date: 26-Jul-2022
        • (2021)Hash Adaptive Bloom Filter2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00061(636-647)Online publication date: Apr-2021
        • (2019)Optimizing Bloom Filter: Challenges, Solutions, and ComparisonsIEEE Communications Surveys & Tutorials10.1109/COMST.2018.288932921:2(1912-1949)Online publication date: Oct-2020
        • (2017)Finding Needles in a Haystack: Missing Tag Detection in Large RFID SystemsIEEE Transactions on Communications10.1109/TCOMM.2017.266679065:5(2036-2047)Online publication date: May-2017
        • (2016)False-Positive Probability and Compression Optimization for Tree-Structured Bloom FiltersACM Transactions on Modeling and Performance Evaluation of Computing Systems10.1145/29403241:4(1-39)Online publication date: 21-Sep-2016
        • (2015)A Performance Evaluation of Hash Functions for IP Reputation Lookup Using Bloom FiltersProceedings of the 2015 10th International Conference on Availability, Reliability and Security10.1109/ARES.2015.101(516-521)Online publication date: 24-Aug-2015
        • (2015)Parallel Bloom Filter on Xeon Phi Many-Core ProcessorsProceedings, Part II, of the 15th International Conference on Algorithms and Architectures for Parallel Processing - Volume 952910.1007/978-3-319-27122-4_27(388-405)Online publication date: 18-Nov-2015
        • (2014)Fast Bloom Filters and Their GeneralizationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2013.4625:1(93-103)Online publication date: 1-Jan-2014
        • (2014)Investigation on bloom filter and implementation of 3k combined parallel tiger bloom filter design2014 International Conference on Electronics and Communication Systems (ICECS)10.1109/ECS.2014.6892509(1-7)Online publication date: Feb-2014
        • Show More Cited By

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media