Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Practical Algorithms for Finding Extremal Sets

Published: 05 April 2016 Publication History

Abstract

The minimal sets within a collection of sets are defined as the ones that do not have a proper subset within the collection, and the maximal sets are the ones that do not have a proper superset within the collection. Identifying extremal sets is a fundamental problem with a wide range of applications in SAT solvers, data mining, and social network analysis. In this article, we present two novel improvements of the high-quality extremal set identification algorithm, AMS-Lex, described by Bayardo and Panda. The first technique uses memoization to improve the execution time of the single-threaded variant of the AMS-Lex, while our second improvement uses parallel programming methods. In a subset of the presented experiments, our memoized algorithm executes more than 400 times faster than the highly efficient publicly available implementation of AMS-Lex. Moreover, we show that our modified algorithm's speedup is not bounded above by a constant and that it increases as the length of the common prefixes in successive input itemsets increases. We provide experimental results using both real-world and synthetic datasets, and show our multithreaded variant algorithm outperforming AMS-Lex by 3 to 6 times. We find that on synthetic input datasets, when executed using 16 CPU cores of a 32-core machine, our multithreaded program executes about as fast as the state-of-the-art parallel GPU-based program using an NVIDIA GTX 580 graphics processing unit.

References

[1]
Roberto J. Bayardo and Biswanath Panda. 2011. Fast algorithms for finding extremal sets. In SDM. SIAM/Omnipress, 25--34.
[2]
Daniel Bundala, Michael Codish, Luís Cruz-Filipe, Peter Schneider-Kamp, and Jakub Závodný. 2014. Optimal-depth sorting networks. CoRR abs/1412.5302 (2014). http://arxiv.org/abs/1412.5302
[3]
Niklas Eén and Armin Biere. 2005. Effective preprocessing in SAT through variable and clause elimination. In SAT (Lecture Notes in Computer Science), Fahiem Bacchus and Toby Walsh (Eds.), Vol. 3569. Springer, 61--75.
[4]
Marta Fort, J. Antoni Sellars, and Nacho Valladares. 2014. Finding extremal sets on the GPU. J. Parallel Distrib. Comput. 74, 1 (Jan. 2014), 1891--1899.
[5]
M. Marinov and D. Gregg. 2015. On the GI-completeness of a sorting networks isomorphism. ArXiv e-prints (July 2015).
[6]
Taneli Mielikäinen, Pance Panov, and Saso Dzeroski. 2006. Itemset support queries using frequent itemsets and their condensed representations. In Discovery Science (Lecture Notes in Computer Science), Ljupco Todorovski, Nada Lavrac, and Klaus P. Jantke (Eds.), Vol. 4265. Springer, 161--172.
[7]
Paul Pritchard. 1991. Opportunistic algorithms for eliminating supersets. Acta Inf. 28, 8 (1991), 733--754.
[8]
Paul Pritchard. 1997. An old sub-quadratic algorithm for finding extremal sets. Inf. Process. Lett. 62, 6 (1997), 329--334.
[9]
Hong Sheni and D. J. Evans. 1996. Fast sequential and parallel algorithms for finding extremal sets. Int. J. Comput. Math. 61, 3--4 (1996), 195--211.
[10]
Marcos R. Vieira, Petko Bakalov, and Vassilis J. Tsotras. 2009. On-line discovery of flock patterns in spatio-temporal data. In GIS, Divyakant Agrawal, Walid G. Aref, Chang-Tien Lu, Mohamed F. Mokbel, Peter Scheuermann, Cyrus Shahabi, and Ouri Wolfson (Eds.). ACM, 286--295.
[11]
Daniel M. Yellin. 1992. Algorithms for subset testing and finding maximal sets. In SODA, Greg N. Frederickson (Ed.). ACM/SIAM, 386--392.
[12]
Daniel M. Yellin and Charanjit S. Jutla. 1993. Finding extremal sets in less than quadratic time. Inf. Process. Lett. 48, 1 (1993), 29--34.

Cited By

View all
  • (2024)Connected Components for Scaling Partial-order Blocking to Billion EntitiesJournal of Data and Information Quality10.1145/364655316:1(1-29)Online publication date: 19-Mar-2024
  • (2023)IoT-Paradigm: Evolution Challenges and Proposed Solutions2023 IEEE International Smart Cities Conference (ISC2)10.1109/ISC257844.2023.10293646(1-5)Online publication date: 24-Sep-2023
  • (2022)High-order Line Graphs of Non-uniform Hypergraphs: Algorithms, Applications, and Experimental Analysis2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00081(784-794)Online publication date: May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics
ACM Journal of Experimental Algorithmics  Volume 21, Issue
Special Issue SEA 2014, Regular Papers and Special Issue ALENEX 2013
2016
404 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/2888418
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 April 2016
Accepted: 01 February 2016
Revised: 01 October 2015
Received: 01 January 2015
Published in JEA Volume 21

Author Tags

  1. Algorithms
  2. dataset
  3. extremal sets
  4. itemset
  5. memoization
  6. parallel
  7. practical

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Irish Research Council (IRC) and SFI project 12/IA/1381

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Connected Components for Scaling Partial-order Blocking to Billion EntitiesJournal of Data and Information Quality10.1145/364655316:1(1-29)Online publication date: 19-Mar-2024
  • (2023)IoT-Paradigm: Evolution Challenges and Proposed Solutions2023 IEEE International Smart Cities Conference (ISC2)10.1109/ISC257844.2023.10293646(1-5)Online publication date: 24-Sep-2023
  • (2022)High-order Line Graphs of Non-uniform Hypergraphs: Algorithms, Applications, and Experimental Analysis2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00081(784-794)Online publication date: May-2022
  • (2022)Towards hierarchical affiliation resolution: framework, baselines, datasetInternational Journal on Digital Libraries10.1007/s00799-022-00326-123:3(267-288)Online publication date: 28-May-2022
  • (2017)MaxPre: An Extended MaxSAT PreprocessorTheory and Applications of Satisfiability Testing – SAT 201710.1007/978-3-319-66263-3_28(449-456)Online publication date: 9-Aug-2017

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media