Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1698790.1698810acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Hiding co-occurring frequent itemsets

Published: 22 March 2009 Publication History

Abstract

Knowledge hiding, hiding rules/patterns that are inferable from published data and attributed sensitive, is extensively studied in the literature in the context of frequent itemsets and association rules mining from transactional data. The research in this thread is focused mainly on developing sophisticated methods that achieve less distortion in data quality. With this work, we extend frequent item-set hiding to co-occurring frequent itemset hiding problem. Co-occurring frequent itemsets are those itemsets that co-exist in the output of frequent itemset mining. What is different from the classical frequent hiding is the new sensitivity definition: an itemset set is sensitive if its itemsets appear altogether within the frequent item-set mining results. In other words, co-occurrence is defined with reference to the mining results but not to the raw input dataset, and thus it is a kind of meta-knowledge. Our notion of co-occurrence is also very different from association rules as itemsets in an association rule need to be frequently present in the same set of transactions, but the co-occurrence need not necessarily require the joint occurrence in the same set of transactions.
In this paper, we briefly review the frequent itemset/association hiding problems and define the co-occurrence hiding along with the real world motivations. We explore its fundamental properties and show that frequent itemset hiding is a special case of the co-occurring frequent itemsets hiding. As a solution, we propose a two-stage sanitization framework, essentially a reduction, where an instance of the frequent itemset hiding is constructed in the first stage and the instance is solved in the second stage. Since the task is shown to be NP-Hard and the reduction is one-to-many, we propose heuristics only for the first stage as the second stage is a well-established field. Finally, an experimental evaluation is carried out on a couple of datasets, and the results are presented.

References

[1]
O. Abul, M. Atzori, F. Bonchi, and F. Giannotti. Hiding sensitive trajectory patterns. In 6th International Workshop on Privacy Aspects of Data Mining (PADM'07), in conjunction with ICDM'07.
[2]
O. Abul, M. Atzori, F. Bonchi, and F. Giannotti. Hiding sequences. In Third ICDE International Workshop on Privacy Data Management (PDM'07), in conjunction with ICDE'07.
[3]
R. Agarwal, C. Aggarwal, and V. Prasad. A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing, 61:350--371, 2000.
[4]
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings SIGMOD'93, pages 207--216, 1993.
[5]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Databases (VLDB'94), pages 487--499, 1994.
[6]
R. Agrawal and R. Srikant. Mining sequential patterns. In Eleventh International Conference on Data Engineering (ICDE'95), pages 3--14, Taipei, Taiwan, 1995.
[7]
R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD 2000), pages 439--450, 2000.
[8]
M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, and V. S. Verykios. Disclosure limitation of sensitive rules. In Proceedings of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX'99), pages 45--52, 1999.
[9]
M. Atzori, F. Bonchi, F. Giannotti, and D. Pedreschi. Blocking anonymity threats raised by frequent itemset mining. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM 2005), pages 561--564, 2005.
[10]
M. Atzori, F. Bonchi, F. Giannotti, and D. Pedreschi. Geopkdd: Alignment report on privacy-preserving data mining. Technical report, Jan. 2006. Pisa KDD laboratory, ISTI-CNR and University of Pisa.
[11]
H. Cao, N. Mamoulis, and D. W. Cheung. Mining frequent spatio-temporal sequential patterns. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 27--30 November 2005, Houston, Texas, USA.
[12]
C. Clifton and D. Marks. Security and privacy implications of data mining. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (SIGMOD'96), pages 15--19, Feb. 1996.
[13]
E. Dasseni, V. S. Verykios, A. K. Elmagarmid, and E. Bertino. Hiding association rules by using confidence and support. In Proceedings of the 4th International Workshop on Information Hiding, pages 369--383, 2001.
[14]
M. R. Garey and D. S. Johnson. Computers and Intractability -- A Guide to the Theory of NP-Completeness. W. H. Freeman, Jan. 1979.
[15]
J. Han, J. Pei, Y. Yin, and R. Mao. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53--87, 2004.
[16]
G. Lee, C.-Y. Chang, and A. L. P. Chen. Hiding sensitive patterns in association rules mining. In 28th Annual International Computer Software and Applications Conference (COMPSAC 2004), pages 424--429, 2004.
[17]
G. Moustakides and V. Verykios. A maxmin approach for hiding frequent itemsets. IEEE Transactions on Knowledge and Data Engineering, 65:75--89, 2008.
[18]
D. E. O'Leary. Knowledge discovery as a threat to database security. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases, pages 507--516. AAAI/MIT Press, 1991.
[19]
S. R. M. Oliveira and O. R. Zaiane. A framework for enforcing privacy in mining frequent patterns. Technical report, Computer Science Department, University of Alberta, Canada, June 2002.
[20]
S. R. M. Oliveira and O. R. Zaïane. Protecting sensitive knowledge by data sanitization. In Proceedings ICDM 2003, pages 211--218, 2003.
[21]
A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. pages 432--444, 1995.
[22]
Y. Saygin, V. S. Verykios, and C. Clifton. Using unknowns to prevent discovery of association rules. ACM SIGMOD Record, 30(4):45--54, 2001.
[23]
X. Sun and P. S. Yu. A border-based approach for hiding sensitive frequent itemsets. In Proceedings ICDM 2005, pages 426--433, 2005.
[24]
V. S. Verykios, E. Bertino, I. N. Fovino, L. P. Provenza, Y. Saygin, and Y. Theodoridis. State-of-the-art in privacy preserving data mining. ACM SIGMOD Record, 33(1):50--57, 2004.
[25]
V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4):434--447, 2004.
[26]
X. Wu, Y. Wu, Y. Wang, and Y. Li. Privacy aware market basket data set generation: A feasible approach for inverse frequent set mining. In Proceedings of the 2005 SIAM International Conference on Data Mining (SDM 2005), 2005.

Cited By

View all
  • (2020)Minimizing information loss in shared dataStatistical Analysis and Data Mining10.1002/sam.1145813:4(309-323)Online publication date: 2-Jul-2020
  • (2012)Hiding co-occurring prioritized sensitive patterns over distributed progressive sequential data streamsJournal of Network and Computer Applications10.1016/j.jnca.2011.12.01135:3(1116-1129)Online publication date: 1-May-2012
  • (2010)Knowledge Hiding in Emerging Application DomainsPrivacy-Aware Knowledge Discovery10.1201/b10373-10(87-116)Online publication date: 2-Dec-2010
  • Show More Cited By

Index Terms

  1. Hiding co-occurring frequent itemsets

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EDBT/ICDT '09: Proceedings of the 2009 EDBT/ICDT Workshops
    March 2009
    218 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 March 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    EDBT/ICDT '09
    EDBT/ICDT '09: EDBT/ICDT '09 joint conference
    March 22, 2009
    Saint-Petersburg, Russia

    Acceptance Rates

    Overall Acceptance Rate 7 of 10 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Minimizing information loss in shared dataStatistical Analysis and Data Mining10.1002/sam.1145813:4(309-323)Online publication date: 2-Jul-2020
    • (2012)Hiding co-occurring prioritized sensitive patterns over distributed progressive sequential data streamsJournal of Network and Computer Applications10.1016/j.jnca.2011.12.01135:3(1116-1129)Online publication date: 1-May-2012
    • (2010)Knowledge Hiding in Emerging Application DomainsPrivacy-Aware Knowledge Discovery10.1201/b10373-10(87-116)Online publication date: 2-Dec-2010
    • (2010)Hiding co-occurring sensitive patterns in progressive databasesProceedings of the 2010 EDBT/ICDT Workshops10.1145/1754239.1754278(1-5)Online publication date: 22-Mar-2010
    • (2010)A multi-objective scheme to hide sequential patterns2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE)10.1109/ICCAE.2010.5451977(153-158)Online publication date: Feb-2010

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media