Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Boolean factors as a means of clustering of interestingness measures of association rules

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Measures of interestingness play a crucial role in association rule mining. An important methodological problem, on which several papers appeared in the literature, is to provide a reasonable classification of the measures. In this paper, we explore Boolean factor analysis, which uses formal concepts corresponding to classes of measures as factors, for the purpose of clustering of the measures. Unlike the existing studies, our method reveals overlapping clusters of interestingness measures. We argue that the overlap between clusters is a desired feature of natural groupings of measures and that because formal concepts are used as factors in Boolean factor analysis, the resulting clusters have a clear meaning and are easy to interpret. We conduct three case studies on clustering of measures, provide interpretations of the resulting clusters and compare the results to those of the previous approaches reported in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB Conf., pp. 478–499 (1994)

  2. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD, pp. 207–216 (1993)

  3. Belohlavek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  4. Blanchard, J., Guillet, F., Briand, H., Gras, R.: Assessing rule with a probabilistic measure of deviation from equilbrium. In: Proc. of 11th International Symposium on Applied Stochastic Models and Data Analysis ASMDA, pp. 191–200. Brest, France (2005)

  5. Blanchard, J., Guillet, F., Briand, H., Gras, R.: IPEE: Indice Probabiliste d’Écart à l’Équilibre pour l’évaluation de la qualité des règles. In: Dans l’Atelier Qualité des Données et des Connaissances, pp. 26–34 (2005)

  6. Bouker, S., Saidi, R., Ben, Yahia S., Mephu, Nguifo E.: Ranking and selecting association rules based on dominance relationship. In: IEEE ICTAI, pp. 658–665 (2012)

  7. Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. In: Proc. of the ACM SIGMOD Conference, pp. 265–276. Tucson, Arizona (1997)

  8. Carpineto, C., Romano, G.: Concept data analysis. Theory and Applications. J. Wiley (2004)

  9. Delgado, M., Ruiz, D.-L., Sanchez, D.: Studying interest measures for association rules through a logical model. Int. J. Uncertain. Fuzz. Knowl Based Syst. 18(1), 87–106 (2010).

    Article  MATH  MathSciNet  Google Scholar 

  10. Feno, D.R.: Mesures de qualité des règles d’association: normalisation et caractérisation des bases. Ph.D. thesis, Université de La Réunion (2007)

  11. Ganter, B., Wille, R.: Formal concept analysis. Mathematical Foundations. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  12. Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3), 1–31 (2006)

    Article  Google Scholar 

  13. Geng, L., Hamilton, H.J.: Choosing the right lens: finding what is interesting in data mining. Qual. Meas. Data Min. 43, 3–24 (2007)

    Article  Google Scholar 

  14. Guillaume, S., Grissa, D., Mephu Nguifo, E.: Categorization of interestingness measures for knowledge extraction. ArXiv e-prints 1206.6741. cs.IT (2012)

  15. Grissa, D., Guillaume, S., Mephu Nguifo, E.: Combining clustering techniques and formal concept analysis to characterize interestingness measures. ArXiv e-prints 1008.3629. cs.IT. (2010)

  16. Gras, R., Kuntz, P., Couturier, R., Guillet, F.: Une version entropique de l’intensité d’implication pour les corpus volumineux. In: EGC, pp. 69–80 (2001)

  17. Hájek, P., Havránek, T.: Mechanizing Hypotheses Formation. Springer (1978)

  18. Hájek, P., Holeňa, Rauch, J.: The GUHA method and its meaning for data mining. J. Comput. Syst. Sci. 76, 34–48 (2010)

    Article  MATH  Google Scholar 

  19. Heravi, M.J., Zaïane, O.R.: A study on interestingness measures for associative classifiers. In: ACM SAC, pp. 1039–1046 (2010)

  20. Hilderman, R.J., Hamilton, H.J.: Knowledge discovery and measures of interest. In: The International Series in Engineering and Computer Science, vol. 638, p. 2, 81. Kluwer (2001)

  21. Huynh, X.-H., Guillet, F., Briand, H.: Clustering interestingness measures with positive correlation. ICEIS 2, 248–253 (2005)

    Google Scholar 

  22. Lallich, S., Teytaud, O.: Évaluation et validation de mesures d’intérêt des règles d’association. In: RNTI-E-1, numéro spécial, pp. 193–217 (2004)

  23. Lallich, S., Teytaud, O., Prudhomme, E.: Association rule interestingness: measure and statistical validation. Qual. Meas. Data Min. 43, 251–275 (2007)

    Article  Google Scholar 

  24. Lenca, P., Meyer, P., Vaillant, B., Picouet P.: Aide multicritére à la décision pour évaluer les indices de qualité des connaissances—modélisation des préférences de l’utilisatieur. EGC 1, 271–282 (2003)

    Google Scholar 

  25. Lenca, P., Meyer, P., Picouet, P., Vaillant, B., Lallich, S.: Critères d’évaluation des mesures de qualité en ecd. RNTI (Entreposage et Fouille de données) 1(1), 123–134 (2003)

    Google Scholar 

  26. Lenca, P., Meyer, P., Vaillant, B., Lallich, S.: A multicriteria decision aid for interestingness measure selection. Technical Report LUSSI-TR-2004-01-EN, (chap. 1). Dpt. LUSSI, ENST Bretagne (2004)

  27. Lenca, P., Vaillant, B., Meyer, P., Lallich, S.: Association rule interestingness measures: experimental and theoretical studies. Qual. Meas. Data Min. 43, 51–76 (2007)

    Article  Google Scholar 

  28. Maddouri, M., Gammoudi, J.: On semantic properties of interestingness measures for extracting rules from data. Lect. Notes Comput. Sci. 4431, 148–158 (2007)

    Article  Google Scholar 

  29. Piatetsky-Shapiro, G.: Discovery, analysis and presentation of strong rules. In: Piatetsky-Shapiro, G., Frawley, W.J. (eds.) Knowledge Discovery in Databases, pp. 229–248. AAAI Press (1991)

  30. Sese, J., Morishita, S.: Answering the most correlated n association rules efficiently. In: Proceedings of the 6th European Conf on Principles of Data Mining and Knowledge Discovery, pp. 410–422. Springer-Verlag (2002)

  31. Surana, A., Kiran R.U., Reddy P.K.: Selecting a right interestingness measure for rare association rules. In: Proceedings of the 16th International Conference on Management of Data (COMAD 2010), pp. 115–124. Nagpur, India (2010)

  32. Suzuki, E.: Pitfalls for categorizations of objective interestingness measures for rule discovery. In: Statistical Implicative Analysis, pp. 383–395. Springer-Verlag (2008)

  33. Tan, P.-N., Kumar, V., Srivastava J.: Selecting the right objective measure for association analysis. Inf. Syst. 29(4), 293–313 (2004)

    Article  Google Scholar 

  34. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2005)

  35. Vaillant, B.: Mesurer la qualité des règles d’association: études formelles et expérimentales. Ph.D. thesis, ENST Bretagne (2006)

  36. Vaillant, B., Lenca, P., Lallich, S.: A clustering of interestingness measures. In: DS’04, The 7th International Conference on Discovery Science LNAI, vol. 3245, pp. 290–297 (2004)

  37. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Reidel, Dordrecht, Boston (1982)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Outrata.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Belohlavek, R., Grissa, D., Guillaume, S. et al. Boolean factors as a means of clustering of interestingness measures of association rules. Ann Math Artif Intell 70, 151–184 (2014). https://doi.org/10.1007/s10472-013-9370-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-013-9370-x

Keywords

Mathematics Subject Classifications (2010)