Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Soft constraint based pattern mining

Published: 01 July 2007 Publication History

Abstract

The paradigm of pattern discovery based on constraints was introduced with the aim of providing to the user a tool to drive the discovery process towards potentially interesting patterns, with the positive side effect of achieving a more efficient computation. So far the research on this paradigm has mainly focused on the latter aspect: the development of efficient algorithms for the evaluation of constraint-based mining queries. Due to the lack of research on methodological issues, the constraint-based pattern mining framework still suffers from many problems which limit its practical relevance. In this paper, we analyze such limitations and we show how they flow out from the same source: the fact that in the classical constraint-based mining, a constraint is a rigid boolean function which returns either true or false. Indeed, interestingness is not a dichotomy. Following this consideration, we introduce the new paradigm of pattern discovery based on Soft Constraints, where constraints are no longer rigid boolean functions. Albeit based on a simple idea, our proposal has many merits: it provides a rigorous theoretical framework, which is very general (having the classical paradigm as a particular instance), and which overcomes all the major methodological drawbacks of the classical constraint-based paradigm, representing an important step further towards practical pattern discovery.

References

[1]
R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases, in: Proceedings ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May, 1993, pp. 207-216.
[2]
R. Agrawal, R. Srikant R, Fast algorithms for mining association rules in large databases, in: Proceedings of the Twentieth International Conference on Very Large Databases, Santiago de Chile, Chile, September, 1994, pp. 487-499.
[3]
R. Agrawal, R. Srikant, Mining sequential patterns, in: Proceedings of the Eleventh IEEE International Conference on Data Engineering, Taipei, Taiwan, March, 1995, pp. 3-14.
[4]
C. Antunes, A.L. Oliveira, Constraint relaxations for discovering unknown sequential patterns, in: Proceedings of the Third International Workshop on Knowledge Discovery in Inductive Databases, Pisa, Italy, September, 2004, pp. 11-32.
[5]
R.J. Bayardo. The hows, whys, and whens of constraints in itemset and rule discovery, in: Constraint-Based Mining and Inductive Databases, European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March, 2004, pp. 1-13.
[6]
R.J. Bayardo, R. Agrawal, Mining the most interesting rules, in: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August, 1999, pp. 145-154.
[7]
R.J. Bayardo, R. Agrawal, D. Gunopulos, Constraint-based rule mining in large, dense databases, in: Proceedings of the 15th IEEE International Conference on Data Engineering, Sydney, Australia, March, 1999, pp. 188-197.
[8]
J. Bellone, A. Chamard, C. Pradelles, Plane - an evolutive planning system for aircraft production, in: Proceedings of the First International Conference on Practical Applications of Prolog, London, UK, April, 1992.
[9]
Besson, J., Robardet, C., Boulicaut, J.F. and Rome, S., Constraint-based concept mining and its application to microarray data analysis. Intelligent Data Analysis Journal. v9 i1. 59-82.
[10]
Bistarelli, S., Semirings for soft constraint solving and programming. In: Lecture Notes in Computer Science, vol. 2962. Springer, Berlin.
[11]
S. Bistarelli, F. Bonchi, Interestingness is not a dichotomy: introducing softness in constrained pattern mining, in: Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October, 2005 pp. 22-33.
[12]
Bistarelli, S., Codognet, P. and Rossi, P., Abstracting soft constraints: framework, properties, examples. Artificial Intelligence. i139. 175-211.
[13]
Bistarelli, S., Montanari, U. and Rossi, F., Semiring-based constraint solving and optimization. Journal of the ACM. v44 i2. (1997) 201-236.
[14]
F. Bonchi, F. Giannotti, C. Lucchese, S. Orlando, R. Perego, R. Trasarti, ConQueSt: a Constraint-based querying system for exploratory pattern discovery, in: Proceedings of The 22nd IEEE International Conference on Data Engineering, Atlanta, GA, USA, April, 2006.
[15]
F. Bonchi, F. Giannotti, A. Mazzanti, D. Pedreschi. ExAnte: Anticipated data reduction in constrained pattern mining, in: Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, Cavtat-Dubrovnik, Croatia, September, 2003, pp. 59-70.
[16]
F. Bonchi, F. Giannotti, A. Mazzanti, D. Pedreschi, ExAMiner: optimized level-wise frequent pattern mining with monotone constraints, in: Proceedings of the Third IEEE International Conference on Data Mining, Melbourne, FL, USA, November, 2003, pp. 11-18.
[17]
F. Bonchi, C. Lucchese, On closed constrained frequent pattern mining, in: Proceedings of the Fourth IEEE International Conference on Data Mining, Brighton, UK, November, 2004, pp. 35-42.
[18]
F. Bonchi, C. Lucchese, Pushing tougher constraints in frequent pattern mining, in: Proceedings of Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conference, Hanoi, Vietnam, May, 2005, pp. 114-124.
[19]
F. Bonchi, C. Lucchese, Extending the state-of-the-art of constraint-based pattern discovery, Data and Knowledge Engineering (DKE), Elsevier, to be published.
[20]
A. Borning, M. Maher, A. Martindale, M. Wilson, Constraint hierarchies and logic programming, in: Proceedings of the Sixth International Conference on Logic Programming, Lisbon, Portugal, June, 1989, pp. 149-164.
[21]
Boulicaut, J.F. and Jeudy, B., Constraint-based data mining. In: Maimon, O., Rokach, L. (Eds.), The Data Mining and Knowledge Discovery Handbook, Springer, Berlin. pp. 399-416.
[22]
S. Brin, R. Motwani, C. Silverstein, Beyond market baskets: generalizing association rules to correlations, in: Proceedings ACM SIGMOD International Conference on Management of Data, Tucson, AZ, USA, May, 1997, pp. 265-276.
[23]
C. Bucila, J. Gehrke, D. Kifer, W. White, DualMiner: A dual-pruning algorithm for itemsets with constraints, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alta., Canada, July, 2002, pp. 42-51.
[24]
L. De Raedt, S. Kramer, The levelwise version space algorithm and its application to molecular fragment finding, in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, WA, USA, August, 2001, pp. 853-862.
[25]
D. Dubois, H. Fargier, H. Prade, The calculus of fuzzy restrictions as a basis for flexible constraint satisfaction, in: Proceedings of the IEEE International Conference on Fuzzy Systems, 1993, pp. 131-1136.
[26]
H. Fargier, J. Lang, Uncertainty in constraint satisfaction problems: a probabilistic approach, in: Proceedings of the European Conference on Symbolic and Qualitative Approaches to Reasoning and Uncertainty, Granada, Spain, November, 1993, pp. 97-104.
[27]
Freuder, E. and Wallace, R., Partial constraint satisfaction. Artificial Intelligence Journal. v58 i1-3. 21-70.
[28]
T. Frühwirth, P. Brisset, Optimal planning of digital cordless telecommunication systems, in: Proceedings of the Third International Conference on the Practical Application of Constraint Technology, London, UK, April, 1997.
[29]
M.N. Garofalakis, R. Rajeev Rastogi, K. Shim, SPIRIT: sequential pattern mining with regular expression constraints, in: Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, September, 1999, pp 223-234.
[30]
Han, J., Lakshmanan, L. and Ng, R., Constraint-based, multidimensional data mining. Computer. v32 i8. 46-50.
[31]
Hilderman, R. and Hamilton, H., Knowledge Discovery and Measures of Interest. 2002. Kluwer Academic Publishers, Boston.
[32]
Hipp, J. and Güntzer, H., Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining. SIGKDD Explorations. v4 i1. 50-55.
[33]
H. Hofmann, A. Siebes, A.F.X. Wilhelm, Visualizing association rules with interactive mosaic plots, in: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, August, 2000, pp. 227-235.
[34]
Inokuchi, A., Washio, T. and Motoda, H., An Apriori-based algorithm for mining frequent substructures from graph data. In: Lecture Notes in Computer Science, vol. 1910. Springer, Berlin. pp. 13-23.
[35]
M. Kuramochi, G. Karypis, Frequent subgraph discovery, in: Proceedings of the 2001 IEEE International Conference on Data Mining, San Jose, CA, USA, December, 2001, pp. 313-320.
[36]
L. Lakshmanan, R. Ng, J. Han, A. Pang, Optimization of constrained frequent set queries with 2-variable constraints, in: Proceedings ACM SIGMOD International Conference on Management of Data, Philadelphia, PA, USA, June, 1999, pp. 157-168.
[37]
A. Lau., S.S. Ong, A. Mahidadia, A.G. Hoffmann, J. Westbrook, T. Zrimec, Mining patterns of dyspepsia symptoms across time points using constraint association rules. in: Advances in Knowledge Discovery and Data Mining, 7th Pacific-Asia Conference, Seoul, Korea, May, 2003, pp. 124-135.
[38]
W. Li, J. Han, J. Pei CMAR: accurate and efficient classification based on multiple class-association rules, in: Proceedings of the First IEEE International Conference on Data Mining, San Jose, CA, USA, December, 2001, pp. 369-376.
[39]
B. Liu, W. Hsu, Y. Ma, Integrating classification and association rule mining, in: Proceedings of the Fourth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York City, NY, USA, August, 1998, pp. 80-86.
[40]
Moulin, H., Axioms for Cooperative Decision Making. 1988. Cambridge University Press, Cambridge.
[41]
R. Ng, L. Lakshmanan, J. Han, A. Pang, Exploratory mining and pruning optimizations of constrained associations rules, in: Proceedings ACM SIGMOD International Conference on Management of Data, Seattle, WA, USA, June, 1998, pp. 13-24.
[42]
C. Ordonez et al., Mining constrained association rules to predict heart disease, in: Proceedings of the First IEEE International Conference on Data Mining, San Jose, CA, USA, December, 2001, pp. 433-440.
[43]
S. Orlando, P. Palmerini, R. Perego, F. Silvestri, Adaptive and resource-aware mining of frequent sets, in: Proceedings of the Second IEEE International Conference on Data Mining, Maebashi City, Japan, December, 2002, pp. 338-345.
[44]
J. Pei, J. Han, Can we push more constraints into frequent pattern mining? In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, August, 2000, pp. 350-354.
[45]
J. Pei, X. Zhang, M. Cho, H. Wang, P.S. Yu, MaPle: A fast algorithm for maximal pattern-based clustering, in: Proceedings of the Third IEEE International Conference on Data Mining, Melbourne, FL, USA, November, 2003, pp. 259-266.
[46]
Z. Ruttkay, Fuzzy constraint satisfaction, in: Proceedings of the 3rd IEEE International Conference on Fuzzy Systems, Orlando, FL, USA, 1994, pp. 1263-1268.
[47]
S. Sahar, Interestingness via what is not interesting, in: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, SanDiego, CA, USA, August, 1999, pp. 332-336.
[48]
A. Silberschatz, A. Tuzhilin, On subjective measures of interestingness, in: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, Montreal, Canada, August, 1995, pp. 275-281.
[49]
R. Srikant, Q. Vu, R. Agrawal, Mining association rules with item constraints, in: Proceedings ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, USA, August, 1997, pp. 67-73.
[50]
Tan, P.-N., Steinbach, M. and Kumar, V., Introduction to Data Mining. 2005. Addison-Wesley, Reading, MA.
[51]
P.-N. Tan, V. Kumar, J. Srivastava, Selecting the right interestingness measure for association patterns, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alta., Canada, July, 2002, pp. 32-41.
[52]
Wang, K. and Liu, H., Discovering structural association of semistructured data. IEEE Transactions on Knowledge and Data Engineering. i12. 353-371.
[53]
M.L. Yiu, N. Mamoulis, Frequent-pattern based iterative projected clustering, in: Proceedings of the Third IEEE International Conference on Data Mining, Melbourne, FL, USA, November, 2003, pp. 689-692.
[54]
Zaki, M.J., Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Transactions on Knowledge and Data Engineering. i17. 1021-1035.

Cited By

View all
  • (2019)Constraint-based sequential pattern mining with decision diagramsProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33011495(1495-1502)Online publication date: 27-Jan-2019
  • (2019)Mining skypatterns in fuzzy tensorsData Mining and Knowledge Discovery10.1007/s10618-019-00640-433:5(1298-1322)Online publication date: 1-Sep-2019
  • (2017)Skypattern miningArtificial Intelligence10.1016/j.artint.2015.04.003244:C(48-69)Online publication date: 1-Mar-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Data & Knowledge Engineering
Data & Knowledge Engineering  Volume 62, Issue 1
July, 2007
198 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 July 2007

Author Tags

  1. Constraint-based mining
  2. Frequent pattern mining
  3. Semiring-based constraints
  4. Soft constraints

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Constraint-based sequential pattern mining with decision diagramsProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33011495(1495-1502)Online publication date: 27-Jan-2019
  • (2019)Mining skypatterns in fuzzy tensorsData Mining and Knowledge Discovery10.1007/s10618-019-00640-433:5(1298-1322)Online publication date: 1-Sep-2019
  • (2017)Skypattern miningArtificial Intelligence10.1016/j.artint.2015.04.003244:C(48-69)Online publication date: 1-Mar-2017
  • (2015)Soft constraints for pattern miningJournal of Intelligent Information Systems10.1007/s10844-013-0281-444:2(193-221)Online publication date: 1-Apr-2015
  • (2013)PMBCComputers in Biology and Medicine10.1016/j.compbiomed.2013.02.00643:5(481-492)Online publication date: 1-Jun-2013
  • (2011)The discovery of frequent patterns with logic and constraint programmingProceedings of the 13th WSEAS international conference on mathematical methods, computational techniques and intelligent systems, and 10th WSEAS international conference on non-linear analysis, non-linear systems and chaos, and 7th WSEAS international conference on dynamical systems and control, and 11th WSEAS international conference on Wavelet analysis and multirate systems: recent researches in computational techniques, non-linear systems and control10.5555/2039846.2039881(198-203)Online publication date: 1-Jul-2011
  • (2011)Direct local pattern sampling by efficient two-step random proceduresProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2020408.2020500(582-590)Online publication date: 21-Aug-2011
  • (2011)Preferences in AIArtificial Intelligence10.1016/j.artint.2011.03.004175:7-8(1037-1052)Online publication date: 1-May-2011
  • (2006)On interactive pattern mining from relational databasesProceedings of the 5th international conference on Knowledge discovery in inductive databases10.5555/1777194.1777200(42-62)Online publication date: 18-Sep-2006

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media