Article

AMIOT: Induced Ordered Tree Mining in Tree-Structured Databases

Authors:

Hiroyuki KawanoAuthors Info & Claims

ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining

Pages 170 - 177

https://doi.org/10.1109/ICDM.2005.20

Published: 27 November 2005 Publication History

Abstract

Frequent subtree mining has become increasingly important in recent years. In this paper, we present AMIOT algorithm to discover all frequent ordered subtrees in a tree-structured database. In order to avoid the generation of infrequent candidate trees, we propose the techniques such as right-and-left tree join and serial tree extension. Proposed methods enumerate only the candidate trees with high probability of being frequent without any duplications. The experiments on synthetic dataset and XML database show that AMIOT reduces redundant candidate trees and outperforms FREQT algorithm by up to five times in execution time.

References

[1]

R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), pages 487- 499, 1994.

Digital Library

[2]

Apache. Xindice. http://xml.apache.org/xindice, 2002.

[3]

T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Sakamoto, and S. Arikawa. Efficient substructure discovery from large semi-structured data. In Proceedings of the 2nd SIAM International Conference on Data Mining (SDM'02), pages 158- 174, 2002.

[4]

T. Asai and H. Arimura. Algorithms for mining semistructured data. IEICE Journal, J87-D1(2):79-96, 2004. (in Japanese).

[5]

R. J. Bayardo Jr. Efficiently mining long patterns from databases. In Proceedings of the International Conference on Management of data (ACM SIGMOD'98), pages 85-93, 1998.

Digital Library

[6]

Y. Chi, Y. Yang, and R. R. Muntz. Hybridtreeminer: An efficient algorithm for mining frequent rooted trees and free trees using canonical form. In Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), pages 11-20, 2004.

Digital Library

[7]

J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proceedings of the 2000 International Conference on Management of Data (ACM SIGMOD'00) , pages 1-12, 2000.

Digital Library

[8]

S. Hido and H. Kawano. Proposal of efficient enumeration method for frequent ordered subtree discovery. DBSJ Letters, 4(1):161-164, 2005. (in Japanese).

[9]

IBM alphaWorks. ToXgene - the ToX XML Data Generator. http://www.alphaworks.ibm.com/tech/toxgene, 2002.

[10]

A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD'00), pages 13-23, 2000.

Digital Library

[11]

A. Inokuchi, T. Washio, and H. Motoda. Applying the apriori-based graph mining method to mutagenesis data analysis. Computer Aided Chemistry, 2:87-92, 2001.

[12]

T. Kudo and Y. Matsumoto. A boosting algorithm for classification of semi-structured text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP'04), pages 301-308, 2004.

[13]

S. Nijssen. Frequent structure mining: Efficiency issues. In the 2nd International Workshop on Mining Graphs, Trees and Sequences (MGTS'04), 2004. http://hms.liacs.nl/mgts2004/mgts-intro.pdf.

[14]

J. R. Punin, M. S. Krishnamoorthy, and M. J. Zaki. Web usage mining: Languages and algorithms. Technical Report 03-1, Rensselaer Polytechnic Institute, 2001.

[15]

C. Wang, M. Hong, J. Pei, H. Zhou, W. Wang, and B. Shi. Efficient pattern-growth methods for frequent tree pattern mining. In Proceedings of the 8th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD'04), pages 441-451, 2004.

[16]

M. J. Zaki. Efficiently mining frequent trees in a forest. In Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD'02), pages 71-80, 2002.

Digital Library

Cited By

Martini MSchuster Dvan der Aalst W(2023)Mining Frequent Infix Patterns from Concurrency-Aware Process Execution VariantsProceedings of the VLDB Endowment10.14778/3603581.360360316:10(2666-2678)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.14778/3603581.3603603
Besta MKanakagiri RKwasniewski GAusavarungnirun RBeránek JKanellopoulos KJanda KVonarburg-Shmaria ZGianinazzi LStefan ILuna JGolinowski JCopik MKapp-Schwoerer LDi Girolamo SBlach NKonieczny MMutlu OHoefler T(2021)SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory SystemsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480133(282-297)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480133
Hadzic FHecker MTagarelli A(2015)Ordered subtree mining via transactional mapping using a structure-preserving tree database schemaInformation Sciences: an International Journal10.1016/j.ins.2015.03.015310:C(97-117)Online publication date: 20-Jul-2015
https://dl.acm.org/doi/10.1016/j.ins.2015.03.015
Show More Cited By

Index Terms

AMIOT: Induced Ordered Tree Mining in Tree-Structured Databases

Recommendations

On the hardness of full Steiner tree problems

Given a weighted graph G = ( V , E ) and a subset R of V, a Steiner tree in G is a tree which spans all vertices in R. The vertices in V R are called Steiner vertices. A full Steiner tree is a Steiner tree in which each vertex of R is a leaf. The full ...
On computing the number of (BC-)subtrees, eccentric subtree number, and global and local means of trees
Highlights
- Algorithms for computing the number of subtrees of trees in linear-time;
- the ...
Abstract
In this paper, we present algorithms for computing the number of subtrees of trees in linear-time, the number of BC-subtrees of trees in linear-time, the global mean of trees in linear-time, the local mean of a given vertex of trees in ...
Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. Because of the ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining

November 2005

837 pages

ISBN:0769522785

Publisher

IEEE Computer Society

United States

Publication History

Published: 27 November 2005

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Martini MSchuster Dvan der Aalst W(2023)Mining Frequent Infix Patterns from Concurrency-Aware Process Execution VariantsProceedings of the VLDB Endowment10.14778/3603581.360360316:10(2666-2678)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.14778/3603581.3603603
Besta MKanakagiri RKwasniewski GAusavarungnirun RBeránek JKanellopoulos KJanda KVonarburg-Shmaria ZGianinazzi LStefan ILuna JGolinowski JCopik MKapp-Schwoerer LDi Girolamo SBlach NKonieczny MMutlu OHoefler T(2021)SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory SystemsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480133(282-297)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480133
Hadzic FHecker MTagarelli A(2015)Ordered subtree mining via transactional mapping using a structure-preserving tree database schemaInformation Sciences: an International Journal10.1016/j.ins.2015.03.015310:C(97-117)Online publication date: 20-Jul-2015
https://dl.acm.org/doi/10.1016/j.ins.2015.03.015
Tan HHadzic FDillon T(2012)Mining Induced/Embedded Subtrees using the Level of Embedding ConstraintFundamenta Informaticae10.5555/2385135.2385139119:2(187-231)Online publication date: 1-Apr-2012
https://dl.acm.org/doi/10.5555/2385135.2385139
Jiménez ABerzal FCubero J(2011)Mining patterns from longitudinal studiesProceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II10.1007/978-3-642-25856-5_13(166-179)Online publication date: 17-Dec-2011
https://dl.acm.org/doi/10.1007/978-3-642-25856-5_13
Hadzic FTan HDillon T(2010)Model guided algorithm for mining unordered embedded subtreesWeb Intelligence and Agent Systems10.5555/1898169.18981758:4(413-430)Online publication date: 1-Dec-2010
https://dl.acm.org/doi/10.5555/1898169.1898175
da Jiménez ABerzal FCubero J(2010)Frequent tree pattern mining: A surveyIntelligent Data Analysis10.5555/1890496.189049814:6(603-622)Online publication date: 15-Nov-2010
https://dl.acm.org/doi/10.5555/1890496.1890498
Jímenez ABerzal FCubero J(2010)POTMinerKnowledge and Information Systems10.5555/1875504.187550723:2(199-224)Online publication date: 1-May-2010
https://dl.acm.org/doi/10.5555/1875504.1875507
Jiménez ABerzal FCubero J(2010)POTMinerKnowledge and Information Systems10.1007/s10115-009-0213-323:2(199-224)Online publication date: 1-May-2010
https://dl.acm.org/doi/10.1007/s10115-009-0213-3
Ozaki TOhkawa T(2009)Mining Mutually Dependent Ordered Subtrees in Tree DatabasesNew Frontiers in Applied Data Mining10.1007/978-3-642-00399-8_7(75-86)Online publication date: 7-Feb-2009
https://dl.acm.org/doi/10.1007/978-3-642-00399-8_7

View Options

View options

Media

Figures

Other

Tables

View Table of Contents