BOSTER: An Efficient Algorithm for Mining Frequent Unordered Induced Subtrees

Chowdhury, Israt J.; Nayak, Richi

doi:10.1007/978-3-319-11749-2_12

Israt J. Chowdhury¹⁹ &
Richi Nayak¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8786))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1577 Accesses
2 Citations

Abstract

Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically enumerates only the valid subtrees. Finally, we present the balanced optimal search tree miner(BOSTER) algorithm based on BOCF and the proposed enumeration approach, for finding frequent induced subtrees from a database of labelled rooted unordered trees. Experiments on the real datasets compare the efficiency of BOSTER over the two state-of-the-art algorithms for mining induced unordered subtrees, HybridTreeMiner and UNI3. The results are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

BEST: An Efficient Algorithm for Mining Frequent Unordered Embedded Subtrees

FreeS: A Fast Algorithm to Discover Frequent Free Subtrees Using a Novel Canonical Form

Mining rooted ordered trees under subtree homeomorphism

Article 19 October 2015

References

Pei, J., Han, J., Mortazavi-asl, B., Zhu, H.: Mining Access Patterns Efficiently from Web Logs. In: Terano, T., Liu, H., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 396–407. Springer, Heidelberg (2000)
Chapter Google Scholar
Zaki, M.J., Aggarwal, C.C.: XRules: An Effective Structural Classifier for XML Data. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 316–325. ACM, Washington, D. C. (2003)
Google Scholar
Wang, Y., DeWitt, D.J., Cai, J.-Y.: X-Diff: An Effective Change Detection Algorithm for XML Documents. In: Proceedings of the 19th International Conference on Data Engineering, pp. 519–530. IEEE, Vienna (2003)
Google Scholar
Luccio, F., Enriquez, A.M., Rieumont, P.O., Pagli, L.: Exact Rooted Subtree Matching in Sublinear Time. Universita Di Pisa Technical Report TR-01 (2001)
Google Scholar
Asai, T., Arimura, H., Uno, T., Nakano, S.-I.: Discovering Frequent Substructures in Large Unordered Trees. Springer, Heidelberg (2003)
Google Scholar
Nijssenm, S., Kok, J.N.: Efficient Discovery of Frequent Unordered Trees. In: First International Workshop on Mining Graphs, Trees and Sequences. Springer, Heidelberg (2003)
Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: Canonical Forms for Labelled Trees and Their Applications in Frequent Subtree Mining. Knowledge and Information System 8(2), 203–234 (2005)
Article Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. In: Proceedings of the 16th International Conference on Scientific and Statistical Database Management, pp. 11–20. IEEE, Santorini (2004)
Google Scholar
Chehreghani, M.H.: Efficiently Mining Unordered Trees. In: Proceedings of the 11th IEEE International Conference on Data Mining, Vancouver, BC, pp. 111–120 (2011)
Google Scholar
Hadzic, F., Tan, H., Dillon, T.S.: UNI3 - Efficient Algorithm for Mining Unordered Induced Subtrees Using TMG Candidate Generation. In: Proceedings of the 1st IEEE Symposium on Computational Intelligence and Data Mining, Honolulu, Hawaii, pp. 568–575 (2007)
Google Scholar
Chowdhury, I.J., Nayak, R.: A Novel Method for Finding Similarities between Unordered Trees Using Matrix Data Model. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 421–430. Springer, Heidelberg (2013)
Chapter Google Scholar
Valiente. Algorithms on Trees and Graphs. Springer, Heidelberg (2002)
Google Scholar
Scholl, A.: Balancing and Sequencing of Assembly Lines. Physica-Verlag, Heidelberg (1999)
Book Google Scholar
Zaki, M.J.: Efficiently Mining Frequent Trees in A Forest: Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 17(8), 1021–1035 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, Science and Engineering Faculty, Queensland University of Technology, Brisbane, Australia
Israt J. Chowdhury & Richi Nayak

Authors

Israt J. Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Richi Nayak
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of New South Wales, Sydney, Australia
Boualem Benatallah
Boston University, Boston, MA, USA
Azer Bestavros
Aristotle University of Thessaloniki, Thessaloniki, Greece
Yannis Manolopoulos & Athena Vakali &
Victoria University, Footscray, VIC, Australia
Yanchun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chowdhury, I.J., Nayak, R. (2014). BOSTER: An Efficient Algorithm for Mining Frequent Unordered Induced Subtrees. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8786. Springer, Cham. https://doi.org/10.1007/978-3-319-11749-2_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-11749-2_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11748-5
Online ISBN: 978-3-319-11749-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

BOSTER: An Efficient Algorithm for Mining Frequent Unordered Induced Subtrees

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

BEST: An Efficient Algorithm for Mining Frequent Unordered Embedded Subtrees

FreeS: A Fast Algorithm to Discover Frequent Free Subtrees Using a Novel Canonical Form

Mining rooted ordered trees under subtree homeomorphism

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

BOSTER: An Efficient Algorithm for Mining Frequent Unordered Induced Subtrees

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

BEST: An Efficient Algorithm for Mining Frequent Unordered Embedded Subtrees

FreeS: A Fast Algorithm to Discover Frequent Free Subtrees Using a Novel Canonical Form

Mining rooted ordered trees under subtree homeomorphism

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation