Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICDM.2005.55guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Efficient Mining of High Branching Factor Attribute Trees

Published: 27 November 2005 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper, we present a new tree mining algorithm, DRYADEPARENT, based on the hooking principle first introduced in DRYADE [9]. In the experiments, we demonstrate that the branching factor and depth of the frequent patterns to find are key factor of complexity for tree mining algorithms. We show that DRYADEPARENT outperforms the current fastest algorithm, CMTreeMiner, by orders of magnitude on datasets where the frequent patterns have a high branching factor.

    References

    [1]
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th VLDB Conference, Santiago, Chile, 1994.
    [2]
    H. Arimura and T. Uno. An output-polynomial time algorithm for mining frequent closed attribute trees. In 15th International Conference on Inductive Logic Programming (ILP'05), 2005.
    [3]
    T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Sakamoto, and S. Arikawa. Efficient substructure discovery from large semi-structured data. In In Proc. of the Second SIAM International Conference on Data Mining (SDM2002), Arlington, VA, pages 158-174, April 2002.
    [4]
    T. Asai, H. Arimura, T. Uno, and S. ichi Nakano. Discovering frequent substructures in large unordered trees. In the Proc. of the 6th International Conference on Discovery Science (DS'03), pages 47-61, 2003.
    [5]
    R. Chalmers and K. Almeroth. Modeling the branching characteristics and efficiency gains of global multicast trees. In Proceedings of the IEEE INFOCOM'2001, April 2001.
    [6]
    Y. Chi, Y. Yang, Y. Xia, and R. R. Muntz. Cmtreeminer: Mining both closed and maximal frequent subtrees. In The Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'04), 2004.
    [7]
    S. Nijssen and J. N. Kok. Efficient discovery of frequent unordered trees. In First International Workshop on Mining Graphs, Trees and Sequences, 2003, 2003.
    [8]
    A. Termier. Extraction of frequent trees in an heterogeneous corpus of semi-structured data: application to xml documents mining. Technical Report 1388, LRI, May 2004. http://www.lri.fr/~termier/publis/phdTermierEN.ps.gz.
    [9]
    A. Termier, M. Rousset, and M. Sebag. Dryade : a new approach for discovering closed frequent trees in heterogeneous tree databases. In International Conference on Data Mining ICDM'04, Brighton, England, pages 543-546, 2004.
    [10]
    A. Termier, M. Rousset, M. Sebag, K. Ohara, T. Washio, and H. Motoda. Computation-time efficient and robust attribute tree mining with DRYADEPARENT. In Third International Workshop on Mining Graphs, Trees and Sequences (MGTS), 2005.
    [11]
    T. Uno, M. Kiyomi, and H. Arimura. Lcm v.2: Efficient mining algorithms for frequent/closed/maximal itemsets. In 2nd Workshop on Frequent Itemset Mining Implementations (FIMI'04), 2004.
    [12]
    M. J. Zaki. Efficiently mining frequent trees in a forest. In In Proc. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2002.
    [13]
    M. J. Zaki. Efficiently mining frequent embedded unordered trees. Fundamenta Informaticae, special issue on Advances in Mining Graphs, Trees and Sequences, 65(1-2):33-52, March/April 2005.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining
    November 2005
    837 pages
    ISBN:0769522785

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 27 November 2005

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)An Experimental Comparison of Different Inclusion Relations in Frequent Tree MiningFundamenta Informaticae10.5555/2366284.236628689:1(1-22)Online publication date: 4-Jan-2019
    • (2019)An Experimental Comparison of Different Inclusion Relations in Frequent Tree MiningFundamenta Informaticae10.5555/1497096.149709889:1(1-22)Online publication date: 4-Jan-2019
    • (2018)Frequent tree pattern mining: A surveyIntelligent Data Analysis10.5555/1890496.189049814:6(603-622)Online publication date: 27-Dec-2018
    • (2009)Mining tree-structured data on multicore systemsProceedings of the VLDB Endowment10.14778/1687627.16877062:1(694-705)Online publication date: 1-Aug-2009
    • (2007)PCITMinerProceedings of the sixth Australasian conference on Data mining and analytics - Volume 7010.5555/1378245.1378265(151-160)Online publication date: 3-Dec-2007
    • (2006)TRIPS and TIDESProceedings of the 15th ACM international conference on Information and knowledge management10.1145/1183614.1183680(455-464)Online publication date: 6-Nov-2006

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media