Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2701126.2701195acmconferencesArticle/Chapter ViewAbstractPublication PagesicuimcConference Proceedingsconference-collections
research-article

An efficient method for extracting subtrees against forest query

Published: 08 January 2015 Publication History

Abstract

In this paper, we present an algorithm to search and rank top-k approximately matched subtrees from a tree database, where the query is a collection of trees i.e. a forest. Even though existing algorithms can handle a single tree query, we argue that forest query would be significantly useful in some real life applications including biological domain. To address the issue we have proposed a method to find relevant subtrees and rank those given a tree database and a forest query. Tree edit distance is used to find and rank a set of subtrees with a pruning technique to improve the performance of the algorithm. We have tested our algorithm on different data sets and the efficiency of the searching and ranking process show promising results. Experimental results suggest that our algorithm improve run time at this stage and in future we would like to make it more useful for practical large data set.

References

[1]
Nikolaus Augsten, Denilson Barbosa, Michael H. Böhlen, and Themis Palpanas. Efficient top-k approximate subtree matching in small memory. IEEE Trans. Knowl. Data Eng., 23(8): 1123--1137, 2011.
[2]
Sudipto Guha, H. V. Jagadish, Nick Koudas, Divesh Srivastava, and Ting Yu. Approximate XML joins. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, Wisconsin, June 3--6, 2002, pages 287--298, 2002.
[3]
Nikolaus Augsten, Michael H. Böhlen, Curtis E. Dyreson, and Johann Gamper. Approximate joins for data-centric XML. In Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7--12, 2008, Cancún, México, pages 814--823, 2008.
[4]
William W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In Laura M. Haas and Ashutosh Tiwary, editors, SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2--4, 1998, Seattle, Washington, USA, pages 201--212. ACM Press, 1998.
[5]
Sergey Melnik, Hector Garcia-Molina, and Erhard Rahm. Similarity ooding: A versatile graph matching algorithm and its application to schema matching. In Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26--March 1, 2002, pages 117--128, 2002.
[6]
Nitin Agarwal, Magdiel Galan Oliveras, and Yi Chen. Approximate structural matching over ordered XML documents. In Eleventh International Database Engineering and Applications Symposium (IDEAS 2007), September 6--8, 2007, Banff, Alberta, Canada, pages 54--62, 2007.
[7]
Kuo-Chung Tai. The tree-to-tree correction problem. J. ACM, 26(3): 422--433, 1979.
[8]
Kaizhong Zhang and Dennis Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput., 18(6): 1245--1262, 1989.
[9]
Mateusz Pawlik and Nikolaus Augsten. RTED: A robust algorithm for the tree edit distance. CoRR, abs/1201.0230, 2012.
[10]
Nikolaus Augsten, Denilson Barbosa, Michael H. Böhlen, and Themis Palpanas. TASM: top-k approximate subtree matching. In Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1--6, 2010, Long Beach, California, USA, pages 353--364, 2010.

Index Terms

  1. An efficient method for extracting subtrees against forest query

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IMCOM '15: Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication
    January 2015
    674 pages
    ISBN:9781450333771
    DOI:10.1145/2701126
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 January 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. XML retrieval
    2. data structure
    3. forest query

    Qualifiers

    • Research-article

    Conference

    IMCOM '15
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 213 of 621 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 56
      Total Downloads
    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media