research-article

An efficient method for extracting subtrees against forest query

Authors:

Shafaet Ashraf,

Sheikh Muhammad Sarwar,

Md. Abeed Hassan,

Saifuddin Md. Tareeq,

Anna FarihaAuthors Info & Claims

IMCOM '15: Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication

Article No.: 98, Pages 1 - 7

https://doi.org/10.1145/2701126.2701195

Published: 08 January 2015 Publication History

Get Access

Abstract

In this paper, we present an algorithm to search and rank top-k approximately matched subtrees from a tree database, where the query is a collection of trees i.e. a forest. Even though existing algorithms can handle a single tree query, we argue that forest query would be significantly useful in some real life applications including biological domain. To address the issue we have proposed a method to find relevant subtrees and rank those given a tree database and a forest query. Tree edit distance is used to find and rank a set of subtrees with a pruning technique to improve the performance of the algorithm. We have tested our algorithm on different data sets and the efficiency of the searching and ranking process show promising results. Experimental results suggest that our algorithm improve run time at this stage and in future we would like to make it more useful for practical large data set.

References

[1]

Nikolaus Augsten, Denilson Barbosa, Michael H. Böhlen, and Themis Palpanas. Efficient top-k approximate subtree matching in small memory. IEEE Trans. Knowl. Data Eng., 23(8): 1123--1137, 2011.

Digital Library

Google Scholar

[2]

Sudipto Guha, H. V. Jagadish, Nick Koudas, Divesh Srivastava, and Ting Yu. Approximate XML joins. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, Wisconsin, June 3--6, 2002, pages 287--298, 2002.

Digital Library

Google Scholar

[3]

Nikolaus Augsten, Michael H. Böhlen, Curtis E. Dyreson, and Johann Gamper. Approximate joins for data-centric XML. In Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7--12, 2008, Cancún, México, pages 814--823, 2008.

Digital Library

Google Scholar

[4]

William W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In Laura M. Haas and Ashutosh Tiwary, editors, SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2--4, 1998, Seattle, Washington, USA, pages 201--212. ACM Press, 1998.

Digital Library

Google Scholar

[5]

Sergey Melnik, Hector Garcia-Molina, and Erhard Rahm. Similarity ooding: A versatile graph matching algorithm and its application to schema matching. In Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26--March 1, 2002, pages 117--128, 2002.

Digital Library

Google Scholar

[6]

Nitin Agarwal, Magdiel Galan Oliveras, and Yi Chen. Approximate structural matching over ordered XML documents. In Eleventh International Database Engineering and Applications Symposium (IDEAS 2007), September 6--8, 2007, Banff, Alberta, Canada, pages 54--62, 2007.

Digital Library

Google Scholar

[7]

Kuo-Chung Tai. The tree-to-tree correction problem. J. ACM, 26(3): 422--433, 1979.

Digital Library

Google Scholar

[8]

Kaizhong Zhang and Dennis Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput., 18(6): 1245--1262, 1989.

Digital Library

Google Scholar

[9]

Mateusz Pawlik and Nikolaus Augsten. RTED: A robust algorithm for the tree edit distance. CoRR, abs/1201.0230, 2012.

Digital Library

Google Scholar

[10]

Nikolaus Augsten, Denilson Barbosa, Michael H. Böhlen, and Themis Palpanas. TASM: top-k approximate subtree matching. In Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1--6, 2010, Long Beach, California, USA, pages 353--364, 2010.

Crossref

Google Scholar

Index Terms

An efficient method for extracting subtrees against forest query
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

On subtrees of trees

We study that over a certain type of trees (e.g., all trees or all binary trees) with a given number of vertices, which trees minimize or maximize the total number of subtrees (or subtrees with at least one leaf). Trees minimizing the total number of ...
The Kernel of Maximum Agreement Subtrees

A Maximum Agreement SubTree (MAST) is a largest subtree common to a set of trees and serves as a summary of common substructure in the trees. A single MAST can be misleading, however, since there can be an exponential number of MASTs, and two MASTs for ...
A dynamic construction algorithm for the compact patricia trie using the hierarchical structure

We need to access objective information efficiently and arbitrary strings in the text at high speed. In several key retrieval strategies, we often use the binary trie for supporting fast access method in order. Especially, the Patricia trie (Pat tree) ...

Comments

Information & Contributors

Information

Published In

IMCOM '15: Proceedings of the 9th International Conference on Ubiquitous Information Management and Communication

January 2015

674 pages

ISBN:9781450333771

DOI:10.1145/2701126

Conference Chairs:
Dongsoo S. Kim
Indiana University
,
Sang-wook Kim
Hanyang University, Korea
,
General Chairs:
Suk-Han Lee
Sungkyunkwan Univerisity, Korea
,
Lajos Hanzo
University of Southampton, UK
,
Roslan Ismail
Universiti Kuala Lumpur, Malaysia

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 January 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

IMCOM '15

Sponsor:

SIGAPP

IMCOM '15: The 9th International Conference on Ubiquitous Information Management and Communication

January 8 - 10, 2015

Bali, Indonesia

Acceptance Rates

Overall Acceptance Rate 213 of 621 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
56
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Index Terms

Recommendations

On subtrees of trees

The Kernel of Maximum Agreement Subtrees

A dynamic construction algorithm for the compact patricia trie using the hierarchical structure

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations