Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1838002.1838052acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfitConference Proceedingsconference-collections
research-article

An ontology based approach to automating data integration in scientific workflows

Published: 16 December 2009 Publication History

Abstract

Due to the proliferation of data generating devices such as sensors in scientific applications, data integration has become most challenging task since the data stemming from these devices are extremely heterogeneous in terms of structure (schema) and semantics (interpretation). In practice, integration and transformation is typically performed by the scientists manually; in fact extensive efforts are required. The approaches for automating data integration task as much as possible are badly needed. DaltOn is a generic framework that offers various functionalities for managing the data in scientific applications. In this paper, we present DaltOn's functionality for automating data integration task based on exploitation of ontologies. In addition, we also elaborate the specific module of our framework which is responsible for implementing the functionality. At last, we also present core algorithms that demonstrate a good evaluation of our approach.

References

[1]
C. A. Goble et al. Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal, 40(2):532--552, 2001.
[2]
F. Baader, et al. The Description Logic Handbook: Theory, Implementation, and Applications, New York, USA: Cambridge University Press, 2003.
[3]
F. Giunchiglia, P. Shvaiko, M. Yatskevich. S-match: an algorithm and an implementation of semantic matching. In Proceedings of ESWS' 04, pp 61--75, 2004.
[4]
J. Madhavan, P. Bernstein, A. Doan, and A. Halevy. Corpus-Based Schema Matching. In ICDE'05, pages 57--68, 2005.
[5]
J. Madhavan, P. A. Bernstein, and E. Rahm. Generic schema matching with Cupid. In VLDB'01, pages 49--58, 2001.
[6]
L. Popa, Y. Velegrakis, R. J. Miller, M. Hernandes, and R. Fagin. Translating Web Data. In VLDB'02, pages 598--609, 2002.
[7]
L. Xu and D. Embley. Using Domain Ontologies to Discover Direct and Indirect Matches for Schema Elements. In Semantic Integration Workshop in ISWC'03, 2003.
[8]
N. Noy, M. Musen. Evaluating ontology mapping tools: Requirement and experience EKAW, pp. 1--14, 2002.
[9]
O. Curé, S. Jablonski. Ontology-based Data Integration in Data Logistics Workflows. Proc. CMLSA, ER07, Auckland, NZL, 2007.
[10]
P. Shvaiko, J. Euzenat. A Survey of Schema-based Matching Approaches, Journal on Data Semantics, 2005.
[11]
ProDatO Integration Technology GmbH: Handbuch iPM Integrated Process Manager, Erlangen, Germany, 2005, www.prodato.de
[12]
R. Dhamankar, Y. Lee, A. Doan, A. Halevy, P. Domingos. iMAP: discovering complex semantic matches between database schemas. In SIGMOD '04: Int. conf. on Management of data, pp. 383--394, 2004.
[13]
S. Bowers, B. Ludäscher. An Ontology-Driven Framework for Data Transformation in Scientific Workflows. Proc. DILS'04, 2004.
[14]
S. Jablonski, M. A. Rehman. DaltOn: Process based data integration in scientific application. Technical Report, Uni of Bayreuth, AI4. 2008.
[15]
S. Jablonski, M. A. Rehman, B. Volz, O. Curé. DaltOn: An Infrastructure for Scientific Data Management. Proc. AWCS08, Int. Conference on Computational Science (ICCS), 2008.
[16]
S. Jablonski, M. A. Rehman, B. Volz. A Conceptual Modeling and Execution Framework for Process Based Scientific Applications. Proc. CIMS07, ACM 16th Conf. on Information and Knowledge Management (CIKM), Lisboa, Portugal, November 2007.
[17]
S. Jablonski, M. A. Rehman, B. Volz. A Modeling Methodology for Scientific Processes. Workshop for Environmental Databases, Hamburg, Germany, May 2007.
[18]
S. Jablonski, C. Bussler. Workflow Management -- Modeling Concepts, Architecture and Implementation. London: Int. Thomson Computer Press, 1996
[19]
S. Staab, R. Studer. Handbook of Ontologies. Springer, 2004.
[20]
U. Visser, H. Stuckenschmidt, H. Wache, T. Vögele. Using environmental information efficiently: Sharing data and knowledge from heterogeneous sources. EISIPA, pp. 41--73, IDEA Group, Hershey, USA & London, UK, 2001.
[21]
Website BayCEER, http://www.bayceer.uni-bayreuth.de/, Aug 2009.
[22]
Website Protégé, http://protege.stanford.edu/, 2009-07-20
[23]
X. Dong, A. Y. Halevy, J. Madhavan. Reference Reconciliation in Complex Information Spaces. In Proc. SIGMOD Conference 2005.
[24]
Y. An, A. Borgida, R. J. Miller, J. Mylopoulos. A Semantic Approach to Schema Mapping. 23rd IEEE ICDE07, April 2007.
[25]
Y. Kalfoglou, M. Schorlemmer. IF-Map: An Ontology-Mapping Method Based on Information-Flow Theory, Journal on Data Semantics, Vol. 1, pp 98--127, 2003.
[26]
Z. Ben-Miled, N. Li, M. Baumgartner, Y. Liu. A decentralized approach to the integration of life science web databases. Informatica (Slovenia), 27(1):3--14, 2003.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
FIT '09: Proceedings of the 7th International Conference on Frontiers of Information Technology
December 2009
446 pages
ISBN:9781605586427
DOI:10.1145/1838002
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • COMSATS Institute of Information Technology

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 December 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ontologies
  2. scientific data management
  3. scientific workflows
  4. semantic data integration
  5. software architecture

Qualifiers

  • Research-article

Conference

FIT '09
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 212
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media