Abstract
In this contribution a system is presented, which provides access to distributed data sources using Semantic Web technology. While it was primarily designed for data sharing and scientific collaboration, it is regarded as a base technology useful for many other Semantic Web applications. The proposed system allows to retrieve data using SPARQL queries, data sources can register and abandon freely, and all RDF Schema or OWL vocabularies can be used to describe their data, as long as they are accessible on the Web. Data heterogeneity is addressed by RDF-wrappers like D2R-Server placed on top of local information systems. A query does not directly refer to actual endpoints, instead it contains graph patterns adhering to a virtual data set. A mediator finally pulls and joins RDF data from different endpoints providing a transparent on-the-fly view to the end-user.
The SPARQL protocol has been defined to enable systematic data access to remote endpoints. However, remote SPARQL queries require the explicit notion of endpoint URIs. The presented system allows users to execute queries without the need to specify target endpoints. Additionally, it is possible to execute join and union operations across different remote endpoints. The optimization of such distributed operations is a key factor concerning the performance of the overall system. Therefore, proven concepts from database research can be applied.
Chapter PDF
Similar content being viewed by others
Keywords
- Resource Description Framework
- SPARQL Query
- Query Plan
- Resource Description Framework Data
- SPARQL Endpoint
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
The Billion Triples Challenge (mailing list archive at Yahoo!) (2007) (last visit December 12, 2007), http://tech.groups.yahoo.com/group/billiontriples/
Auer, S., Bizer, C., Lehmann, J., Kobilarov, G., Cyganiak, R., Ives, Z.: DBpedia: A nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ISWC 2007. LNCS, vol. 4825, pp. 715–728. Springer, Heidelberg (2007)
Quilitz, B.: DARQ – Federated Queries with SPARQL (2006) (last visit December 12, 2007), http://darq.sourceforge.net/
Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18(4), 323–364 (1986)
Berners-Lee, T., Chen, Y., Chilton, L., Connolly, D., et al.: Tabulator: Exploring and analyzing linked data on the semantic web. In: Proceedings of the ISWC Workshop on Semantic Web User Interaction (2006)
Bizer, C., Cyganiak, R.: D2RQ – lessons learned. In: The W3C Workshop on RDF Access to Relational Databases (October 2007), http://www.w3.org/2007/03/RdfRDB/papers/d2rq-positionpaper/
Bizer, C., Cyganiak, R.: D2R Server – Publishing Relational Databases on the Semantic Web. In: 5th International Semantic Web Conference (2006)
In Silico Discovery. Semantic discovery system (2007) (last visit December 12, 2007), http://www.insilicodiscovery.com
Kossmann, D.: The State of the Art in Distributed Query Processing. ACM Comput. Surv. 32(4), 422–469 (2000)
Kossmann, D., Stocker, K.: Iterative dynamic programming: a new class of query optimization algorithms. ACM Trans. Database Syst. 25(1), 43–82 (2000)
Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. In: Sakellariou, R., Keane, J.A., Gurd, J.R., Freeman, L. (eds.) Euro-Par 2001. LNCS, vol. 2150, Springer, Heidelberg (2001)
Haslhofer, B.: Mediaspaces (2007), http://www.mediaspaces.info/
He, B., Patel, M., Zhang, Z., Chang, K.C.-C.: Accessing the deep web. Commun. ACM 50(5), 94–101 (2007)
UK HP Labs, Bristol. Jena – A Semantic Web Framework for Java (last visit March 2007), http://jena.sourceforge.net/
Langegger, A., Blöchl, M., Wöß, W.: Sharing data on the grid using ontologies and distributed SPARQL queries. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 450–454. Springer, Heidelberg (2007)
Langegger, A., Wöß, W., Blöchl, M.: Semantic data access middleware for grids (last visit December 2007), http://gsdam.sourceforge.net
Haas, L.M., Kossmann, D., Wimmers, E.L., Yang, J.: Optimizing Queries Across Diverse Data Sources. In: Proceedings of the 23th International Conference on Very Large Databases, Athens, VLDB Endowment, Saratoga, Calif, pp. 276–285 (1997)
Melnik, S.: Generic Model Management: Concepts And Algorithms. LNCS. Springer, New York (2004)
Miles, A., Baker, T., Swick, R.: Best practice recipes for publishing RDF vocabularies (2006) (last visit December 12, 2007), http://www.w3.org/TR/swbp-vocab-pub/
Noy, N.F., Rubin, D.L., Musen, M.A.: Making biomedical ontologies and ontology repositories work. Intelligent Systems 19(6), 78–81 (2004)
OpenLink Software. OpenLink Virtuoso (last visit March 2007) http://www.openlinksw.com/virtuoso/
Prud’hommeaux, E.: Optimal RDF access to relational databases (April 2004), http://www.w3.org/2004/04/30-RDF-RDB-access/
Prud’hommeaux, E.: Federated SPARQL (May 2007), http://www.w3.org/2007/05/SPARQLfed/
Sattler, K.-U., Geist, I., Schallehn, E.: Concept-based querying in mediator systems. The VLDB Journal 14(1), 97–111 (2005)
Tan, H., Lambrix, P.: A method for recommending ontology alignment strategies. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ISWC 2007. LNCS, vol. 4825, pp. 491–504. Springer, Heidelberg (2007)
Tomasic, A., Raschid, L., Valduriez, P.: Scaling heterogeneous databases and the design of disco. ICDCS 00, 449 (1996)
Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the open linked data. In: Proceedings of the 6th International Semantic Web Conference (ISWC) (November 2007)
W3C. SPARQL Query Language for RDF, W3C Proposed Recommendation (last visit May 2007), http://www.w3.org/TR/rdf-sparql-query/
Wiederhold, G.: Mediators in the architecture of future information systems. Computer 25(3), 38–49 (1992)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Langegger, A., Wöß, W., Blöchl, M. (2008). A Semantic Web Middleware for Virtual Data Integration on the Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds) The Semantic Web: Research and Applications. ESWC 2008. Lecture Notes in Computer Science, vol 5021. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68234-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-68234-9_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68233-2
Online ISBN: 978-3-540-68234-9
eBook Packages: Computer ScienceComputer Science (R0)