Abstract
This paper describes a simple data model for the composition and metadata management of documents in a distributed setting. We assume that each document resides at the local repository of its provider, so all providers’ repositories, collectively, can be thought of as a single database of documents spread over the network. Providers willing to share their documents with other providers in the network must register them with a coordinator, or mediator, and providers that search for documents matching their needs must address their queries to the mediator. The process of registering (or un-registering) a document, formulating a query to the mediator, or answering a query by the mediator, all rely on document content annotation.
Content annotation depends on the nature of the document: if the document is atomic then an annotation provided explicitely by the author is sufficient, whereas if the document is composite then the author annotation should be augmented by an implied annotation, i.e., an annotation inferred from the annotations of the document’s components.
The main contributions of this paper are:
-
1
Providing appropriate definitions of document annotations;
-
2
Providing an algorithm for the automatic computation of implied annotations;
-
3
Defining the main services that the mediator should support.
Research supported by the EU DELOS Network of Excellence in Digital Libraries and the EU IST Project (Self eLearning Networks), IST-2001-39045.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
The ACM Computing Classification System. ACM (1999), http://www.acm.org/class/
Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis, D., Tolle, K.: The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases. In: Proc. Intl. Conf. on Semantic Web (2001)
Ambite, J., Ashish, N., Barish, G., Knoblock, C., Minton, S., Modi, P., Muslea, I., Philpot, A., Tejada, S.: ARIADNE: a System for Constructing Mediators for Internet Sources. In: Proc. ACM SIGMOD Symp. on the Management of Data, pp. 561–563 (1998)
Baeza-Yates, R., Ribeiro-Neto, B. (eds.): Modern Information Retrieval. Addison-Wesley, Reading (1999)
Ciravegna, F., Dingli, A., Petrelli, D., Wilks, Y.: User-System Cooperation in Document Annotation based on Information Extraction. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, p. 122. Springer, Heidelberg (2002)
Cluet, S., Delobel, C., Simeon, J., Smaga, K.: Your Mediators need Data Conversion. In: Proc. ACM SIGMOD Symp. on the Management of Data (1998)
Decker, S., Melnik, S., van Harmelen, F., Fensel, D., Klein, M., Broekstra, J., Erdmann, M., Horrocks, I.: The Semantic Web: The Roles of XML and RDF. IEEE Expert 15(3) (2000)
Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A.: SemTag and seeker: bootstrapping the semantic web via automated semantic annotation. In: Proc. Intl. World Wide Web Conference (WWW), pp. 178–186 (2003)
Dublin Core Metadata Element Set. Technical Report (1999), http://dublincore.org/
Erdmann, M., Maedche, A., Schnurr, H., Staab, S.: From Manual to Semi-automatic Semantic Annotation: About Ontology-based Semantic Annotation Tools. In: Proc. COLING Intl. Workshop on Semantic Annotation and Intelligent Context (2000)
Garcia-Molia, H.: Peer-to-peer Data Management. In: Proc. IEEE Intl. Conf. on Data Engineering, ICDE (2002)
Handschuh, S., Staab, S., Volz, R.: On deep annotation. In: Proc. Intl. World Wide Web Conference (WWW), pp. 431–438 (2003)
Kahan, J., Koivunen, M.: Annotea: an Open RDF Infrastructure for Shared Web Annotations. In: Proc. Intl. World Wide Web Conference (WWW), pp. 623–632 (2001)
Karvounarakis, G., Alexaki, S., Christophides, V., Plexousakis, D., Scholl, M.: RQL: A Declarative Query Language for RDF. In: Proc. Intl. World Wide Web Conference (WWW), pp. 623–632 (2002)
Keenoy, K., Papamarkos, G., Poulovassilis, A., Peterson, D., Loizou, G.: Self e-Learning Networks – Functionality, User Requirements and Exploitation Scenarios. Technical report, SeLeNe Consortium (2003), www.dcs.bbk.ac.uk/selene/
Kieslinger, B., Simon, B., Vrabic, G., Neumann, G., Quemada, J., Henze, N., Gunnersdottir, S., Brantner, S., Kuechler, T., Siberski, W., Nejdl, W.: ELENA Creating a Smart Space for Learning. In: Proc. Intl. Semantic Web Conference. LNCS, vol. 2342. Springer, Heidelberg (2002)
Liddy, E.D., Allen, E., Harwell, S., Corieri, S., Yilmazel, O., Ozgencil, N.E., Diekema, A., McCracken, N., Silverstein, J., Sutton, S.: Automatic Metadata Generation and Evaluation. In: Proc. ACM Symp. on Information Retrieval, Tempere, Finland (2002) (poster session)
Draft Standard for Learning Objects Metadata. IEEE (2002)
Nejdl, W., Worlf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmer, M., Risch, T.: EDUTELLA: a P2P networking Infrastruture Based on RDF. In: Proc. Intl. World Wide Web Conference (WWW), pp. 604–615 (2002)
Resource Description Framework Model and Syntax Specification. World Wide Web Consortium (1999)
Resource Description Framework Schema (RDF/S). World Wide Web Consortium (2000)
Staab, S., Maedche, A., Handschuh, S.: An Annotation Framework for the Semantic Web. In: Proc. Intl. Workshop on Multimedia annotation (2001)
Tzitzikas, Y., Spyratos, N., Constantopoulos, P.: Mediators over Ontology-based Information Sources. In: Proc. Intl. Conf. on Web Information Systems Engineering, WISE 2001 (2001)
Tzitzikas, Y., Spyratos, N., Constantopoulos, P.: Query Evaluation for Mediators over Web Catalogs. In: Proc. Intl. Conf. on Information and Communication Technologies and Programming, Primorsko, Bulgaria (2002)
Wang, J., Lochovsky, F.: Data extraction and label assignment for web databases. In: Proc. Intl. World Wide Web Conference (WWW), pp. 187–196 (2003)
Wiederhold, G.: Mediators in the Architecture or Future Information Systems. IEEE Computer 25 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rigaux, P., Spyratos, N. (2004). Metadata Inference for Document Retrieval in a Distributed Repository. In: Maher, M.J. (eds) Advances in Computer Science - ASIAN 2004. Higher-Level Decision Making. ASIAN 2004. Lecture Notes in Computer Science, vol 3321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30502-6_31
Download citation
DOI: https://doi.org/10.1007/978-3-540-30502-6_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24087-7
Online ISBN: 978-3-540-30502-6
eBook Packages: Computer ScienceComputer Science (R0)