Abstract
A semantic (web) crawler refers to a series of web crawlers designed for harvesting semantic web content. This paper presents the framework of a semantic crawler that can abstract metadata from online webpages and cluster the metadata by associating them with ontological concepts. The clustering is based on a CBR algorithm which is adopted in the field of problem solving. We reveal the technical details with regard to ontological concept and metadata format, and the extended CBR algorithm. In addition, the system implementation and evaluation details are provided in detail, finalized by our conclusion and further works.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Carthy, D.C.J., Drummond, A., Dunnion, J., Sheppard, J.: The use of data mining in the design and implementation of an incident report retrieval system. In: Systems and Information Engineering Design Symposium, pp. 13–18. IEEE, Charlottesville (2003)
Decker, S., Erdmann, M., Fensel, D., Studer, R.: Ontobroker: Ontology based access to distributed and semi-structured Information. In: Meersman, R. (ed.) Database Semantics: Semantic Issues in Multimedia Systems, pp. 351–369. Kluwer Academic Publisher, Dordrecht (1999)
Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V.C., Sachs, J.: Swoogle: a search and metadata engine for the semantic web. In: The Thirteenth ACM Conference on Information and Knowledge Management. ACM Press, Washington (2004)
Dodds, L.: Slug: a semantic web crawler (2006)
Dong, H., Hussain, F.K., Chang, E.: State of the art in metadata abstraction crawlers. In: 2008 IEEE International Conference on Industrial Technology (IEEE ICIT 2008). IEEE, Chengdu (2008)
Handschuh, S., Staab, S.: Authoring and annotation of web pages in CREAM. In: WWW 2002, pp. 462–473. ACM Press, Honolulu (2002)
Handschuh, S., Staab, S.: CREAM: CREAting Metadata for the Semantic Web. Computer Networks 42, 579–598 (2003)
Handschuh, S., Staab, S., Maedche, A.: CREAM — Creating relational metadata with a component-based, ontology-driven annotation framework. In: K-CAP 2001, pp. 76–83. ACM Press, Victoria (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dong, H., Hussain, F.K., Chang, E. (2008). A Semantic Crawler Based on an Extended CBR Algorithm. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2008 Workshops. OTM 2008. Lecture Notes in Computer Science, vol 5333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88875-8_135
Download citation
DOI: https://doi.org/10.1007/978-3-540-88875-8_135
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88874-1
Online ISBN: 978-3-540-88875-8
eBook Packages: Computer ScienceComputer Science (R0)