Abstract
Linking historical datasets and making them available on the Web has increasingly become a subject of research in the field of digital humanities. In this paper, we focus on discovering links between ships from a dataset of Dutch maritime events and a historical archive of newspaper articles. We apply a heuristic-based method for finding and filtering links between ship instances; subsequently, we use machine learning for article classification to be used for enhanced filtering in combination with domain features. We evaluate the resulting links, using manually annotated samples as gold standard. The resulting links are made available as Linked Open Data, thus enriching the original data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boonstra, O., Breure, L., Doorn, P.: Past, present and future of historical information science. Historical Social Research / Historische Sozialforschung 29, 2 (2004)
Bron, M., Huurnink, B., de Rijke, M.: Linking archives using document enrichment and term selection. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 360–371. Springer, Heidelberg (2011)
de Boer, V., van Rossum, M., Leinenga, J., Hoekstra, R.: Dutch ships and sailors linked data. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 229–244. Springer, Heidelberg (2014)
de Boer, V., van Doornik, J., Buitinck, L., Marx, M., Veken, T., Ribbens, K.: Linking the kingdom: enriched access to a historiographical text. In: Proceedings of the Seventh International Conference on Knowledge Capture, K-CAP 2013, pp. 17–24. ACM, New York (2013)
Gottipati, S., Jiang, J.: Linking entities to a knowledge base with query expansion. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 804–813. Association for Computational Linguistics, Stroudsburg (2011)
Juric, D., Hollink, L., Houben, G.: Bringing parliamentary debates to the semantic web. In: Proceedings of the Workshop on Detection, Representation and Exploitation of Events in the Semantic Web (DeRIVE 2012) November 12, 2012 (to appear, 2012)
Juric, D., Hollink, L., Houben, G.-J.: Discovering links between political debates and media. In: Daniel, F., Dolog, P., Li, Q. (eds.) ICWE 2013. LNCS, vol. 7977, pp. 367–375. Springer, Heidelberg (2013)
Kleppe, M., Hollink, L., Kemman, M., Juric, D., Beunders, H., Blom, J., Oomen, J., Houben, G.: Polimedia: analysing media coverage of political debates by automatically generated links to radio and newspaper items. In: LinkedUp Veni Competition 2013, Proceedings of the LinkedUp Veni Competition on Linked and Open Data for Education, vol. 1124, pp. 1–6. CEUR Workshop Proceedings (2014)
Lv, Y., Moon, T., Kolari, P., Zheng, Z., Wang, X., Chang, Y.: Learning to model relatedness for news recommendation. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 57–66. ACM, New York (2011)
Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Meroño-Peñuela, A., Ashkpour, A., Rietveld, L., Hoekstra, R.: Linked humanities data: the next frontier? a case-study in historical census data. In: Proc. of the 2nd Int. Workshop on Linked Science 2012, vol. 951 (2012)
Meroño-Peñuela, A., Ashkpour, A., van Erp, M., Mandemakers, K., Breure, L., Scharnhorst, A., Schlobach, S., van Harmelen, F.: Semantic technologies for historical research: A survey. Semantic Web Journal, 588–1795 (2014)
Rao, D., McNamee, P., Dredze, M.: Entity linking: finding extracted entities in a knowledge base. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing, pp. 93–115. Springer, Heidelberg (2013)
Stasiu, R.K., Heuser, C.A., da Silva, R.: Estimating recall and precision for vague queries in databases. In: Pastor, Ó., Falcão e Cunha, J. (eds.) CAiSE 2005. LNCS, vol. 3520, pp. 187–200. Springer, Heidelberg (2005)
Téllez-Valero, A., Montes-y Gómez, M., Villaseñor Pineda, L.: A Machine Learning Approach to Information Extraction, pp. 539–547 (2005)
Yu, B.: An evaluation of text classification methods for literary study. LLC 23(3), 327–343 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bravo Balado, A., de Boer, V., Schreiber, G. (2015). Linking Historical Ship Records to a Newspaper Archive. In: Aiello, L., McFarland, D. (eds) Social Informatics. SocInfo 2014. Lecture Notes in Computer Science(), vol 8852. Springer, Cham. https://doi.org/10.1007/978-3-319-15168-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-15168-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15167-0
Online ISBN: 978-3-319-15168-7
eBook Packages: Computer ScienceComputer Science (R0)