Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Visualizing and Analyzing Networks of Named Entities in Biographical Dictionaries for Digital Humanities Research

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2019)

Abstract

This paper shows how named entity extraction and network analysis can be used to examine biographies individually and in groups to aid historians in biographical and prosopographical research. For this purpose a reference network of 13 100 biographies in the collections of the Biographical Centre of the Finnish Literature Society was created, based on links between the biographies as well as automatically extracted named entities found in the texts. The data was published in a SPARQL endpoint as a Linked Data knowledge graph on top of which network analytic tools were created and analysis were done showing the usefulness of the approach in Digital Humanities. The reference graph has been utilized for network analysis to examine egocentric networks of individual persons as well as networks among groups of people in prosopography. The data and tools presented are in use since autumn 2018 in the semantic portal BiographySampo that has had tens of thousands of users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Online at www.biografiasampo.fi; see project homepage https://seco.cs.aalto.fi/projects/biografiasampo/en/ for further info and publications.

  2. 2.

    Prosopography is a method that is used to study groups of people through their biographical data. The goal of prosopography is to find connections, trends, and patterns from these groups.

  3. 3.

    Actually, the biographies in our case study come from several separate databases, including the general National Biography of Finland as a core, supplemented with four other thematic dictionaries [16].

  4. 4.

    https://kansallisbiografia.fi/english/national-biography.

  5. 5.

    http://hipla.fi.

  6. 6.

    http://developers.google.com/maps/.

  7. 7.

    https://github.com/Traubert/FiNer-rules/blob/master/finer-readme.md.

  8. 8.

    http://persistence.uni-leipzig.org/nlp2rdf/specification/core.html.

  9. 9.

    http://dublincore.org/documents/dcmi-terms/.

  10. 10.

    Denoted with prefix nbf.

  11. 11.

    https://finto.fi/yso-paikat/en/.

  12. 12.

    The view currently lists only sentences that contain manually added HTML links.

  13. 13.

    https://korp.csc.fi/.

  14. 14.

    https://linkedjazz.org/.

  15. 15.

    https://www.dbpedia-spotlight.org/demo/.

  16. 16.

    https://cloud.gate.ac.uk/.

References

  1. Aylett, R.S., Bental, D.S., Stewart, R., Forth, J., Wiggins, G.: Supporting serendipitous discovery. In: Digital Futures (Third Annual Digital Economy Conference), Aberdeen, UK, 23–25 October 2012 (2012)

    Google Scholar 

  2. Borin, L., Forsberg, M., Roxendal, J.: Korp – the corpus infrastructure of Språkbanken. In: Proceedings of LREC 2012, Istanbul: ELRA, pp. 474–478 (2012)

    Google Scholar 

  3. Brouwer, J., Nijboer, H.: Golden agents. A web of linked biographical data for the Dutch Golden Age. In: BD2017 Biographical Data in a Digital World 2017, Proceedings, vol. 2119, pp. 33–38. CEUR Workshop Proceedings (2018). https://ceur-ws.org/Vol-2119/paper6.pdf

  4. Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL 2006, 11st Conference of the European Chapter of the Association for Computational Linguistics, vol. 6, pp. 9–16 (2006)

    Google Scholar 

  5. Elson, D.K., Dames, N., McKeown, K.R.: Extracting social networks from literary fiction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 138–147. Association for Computational Linguistics (2010)

    Google Scholar 

  6. Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1625–1628. ACM (2010)

    Google Scholar 

  7. Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Hakosalo, H., Jalagin, S., Junila, M., Kurvinen, H.: Historiallinen elämä - Biografia ja historiantutkimus. Suomalaisen Kirjallisuuden Seura (SKS), Helsinki (2014)

    Google Scholar 

  9. Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web: Theory and Technology. Morgan & Claypool (2011)

    Google Scholar 

  10. Heino, E., et al.: Named entity linking in a complex domain: case second world war history. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) LDK 2017. LNCS (LNAI), vol. 10318, pp. 120–133. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59888-8_10

    Chapter  Google Scholar 

  11. Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41338-4_7

    Chapter  Google Scholar 

  12. Hyvönen, E.: Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Morgan & Claypool, Palo Alto (2012)

    Book  Google Scholar 

  13. Hyvönen, E., et al.: WarSampo data service and semantic portal for publishing linked open data about the second world war history. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 758–773. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_46

    Chapter  Google Scholar 

  14. Hyvönen, E., Ikkala, E., Tuominen, J.: Linked data brokering service for historical places and maps. In: Proceedings of the 1st Workshop on Humanities in the Semantic Web (WHiSe), vol. 1608, pp. 39–52. CEUR Workshop Proceedings (2016). https://ceur-ws.org/Vol-1608/paper-06.pdf

  15. Hyvönen, E., et al.: BiographySampo – publishing and enriching biographies on the semantic web for digital humanities research. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 574–589. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_37

    Chapter  Google Scholar 

  16. Hyvönen, E., Leskinen, P., Tamper, M., Tuominen, J., Keravuori, K.: Semantic national biography of Finland. In: Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference (DHN 2018), vol. 2084, pp. 372–385. CEUR Workshop Proceedings (2018). https://ceur-ws.org/Vol-2084/short12.pdf

  17. Ikkala, E., Tuominen, J., Hyvönen, E.: Contextualizing historical places in a gazetteer by using historical maps and linked data. In: Proceedings of Digital Humanities 2016 (DH 2016), Krakow, Poland, pp. 573–577 (2016). https://dh2016.adho.org/abstracts/39

  18. Kettunen, K., Mäkelä, E., Ruokolainen, T., Kuokkala, J., Löfberg, L.: Old content and modern tools-searching named entities in a Finnish OCRed historical newspaper collection 1771–1910. arXiv preprint arXiv:1611.02839 (2016)

  19. Langmead, A., Otis, J., Warren, C., Weingart, S., Zilinski, L.: Towards interoperable network ontologies for the digital humanities. Int. J. Hum. Arts Comput. 10, 22–35 (2016)

    Google Scholar 

  20. Leskinen, P., Hyvönen, E.: Extracting genealogical networks of linked data from biographical texts. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11762, pp. 121–125. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32327-1_24

    Chapter  Google Scholar 

  21. Leskinen, P., Hyvönen, E., Tuominen, J.: Analyzing and visualizing prosopographical linked data based on biographies. In: BD2017 Proceedings of the Second Conference on Biographical Data in a Digital World 2017, vol. 2119, pp. 39–44. CEUR Workshop Proceedings (2018). https://ceur-ws.org/Vol-2119/paper7.pdf

  22. Lindquist, T., Long, H.: How can educational technology facilitate student engagement with online primary sources? A user needs assessment. Libr. Hi Tech 29(2), 224–241 (2011)

    Article  Google Scholar 

  23. Mäkelä, E.: Combining a REST lexical analysis web service with SPARQL for mashup semantic annotation from text. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8798, pp. 424–428. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11955-7_60

    Chapter  Google Scholar 

  24. Mäkelä, E., Lindquist, T., Hyvönen, E.: CORE - a contextual reader based on linked data. In: Proceedings of Digital Humanities 2016, Krakow, Poland, pp. 267–269 (2016). https://dh2016.adho.org/abstracts/4

  25. Maynard, D., Roberts, I., Greenwood, M.A., Rout, D., Bontcheva, K.: A framework for real-time semantic social media analysis. J. Web Semant. 44, 75–88 (2017)

    Article  Google Scholar 

  26. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 1–8. ACM (2011)

    Google Scholar 

  27. Newman, M.: Networks. Oxford University Press, Oxford (2018)

    Book  MATH  Google Scholar 

  28. Nguyen, D.B., Hoffart, J., Theobald, M., Weikum, G.: AIDA-light: high-throughput named-entity disambiguation. In: Proceedings of LDOW, Linked Data on the Web, vol. 1184. CEUR Workshop Proceedings (2014). https://ceur-ws.org/Vol-1184/ldow2014_paper_03.pdf

  29. Oksanen, A., Tuominen, J., Mäkelä, E., Tamper, M., Hietanen, A., Hyvönen, E.: Semantic Finlex: transforming, publishing, and using Finnish legislation and case law as linked open data on the web. In: Knowledge of the Law in the Big Data Age. Frontiers in Artificial Intelligence and Applications, vol. 317, pp. 212–228. IOS Press (2019)

    Google Scholar 

  30. Otte, E., Rousseau, R.: Social network analysis: a powerful strategy, also for the information sciences. J. Inf. Sci. 28(6), 441–453 (2002)

    Article  Google Scholar 

  31. Pattuelli, M.C., Miller, M., Lange, L., Thorsen, H.K.: Linked Jazz 52nd street: a LOD crowdsourcing tool to reveal connections among Jazz artists. In: Proceedings of Digital Humanities 2013, pp. 337–339 (2013)

    Google Scholar 

  32. Piccinno, F., Ferragina, P.: From TagME to WAT: a new entity annotator. In: Proceedings of the First International Workshop on Entity Recognition & Disambiguation, pp. 55–62. ACM (2014)

    Google Scholar 

  33. Roberts, B.: Biographical Research. Understanding Social Research. Open University Press (2002)

    Google Scholar 

  34. Small, H.: Co-citation in the scientific literature: a new measure of the relationship between two documents. J. Am. Soc. Inf. Sci. 24(4), 265–269 (1973)

    Article  Google Scholar 

  35. Tamper, M., Leskinen, P., Apajalahti, K., Hyvönen, E.: Using biographical texts as linked data for prosopographical research and applications. In: Ioannides, M., et al. (eds.) EuroMed 2018. LNCS, vol. 11196, pp. 125–137. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01762-0_11

    Chapter  Google Scholar 

  36. Tuominen, J., Hyvönen, E., Leskinen, P.: Bio CRM: a data model for representing biographical data for prosopographical research. In: Biographical Data in a Digital World 2017, Proceedings, vol. 2119. CEUR Workshop Proceedings (2018). https://ceur-ws.org/Vol-2119/paper7.pdf

  37. Verboven, K., Carlier, M., Dumolyn, J.: A short manual to the art of prosopography. In: Prosopography Approaches and Applications. A Handbook, pp. 35–70. Unit for Prosopographical Research (Linacre College) (2007)

    Google Scholar 

  38. Warren, C.N., Shore, D., Otis, J., Wang, L., Finegold, M., Shalizi, C.: Six degrees of francis bacon: a statistical method for reconstructing large historical social networks. DHQ: Digit. Hum. Q. 10(3) (2016)

    Google Scholar 

Download references

Acknowledgments

Our research was part of the Severi project (http://seco.cs.aalto.fi/projects/severi), funded mainly by Business Finland. Thanks to Mikko Kivelä for inspirational discussions and CSC - IT Center for Science for computational resources.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minna Tamper .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tamper, M., Leskinen, P., Hyvönen, E. (2023). Visualizing and Analyzing Networks of Named Entities in Biographical Dictionaries for Digital Humanities Research. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13451. Springer, Cham. https://doi.org/10.1007/978-3-031-24337-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24337-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24336-3

  • Online ISBN: 978-3-031-24337-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics