Semantic Perspectives on the Lake District Writing: Spatial Ontology Modeling and Relation Extraction for Deeper Insights

Haris, Erum; Cohn, Anthony G.; Stell, John G.

doi:10.4230/LIPIcs.COSIT.2024.11

Abstract

Extracting spatial details from historical texts can be difficult, hindering our understanding of past landscapes. The study addresses this challenge by analyzing the Corpus of the Lake District Writing, focusing on the English Lake District region. We systematically link the theoretical notions from the core concepts of spatial information to provide basis for the problem domain. The conceptual foundation is further complemented with a spatial ontology and a custom gazetteer, allowing a formal and insightful semantic exploration of the massive unstructured corpus. The other contrasting side of the framework is the usage of LLMs for spatial relation extraction. We formulate prompts leveraging understanding of the LLMs of the intended task, curate a list of spatial relations representing the most recurring proximity or vicinity relations terms and extract semantic triples for the top five place names appearing in the corpus. We compare the extraction capabilities of three benchmark LLMs for a scholarly significant historical archive, representing their potential in a challenging and interdisciplinary research problem. Finally, the network comprising the semantic triples is enhanced by incorporating a gazetteer-based classification of the objects involved thus improving their spatial profiling.

Christopher Allen, Thomas Hervey, Sara Lafia, Daniel W. Phillips, Behzad Vahedi, and Werner Kuhn. Exploring the notion of spatial lenses. In Jennifer A. Miller, David O'Sullivan, and Nancy Wiegand, editors, Geographic Information Science, pages 259-274, Cham, 2016. Springer International Publishing.
Azure openai service models (GPT-4), 2023. URL: https://learn.microsoft.com/en-us/azure/ai services/openai.
Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. Gephi: an open source software for exploring and manipulating networks. In Proceedings of the International AAAI Conference on Weblogs and Social Media, ICWSM, pages 361-362. AAAI, 2009. URL: https://doi.org/10.1609/icwsm.v3i1.13937.
David J Bodenhamer, John Corrigan, Trevor M Harris, and et al. The Spatial Humanities: GIS and the Future of Humanities Scholarship. Indiana University Press, 2010. URL: http://www.jstor.org/stable/j.ctt16gzj77.
Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, and et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, and et al. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS'20, Red Hook, NY, 2020. Curran Associates, Inc.
Nicholas J. Car and Timo Homburg. Geosparql 1.1: Motivations, details and applications of the decadal update to the most important geospatial lod standard. ISPRS International Journal of Geo-Information, 11(2), 2022. URL: https://doi.org/10.3390/ijgi11020117.
David Cooper. The poetics of place and space: Wordsworth, norman nicholson and the lake district. Literature Compass, 5(4):807-821, 2008.
Helen Couclelis. People manipulate objects (but cultivate fields): Beyond the raster-vector debate in gis. In A. U. Frank, I. Campari, and U. Formentini, editors, Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, pages 65-77, Berlin, Heidelberg, 1992. Springer Berlin Heidelberg.
Cheng Deng, Tianhang Zhang, Zhongmou He, Qiyuan Chen, Yuanyuan Shi, Le Zhou, Luoyi Fu, Weinan Zhang, Xinbing Wang, Chenghu Zhou, and et al. Learning A foundation language model for geoscience knowledge understanding and utilization. CoRR, abs/2306.05064, 2023. URL: https://doi.org/10.48550/arXiv.2306.05064.
Christopher Donaldson, Ian N. Gregory, and Joanna E. Taylor. Locating the beautiful, picturesque, sublime and majestic: spatially analysing the application of aesthetic terminology in descriptions of the english lake district. Journal of Historical Geography, 56(1):43-60, 2017. URL: https://doi.org/10.1016/j.jhg.2017.01.006.
Stuart Dunn, Graeme Earl, Anna Foka, and Will Wootton. Spatial narratives in museums and online: The birth of the digital object itinerary. Museums and Digital Culture: New Perspectives and Research, pages 253-271, 2019.
Ignatius Ezeani, Paul Rayson, Ian N. Gregory, Erum Haris, Anthony G. Cohn, John G. Stell, Tim Cole, Joanna E. Taylor, David Bodenhamer, Neil Devadasan, and et al. Towards an extensible framework for understanding spatial narratives. In Proceedings of the 7th ACM SIGSPATIAL International Workshop on Geospatial Humanities, GeoHumanities '23, pages 1-10, New York, NY, 2023. ACM Press. URL: https://doi.org/10.1145/3615887.3627761.
Anna Foka, Elton Barker, Kyriaki Konstantinidou, Nasrin Mostofian, O. Cenk Demiroglu, Brady Kiesling, and Linda Talatas. Semantically geo-annotating an ancient greek "travel guide" itineraries, chronotopes, networks, and linked data. In Proceedings of the 4th ACM SIGSPATIAL Workshop on Geospatial Humanities, GeoHumanities '20, pages 1-9, New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3423337.3429433.
Gemini, 2024. URL: https://gemini.google.com/.
Michael F. Goodchild, May Yuan, and Thomas J. Cova. Towards a general theory of geographic representation in gis. International Journal of Geographical Information Science, 21(3):239-260, 2007. URL: https://doi.org/10.1080/13658810600965271.
Ian Gregory, Christopher Donaldson, Patricia Murrieta-Flores, and Paul Rayson. Geoparsing, gis, and textual analysis: current developments in spatial humanities research. International Journal of Humanities and Arts Computing, 9(1):1-14, 2015.
Ian Gregory, Ian Smail, Joanna Taylor, and James Butler. Exploring qualitative geographies in large volumes of digital text: Placing tourists, travellers and inhabitants in the english lake district. Annals of the American Association of Geographers, May 2024.
Erum Haris, Anthony G. Cohn, and John G. Stell. Understanding the Spatial Complexity in Landscape Narratives Through Qualitative Representation of Space. In Roger Beecham, Jed A. Long, Dianna Smith, Qunshan Zhao, and Sarah Wise, editors, 12th International Conference on Geographic Information Science (GIScience 2023), volume 277 of Leibniz International Proceedings in Informatics (LIPIcs), pages 37:1-37:6, Dagstuhl, Germany, 2023. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: https://drops.dagstuhl.de/opus/volltexte/2023/18932.
Erum Haris, Anthony G. Cohn, and John G. Stell. Exploring spatial representations in the historical lake district texts with llm-based relation extraction. In Xuke Hu, Ross Purves, Ludovic Moncla, Jens Kersten, and Kristin Stock, editors, Proceedings of The GeoExT 2024: Geographic Information Extraction from Texts Workshop co-located with The 46th European Conference on Information Retrieval (ECIR), Glasgow, Scotland, March 24, 2024, volume 3683 of CEUR Workshop Proceedings, pages 63-73. CEUR-WS.org, 2024. URL: https://ceur-ws.org/Vol-3683/paper9.pdf.
Yingjie Hu, Gengchen Mai, Chris Cundy, Kristy Choi, Ni Lao, Wei Liu, Gaurish Lakhanpal, Ryan Zhenqi Zhou, and Kenneth Joseph. Geo-knowledge-guided gpt models improve the extraction of location descriptions from disaster-related social media messages. International Journal of Geographical Information Science, 37(11):2289-2318, 2023. URL: https://doi.org/10.1080/13658816.2023.2266495.
Yingjie Hu, Xinyue Ye, and Shih-Lung Shaw. Extracting and analyzing semantic relatedness between cities using news articles. International Journal of Geographical Information Science, 31(12):2427-2451, 2017. URL: https://doi.org/10.1080/13658816.2017.1367797.
Werner Kuhn. Core concepts of spatial information for transdisciplinary research. International Journal of Geographical Information Science, 26(12):2267-2276, 2012. URL: https://doi.org/10.1080/13658816.2012.722637.
Corpus of lake district writing, 1622-1900, 2017. URL: https://github.com/UCREL/LakeDistrictCorpus.
Spatial relations ontology (osspr), 2013. URL: https://lov.linkeddata.es/dataset/lov/vocabs/osspr.
Gengchen Mai, Weiming Huang, Jin Sun, Suhang Song, Deepak Mishra, Ninghao Liu, Song Gao, Tianming Liu, Gao Cong, Yingjie Hu, and et al. On the opportunities and challenges of foundation models for geoai (vision paper). ACM Trans. Spatial Algorithms Syst., March 2024. Just Accepted. URL: https://doi.org/10.1145/3653070.
Rohin Manvi, Samar Khanna, Gengchen Mai, Marshall Burke, David Lobell, and Stefano Ermon. Geollm: Extracting geospatial knowledge from large language models. arXiv preprint arXiv:2310.06213, 2023.
Molly Miranker and Alberto Giordano. Text mining and semantic triples: Spatial analyses of text in applied humanitarian forensic research. Digital Geography and Society, 1:100005, 2020. URL: https://doi.org/10.1016/j.diggeo.2020.100005.
Peter Mooney, Wencong Cui, Boyuan Guan, and Levente Juhász. Towards understanding the geospatial skills of chatgpt: Taking a geographic information systems (gis) exam. In Proceedings of the 6th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, GeoAI '23, pages 85-94, New York, NY, 2023. ACM Press. URL: https://doi.org/10.1145/3615886.3627745.
Patricia Murrieta-Flores and Bruno Martins. The geospatial humanities: past, present and future. International Journal of Geographical Information Science, 33(12):2424-2429, 2019. URL: https://doi.org/10.1080/13658816.2019.1645336.
Mark A. Musen. The protégé project: A look back and a look forward. AI Matters, 1(4):4-12, June 2015. URL: https://doi.org/10.1145/2757001.2757003.
Vatsala Nundloll, Robert Smail, Carly Stevens, and Gordon Blair. Automating the extraction of information from a historical text and building a linked data model for the domain of ecology and conservation science. Heliyon, 8(10):e10710, 2022. URL: https://doi.org/10.1016/j.heliyon.2022.e10710.
Ian N. Gregory Olga Chesnokova, Joanna E. Taylor and Ross S. Purves. Hearing the silence: finding the middle ground in the spatial humanities? extracting and comparing perceived silence and tranquillity in the english lake district. International Journal of Geographical Information Science, 33(12):2430-2454, 2019. URL: https://doi.org/10.1080/13658816.2018.1552789.
OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, and et al. Gpt-4 technical report, 2024. URL: https://arxiv.org/abs/2303.08774.
Feng Pan and Jerry R Hobbs. Time ontology in owl. W3C working draft, W3C, 1(1):1, 2006.
Perplexity labs, 2024. URL: https://labs.perplexity.ai/.
Ross S. Purves, Stephan Winter, and Werner Kuhn. Places in information science. Journal of the Association for Information Science and Technology, 70(11):1173-1182, 2019. URL: https://doi.org/10.1002/asi.24194.
Nitin Ramrakhiyani, Vasudeva Varma, and Girish Keshav Palshikar. Extracting orientation relations between geo-political entities from their wikipedia text. In Proceedings of the First International Workshop on Geographic Information Extraction from Texts, GeoExT '23, pages 44-50. CEUR Workshop Proceedings, 2023.
Babak Ranjgar, Abolghasem Sadeghi-Niaraki, Maryam Shakeri, and Soo-Mi Choi. An ontological data model for points of interest (poi) in a cultural heritage site. Heritage Science, 10(1):13, 2022.
Paul Rayson, Alex Reinhold, James Butler, Chris Donaldson, Ian N. Gregory, and Joanna Taylor. A deeply annotated testbed for geographical text analysis: The corpus of lake district writing. In Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities, GeoHumanities '17, pages 9-15, New York, NY, 2017. ACM Press. URL: https://doi.org/10.1145/3149858.3149865.
Simon Scheider and Tom de Jong. A conceptual model for automating spatial network analysis. Transactions in GIS, 26(1):421-458, 2022. URL: https://doi.org/10.1111/tgis.12855.
Evan Sheehan, Chenlin Meng, Matthew Tan, Burak Uzkent, Neal Jean, Marshall Burke, David Lobell, and Stefano Ermon. Predicting economic development using geolocated wikipedia articles. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '19, pages 2698-2706, New York, NY, USA, 2019. Association for Computing Machinery. URL: https://doi.org/10.1145/3292500.3330784.
Alessio Signorini, Alberto Maria Segre, and Philip M Polgreen. The use of twitter to track levels of disease activity and public concern in the us during the influenza a h1n1 pandemic. PloS one, 6(5):e19467, 2011.
Stéfan Sinclair and Geoffrey Rockwell. Voyant tools, 2016. URL: http://voyant-tools.org/.
Robert Smail, Ian Gregory, and Joanna Taylor. Qualitative geographies in digital texts: Representing historical spatial identities in the lake district. Int'ernational Journal of Humanities and Arts Computing, 13(1-2):28-38, 2019.
Erik Steiner, Zephyr Frank, Ian Gregory, David Bodenhamer, and Ignatius Ezeani. Spatio-textual Regions: Extracting Sense of Place from Spatial Narratives. In R Westerholt and F. B. Mocnik, editors, Proceedings of the Fourth International Symposium on Platial Information Science (PLATIAL’23), pages 1-8, Dortmund, Germany;, 2023. PLATIAL'X.
Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, and et al. Gemini: A family of highly capable multimodal models, 2024. URL: https://arxiv.org/abs/2312.11805.
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, and et al. Llama: Open and efficient foundation language models, 2023. URL: https://arxiv.org/abs/2302.13971.
The english lake district, 2017. URL: https://whc.unesco.org/en/list/422/.
Maria Vasardani, Sabine Timpf, Stephan Winter, and Martin Tomko. From descriptions to depictions: A conceptual framework. In Thora Tenbrink, John Stell, Antony Galton, and Zena Wood, editors, Spatial Information Theory, pages 299-319, Cham, 2013. Springer International Publishing.
Barney Warf and Daniel Sui. From gis to neogeography: ontological implications and theories of truth. Annals of GIS, 16(4):197-209, 2010.
Michael F. Worboys. Nearness relations in environmental space. Intelligent Systems with Applications, 15(7), 2001. URL: https://doi.org/10.1080/13658810110061162.
Chenhan Yuan, Qianqian Xie, and Sophia Ananiadou. Zero-shot temporal relation extraction with chatgpt. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 92-102. ACL, 2023. URL: https://doi.org/10.18653/v1/2023.bionlp-1.7.
Yifan Zhang, Cheng Wei, Shangyou Wu, Zhengting He, and Wenhao Yu. Geogpt: Understanding and processing geospatial tasks through an autonomous gpt, 2023. URL: https://arxiv.org/abs/2307.07930.

Semantic Perspectives on the Lake District Writing: Spatial Ontology Modeling and Relation Extraction for Deeper Insights

Authors Erum Haris , Anthony G. Cohn , John G. Stell

File

Document Identifiers

Author Details

Acknowledgements

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Semantic Perspectives on the Lake District Writing: Spatial Ontology Modeling and Relation Extraction for Deeper Insights

Authors Erum Haris , Anthony G. Cohn , John G. Stell

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message