Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Semi-Automatic Building and Learning of a Multilingual Ontology

Published: 18 November 2023 Publication History

Abstract

Most online platforms, applications, and Websites use a massive amount of heterogeneous evolving data. These data must be structured and normalized before integration to improve the search and increase the relevance of results. An ontology can address this critical task by efficiently managing data and providing structured formats through techniques such as the Web Ontology Language (OWL). However, building an ontology can be costly, primarily if conducted manually. In this context, we propose a new methodology for automatically building and learning a multilingual ontology using Arabic as the base language via a corpus collected from Wikipedia. Our proposed methodology relies on Finite-state transducers (FSTs). FSTs are regrouped into a cascade to reduce errors and minimize ambiguity. The produced ontology is extended to English and French and independent language images via a translator we developed using APIs. The rationale for starting with the Arabic corpus to extract terms is that entity linking is more convenient from Arabic to other languages. In addition, many Wikipedia articles in English and French (for instance) do not have associated Arabic articles, but the opposite is true. In addition, dealing with Arabic terms permits us to enrich the Arabic module of the free linguistic platform we use in dictionaries and graphs. To assess the efficiency of our proposed methodology, we conducted performance metrics. The reported results are encouraging and promising.

References

[1]
Marlon A. Altamirano Di Luca and Neilys González Benítez. 2020. Comparative study of RDF and OWL ontology languages as support for the semantic web. In Proceedings of the Applied Technologies. Miguel Botto-Tobar, Marcelo Zambrano Vizuete, Pablo Torres-Carrión, Sergio Montes León, Guillermo Pizarro Vásquez, and Benjamin Durakovic (Eds.), Springer International Publishing, Cham, 3–12.
[2]
Wissam Antoun, Fady Baly, and Hazem Hajj. 2021. AraBERT: Transformer-based Model for Arabic Language Understanding. arXiv:2003.00104. Retrieved from https://arxiv.org/abs/2003.00104
[3]
Mahdi Bidar and Malek Mouhoub. 2022. Nature-inspired techniques for dynamic constraint satisfaction problems. Operations Research Forum 3, 2 (2022), 1–28. DOI:
[4]
Stephan Bloehdorn, Peter Haase, Zhisheng Huang, York Sure, Johanna Völker, Frank van Harmelen, and Rudi Studer. 2009. Ontology management. In Proceedings of the Semantic Knowledge Management: Integrating Ontology Management, Knowledge Discovery, and Human Language Technologies. John Davies, Marko Grobelnik, and Dunja Mladenić (Eds.). Springer, Berlin,3–20. DOI:
[5]
Paul Buitelaar, Philipp Cimiano, and Bernardo Magnini. 2005. Ontology learning from text: An overview. Ontology Learning from Text: Methods, Evaluation, and Applications 123 (2005), 3–12.
[6]
Guus Schreiber and Yves Raimond. 2014. RDF 1.1 Primer. W3C Working Group Note 24. Retrieved from http://www.w3.org/TR/rdf11-primer/
[7]
Mariano Fernández-López, Asunción Gómez-Pérez, and Natalia Juristo. 1997. Methontology: From ontological art towards ontological engineering. In Proceedings of the AAAI Spring Symposium. American Association for Artificial Intelligence, Palo Alto, 33–40.
[8]
Rania M. Ghoniem, Nawal Alhelwa, and Khaled Shaalan. 2019. A novel hybrid genetic-whale optimization model for ontology learning from Arabic text. Algorithms 12, 9 (2019), 182.
[9]
Mirna El Ghosh, Hala Abou Naja, Habib Abdulrab, and Md. Khalil. 2017. Ontology learning process as a bottom-up strategy for building domain-specific ontology from legal texts. In Proceedings of the International Conference on Agents and Artificial Intelligence. LNCS, Porto, Portugal, 473–480.
[10]
Asunción Gómez-Pérez, David Manzano-Macho, E. Alfonseca, R. Núñez, I. Blacoe, S. Staab, O. Corcho, Y. Ding, J. Paralic, and R Troncy. 2003. Deliverable 1.5: A Survey of Ontology Learning Methods and Techniques. IST Project IST-2000-29243, OntoWeb: Ontology-based Information Exchange for Knowledge Management and Electronic Commerce.
[11]
Thomas R. Gruber. 1993. A translation approach to portable ontology specifications. Knowledge Acquisition 5, 2 (1993), 199–220.
[12]
Nicola Guarino, Daniel Oberle, and Steffen Staab. 2009. What is an ontology? In Proceedings of the Handbook on Ontologies. Springer, Berlin, 1–17.
[13]
Imane Guellil, Houda Saâdane, Faical Azouaou, Billel Gueni, and Damien Nouvel. 2019. Arabic natural language processing: An overview. Journal of King Saud University-Computer and Information Sciences 33, 5 (2019), 497–507.
[14]
Peter Haase and Johanna Völker. 2006. Ontology learning and reasoning–dealing with uncertainty and inconsistency. In Proceedings of the Uncertainty Reasoning for the Semantic Web I. Springer, Berlin, 366–384.
[15]
Amjad Hawash and Kamal Irshaid. 2015. Automating the creation of Arabic ontologies: A preliminary study. In Proceedings of the 4th Palestinian International Conference on Computer and Information Technology (PICCIT’15). Hebron.
[16]
Maryam Hazman, Samhaa R. El-Beltagy, and Ahmed Rafea. 2011. A survey of ontology learning approaches. International Journal of Computer Applications 22, 9 (2011), 36–43.
[17]
John H. Holland. 1992. Genetic algorithms. Scientific American 267, 1 (1992), 66–73.
[18]
Ali Asghar Rahmani Hosseinabadi, Mehdi Sadeghilalimi, Morteza Babazadeh Shareh, Malek Mouhoub, and Samira Sadaoui. 2022. Whale optimization-based prediction for medical diagnostic. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence, ICAART 2022.Ana Paula Rocha, Luc Steels, and H. Jaap van den Herik (Eds.), SCITEPRESS, 211–217. DOI:
[19]
Holger Knublauch, Daniel Oberle, Phil Tetlow, Evan Wallace, J. Z. Pan, and M. Uschold. 2006. A Semantic Web Primer for Object-oriented Software Developers. W3C Working Group Note 9. Retrieved from http://www.w3.org/TR/sw-oosd-primer/
[20]
Jens Lehmann and Johanna Voelker. 2014. An introduction to ontology learning. In Proceedings of the Perspectives on Ontology Learning. Jens Lehmann and Johanna Voelker (Eds.), AKA/IOS, ix–xvi. Retrieved from http://jens-lehmann.org/files/2014/pol_introduction.pdf
[21]
Frank Manola, Eric Miller, and Brian McBride. 2004. RDF primer. W3C Recommendation 10, 1-107 (2004), 6. Retrieved from http://www.w3.org/TR/rdf-primer/
[22]
Deborah L. McGuinness, Frank van Harmelen, et al. 2004. OWL web ontology language overview. W3C Recommendation 10, 10 (2004). Retrieved from http://www.w3.org/TR/2004/REC-owl-features-20040210/
[23]
Fatma Ben Mesmia, Kais Haddar, Nathalie Friburger, and Denis Maurel. 2018. CasANER: Arabic named entity recognition tool. In Proceedings of the Intelligent Natural Language Processing: Trends and Applications. Springer, Berlin, 173–198.
[24]
Fatma Ben Mesmia and Malek Mouhoub. 2021. A web-based communication tool for Arabic-speaking newcomers to Canada. In Proceedings of the 14th International Conference on Advances in Computer-Human Interactions.40–47.
[25]
Fatma Ben Mesmia, Fatma Zid, Kais Haddar, and Denis Maurel. 2017. ASRextractor: A tool extracting semantic relations between Arabic named entities. Procedia Computer Science 117 (2017), 55–62.
[26]
Eric Miller. 1998. An introduction to the resource description framework. Bulletin of the American Society for Information Science and Technology 25, 1 (1998), 15–19.
[27]
Seyedali Mirjalili and Andrew Lewis. 2016. The whale optimization algorithm. Advances in Engineering Software 95 (2016), 51–67. DOI:
[28]
Rani Nelken and Stuart Shieber. 2005. Arabic diacritization using weighted finite-state transducers. In Proceedings of the 2005 ACL Workshop on Computational Approaches to Semitic Languages. Association for Computational Linguistics. 79–86.
[29]
Nouha Omrane, Adeline Nazarenko, and Sylvie Szulman. 2011. From linguistics to ontologies - the role of named entities in the conceptualisation process. In Proceedings of the International Conference on Knowledge Engineering and Ontology Development. SCITEPRESS, Paris, France, 249–254.
[30]
Zineb Kheira Bousmaha Ossoukine, Hafsa Oulhaci, and Lamia Hadrich Belguith. 2020. AR2Concept automatic extraction concepts from Arabic text language. International Journal of Computing and Digital Systems 10 (2020), 3–10.
[31]
Olaide N. Oyelade and Absalom E. Ezugwu. 2020. COVID-19: A Natural Language Processing and Ontology Oriented Temporal Case-based Framework for Early Detection and Diagnosis of Novel Coronavirus. (2020). DOI:https://www.preprints.org/manuscript/202005.0171/v2
[32]
Zeynep ÖZKANLI. 2023. Vowels and classification criteria in modern standard Arabic. İhya Uluslararası İslam Araştırmaları Dergisi, International Journal of Islamic Studies 9, 1 (2023), 1–19. DOI:
[33]
Arfath Pasha, Mohamed Al-Badrashiny, Mona Diab, Ahmed El Kholy, Ramy Eskander, Nizar Habash, Manoj Pooleery, Owen Rambow, and Ryan M. Roth. 2014. Madamira: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In Proceedings of the Language Resources and Evaluation Conference (LREC’14). European Language Resources Association, Reykjavik, Iceland, 1094–1101.
[34]
Sébastien Paumier, Friederike Malchok, Clemens Marschner, Claude Martineau, Cristian Martínez, Denis Maurel, Sebastian Nagel, Alexis Neme, Maxime Petit, and Johannes Stiehler. 2002. UNITEX 3.2. (2002). Retrieved from https://unitexgramlab.org/
[35]
V. Petrov. 2011. Chapter VI: Process ontology in the context of applied philosophy. Ontological Landscapes: Recent Thought on Conceptual Interfaces Between Science and Philosophy. Ontos Verlag 1 (2011), 137–156.
[36]
Marta Sabou, Chris Wroe, Carole Goble, and Heiner Stuckenschmidt. 2005. Learning domain ontologies for semantic web service descriptions. Journal of Web Semantics 3, 4 (2005), 340–365.
[37]
Alberto G. Salguero, Pablo Delatorre, Javier Medina, Macarena Espinilla, and Antonio J. Tomeu. 2019. Ontology-based framework for the automatic recognition of activities of daily living using class expression learning techniques. Scientific Programming 2019 (2019), 1–19.
[38]
Chris Welty, Deborah L. McGuinness, and Michael K. Smith. 2004. Owl Web Ontology Language Guide. W3C Recommendation 10. Retrieved from https://www.w3.org/TR/2004/REC-owl-guide-20040210/

Index Terms

  1. Semi-Automatic Building and Learning of a Multilingual Ontology

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 11
    November 2023
    255 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3633309
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 November 2023
    Online AM: 11 September 2023
    Accepted: 07 August 2023
    Revised: 06 June 2023
    Received: 24 April 2022
    Published in TALLIP Volume 22, Issue 11

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Ontology building and learning
    2. finite sate transducer
    3. transducer cascade
    4. API
    5. Arabic NLP

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 122
      Total Downloads
    • Downloads (Last 12 months)61
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 22 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media