Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 

Applications of Information Extraction, Knowledge Graphs, and Large Language Models

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 30 November 2024 | Viewed by 5128

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Engineering, Central South University, Changsha 410073, China
Interests: information extraction; text mining; knowledge graph

E-Mail Website
Guest Editor
School of Computer Science and Engineering, Central South University, Changsha 410073, China
Interests: text mining; information extraction; knowledge graph

E-Mail Website
Guest Editor
Rare Care Centre, Perth Children's Hospital, Nedlands, WA 6009, Australia
Interests: natural language processing; knowledge graphs; ontologies
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

 Dear Colleagues,

Information extraction (IE), knowledge graphs (KGs), and large language models (LLMs) have emerged as powerful tools for organizing, analyzing, and harnessing the potential of vast amounts of data. This Special Issue aims to explore the synergies and applications of IE, KGs, and LLMs, showcasing their collective impact on information management, knowledge representation, and decision-making processes.

Information extraction involves automatically identifying and extracting structured information from unstructured or semi-structured data sources, such as text documents, websites, social media posts, and scientific literature. Knowledge graphs provide a powerful framework for representing and organizing knowledge, enabling efficient navigation, querying, and inference over interconnected entities and their relationships. Large language models, such as GPT-3.5, have pushed the boundaries of natural language understanding and generation, demonstrating remarkable capabilities in tasks such as text completion, translation, summarization, and question answering.

This Special Issue invites original research papers and reviews that showcase the combined applications, methodologies, and advancements in the field of information extraction, knowledge graphs, and large language models. We welcome submissions on, but not limited to, the following topics:

  1. Knowledge Graph Construction: techniques and methodologies for constructing knowledge graphs from diverse data sources, incorporating the outputs of large language models for improved entity recognition, relation extraction, and ontology design.
  2. Semantic Search and Recommendation Systems: leveraging the power of large language models and knowledge graphs to enhance search engines and recommendation systems, enabling more accurate and context-aware information retrieval and personalized recommendations.
  3. Natural Language Processing (NLP) with Large Language Models: exploring the integration of large language models, such as GPT-3.5, with knowledge graphs for tasks such as question answering, sentiment analysis, summarization, and named entity recognition.
  4. Knowledge Graphs in Healthcare and Life Sciences: harnessing the potential of information extraction, large language models, and knowledge graphs to facilitate biomedical data integration, clinical decision support systems, drug discovery, and personalized medicine.
  5. Industry Applications and Ethical Considerations: Real-world case studies demonstrating the adoption and impact of combined IE, KG, and LLM technologies in various domains such as finance, e-commerce, manufacturing, transportation, and energy. Additionally, papers addressing the ethical implications and responsible use of large language models in knowledge extraction and representation are encouraged.

Papers that showcase innovative approaches, novel algorithms, and practical implementations that advance the state of the art in information extraction, knowledge graphs, and large language models are welcome. We particularly encourage papers that demonstrate the successful deployment of these technologies in real-world scenarios and their impact on decision making, knowledge discovery, and information management.

Dr. Junwen Duan
Dr. Fangfang Li
Dr. Tudor Groza
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • information extraction
  • knowledge graphs
  • large language model
  • natural language processing
  • healthcare applications
  • industry applications
  • data integration

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

29 pages, 1340 KiB  
Article
Applied Hedge Algebra Approach with Multilingual Large Language Models to Extract Hidden Rules in Datasets for Improvement of Generative AI Applications
by Hai Van Pham and Philip Moore
Information 2024, 15(7), 381; https://doi.org/10.3390/info15070381 (registering DOI) - 29 Jun 2024
Viewed by 160
Abstract
Generative AI applications have played an increasingly significant role in real-time tracking applications in many domains including, for example, healthcare, consultancy, dialog boxes (common types of window in a graphical user interface of operating systems), monitoring systems, and emergency response. This paper considers [...] Read more.
Generative AI applications have played an increasingly significant role in real-time tracking applications in many domains including, for example, healthcare, consultancy, dialog boxes (common types of window in a graphical user interface of operating systems), monitoring systems, and emergency response. This paper considers generative AI and presents an approach which combines hedge algebra and a multilingual large language model to find hidden rules in big data for ChatGPT. We present a novel method for extracting natural language knowledge from large datasets by leveraging fuzzy sets and hedge algebra to extract these rules, presented in meta data for ChatGPT and generative AI applications. The proposed model has been developed to minimize the computational and staff costs for medium-sized enterprises which are typically resource and time limited. The proposed model has been designed to automate question–response interactions for rules extracted from large data in a multiplicity of domains. The experimental results show that the proposed model performs well using datasets associated with specific domains in healthcare to validate the effectiveness of the proposed model. The ChatGPT application in case studies of healthcare is tested using datasets for English and Vietnamese languages. In comparative experimental testing, the proposed model outperformed the state of the art, achieving in the range of 96.70–97.50% performance using a heart dataset. Full article
22 pages, 1425 KiB  
Article
Towards Reliable Healthcare LLM Agents: A Case Study for Pilgrims during Hajj
by Hanan M. Alghamdi and Abeer Mostafa
Information 2024, 15(7), 371; https://doi.org/10.3390/info15070371 - 26 Jun 2024
Viewed by 286
Abstract
There is a pressing need for healthcare conversational agents with domain-specific expertise to ensure the provision of accurate and reliable information tailored to specific medical contexts. Moreover, there is a notable gap in research ensuring the credibility and trustworthiness of the information provided [...] Read more.
There is a pressing need for healthcare conversational agents with domain-specific expertise to ensure the provision of accurate and reliable information tailored to specific medical contexts. Moreover, there is a notable gap in research ensuring the credibility and trustworthiness of the information provided by these healthcare agents, particularly in critical scenarios such as medical emergencies. Pilgrims come from diverse cultural and linguistic backgrounds, often facing difficulties in accessing medical advice and information. Establishing an AI-powered multilingual chatbot can bridge this gap by providing readily available medical guidance and support, contributing to the well-being and safety of pilgrims. In this paper, we present a comprehensive methodology aimed at enhancing the reliability and efficacy of healthcare conversational agents, with a specific focus on addressing the needs of Hajj pilgrims. Our approach leverages domain-specific fine-tuning techniques on a large language model, alongside synthetic data augmentation strategies, to optimize performance in delivering contextually relevant healthcare information by introducing the HajjHealthQA dataset. Additionally, we employ a retrieval-augmented generation (RAG) module as a crucial component to validate uncertain generated responses, which improves model performance by 5%. Moreover, we train a secondary AI agent on a well-known health fact-checking dataset and use it to validate medical information in the generated responses. Our approach significantly elevates the chatbot’s accuracy, demonstrating its adaptability to a wide range of pilgrim queries. We evaluate the chatbot’s performance using quantitative and qualitative metrics, highlighting its proficiency in generating accurate responses and achieve competitive results compared to state-of-the-art models, in addition to mitigating the risk of misinformation and providing users with trustworthy health information. Full article
Show Figures

Figure 1

10 pages, 4787 KiB  
Article
Detecting the Use of ChatGPT in University Newspapers by Analyzing Stylistic Differences with Machine Learning
by Min-Gyu Kim and Heather Desaire
Information 2024, 15(6), 307; https://doi.org/10.3390/info15060307 - 25 May 2024
Viewed by 807
Abstract
Large language models (LLMs) have the ability to generate text by stringing together words from their extensive training data. The leading AI text generation tool built on LLMs, ChatGPT, has quickly grown a vast user base since its release, but the domains in [...] Read more.
Large language models (LLMs) have the ability to generate text by stringing together words from their extensive training data. The leading AI text generation tool built on LLMs, ChatGPT, has quickly grown a vast user base since its release, but the domains in which it is being heavily leveraged are not yet known to the public. To understand how generative AI is reshaping print media and the extent to which it is being implemented already, methods to distinguish human-generated text from that generated by AI are required. Since college students have been early adopters of ChatGPT, we sought to study the presence of generative AI in newspaper articles written by collegiate journalists. To achieve this objective, an accurate AI detection model is needed. Herein, we analyzed university newspaper articles from different universities to determine whether ChatGPT was used to write or edit the news articles. We developed a detection model using classical machine learning and used the model to detect AI usage in the news articles. The detection model showcased a 93% accuracy in the training data and had a similar performance in the test set, demonstrating effectiveness in AI detection above existing state-of-the-art detection tools. Finally, the model was applied to the task of searching for generative AI usage in 2023, and we found that ChatGPT was not used to revise articles to any appreciable measure to write university news articles at the schools we studied. Full article
Show Figures

Figure 1

20 pages, 4194 KiB  
Article
Do Large Language Models Show Human-like Biases? Exploring Confidence—Competence Gap in AI
by Aniket Kumar Singh, Bishal Lamichhane, Suman Devkota, Uttam Dhakal and Chandra Dhakal
Information 2024, 15(2), 92; https://doi.org/10.3390/info15020092 - 6 Feb 2024
Viewed by 2072
Abstract
This study investigates self-assessment tendencies in Large Language Models (LLMs), examining if patterns resemble human cognitive biases like the Dunning–Kruger effect. LLMs, including GPT, BARD, Claude, and LLaMA, are evaluated using confidence scores on reasoning tasks. The models provide self-assessed confidence levels before [...] Read more.
This study investigates self-assessment tendencies in Large Language Models (LLMs), examining if patterns resemble human cognitive biases like the Dunning–Kruger effect. LLMs, including GPT, BARD, Claude, and LLaMA, are evaluated using confidence scores on reasoning tasks. The models provide self-assessed confidence levels before and after responding to different questions. The results show cases where high confidence does not correlate with correctness, suggesting overconfidence. Conversely, low confidence despite accurate responses indicates potential underestimation. The confidence scores vary across problem categories and difficulties, reducing confidence for complex queries. GPT-4 displays consistent confidence, while LLaMA and Claude demonstrate more variations. Some of these patterns resemble the Dunning–Kruger effect, where incompetence leads to inflated self-evaluations. While not conclusively evident, these observations parallel this phenomenon and provide a foundation to further explore the alignment of competence and confidence in LLMs. As LLMs continue to expand their societal roles, further research into their self-assessment mechanisms is warranted to fully understand their capabilities and limitations. Full article
Show Figures

Figure 1

Review

Jump to: Research

53 pages, 6188 KiB  
Review
A Survey of Text-Matching Techniques
by Peng Jiang and Xiaodong Cai
Information 2024, 15(6), 332; https://doi.org/10.3390/info15060332 - 5 Jun 2024
Viewed by 419
Abstract
Text matching, as a core technology of natural language processing, plays a key role in tasks such as question-and-answer systems and information retrieval. In recent years, the development of neural networks, attention mechanisms, and large-scale language models has significantly contributed to the advancement [...] Read more.
Text matching, as a core technology of natural language processing, plays a key role in tasks such as question-and-answer systems and information retrieval. In recent years, the development of neural networks, attention mechanisms, and large-scale language models has significantly contributed to the advancement of text-matching technology. However, the rapid development of the field also poses challenges in fully understanding the overall impact of these technological improvements. This paper aims to provide a concise, yet in-depth, overview of the field of text matching, sorting out the main ideas, problems, and solutions for text-matching methods based on statistical methods and neural networks, as well as delving into matching methods based on large-scale language models, and discussing the related configurations, API applications, datasets, and evaluation methods. In addition, this paper outlines the applications and classifications of text matching in specific domains and discusses the current open problems that are being faced and future research directions, to provide useful references for further developments in the field. Full article
Show Figures

Figure 1

Back to TopTop