Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
abstract

Semantic Search as Inference: Applications in Health Informatics

Published: 23 December 2014 Publication History
  • Get Citation Alerts
  • Abstract

    In this thesis, we present models for semantic search: Information Retrieval (IR) models that elicit the meaning behind the words found in documents and queries rather than simply matching keywords. This is achieved by the integration of structured domain knowledge and data-driven information retrieval methods.
    The research is set within health informatics to tackle the unique challenges within this domain; specifically, how to bridge the 'semantic gap'; that is, how to overcome the mismatch between raw medical data and the way human beings interpret it. Bridging the semantic gap involves addressing two issues: semantics; that is, aligning the meaning or concepts behind words found in documents and queries; and leveraging inference, which utilises semantics to infer relevant information.
    Three semantic search models -- all utilising concept-based rather than term-based representations---are developed; these include: the Bag-of-concepts model, which utilises concepts from the SNOMED CT medical ontology as its underlying representation; the Graph-based Concept Weighting model, which captures concept dependence and importance in a novel weighting function; and the core contribution of the thesis, the Graph INference model (GIN): a unified theoretical model of semantic search as inference, achieved by the integration of structured domain knowledge (ontologies) and statistical, information retrieval methods. It is the GIN that provides the necessary mechanism for inference to bridge the semantic gap. All three models are empirically evaluated using clinical queries and a real-world collection of clinical records taken from the TREC Medical Records Track (MedTrack).
    Our evaluation shows that the use of concept-based representations in the Bag-of-concepts model leads to improved retrieval effectiveness. When concepts are combined within the Graph-based ConceptWeighting model, further improvements are possible. The evaluation of GIN highlighted that its inference mechanism is suited to hard queries -- those that perform poorly on a term-based system. In-depth analysis also revealed that the GIN returned many new documents not retrieved by term-based systems and therefore never evaluated for relevance as part of the TREC MedTrack. This highlights that using current IR test collections, where semantic search systems did not contribute to the pool, may underestimate the effectiveness of semantic search systems.
    This work represents a significant step forward in the integration of structured domain knowledge and data-driven information retrieval methods. Furthermore, the thesis provides an understanding of inference -- when and how it should be applied for effective semantic search. It shows that queries with certain characteristics benefit from inference, while others do not. The detailed investigation into the evaluation of semantic search systems shows how current IR test collections may underestimate effectiveness of such systems and new techniques for evaluation are suggested. The Graph Inference model, although developed within the medical domain, is generally defined and has implications in other areas, including web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.

    Cited By

    View all
    • (2024)Result Assessment Tool: Software to Support Studies Based on Data from Search EnginesAdvances in Information Retrieval10.1007/978-3-031-56069-9_19(206-211)Online publication date: 24-Mar-2024
    • (2023)Result Assessment Tool: A Software Toolkit for Conducting Studies Based on Search ResultsProceedings of the Association for Information Science and Technology10.1002/pra2.97260:1(1143-1145)Online publication date: 22-Oct-2023
    • (2014)Medical Free-Text to Concept Mapping as an Information Retrieval ProblemProceedings of the 19th Australasian Document Computing Symposium10.1145/2682862.2682880(93-96)Online publication date: 26-Nov-2014
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGIR Forum
    ACM SIGIR Forum  Volume 48, Issue 2
    December 2014
    107 pages
    ISSN:0163-5840
    DOI:10.1145/2701583
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 December 2014
    Published in SIGIR Volume 48, Issue 2

    Check for updates

    Qualifiers

    • Abstract

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Result Assessment Tool: Software to Support Studies Based on Data from Search EnginesAdvances in Information Retrieval10.1007/978-3-031-56069-9_19(206-211)Online publication date: 24-Mar-2024
    • (2023)Result Assessment Tool: A Software Toolkit for Conducting Studies Based on Search ResultsProceedings of the Association for Information Science and Technology10.1002/pra2.97260:1(1143-1145)Online publication date: 22-Oct-2023
    • (2014)Medical Free-Text to Concept Mapping as an Information Retrieval ProblemProceedings of the 19th Australasian Document Computing Symposium10.1145/2682862.2682880(93-96)Online publication date: 26-Nov-2014
    • (2014)Document Timespan Normalisation and Understanding Temporality for Clinical Records SearchProceedings of the 19th Australasian Document Computing Symposium10.1145/2682862.2682879(85-88)Online publication date: 26-Nov-2014
    • (2014)Exploiting Inference from Semantic Annotations for Information RetrievalProceedings of the 7th International Workshop on Exploiting Semantic Annotations in Information Retrieval10.1145/2663712.2666197(43-45)Online publication date: 7-Nov-2014

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media