It’s difficult for laypeople, even clinicians, to understand eHealth contents found in the web me... more It’s difficult for laypeople, even clinicians, to understand eHealth contents found in the web medical documents. With the objective to build a health search engine, task 2 of 2015 CLEF eHealth aims to detect levels of accuracy of information retrieval systems when searching for web medical documents. In this task, our approach is to integrate a retrieval of medical concepts into the preprocessing of corpora. This means that all terms in the documents that are not related to medicine are removed before indexing. We also expand queries for searching more effectively. In general, our results are not better than other participants’ in doing task 2 except some queries. When using integration of extracting medical concepts and query expansion based on laypeople’s queries, searching retrieval is also lower. It can be explained partly that laypeople’s queries are not commonly included medical terms or only contain features painting their health situations. In addition, we also give a brief...
2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), 2016
On reviewing clinical documents, doctors, researchers and caregivers all expect to know the time ... more On reviewing clinical documents, doctors, researchers and caregivers all expect to know the time when a patient's disorders appear (in the past, present, and future or from the past until now...) in comparison with the time when the documents are written. The information about this period of time is very useful in building a treatment regimen and an inquiry system for the patient, and summarizing the relevant documents. This paper proposes a hybrid approach between the rules and the machine learning to classify the relationship between a patient's disorders and the time of writing clinical documents. The hybrid approach has achieved a result of accuracy 0.5194, which is higher than the best ranking system (0.328) in the ShARE/CLEF eHealth 2014 Evaluation Lab.
2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), 2015
Disease/Disorder Template Filling is a complicated task of relation extraction, requiring a combi... more Disease/Disorder Template Filling is a complicated task of relation extraction, requiring a combination of several methods in order to solve it. The aim of this paper is to propose a combined approach for disorder template filling. The system combined three methods: rule-based, regular expression, and machine learning-based. This system added several features for the machine learning-based method in comparison with the our system that was proposed in Task 2: ShARe/CLEF eHealth Evaluation Lab 2014 [6]. This rule-based set is established on observation of instances of disease/disorder shown the dependency tree presentation. The regular expression used the rules in Heidel Time [2]. The machine learning method used the SVM algorithm to train the classification model based on the features that were added. This addition increased the result of the Doc Time Class attribute up to 6%. The system's result obtained an overall accuracy of 0.833, F1-score of 0.445, a precision of 0.406, and a recall of 0.516.
It’s difficult for laypeople, even clinicians, to understand eHealth contents found in the web me... more It’s difficult for laypeople, even clinicians, to understand eHealth contents found in the web medical documents. With the objective to build a health search engine, task 2 of 2015 CLEF eHealth aims to detect levels of accuracy of information retrieval systems when searching for web medical documents. In this task, our approach is to integrate a retrieval of medical concepts into the preprocessing of corpora. This means that all terms in the documents that are not related to medicine are removed before indexing. We also expand queries for searching more effectively. In general, our results are not better than other participants’ in doing task 2 except some queries. When using integration of extracting medical concepts and query expansion based on laypeople’s queries, searching retrieval is also lower. It can be explained partly that laypeople’s queries are not commonly included medical terms or only contain features painting their health situations. In addition, we also give a brief...
2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), 2016
On reviewing clinical documents, doctors, researchers and caregivers all expect to know the time ... more On reviewing clinical documents, doctors, researchers and caregivers all expect to know the time when a patient's disorders appear (in the past, present, and future or from the past until now...) in comparison with the time when the documents are written. The information about this period of time is very useful in building a treatment regimen and an inquiry system for the patient, and summarizing the relevant documents. This paper proposes a hybrid approach between the rules and the machine learning to classify the relationship between a patient's disorders and the time of writing clinical documents. The hybrid approach has achieved a result of accuracy 0.5194, which is higher than the best ranking system (0.328) in the ShARE/CLEF eHealth 2014 Evaluation Lab.
2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), 2015
Disease/Disorder Template Filling is a complicated task of relation extraction, requiring a combi... more Disease/Disorder Template Filling is a complicated task of relation extraction, requiring a combination of several methods in order to solve it. The aim of this paper is to propose a combined approach for disorder template filling. The system combined three methods: rule-based, regular expression, and machine learning-based. This system added several features for the machine learning-based method in comparison with the our system that was proposed in Task 2: ShARe/CLEF eHealth Evaluation Lab 2014 [6]. This rule-based set is established on observation of instances of disease/disorder shown the dependency tree presentation. The regular expression used the rules in Heidel Time [2]. The machine learning method used the SVM algorithm to train the classification model based on the features that were added. This addition increased the result of the Doc Time Class attribute up to 6%. The system's result obtained an overall accuracy of 0.833, F1-score of 0.445, a precision of 0.406, and a recall of 0.516.
Uploads
Papers by Quoc HO