Hierarchical Text Classification of Autopsy Reports to Determine MoD and CoD Through Term-Based and Concepts-Based Features

Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Al-Garadi, Mohammed Ali; Rajandram, Retnagowri; Shaikh, Khairunisa

doi:10.1007/978-3-319-62701-4_16

Ghulam Mujtaba^14,16,
Liyana Shuib¹⁴,
Ram Gopal Raj¹⁴,
Mohammed Ali Al-Garadi¹⁴,
Retnagowri Rajandram¹⁵ &
…
Khairunisa Shaikh¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10357))

Included in the following conference series:

Industrial Conference on Data Mining

2012 Accesses
3 Citations

Abstract

Nowadays, text classification has been extensively employed in medical domain to classify free text clinical reports. In this study, text classification techniques have been used to determine cause of death from free text forensic autopsy reports using proposed term-based and SNOMED CT concept-based features. In this study, detailed term-based features and concept-based features were extracted from a set of 1500 forensic autopsy reports belonging to four manners of death and 16 different causes of death. These features were used to train text classifier. The classifier was deployed in cascade architecture: the first level will predict the manner of death and the second level will predict the CoD using proposed term-based and SNOMED CT concept-based features. Moreover, to show the significance of our proposed approach, we compared the results of our proposed approach with four state-of-the-art feature extraction approaches. Finally, we also presented the comparison of one-level classification versus two-level classification. The experimental results showed that our proposed approach showed 8% improvement in accuracy as compared to other four baselines. Moreover, two-level classification showed improved accuracy in determining CoD compared to one-level classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Text Mining Models to Predict Brain Deaths Using X-Rays Clinical Notes

Morbidity Detection from Clinical Text Data Using Artificial Intelligence Technique

A Study on the Classification of Chinese Medicine Records Using BERT, Chest Impediment as an Example

References

Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39, 103–134 (2000)
Article MATH Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34, 1–47 (2002)
Article Google Scholar
Lewis, D.D.: Feature selection and feature extraction for text categorization. In: Proceedings of the Workshop on Speech and Natural Language, pp. 212–217(1992)
Google Scholar
Markov, A., Last, M., Kandel, A.: The hybrid representation model for web document classification. International Journal of Intelligent Systems 23, 654–679 (2008)
Article MATH Google Scholar
Al-garadi, M.A., Varathan, K.D., Ravana, S.D.: Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network. Computers in Human Behavior 63, 433–443 (2016)
Article Google Scholar
Mujtaba, G., Shuib, L., Raj, R. G., Rajandram, R., Shaikh, K.: Automatic Text Classification of ICD-10 Related CoD from Complex and Free Text Forensic Autopsy Reports. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1055–1058
Google Scholar
Mujtaba, G., Shuib, L., Raj, R.G., Rajandram, R., Shaikh, K., Al-Garadi, M.A.: Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection. PloS one 12, e0170242 (2017)
Article Google Scholar
James, S. H., Nordby, J. J., Bell, S.:Forensic science: an introduction to scientific and investigative techniques. CRC press (2002)
Google Scholar
Yeow, W.L., Mahmud, R., Raj, R.G.: An application of case-based reasoning with machine learning for forensic autopsy. Expert Systems with Applications 41, 3497–3505 (2014)
Article Google Scholar
Koopman, B., Zuccon, G., Nguyen, A., Bergheim, A., Grayson, N.: Automatic ICD-10 classification of cancers from free-text death certificates. International Journal of Medical Informatics 84, 956–965 (2015)
Article Google Scholar
Dias, R., Salvini, R., Nierenberg, A., Lafer, B.: Machine learning approach with baseline clinical data forecasting depression relapse in bipolar disorder. Bipolar Disorders 18, 103–103 (2016)
Google Scholar
Farooq, K., Hussain, A.: A novel ontology and machine learning driven hybrid cardiovascular clinical prognosis as a complex adaptive clinical system. Complex Adaptive Systems Modeling 4, 21 (2016)
Article Google Scholar
Galli, M., Zoppis, I., Smith, A., Magni, F., Mauri, G.: Machine learning approaches in MALDI-MSI: clinical applications. Expert Review of Proteomics 13, 685–696 (2016)
Article Google Scholar
Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)
Article Google Scholar
Passalis, N., Tefas, A.: Entropy optimized feature-based bag-of-words representation for information retrieval. IEEE Transactions on Knowledge and Data Engineering 28, 1664–1677 (2016)
Article Google Scholar
Le, Q.V., Mikolov, T.: Distributed Representations of Sentences and Documents. In: ICML, pp. 1188–1196 (2014)
Google Scholar
Enríquez, F., Troyano, J.A., López-Solaz, T.: An approach to the use of word embeddings in an opinion classification task. Expert Systems with Applications 66, 1–6 (2016)
Article Google Scholar
Jouhet, V., Defossez, G., Burgun, A., Le Beux, P., Levillain, P., Ingrand, P., et al.: Automated classification of free-text pathology reports for registration of incident cases of cancer. Methods of Information in Medicine 51, 242 (2012)
Article Google Scholar
Danso, S., Atwell, E., Johnson, O.: Linguistic and statistically derived features for cause of death prediction from verbal autopsy text. In: Gurevych, I., Biemann, C., Zesch, T. (eds.) GSCL 2013. LNCS, vol. 8105, pp. 47–60. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40722-2_5
Chapter Google Scholar
Danso, S., Atwell, E., Johnson, O.: A comparative study of machine learning methods for verbal autopsy text classification (2014). arXiv preprint arXiv:1402.4380
Siddiqui, M.F., Reza, A.W., Kanesan, J.: An automated and intelligent medical decision support system for brain MRI scans classification. PloS One 10, e0135875 (2015)
Article Google Scholar
Al-garadi, M.A., Khan, M.S., Varathan, K.D., Mujtaba, G., Al-Kabsi, A.M.: Using online social networks to track a pandemic: A systematic review. Journal of Biomedical Informatics 62, 1–11 (2016)
Article Google Scholar
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)
Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Information Processing & Management 45, 427–437 (2009)
Article Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). doi:10.1007/BFb0026683
Chapter Google Scholar
Xu, B., Guo, X., Ye, Y., Cheng, J.: An Improved Random Forest Classifier for Text Categorization. JCP 7, 2913–2920 (2012)
Google Scholar
Dreiseitl, S., Ohno-Machado, L., Kittler, H., Vinterbo, S., Billhardt, H., Binder, M.: A comparison of machine learning methods for the diagnosis of pigmented skin lesions. Journal of Biomedical Informatics 34, 28–36 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Ghulam Mujtaba, Liyana Shuib, Ram Gopal Raj & Mohammed Ali Al-Garadi
Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
Retnagowri Rajandram & Khairunisa Shaikh
Department of Computer Science, Sukkur Institute of Business Administration, Sukkur, Pakistan
Ghulam Mujtaba

Authors

Ghulam Mujtaba
View author publications
You can also search for this author in PubMed Google Scholar
Liyana Shuib
View author publications
You can also search for this author in PubMed Google Scholar
Ram Gopal Raj
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Ali Al-Garadi
View author publications
You can also search for this author in PubMed Google Scholar
Retnagowri Rajandram
View author publications
You can also search for this author in PubMed Google Scholar
Khairunisa Shaikh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ram Gopal Raj .

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mujtaba, G., Shuib, L., Raj, R.G., Al-Garadi, M.A., Rajandram, R., Shaikh, K. (2017). Hierarchical Text Classification of Autopsy Reports to Determine MoD and CoD Through Term-Based and Concepts-Based Features. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2017. Lecture Notes in Computer Science(), vol 10357. Springer, Cham. https://doi.org/10.1007/978-3-319-62701-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-62701-4_16
Published: 01 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62700-7
Online ISBN: 978-3-319-62701-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hierarchical Text Classification of Autopsy Reports to Determine MoD and CoD Through Term-Based and Concepts-Based Features

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Text Mining Models to Predict Brain Deaths Using X-Rays Clinical Notes

Morbidity Detection from Clinical Text Data Using Artificial Intelligence Technique

A Study on the Classification of Chinese Medicine Records Using BERT, Chest Impediment as an Example

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Hierarchical Text Classification of Autopsy Reports to Determine MoD and CoD Through Term-Based and Concepts-Based Features

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Text Mining Models to Predict Brain Deaths Using X-Rays Clinical Notes

Morbidity Detection from Clinical Text Data Using Artificial Intelligence Technique

A Study on the Classification of Chinese Medicine Records Using BERT, Chest Impediment as an Example

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation