Abstract
This paper demonstrates a successful application of a Fuzzy Bayes machine-learning tool for classifying large amounts of narrative text, involving the use of ROC curves to identify optimum prediction threshold values at which to filter predictions for manual review. Different thresholds were used for different categories to optimize results and effectively minimize resources necessary for manual coding of a large number of claims narratives randomly extracted from a large U.S. insurer. The results indicated that utilizing a computer approach with strategic assignment of manual narratives filtered out approximately 15% of the narratives for manual review and resulted in a final accuracy at the two-digit classification level of 81%.
Chapter PDF
Similar content being viewed by others
References
Wellman, H., Lehto, M., Sorock, G.: Computerized coding of injury narrative data from the National Health Interview Survey. Accid. Anal. Prev 36, 165–171 (2004)
Lehto, M., Sorock, G.: Machine learning of motor vehicle accident categories from narrative data. Methods Info Med. 35(4-5), 309–316 (1996)
Sorock, G., Ranney, T., Lehto, M.: Motor vehicle crashes in roadway construction work zones: an analysis using narrative text from insurance claims. Accid. Anal. Prev. 28, 131–138 (1996)
Lehto, M.R.: TextMiner Manual ConsumerResearch, Inc., Ann Arbor, MI (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Corns, H.L., Marucci, H.R., Lehto, M.R. (2007). Development of an Approach for Optimizing the Accuracy of Classifying Claims Narratives Using a Machine Learning Tool (TEXTMINER[4]). In: Smith, M.J., Salvendy, G. (eds) Human Interface and the Management of Information. Methods, Techniques and Tools in Information Design. Human Interface 2007. Lecture Notes in Computer Science, vol 4557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73345-4_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-73345-4_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73344-7
Online ISBN: 978-3-540-73345-4
eBook Packages: Computer ScienceComputer Science (R0)