Abstract
Unintentional injury due to falls is a serious and expensive health problem among the elderly. This is especially true in the Veterans Health Administration (VHA) ambulatory care setting, where nearly 40% of the male patients are 65 or older and at risk for falls. Health service researchers and clinicians can utilize VHA administrative data to identify and explore the frequency and nature of fall-related injuries (FRI) to aid in the implementation of clinical and prevention programs. Here we define administrative data as structured (coded) values that are generated as a result clinical services provided to veterans and stored in databases. However, the limitations of administrative data do not always allow for conclusive decision making, especially in areas where coding may be incomplete. This study utilizes data and text mining techniques to investigate if unstructured text-based information included in the electronic medical record can validate and enhance those records in the administrative data that should have been coded as fall-related injuries. The challenges highlighted by this study include data extraction and preparation from administrative sources and the full electronic medical records, de-indentifying the data (to assure HIPAA compliance), conducting chart reviews to construct a “gold standard” dataset, and performing both supervised and unsupervised text mining techniques in comparison with traditional medical chart review.
Similar content being viewed by others
Notes
The VHA Medical SAS Datasets are national administrative data for VHA-provided health care utilized primarily by veterans. The datasets are provided in SAS® format by fiscal year (Oct. 1 - Sept. 30). These data are extracted from the National Patient Care Database (NPCD). For more see http://www.virec.research.va.gov/datasourcesname/Medical-SAS-Datasets/SAS.htm.
An E-code should indicate a fall related injury.
References
Thomas EJ, Studdert DM, Brennan TA (2002) The reliability of medical record review for estimating adverse event rates. Ann Intern Med 136(11):812–816
Baker DW et al (2007) Automated review of electronic health records to assess quality of care for outpatients with heart failure. Ann Intern Med 146(4):270–277
Lee IN, Liao SC, Embrechts M (2000) Data mining techniques applied to medical information. Med Inform Internet Med 25(2):81–102
Hunt P et al (2007) Completeness and accuracy of international classification of disease (ICD) external cause of injury codes in emergency department electronic data. Inj Prev 13(6):422–425
Kannus P et al (1999) Fall-induced injuries and deaths among older adults. JAMA 281(20):1895–1899
Rizzo JA et al (1998) Med Care 36(8):1174–1188
Scuffham P, Chaplin S, Legood R (2003) Incidence and costs of unintentional falls in older people in the United Kingdom. J Epidemiol Community Health 57(9):740–744
Koski K et al (1998) Risk factors for major injurious falls among the home-dwelling elderly by functional abilities. Gerontology 44:232–238
Cesari M et al (2002) Prevalence and risk factors for falls in an older community-dwelling population. J Gerontol A Biol Sci Med Sci 57:722–726
Nevitt MC, Cummings SR, Hudes ES (1991) Risk factors for injurious falls: a prospective study. J Gerontol 46:M164–M170
Nevitt MC, Cummings SR, Kidd S (1989) Risk factors for recurrent nonsyncopal falls: a prospective study. JAMA 261:2663–2668
Rubenstein L, Joephson K (2002) The epidemiology of falls and syncope. In: Kenny RA, Oshea D (eds) Falls and syncope in elderly patients. Clinics in Geriatric Medicine, pp 141–158
Tinetti M, Speechley M, Ginter S (1998) Risk factors for falls among elderly persons living in the community. N Eng J Med 319:1703–1707
Jager T et al (2000) Traumatic brain injuries evaluated in U.S. emergency departments, 1992–1994. Acad Emerg Med 7(2):134–140
Klein R, Stockford D (2000) The changing veteran population: 1999–2020. Office of the DAS for Program and Data Analyses
Luther S et al (2005) Fall-related ambulatory care services in the veterans administration healthcare system. Aging Clin Exp Res 17(5):412–418
Kraft MR, Desouza KC, Androwich I (2003) Data mining in healthcare information systems: case study of a VeteransÂ’ administration spinal cord injury population. In: HICCS, Hawaii
Feldman R, Dagan I (1995) Knowledge discovery in textual databases (KDT). In: Proceeding of 1st international conference on knowledge discovery (KDD-95)
Loh S, Oliveira JPMD, Gameiro MA (2003) Knowledge discovery in texts for constructing decision support systems. Appl Intell 18:357–366
Ribbeck BM, Runge JW, Thomason M (1992) Injury surveillance: a method for recording e codes for injured emergency department patients. Ann Emerg Med 21:37–40
Coben J et al (2001) Completeness of cause of injury coding in healthcare administrative databases in the United States. Inj Prev 12(3):199–201
Lawrence B et al (2007) Issues in using state hospital discharge data in injury control research and surveillance. Accid Anal Prev 39(2):319–325
The American Geriatrics Society (2001) B.G.s.a.A.A.o.O.S.p.o.F.P., Guideline for the prevention of falls in older persons. J Am Geriatr Assoc 49(5):664–672
Nguyen TV, Eisman JA, Kelly PJ, Sambrook PN (1996) Risk factors for osteoporotic fractures in elderly men. Am J Epidemiol 144(3):255–263
Kraft MR, Desouza KC, Androwich I (2003) Data mining in healthcare information systems: case study of a Veterans’ administration spinal cord injury population. In: Proceedings of the 36th Hawaii international conference on system sciences, Hawaii
Rubenstein LZ, Josephson KR, Robbins AS (1994) Falls in the nursing home. Ann Intern Med 121(6):442–451
Rubenstein LZ, Powers CM, MacLean CH (2001) Quality indicators for the management and prevention of falls and mobility problems in vulnerable elders. Ann Intern Med 135(8, Part 2):686–693
Nevitt MC (1997) Falls in the elderly: risk factors and prevention. In: ed. Masdeu JC SL, Wolfson L (eds) Gait disorders in aging. Lippincott-Raven, Philadelphia
Yates JS et al (2002) Falls in community-dwelling stroke survivors: an accumulated impairments model. J Rehabil Res Dev 39:385–393
Evans DA, Patel VL (1992) Advanced models of cognition for medical training and practice: proceedings of the NATO advanced research workshop on advanced models of cognition for medical training and practice, held at Il Ciocco, Barga, Italy, June 19–22, 1991. Springer
Stead WW et al (1994) Designing medical informatics research and library-resource projects to increase what is learned. J Am Med Inform Assoc 1(1):28–33
Hripcsak G, Rothschild AS (2005) Agreement, the F-measure, and reliability in information retrieval. J Am Med Inform Assoc 12(3):296–298
Ware H, Mullett CJ, Jagannathan V (2009) Natural language processing framework to assess clinical conditions. J Am Med Inform Assoc 16(4):585–589
Brown SHE et al (2008) eQuality for all: extending automated quality measurement of free text clinical narratives. In: AMIA 2008, Washington DC
Unified Medical Language System. Available from: http://www.nlm.nih.gov/research/umls
Woodfield T (2003) Text mining using SAS Software course notes
Wei C-P, Yang CC, Lin C-M (2008) A latent semantic indexing-based approach to multilingual document clustering. Decis Support Syst 45(3):606–620
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
Berry MW, Browne M (1999) Understanding search engines: mathematical modeling and text retrieval. P.S.f.I.a.A. Mathematics
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:398–403
Han J, Kamber M (2001) Data mining: concepts and techniques. The Morgan Kaufmann Series in Data Management. M.K. Publishers. San Diego,
Spangler S, Kreulen JT (2008) Mining the talk: unlocking the business value in unstructured information. IBM Press/Pearson plc, Upper Saddle River, xix, 217 pp
Dash M, Liu H, Yao J (1997) Dimensionality reduction of unsupervised data. In: 9th International conference on tools with artificial intelligence. IEEE Computer Society, Washington DC, New Port Beach, CA
Tremblay MC, Berndt DJ, Studnicki J (2006) Feature selection for predicting surgical outcomes. In: Proceedings of the 39th annual Hawaii international conference on system sciences (HICSS’06)
Barbara D et al (1997) The new jersey data reduction report. IEEE Data Eng Bull 20(4):3–45
Acknowledgments
The authors acknowledge research support of resources and use of facilities provided by the James A. Haley Veterans’ Hospital in Tampa, Florida.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tremblay, M.C., Berndt, D.J., Luther, S.L. et al. Identifying fall-related injuries: Text mining the electronic medical record. Inf Technol Manag 10, 253–265 (2009). https://doi.org/10.1007/s10799-009-0061-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10799-009-0061-6