Research Proporsal
Research Proporsal
Research Proporsal
(ART)
Mutinta Malambo
School of Engineering
1.0 INTRODUCTION
Machine learning is an evolving branch of computational algorithms that are designed to emulate
human intelligence by learning from the surrounding environment. They are considered the working
horse in the new era of the so-called big data. Techniques based on machine learning have been applied
successfully in diverse fields ranging from pattern recognition, computer vision, spacecraft engineering,
finance, entertainment, and computational biology to biomedical and medical applications. ( Alpaydin
E. Introduction to machine learning. 3rd ed. Cambridge, MA: The MIT Press;
2014.)
Supervised machine learning algorithms have been a dominant method in the data mining field. Disease
prediction using health data has recently shown a potential application area for these methods.
Antiretroviral therapy (ART) has significantly reduced HIV-related morbidity and mortality. Keeping
persons living with HIV on antiretroviral treatment (ART) poses a lasting and formidable challenge in the
global public health response. To obtain long-term health benefits, patients must overcome barriers to
physical access to care, psychosocial obstacles and navigate sometimes difficult and frustrating health
systems. Among the challenges associated with ART is HIV/AIDs patients defaulting treatment. Using
machine learning techniques, a pattern can be established which defaulters exhibit before they default.
This can help implementing measures to ensure HIV/AIDs patients don’t default and obtain the long-
term benefits ART gives.
1.1 Background
Zambia is among the countries most heavily affected by widespread HIV infection.
The population of Zambia now stands at 10.2 million people with an annual growth rate of 2.9 percent
(Census 2000). More than 50 percent of the population is less than twenty years of age and constitute
the most vulnerable group to HIV.
Despite free ART services offered countrywide, the number of defaulters is on the increase.
In most settings, regular attendance at health facilities is criticalto the comprehensive medical care of
HIV-infected patients.During these visits, health care workers examine patients forclinical progression,
provide and monitor combination antiretro-viral therapy (ART), and counsel them on minimizing the risk
ofHIV transmission. Despite this need for regular monitoring, 33.6 per 1000 patient years [4]. In
industrialized settings lessadvanced disease, younger age, injection drug use and homeless-ness have
been associated with loss to follow-up [5–7].Loss to follow-up is more common in resource-poor
settings. Inan Antiretroviral Treatment in Lower Income Countries (ART-LINC) study, loss to follow-up
after 1 year was above 40% in someprograms [8], and associated with more advanced clinical
diseaseand lower CD4 cell counts. Similarly, an analysis of the large ARTprogramme jointly administered
by the Zambian Ministry ofHealth and the Centre for Infectious Disease Research in Zambia(CIDRZ)
showed that predictors of treatment failure or death alsopredicted loss to follow-up [9]. A meta-analysis
of studies thattraced patients lost to follow-up to ascertain their vital statusshowed that in sub-Saharan
Africa 40% of those traced had died[10]. High rates of loss to follow-up may affect mortality estimatesin
ART programmes if patients lost to follow-up have a differentprognosis compared to similar patients
remaining in care [11].Obtaining valid estimates of loss to follow-up at different points intime is
therefore important when evaluating ART programmes.Death is a competing risk of looss to follow-up:
patients who diecan no longer become lost to follow-up. Competing risks are definedas events that
prevent the outcome of interest from occurring. Theyare common in longitudinal studies and are
particularly importantin populations at high risk of death [12,13]. For example, deathfrom all causes is a
competing risk when studying recurrences aftertreatment of cancer, and death from other causes is a
competing riskwhen studying a specific cause of death. In standard Kaplan-Meieranalyses, the follow-up
of those developing a competing event issimply censored, assuming that the probability of the outcome
ofinterest is the same as that of comparable patients remaining underobservation. However, this
assumption is invalid because theoutcome of interest can no longer occur in those developing
thecompeting event, and such analyses will therefore overestimate theprobability of the outcome of
interest. This situation can be seen asan extreme form of ‘informative’ censoring, where censoring
isassociated with the probability of the outcome [14]. Analyses thatignore competing events are
however regularly published [7,15,16]even though they may produce misleading results.We examined
how the competing risk of death affectedestimates of loss to follow-up in cohorts of patients (1. Rosen
S, Fox MP, Gill CJ (2007) Patient retention in antiretroviral therapy
1624–1633.
(Brinkhof MW, Dabis F, Myer L, Bangsberg DR, Boulle A, et al. (2008) Early loss of HIV-infected patients
on potent antiretroviral therapy programmes in lower-income countries. Bull World Health Organ 86:
559–567. Loss to follow-up is more common in resource-poor settings. In an Antiretroviral Treatment in
Lower Income Countries (ART-LINC) study, loss to follow-up after 1 year was above 40% in some
programs [8], and associated with more advanced clinical disease and lower CD4 cell counts. Similarly,
an analysis of the large ART programme jointly administered by the Zambian Ministry of Health and the
Centre for Infectious Disease Research in Zambia (CIDRZ) showed that predictors of treatment failure or
death also predicted loss to follow-up [9]. A meta-analysis of studies that traced patients lost to follow-
up to ascertain their vital status showed that in sub-Saharan Africa 40% of those traced had died [10].
High rates of loss to follow-up may affect mortality estimates in ART programmes if patients lost to
follow-up have a different prognosis compared to similar patients remaining in care [11]. Obtaining valid
estimates of loss to follow-up at different points in time is therefore important when evaluating ART
programmes.)
(The widespread use of antiretroviral therapy (ART) has transformed national AIDS responses and has
had a huge positive impact on health.[1] ART has been shown to reduce transmission of HIV and HIV-
related morbidity and mortality.[2] In 2012, 9.7 million people received ART in low- and middle-income
countries (LMICs)[1] and, as of 2013, ART prevented an estimated 4.2 million deaths in LMICs in 2002-
2012.[1] However, while increased access to ART has continued throughout the world, disparities in ART
access still exist.
Despite improved and highly successful programmatic coverage with ART, significant numbers of adults
and children drop out of care at various points along the treatment pathway and treatment gains fail to
reach sufficient numbers of children and adolescents.[1] It is essential to understand how and why
people drop out of treatment programs, since retention of people on ART and ensuring adherence to
treatment are critical determinants of successful long-term outcomes. Studies in sub-Saharan Africa
have shown that about half of people who test HIV-positive are lost between testing and being assessed
for eligibility for therapy, and 32% of people considered eligible for ART are then lost between eligibility
assessment and initiation of ART.[3] Data from 23 countries indicate that average retention for people
on ART decreases over time, from about 86% at 12 months to 72% at 60 months.[1] Loss to follow-up
(LTFU) negatively impacts on the immunological benefit of ART and increases AIDS-related morbidity,
mortality, and hospitalizations.[4] LTFU in patients receiving ART can result in serious consequences,
such as discontinuation of treatment, drug toxicity, treatment failure due to poor adherence, and drug
resistance;[5,6,7] this results in an increased risk of death[8,9,10,11,12,13] of up to 40% in studies of
patients LTFU in sub-Saharan Africa.[14,15] Poor nutritional status, lower CD4 count, Tuberculosis (TB)
co-infection, advanced clinical staging, younger age, adverse drug reactions, gaps in services, and
accessibility to services are some of the predictors reported to be associated with LTFU.
[3,14,16,17,18,19] 1. WHO. Global update on HIV treatment: Results, impact and opportunities. 2013.
[Accessed March 14, 2014]. at: http://www.who.int/about/licensing/copyright_form/en/index.html .
4. Hogg RS, Heath K, Bangsberg D, Yipa B, Press N, O’Shaughnessy MV, et al. Intermittent use of triple-
combination therapy is predictive of mortality at baseline and after 1 year of follow-up. AIDS.
2002;16:1051–8. [PubMed] [Google Scholar]
5. Kaplan JE, Hanson D, Dworkin MS, Frederick T, Bertolli J, Lindegren ML, et al. Epidemiology of human
immunodeficiency virus associated opportunistic infections in the United States in the era of highly
active antiretroviral therapy. Clin Infect Dis. 2000;30:S5–14. [PubMed] [Google Scholar]
6. Low-Beer S, Yip B, O’Shaughnessy MV, Hogg RS, Montaner JS. Adherence to triple therapy and viral
load response. J Acquir Immune Defic Syndr. 2000;23:360–1. [PubMed] [Google Scholar]
7. Taiwo B. Understanding transmitted HIV resistance through the experience in the USA. Int J Infect Dis.
2009;13:552–9. [PubMed] [Google Scholar]
8. Dalal RP, Macphail C, Mqhayi M, Wing J, Feldman C, Chersich MF, et al. Characteristics and outcomes
of adult patients lost to follow-up at an antiretroviral treatment clinic in Johannesburg, South Africa. J
Acquir Immune Defic Syndr. 2008;47:101–7. [PubMed] [Google Scholar]
9. Brennan AT, Maskew M, Sanne I, Fox MP. The importance of clinic attendance in the first six months
on antiretroviral treatment: A retrospective analysis at a large public sector HIV clinic in South Africa. J
Int AIDS Soc. 2010;13:49. [PMC free article] [PubMed] [Google Scholar]
10. Bygrave H, Kranzer K, Hilderbrand K, Whittall J, Jouquet G, Goemaere E, et al. Trends in loss to
follow-up among migrant workers on antiretroviral therapy in a community cohort in Lesotho. PLoS One.
2010;5:e13198. [PMC free article] [PubMed] [Google Scholar]
11. Adam BD, Maticka-Tyndale E, Cohen JJ. Adherence practices among people living with HIV. AIDS
Care. 2003;15:263–74. [PubMed] [Google Scholar]
12. Malcolm SE, Ng JJ, Rosen RK, Stone VE. An examination of HIV⁄AIDS patients who have excellent
adherence to HAART. AIDS Care. 2003;15:251–61. [PubMed] [Google Scholar]
13. Murphy DA, Sarr M, Durako SJ, Moscicki AB, Wilson CM, Muenz LR. Adolescent Medicine HIV/AIDS
Research Network. Barriers to HAART adherence among human immunodeficiency virus-infected
adolescents. Arch Pediatr Adolesc Med. 2003;157:249–55. [PubMed] [Google Scholar]
14. Brinkhof MW, Pujades-Rodriguez M, Egger M. Mortality of patients lost to follow-up in antiretroviral
treatment programmes in resource-limited settings: Systematic review and meta-analysis. PLoS One.
2009;4:e5790. [PMC free article] [PubMed] [Google Scholar]
15. Fatti G, Meintjes G, Shea J, Eley B, Grimwood A. Improved survival and antiretroviral treatment
outcomes in adults receiving community-based adherence support: 5-year results from a multicentre
cohort study in South Africa. J Acquir Immune Defic Syndr. 2012;61:e50–8. [PubMed] [Google Scholar]
16. Amuron B, Namara G, Birungi J, Nabiryo C, Levin J, Grosskurth H, et al. Mortality and loss to- follow-
up during the pre-treatment period in an antiretroviral therapy programme under normal health service
conditions in Uganda. BMC Public Health. 2009;9:290. [PMC free article] [PubMed] [Google Scholar]
17. Alvarez-Uria G, Naik PK, Pakam R, Midde M. Factors associated with attrition, mortality, and loss to
follow up after antiretroviral therapy initiation: Data from an HIV cohort study in India. Glob Health
Action. 2013;6:21682. [PMC free article] [PubMed] [Google Scholar]
18. Lanoy E, Mary-Krause M, Tattevin P, Dray-Spira R, Duvivier C, Fischer P, et al. Clinical Epidemiology
Group of French Hospital Database on HIV Infection. Predictors identified for losses to follow-up among
HIV-sero positive patients. J Clin Epidemiol. 2006;59:829–35. [PubMed] [Google Scholar]
19. Bagchi S. Telemedicine in rural India. PLoS Med. 2006;3:e82. [PMC free article] [PubMed] [Google
Scholar]
20. Rosen S, Fox MP, Gill CJ. Patient retention in antiretroviral therapy programs in sub-Saharan Africa: A
systematic review. PLoS Med. 2007;4:e298. [PMC free article] [PubMed] [Google Scholar]
21. Karcher H, Omondi A, Odera J, Kunz A, Harms G. Risk factors for treatment denial and loss to follow-
up in an antiretroviral treatment cohort in Kenya. Trop Med Int Health. 2007;12:687–94. [PubMed]
[Google Scholar]
22. Central Statistical Agency [Ethiopia] and ICF International: Ethiopia Demographic and Health Survey.
Addis Ababa, Ethiopia and Calverton, Maryland, USA: Central Statistical Agency and ICF International;
2012. [Accessed February 25, 2014]. at
http://www.usaid.gov/sites/default/files/documents/1860/Demographic%20Health%20Survey
%202011%20Ethiopia%20Final%20Report.pdf . [Google Scholar]
23. Moshago T, Haile DB, Enqusilasie F. Survival analysis of HIV infected people on antiretroviral therapy
at Mizan-Aman General Hospital, Southwest Ethiopia. Int J Sci Res. 2014;3:1462–9. [Google Scholar]
24. Bakand C, Birungi J, Mwesigwa R, Nachega J, Chan K, Palmer A, et al. Survival of HIV-infected
adolescents on antiretroviral therapy in Uganda: Findings from a nationally representative cohort in
Uganda. PLoS One. 2011;6:e19261. [PMC free article] [PubMed] [Google Scholar]
25. Deribe K, Hailekiros F, Biadgilign S, Amberbir A, Beyene BK. Defaulters from antiretroviral treatment
in Jimma University Specialized Hospital, Southwest Ethiopia. Trop Med Int Health. 2008;13:328–33.
[PubMed] [Google Scholar]
26. Schoni-Affolter F, Keiser O, Mwango A, Stringer J, Ledergerber B, Mulenga L, et al. Swiss HIV Cohort
Study, IeDEA Southern Africa. Estimating loss to follow-up in HIV-infected patients on antiretroviral
therapy: The effect of the competing risk of death in Zambia and Switzerland. PLoS One. 2011;6:e27919.
[PMC free article] [PubMed] [Google Scholar]
27. Yu JK, Chen SC, Wang KY, Chang CS, Makombe SD, Schouten EJ, et al. True outcomes for patients on
antiretroviral therapy who are “lost to follow-up” in Malawi. Bull World Health Organ. 2007;85:550–4.
[PMC free article] [PubMed] [Google Scholar]
28. Alemu AW, San Sebastian M. Determinants of survival in adult HIV patients on antiretroviral therapy
in Oromiyaa, Ethiopia. Glob Health Action. 2010;3 [PMC free article] [PubMed] [Google Scholar]
29. Gerver SM, Chadborn TR, Ibrahim F, Vatsa B, Delpech VC, Easterbrook PJ. High rate of loss to clinical
follow up among African HIV-infected patients attending a London clinic: A retrospective analysis of a
clinical cohort. J Int AIDS Soc. 2010;13:29. [PMC free article] [PubMed] [Google Scholar]
30. Mocroft A, Kirk O, Aldins P, Chies A, Blaxhult A, Chentsova N, et al. EuroSIDA study group. Loss to
follow-up in an international, multicentre observational study. HIV Med. 2008;9:261–9. [PubMed]
[Google Scholar]
31. Weidle PJ, Malamba S, Mwebaze R, Sozi C, Rukundo G, Downing R, et al. Assessment of a pilot
antiretroviral drug therapy programme in Uganda: Patients’ response, survival, and drug resistance.
Lancet. 2002;360:34–40. [PubMed] [Google Scholar]
32. Wilson CM, Wright PF, Safrit JT, Rudy B. Epidemiology of HIV infection and risk in adolescents and
youth. J Acquir Immune Defic Syndr. 2010;54:S5–6. [PMC free article] [PubMed] [Google Scholar]
33. Murphy DA, Wilson CM, Durako SJ, Muenz LR, Belzer M. Adolescent medicine HIV/AIDS research
network. Antiretroviral medication adherence among REACH HIV-infected adolescent cohort in the USA.
AIDS Care. 2001;13:27–40. [PubMed] [Google Scholar]
34. Gordillo V, del Amo J, Soriano V, Gonzalez-Lahoz J. Socio demographic and psychological variables
influencing adherence to antiretroviral therapy. AIDS. 1999;13:1763–9. [PubMed] [Google Scholar]
35. Brinkhof MW, Spycher BD, Yiannoutsos C, Weigel R, Wood R, Messou E, et al. Adjusting mortality for
loss to follow-up: Analysis of five ART programmes in sub-Saharan Africa. PLoS One. 2010;5:e14149.
[PMC free article] [PubMed] [Google Scholar]
36. Geng EH, Glidden DV, Emenyonu N, Musinguzi N, Bwana MB, Neilands TB, et al. Tracking a sample of
patients lost to follow-up has a major impact on understanding determinants of survival in HIV-infected
patients on antiretroviral therapy in Africa. Trop Med Int Health. 2010;15:63–9. [PMC free article]
[PubMed] [Google Scholar]
37. Lebouche B, Yazdanpanah Y, Gerard Y, Sissoko D, Ajana F, Alcaraz I, et al. Incidence rate and risk
factors for loss to follow-up in a French clinical cohort of HIV-infected patients from January 1985 to
January 1998. HIV Med. 2006;7:140–5. [PubMed] [Google Scholar]
38. Ministry of Health (MOH) of Ethiopia: Guideline for implementation of antiretroviral therapy. Federal
HIV/AIDS Prevention and Control Office. Federal Ministry of Health. 2010. [Accessed March 25, 2014]. at
http://www.etharc.,org/.../SPM%20II%20Final%20version%20sept%2026.pdf .)
Over the past decade, Zambia has made progress in the HIV response. New infections and Annual AIDS-
related deaths have also declined significantly. Despite the progress, the HIV burden remains high as
patients are on treatment are defaulting and becoming loss to follow up. A pattern can be established
using data mining algorithms which defaulters exhibit before they default. This is especially important in
countries like Zambia, which are heavily affected by HIV where medical cures are still being sought and
all new knowledge and techniques about managing the disease during treatment is welcome.
1.3 Purpose
1.4 Objectives
The main objective of the research is to establish a pattern using machine learning based models to
predict HIV/AIDS patients on anti retroviral treatment who will default on treatment and become loss to
follow up.
2.3 Relevant Literature: Review of relevant literature that you analysed to come up with the gaps that
led you to formulate the statement of the problem
3.0 METHODOLOGY
CIDRZ-supported activities in HIV care and treatment began in 2004 across four sites in Lusaka. Since
then, the programme has expanded to 68 facilities, most of them government health centres and
hospitals. Across all sites, clinical care is standardized according to the Zambian National HIV guidelines
[18]. Over the analysis period, individuals were eligible for ART when: (a) they were diagnosed with a
stage IV conditions according to World Health Organization (WHO) criteria; (b) their CD4 cell count was
below 200 cells/µL; or (c) they had a stage III condition and their CD4 was between 200 and 350 cells/µL.
Clinical and immunologic monitoring occurred every 3–6 months. Although viral load testing is available,
its use is limited for reasons of cost and operational constraints [19], particularly outside of Lusaka.
Patients with missed visits are contacted by community health workers and reminded about their
appointments [20]. All patient-level data are entered into a comprehensive electronic medical record
supported by the Ministry of Health. Additional details of the CIDRZ programme can be found elsewhere
[9], [17]. Approval for use of these programmatic data was obtained from the University of Zambia
Research Ethics Committee (Lusaka, Zambia) and the University of Alabama at Birmingham
(Birmingham, AL, USA). Only routine clinical data were analyzed for the present study and informed
consent from patients was not obtained.
The scope of this research is primarily on the performance analysis of disease prediction approaches
using different variants of supervised machine learning algorithms. Disease prediction and in a broader
context, medical informatics, have recently gained significant attention from the data science research
community in recent years. This is primarily due to the wide adaptation of computer-based technology
into the health sector in different forms (e.g., electronic health records and administrative data) and
subsequent availability of large health databases for researchers. These electronic data are being utilised
in a wide range of healthcare research areas such as the analysis of healthcare utilisation [10],
measuring performance of a hospital care network [11], exploring patterns and cost of care [12],
developing disease risk prediction model [13, 14], chronic disease surveillance [15], and comparing
disease prevalence and drug outcomes [16]. Our research focuses on the disease risk prediction models
involving machine learning algorithms (e.g., support vector machine, logistic regression and artificial
neural network), specifically - supervised learning algorithms. Models based on these algorithms use
labelled training data of patients for training [8, 17, 18]. For the test set, patients are classified into
several groups such as low risk and high risk.
Given the growing applicability and effectiveness of supervised machine learning algorithms on
predictive disease modelling, the breadth of research still seems progressing. Specifically, we found little
research that makes a comprehensive review of published articles employing different supervised
learning algorithms for disease prediction. Therefore, this research aims to identify key trends among
different types of supervised machine learning algorithms, their performance accuracies and the types
of diseases being studied. In addition, the advantages and limitations of different supervised machine
learning algorithms are summarised. The results of this study will help the scholars to better understand
current trends and hotspots of disease prediction models using supervised machine learning algorithms
and formulate their research goals accordingly.
Culler SD, Parchman ML, Przybylski M. Factors related to potentially preventable hospitalizations among
the elderly. Med Care. 1998;1:804–17.
Google Scholar
11.
Uddin MS, Hossain L. Social networks enabled coordination model for cost Management of Patient
Hospital Admissions. J Healthc Qual. 2011;33(5):37–48.
PubMed
Google Scholar
12.
Lee PP, et al. Cost of patients with primary open-angle glaucoma: a retrospective study of commercial
insurance claims data. Ophthalmology. 2007;114(7):1241–7.
PubMed
Google Scholar
13.
Davis DA, Chawla NV, Christakis NA, Barabási A-L. Time to CARE: a collaborative engine for practical
disease prediction. Data Min Knowl Disc. 2010;20(3):388–415.
Google Scholar
14.
McCormick T, Rudin C, Madigan D. A hierarchical model for association rule mining of sequential events:
an approach to automated medical symptom prediction; 2011.
Google Scholar
15.
CAS
Google Scholar
16.
Fisher ES, Malenka DJ, Wennberg JE, Roos NP. Technology assessment using insurance claims: example
of prostatectomy. Int J Technol Assess Health Care. 1990;6(02):194–202.
CAS
PubMed
Google Scholar
17.
Farran B, Channanath AM, Behbehani K, Thanaraj TA. Predictive models to assess risk of type 2 diabetes,
hypertension and comorbidity: machine-learning algorithms and validation using national health data
from Kuwait-a cohort study. BMJ Open. 2013;3(5):e002457.
PubMed
PubMed Central
Google Scholar
18.
Ahmad LG, Eshlaghy A, Poorebrahimi A, Ebrahimi M, Razavi A. Using three machine learning techniques
for predicting breast cancer recurrence. J Health Med Inform. 2013;4(124):3.
Google Scholar
19.
Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-
analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–9.
PubMed
Google Scholar