Abstract
While systematic reviews are positioned as an essential element of modern evidence-based medical practice, the creation of these reviews is resource intensive. To mitigate this problem, there have been some attempts to leverage supervised machine learning to automate the article triage procedure. This approach has been proved to be helpful for updating existing systematic reviews. However, this technique holds very little promise for creating new reviews because training data is rarely available when it comes to systematic creation. In this research we assess and compare the applicability of semi-supervised learning to overcome this labeling bottleneck and support the creation of systematic reviews. The results indicated that semi-supervised learning could significantly reduce the human effort and is a viable technique for automating medical systematic review creation with a small-sized training dataset.
Similar content being viewed by others
References
Adeva, G., Atxa, P., Carrillo, U., & Zengotitabengoa, A. (2014). Automatic text classification to support systematic reviews in medicine. Expert Systems with Applications, 41(4), 1498–1508.
Allen, I., & Olkin, I. (1999). Estimating time to conduct a meta‐analysis from number of citations retrieved. JAMA, 282(7), 634–635.
Bekhuis, T., & Demner-Fushman, D. (2012). Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artificial Intelligence in Medicine, 55, 197–207.
Bennett, K. and Demiriz, A. (1999). Semi-supervised support vector machines. Advances in Neural Information processing systems: 368–374.
Cohen, A. M., Hersh, W. R., Peterson, K., & Yen, P.-Y. (2006). Reducing workload in systematic review preparation using automated citation classification. Journal of the American Medical Informatics Association, 13(2), 206–219.
Cohen, A. M., Ambert, K., & McDonagh, M. (2009). Cross-topic learning for work prioritization in systematic review creation and update. Journal of the American Medical Informatics Association, 16(5), 690–704.
Frunza, O., Inkpen, D. and Matwin, S. (2010). Building Systematic Reviews Using Automatic Text Classification Techniques. Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics: 303–311.
Gieseke, F., Airola, A., Pahikkala, T., & Kramer, O. (2014). Fast and simple gradient-based optimization for semi-supervised support vector machines. Neurocomputing, 123, 23–32.
Jin, Y., Huang, C., & Zhao, L. (2011). A semi-supervised learning algorithm based on modified self-training SVM. Journal of Computers, 6(7), 1438–1443.
Lin, J. S., O’Connor, E., Rossom, R. C., Perdue, L. A., & Eckstrom, E. (2013). Screening for cognitive impairment in older adults: a systematic review for the U.S. preventive services task force. Annals of Internal Medicine, 159(9), 601–612.
Matwin, S., Kouznetsov, A., Inkpen, D., Frunza, O., & O’Blenis, P. (2010). A new algorithm for reducing the workload of experts in performing systematic reviews. Journal of the American Medical Informatics Association, 17(4), 446–453.
McGowan, J., & Sampson, M. (2005). Systematic reviews need systematic searchers. Journal of the Medical Library Association, 93(1), 74–80.
Murdoch, T., & Detsky, A. (2013). The inevitable application of big data to health care. JAMA, 309(13), 1351–1352.
Robertson, S. (2004). Understanding inverse document frequency: on theoretical arguments for IDF. Journal of Documentation, 60(5), 503–520.
Settles, B. (2010). Active learning literature survey. University of Wisconsin, Madison 52(11): 55–66.
Shemilt, I., Simon, A., Hollands, G. J., Marteau, T. M., Ogilvie, D., O’Mara-Eves, A., Kelly, M. P., & Thomas, J. (2013). Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews. Research Synthesis Methods, 5(1), 31–49.
Shojania, K. G., Sampson, M., Ansari, M. T. and Garritty, C. (2007). Updating Systematic Reviews. Publication No. AHRQ 07–0087, Rockville, MD, Agency for Healthcare Research and Quality.
Song, M., Yu, H. and Han, W. S. (2011). Combining active learning and semi-supervised learning techniques to extract protein interaction sentences. BMC bioinformatics 12.
Thomas, J., McNaught, J., & Ananiadou, S. (2011). Applications of text mining within systematic reviews. Research Synthesis Methods, 2(1), 1–14.
Timsina, P., Liu, J. and El-Gayar, O. (2015). Advanced analytics for the automation of medical systematic reviews. Information Systems Frontiers (A Special Issue on Big Data and Analytics in Healthcare): 1–16.
Tsafnat, G., Glasziou, P., Choong, M., Dunn, A., Galgani, F., & Coiera, E. (2014). Systematic review automation technologies. Systematic Reviews, 3, 74.
Wang, S., Li, D., Petrick, N., Sahiner, B., Linguraru, M. G., & Summersa, R. M. (2015). Optimizing area under the ROC curve using semi-supervised learning. Pattern Recognition, 48(1), 276–287.
Zhou, D., Bousquet, O., Lal, T. N., Weston, J. and Schölkopf, B. (2004). Learning with Local and Global Consistency. Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany.
Zhu, X. (2005). Semi-supervised learning literature survey. TR-1530, University of Wisconsin-Madison, Department of Computer Science.
Zhu, X. and Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, J., Timsina, P. & El-Gayar, O. A comparative analysis of semi-supervised learning: The case of article selection for medical systematic reviews. Inf Syst Front 20, 195–207 (2018). https://doi.org/10.1007/s10796-016-9724-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-016-9724-0