Abstract
Preparation of landslide susceptibility maps is considered as the first important step in landslide risk assessments, but these maps are accepted as an end product that can be used for land use planning. The main objective of this study is to explore some new state-of-the-art sophisticated machine learning techniques and introduce a framework for training and validation of shallow landslide susceptibility models by using the latest statistical methods. The Son La hydropower basin (Vietnam) was selected as a case study. First, a landslide inventory map was constructed using the historical landslide locations from two national projects in Vietnam. A total of 12 landslide conditioning factors were then constructed from various data sources. Landslide locations were randomly split into a ratio of 70:30 for training and validating the models. To choose the best subset of conditioning factors, predictive ability of the factors were assessed using the Information Gain Ratio with 10-fold cross-validation technique. Factors with null predictive ability were removed to optimize the models. Subsequently, five landslide models were built using support vector machines (SVM), multi-layer perceptron neural networks (MLP Neural Nets), radial basis function neural networks (RBF Neural Nets), kernel logistic regression (KLR), and logistic model trees (LMT). The resulting models were validated and compared using the receive operating characteristic (ROC), Kappa index, and several statistical evaluation measures. Additionally, Friedman and Wilcoxon signed-rank tests were applied to confirm significant statistical differences among the five machine learning models employed in this study. Overall, the MLP Neural Nets model has the highest prediction capability (90.2 %), followed by the SVM model (88.7 %) and the KLR model (87.9 %), the RBF Neural Nets model (87.1 %), and the LMT model (86.1 %). Results revealed that both the KLR and the LMT models showed promising methods for shallow landslide susceptibility mapping. The result from this study demonstrates the benefit of selecting the optimal machine learning techniques with proper conditioning selection method in shallow landslide susceptibility mapping.
Similar content being viewed by others
References
Abe S (2010) Support vector machines for pattern classification. Springer, London
Aguirre-Gutiérrez J, Carvalheiro LG, Polce C, van Loon EE, Raes N, Reemer M, Biesmeijer JC (2013) Fit-for-purpose: species distribution model performance depends on evaluation criteria—Dutch hoverflies as a case study. PLoS One 8:e63708
Akgun A (2012) A comparison of landslide susceptibility maps produced by logistic regression, multi-criteria decision, and likelihood ratio methods: a case study at İzmir, Turkey. Landslides 9:93–106
Akgun A, Sezer EA, Nefeslioglu HA, Gokceoglu C, Pradhan B (2012) An easy-to-use MATLAB program (MamLand) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm. Comput Geosci 38:23–34
Allison PD (1999) Logistic regression using the SAS system: theory and application. SAS Institute, Inc., Cary
Atkinson PM, Massari R (1998) Generalised linear modelling of susceptibility to landsliding in the central Apennines, Italy. Comput Geosci 24:373–385
Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko mountains, central Japan. Geomorphology 65:15–31
Ballabio C, Sterlacchini S (2012) Support vector machines for landslide susceptibility mapping: the Staffora River Basin case study, Italy. Math Geosci 44:47–70
Beasley TM, Zumbo BD (2003) Comparison of aligned Friedman rank and parametric methods for testing interactions in split-plot designs. Comput Stat Data Anal 42:569–593
Belsley D (1991) A guide to using the collinearity diagnostics. Comput Sci Econ Manag 4:33–50
Booth GD, Niccolucci MJ, Schuster EG (1994) Identifying proxy sets in multiple linear regression: an aid to better coefficient interpretation. US Dept of Agriculture Forest Service, Ogden
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont
Brenning A (2005) Spatial prediction models for landslide hazards: review, comparison and evaluation. Nat Hazards Earth Syst Sci 5:853–862
Carrara A, Pike RJ (2008) GIS technology and models for assessing landslide hazard and risk. Geomorphology 94:257–260
Cawley G, Talbot NC (2008) Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach Learn 71:243–264
Chacon J, Irigaray C, Fernandez T, El Hamdouni R (2006) Engineering geology maps: landslides and geographical information systems. Bull Eng Geol Environ 65:341–411
Chung C-J, Fabbri AG (2008) Predicting landslides for risk analysis—spatial models tested by a cross-validation technique. Geomorphology 94:438–452
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46
Costanzo D, Rotigliano E, Irigaray C, Jiménez-Perálvarez JD, Chacón J (2012) Factors selection in landslide susceptibility modelling on large scale following the GIS matrix method: application to the River Beiro Basin (Spain). Nat Hazards Earth Syst Sci 12:327–340
Costanzo D, Chacón J, Conoscenti C, Irigaray C, Rotigliano E (2014) Forward logistic regression for earth-flow landslide susceptibility assessment in the Platani river basin (southern Sicily, Italy). Landslides 11:639–653
Cross M (2002) Landslide susceptibility mapping using the matrix assessment approach: a Derbyshire case study. In: Griffiths JS (ed) Mapping in engineering geology, The Geological Society. Key Issue in Earth Sciences, London, pp 247–261
Dan NT, Tuan TA, Thu TH, Hong PV, Hung LQ, Luong NV, Hai NT, Nhung H, Ha NTV, Thu DH, Thanh LV, Hien D, Mai D (2011) Application of remote sensing, GIS, and GPS for the study of landslides at the Son La hydropower basin and proposed remedial measures. Institute of Marine Geology & Geophysics, Vietnam Academy of Science and Technology, Hanoi, p 140
D’Arco M, Liccardo A, Pasquino N (2012) ANOVA-based approach for DAC diagnostics. IEEE Trans Instrum Meas 61:1874–1882
Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evol Comput 1:3–18
Do T, Bui Minh T, Truong Minh T, Trinh Xuan H, Nguyen Phuong M (2000) The investigation and assessment of environmental geology for the Son La hydropower basin and its surrounding areas. Ministry of Science, Technology and Environment of Vietnam, Hanoi, p 231
Doetsch P, Buck C, Golik P, Hoppe N, Kramp M, Laudenberg J, Oberdörfer C, Steingrube P, Forster J and Mauser A (2009) Logistic model trees with AUC split criterion for the KDD Cup 2009 Small Challenge
Dormann CF, Elith J, Bacher S, Buchmann C, Carl G, Carré G, Marquéz JRG, Gruber B, Lafourcade B, Leitão PJ, Münkemüller T, McClean C, Osborne PE, Reineking B, Schröder B, Skidmore AK, Zurell D, Lautenbach S (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36:27–46
Dovzhikov AE, Mi BP, Vasilevskaya ED, Zhamoida AI, Ivanov GV, Izokh EP, Huu LD, Mareichev AM, Tien NV, Tri NT, Luong TD, Kuang PV, Long PD (1965) Geology of northern Vietnam. Science and Technology, Hanoi, p 668
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17:191–209
Duc D (2013) Rainfall-triggered large landslides on 15 December 2005 in Van Canh district, Binh Dinh province, Vietnam. Landslides 10:219–230
Ercanoglu M, Gokceoglu C (2002) Assessment of landslide susceptibility for a landslide-prone area (north of Yenice, NW Turkey) by fuzzy approach. Environ Geol 41:720–730
Erener A, Düzgün H (2010) Improvement of statistical landslide susceptibility mapping by using spatial and global regression methods in the case of More and Romsdal (Norway). Landslides 7:55–68
Felicisimo A, Cuartero A, Remondo J, Quiros E (2013) Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: a comparative study. Landslides 10:175–189
Fernández T, Irigaray C, El Hamdouni R, Chacón J (2003) Methodology for landslide susceptibility mapping by means of a GIS. Application to the Contraviesa Area (Granada, Spain). Nat Hazards 30:297–308
Forest Inventory and Planning Institute (2005) The forest map of Vietnam scale 1:50 000. Vietnam Forest Inventory and Planning Institute, Hanoi
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Gama J (2004) Functional trees. Mach Learn 55:219–250
General Department of Geology and Minerals of Vietnam (2000) Geological and mineral resources maps scale of 1:200,000
Gil D and Johnsson M (2010) Supervised SOM based architecture versus multilayer perceptron and RBF networks. Proc Linköping Electron Conf: 15-24
Gokceoglu C, Aksoy H (1996) Landslide susceptibility mapping of the slopes in the residual soils of the Mengen region (Turkey) by deterministic stability analyses and image processing techniques. Eng Geol 44:147–161
Gokceoglu C, Sezer E (2009) A statistical assessment on international landslide literature (1945–2008). Landslides 6:345–351
Gomez H, Kavzoglu T (2005) Assessment of shallow landslide susceptibility using artificial neural networks in Jabonosa River Basin, Venezuela. Eng Geol 78:11–27
Guzzetti F, Galli M, Reichenbach P, Ardizzone F, Cardinali M (2006a) Landslide hazard assessment in the Collazzone area, Umbria, central Italy. Nat Hazards Earth Syst Sci 6:115–131
Guzzetti F, Reichenbach P, Ardizzone F, Cardinali M, Galli M (2006b) Estimating the quality of landslide susceptibility models. Geomorphology 81:166–184
Hair JF, Black WC, Babin BJ, Anderson RE (2009) Multivariate data analysis. Prentice Hall, New York
Haykin S (1998) Neural networks: a comprehensive foundation (2nd edition). Prentice Hall, Upper Saddle River
Hungr O, Fell R, Couture R, Eberhardt E (2005) Landslide risk management. CRC Press
Hunter E, Matin J, Stone P (1966) Experiments in induction. Academic, New York
Irigaray C, Fernández T, El Hamdouni R, Chacón J (2007) Evaluation and validation of landslide-susceptibility maps obtained by a GIS matrix method: examples from the Betic Cordillera (southern Spain). Nat Hazards 41:61–79
Jebur MN, Pradhan B, Tehrany MS (2014) Optimization of landslide conditioning factors using very high-resolution airborne laser scanning (LiDAR) data at catchment scale. Remote Sens Environ 152:150–165
Jia N, Mitani Y, Xie MW, Djamaluddin I (2012) Shallow landslide hazard assessment using a three-dimensional deterministic model in a mountainous area. Comput Geotech 45:1–10
Jiménez-Perálvarez JD, Irigaray C, El Hamdouni R, Chacón J (2011) Landslide susceptibility mapping in a semi-arid mountain environment: an example from the southern slopes of Sierra Nevada (Granada, Spain). Bull Eng Geol Environ 70:265–277
Kavzoglu T, Colkesen I (2009) A kernel functions analysis for support vector machines for land cover classification. Int J Appl Earth Obs Geoinform 11:352–359
Kavzoglu T, Mather PM (2003) The use of backpropagating artificial neural networks in land cover classification. Int J Remote Sens 24:4907–4938
Kavzoglu T, Kutlug Sahin E, Colkesen I (2014a) An assessment of multivariate and bivariate approaches in landslide susceptibility mapping: a case study of Duzkoy district. Nat Hazards 1-26
Kavzoglu T, Sahin E, Colkesen I (2014b) Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11:425–439
Keith TZ (2006) Multiple regressions and beyond. Pearson, Boston
Kononenko I (1994) Estimating attributes: analysis and extensions of relief. In: Bergadano F, Raedt L (eds) Machine learning: ECML-94. Springer, Berlin, pp 171–182
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59:161–205
Lee S (2005) Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data journals. Int J Remote Sens 26:1477–1491
Lee S (2007) Application and verification of fuzzy algebraic operators to landslide susceptibility mapping. Environ Geol 52:615–623
Lee S, Ryu JH, Min KD, Won JS (2003) Landslide susceptibility analysis using GIS and artificial neural network. Earth Surf Process Landf 28:1361–1376
Liao D, Valliant R (2012) Variance inflation factors in the analysis of complex survey data. Surv Methodol 38:53–62
Martínez-Álvarez F, Reyes J, Morales-Esteban A, Rubio-Escudero C (2013) Determining the best set of seismicity indicators to predict earthquakes. Two case studies: Chile and the Iberian Peninsula. Knowl-Based Syst 50:198–210
Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond Ser A, Containing Pap Math Phys Charact 209:415–446
Montgomery DR, Dietrich WE (1994) A physically-based model for the topographic control on shallow landsliding. Water Resour Res 30:1153–1171
Nefeslioglu HA, Sezer E, Gokceoglu C, Bozkir AS, Duman TY (2010) Assessment of landslide susceptibility by decision trees in the metropolitan area of Istanbul, Turkey. Math Probl Eng
Pavel M, Fannin RJ, Nelson JD (2008) Replication of a terrain stability mapping using an artificial neural network. Geomorphology 97:356–373
Pourghasemi H, Pradhan B, Gokceoglu C (2012a) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63:965–996
Pourghasemi H, Pradhan B, Gokceoglu C, Moezzi KD (2012b) A comparative assessment of prediction capabilities of Dempster–Shafer and weights-of-evidence models in landslide susceptibility mapping using GIS. Geomatics Nat Hazards Risk 4:93–118
Pradhan B (2011) Manifestation of an advanced fuzzy logic model coupled with geo-information techniques to landslide susceptibility mapping and their comparison with logistic regression modelling. Environ Ecol Stat 18:471–493
Pradhan B (2012) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365
Pradhan B, Lee S (2010a) Landslide susceptibility assessment and factor effect analysis: backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling. Environ Model Softw 25:747–759
Pradhan B, Lee S (2010b) Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia. Landslides 7:13–30
Pradhan B, Lee S, Buchroithner MF (2010a) A GIS-based back-propagation neural network model and its cross-application and validation for landslide susceptibility analyses. Comput Environ Urban Syst 34:216–235
Pradhan B, Sezer EA, Gokceoglu C, Buchroithner MF (2010b) Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (Cameron Highlands, Malaysia). IEEE Trans Geosci Remote Sens 48:4164–4177
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, CA, USA
Saito H, Nakayama D, Matsuyama H (2009) Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: the Akaishi mountains, Japan. Geomorphology 109:108–121
Sasikala S, AppavualiasBalamurugan S, Geetha S (2014) Multi filtration feature selection (MFFS) to improve discriminatory ability in clinical data set. Appl Comput Inform. doi:10.1016/j.aci.2014.03.002
Schuerman J (1983) Principal components analysis. Multivariate analysis in the human services. Springer, Netherlands, pp 93–119
Şenkal O, Kuleli T (2009) Estimation of solar radiation over turkey using artificial neural network and satellite data. Appl Energy 86:1222–1228
Sezer EA, Pradhan B, Gokceoglu C (2011) Manifestation of an adaptive neuro-fuzzy model on landslide susceptibility mapping: Klang valley, Malaysia. Expert Syst Appl 38:8208–8219
Sossa H, Guevara E (2014) Efficient training for dendrite morphological neural networks. Neurocomputing 131:132–142
Thanh L, De Smedt F (2014) Slope stability analysis using a physically based model: a case study from a Luoi district in Thua Thien-Hue province, Vietnam. Landslides 11:897–907
Tien Bui D (2012) Modeling of rainfall-induced landslide hazard for the hoa binh province of vietnam. Norwegian University of Life Sciences. Ph.D Thesis, 192p
Tien Bui D, Lofman O, Revhaug I, Dick O (2011) Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Nat Hazards 59:1413–1444
Tien Bui D, Pradhan B, Lofman O, Revhaug I (2012a) Landslide susceptibility assessment in Vietnam using support vector machines, decision tree and naïve Bayes models. Math Probl Eng 2012:1–26
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012b) Application of support vector machines in landslide susceptibility assessment for the Hoa Binh province (Vietnam) with kernel functions analysis. In: Seppelt R, Voinov AA, Lange S, Bankamp D (eds) Proceedings of the iEMSs Sixth Biennial Meeting: International Congress on Environmental Modelling and Software (iEMSs 2012) International Environmental Modelling and Software Society, Leipzig, Germany, July 2012
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012c) Landslide susceptibility assessment in the Hoa Binh province of Vietnam: a comparison of the Levenberg-Marquardt and Bayesian regularized neural networks. Geomorphology 171–172:12–29
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012d) Landslide susceptibility mapping at Hoa Binh province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Comput Geosci 45:199–211
Tien Bui D, Pradhan B, Lofman O, Revhaug I and Dick OB (2012e) Spatial prediction of landslide hazards in hoa binh province (vietnam): a comparative assessment of the efficacy of evidential belief functions and fuzzy logic models. Catena 96: 28-40
Tien Bui D, Ho TC, Revhaug I, Pradhan B, Nguyen D (2013a) Landslide susceptibility mapping along the national road 32 of Vietnam using GIS-based j48 decision tree classifier and its ensembles. In: Buchroithner M, Prechtel N, Burghardt D (eds) Cartography from pole to pole. Springer, Berlin, pp 303–317
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick O (2013b) Regional prediction of landslide hazard using probability analysis of intense rainfall in the Hoa Binh province, Vietnam. Nat Hazards 66:707–730
Tien Bui D, Pradhan B, Revhaug I, Trung Tran C (2014) A comparative assessment between the application of fuzzy unordered rules induction algorithm and j48 decision tree models in spatial prediction of shallow landslides at Lang Son city, Vietnam. In: Srivastava PK, Mukherjee S, Gupta M, Islam T (eds) Remote sensing applications in environmental research. Springer International Publishing, pp 87–111
Tuan TA, Dan NT (2012) Landslide susceptibility mapping and zoning in the Son La hydropower catchment area using the analytical hierarchy process. J Sci Earth (Vietnamese): 223–232
Tunusluoglu MC, Gokceoglu C, Nefeslioglu HA, Sonmez H (2008) Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey). Environ Geol 54:9–22
Van Den Eeckhaut M, Vanwalleghem T, Poesen J, Govers G, Verstraeten G, Vandekerckhove L (2006) Prediction of landslide susceptibility using rare events logistic regression: a case-study in the Flemish Ardennes (Belgium). Geomorphology 76:392–410
Van Den Eeckhaut M, Reichenbach P, Guzzetti F, Rossi M, Poesen J (2009) Combined landslide inventory and susceptibility assessment based on different mapping units: an example from the Flemish Ardennes, Belgium. Nat Hazards Earth Syst Sci 9:507–521
Van Westen CJ, Terlien MTJ (1996) An approach towards deterministic landslide hazard analysis in GIS. A case study from Manizales (Colombia). Earth Surf Process Landf 21:853–868
Van TT, Anh DT, Hieu HH, Giap NX, Ke TD, Nam TD, Ngoc D, Ngoc DTY, Thai TN, Thang DV, Tinh NV, Tuat LT, Tung NT, Tuy PK, Viet HA (2006) Investigation and assessment of the current status and potential of landslides in some sections of the Ho Chi Minh Road, National Road 1a and proposed remedial measures to prevent landslides from threat of safety of people, property, and infrastructure. Vietnam Institute of Geosciences and Mineral Resources, Hanoi, p 249
Vapnik VN (1998) Statistical learning theory. Wiley-Interscience
Vorpahl P, Elsenbeer H, Marker M, Schroder B (2012) How can statistical models help to determine driving factors of landslides? Ecol Model 239:27–39
Walter SD (2002) Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Stat Med 21:1237–1256
Witten IH, Frank E, Mark AH (2011) Data mining: practical machine learning tools and techniques (third edition). Morgan Kaufmann, Burlington
Wu W, Sidle RC (1995) A distributed slope stability model for steep forested basins. Water Resour Res 31:2097–2110
Xiaomeng W, Borgelt C (2004) Information measures in fuzzy decision trees. Fuzzy Systems, 2004 Proceedings 2004 I.E. International Conference on, pp 85–90 vol. 81
Xu L, Li J, Brenning A (2014) A comparative study of different classification techniques for marine oil spill identification using RADARSAT-1 imagery. Remote Sens Environ 141:14–23
Yao X, Tham LG, Dai FC (2008) Landslide susceptibility mapping based on support vector machine: a case study on natural slopes of Hong Kong, China. Geomorphology 101:572–582
Yem NT (2006) Assessment of landslides, flash floods, and debris flows in selected prone areas in the northern mountainous Vietnam and recommendation of remedial measures to prevent and mitigate potential damages. pp 166
Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: a case study from Kat landslides (Tokat—Turkey). Comput Geosci 35:1125–1138
Yilmaz I (2010) Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ Earth Sci 61:821–836
Zhuang L, Dai HH (2006) Parameter optimization of kernel-based one-class classifier on imbalance text learning. PRICAI 2006. Trends Artif Intell Proc 4099:434–443
Acknowledgment
This research was supported by the Geographic Information System group, Department of Business Administration and Computer Science, Faculty of Art and Sciences, Telemark University College, Bø i Telemark, Norway. The authors would like to thank Professor Candan Gokceoglu and three anonymous reviewers for their valuable and constructive comments on the earlier version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tien Bui, D., Tuan, T.A., Klempe, H. et al. Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13, 361–378 (2016). https://doi.org/10.1007/s10346-015-0557-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10346-015-0557-6