Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Article An Integrated Approach of Machine Learning, Remote Sensing, and GIS Data for the Landslide Susceptibility Mapping Israr Ullah 1, Bilal Aslam 2, Syed Hassan Iqbal Ahmad Shah 3,4, Aqil Tariq 5,6,*,†, Shujing Qin 7,*,†, Muhammad Majeed 8 and Hans-Balder Havenith 9 Division of Earth Sciences and Geography, RWTH Aachen University, 52062 Aachen, Germany School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA 3 Division of Earth and Planetary Science, University of Hong Kong, Hong Kong, China 4 Laboratory for Space Research, University of Hong Kong, Hong Kong, China 5 State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430072, China 6 Department of Wildlife, Fisheries and Aquaculture, Mississippi State University, 775 Stone Boulevard, Mississippi State, MS 39762, USA 7 State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China 8 Department of Botany, University of Gujrat, Hafiz Hayat Campus, Gujrat 50700, Pakistan 9 Georisk & Environment, Department of Geology, University of Liege, 4000 Liege, Belgium * Correspondence: aqiltariq@whu.edu.cn (A.T.); Shujing.qin@whu.edu.cn (S.Q.) † These authors contributed equally to this work. 1 2 Citation: Ullah, I.; Aslam, B.; Shah, S.H.I.A.; Tariq, A.; Qin, S.; Majeed, M.; Havenith, H.-B. An Integrated Approach of Machine Learning, Remote Sensing, and GIS Data for the Landslide Susceptibility Mapping. Land 2022, 11, 1265. https://doi.org/10.3390/land11081265 Academic Editor: Le Yu Received: 12 July 2022 Accepted: 3 August 2022 Published: 7 August 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. Abstract: Landslides triggered in mountainous areas can have catastrophic consequences, threaten human life, and cause billions of dollars in economic losses. Hence, it is imperative to map the areas susceptible to landslides to minimize their risk. Around Abbottabad, a large city in northern Pakistan, a large number of landslides can be found. This study aimed to map the landslide susceptibility over these regions in Pakistan by using three Machine Learning (ML) techniques, specifically Linear Regression (LiR), Logistic Regression (LoR), and Support Vector Machine (SVM). Several influencing factors were used to identify the potential landslide areas, including elevation, slope degree, slope aspect, general curvature, plan curvature, profile curvature, landcover classification system, Normalized Difference Water Index (NDWI), Normalized Difference Vegetation Index (NDVI), soil, lithology, fault density, topographic roughness index, and road density. The weights of these factors were calculated using ML techniques. The weightage overlay tool is adopted to map the final output. According to three ML models, lithology, NDWI, slope, and LCCS significantly impact landslide occurrence. The area under the ROC curve (AUC) is applied to validate the performance of models, and the results show the AUC value of LiR (88%) is better than SVM (86%) and LoR (85%) models. ML models and final susceptibility map gives good accuracy, which can be reliable for the results. The study’s outcome provides baselines for policymakers to propose adequate protection and mitigation measures against the landslides in the region, and any other researcher can adopt this methodology to map the landslide susceptibility in another area having similar characteristics. Keywords: Abbottabad; landslide; machine learning; natural hazard; policymakers This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Land 2022, 11, 1265. https://doi.org/10.3390/land11081265 www.mdpi.com/journal/land Land 2022, 11, 1265 2 of 23 1. Introduction One of the main reasons for the terrain transformation is the regular occurrence of landslides worldwide [1,2], which are prevalent geohazards globally, causing economic losses and causalities [3–5]. From 2008 to 2017, more than the US $2.7 billion of economic losses were caused by landslides, and over 3 million people were affected, with 10,338 fatalities globally [6,7]. In classifying natural disasters based on the involved natural processes, landslides rank at the third position of the most disastrous types of natural hazards [8–10]. Landslides have a heterogeneous spatial distribution, with Asia being the most prevalent geographical region [11]. About 60% of Pakistan’s entire area features rough mountainous topography and plateaus [12]. The Himalayan region in Pakistan represents the most hazard-prone region of the country. It is also exposed to widespread landslide activity due to heavy rainfall during the monsoon season, the high seismicity, the presence of high and steep slopes, and the locally thick unstable soil cover. The Kashmir earthquake initiated thousands of landslides in the northern region in 2005 [13,14]. The Abbottabad district, which is the target region of this study, is located in the foothills of the highly rugged Himalayas Mountains and is subject to the occurrence of many landslides. Abbottabad city lies at 43 km to the south of Balakot, where the 2005 Kashmir earthquake with a moment magnitude of 7.6 devastated the whole region [15,16]. In many regions, various types of natural hazards may be coupled and their combined effects may be strongly intensified [17]. In such circumstances, the recognition of the effect of each hazard becomes very challenging [18]. Landslides are difficult to predict, therefore it is essential to understand the causative factors of landslides and to map the areas which are susceptible to future landslide occurrence. For this purpose, landslide susceptibility (LS) mapping is carried out using remote sensing and geographic information system (GIS) data. LS map is categorized into different classes depending upon the degree of susceptibility (very low, low, moderate, high, very high) without considering the rate of occurrence [7,19]. Different methods have been developed and adopted for LS mapping globally, classified as qualitative and quantitative approaches [20–22]. The qualitative approach is a knowledge-driven method based on an expert’s knowledge. It is a relatively subjective or heuristic approach. It evaluates landslide susceptibility by weighing and ranking the influencing factors of landslides based on the researcher’s expertise [23–25]. Some qualitative techniques use analytical tools to perform rating and weighting and are considered semi-quantitative [26,27]. The commonly used subjective methods include simple additive weighting [28], ordered weighted average [29], analytical hierarchy process [30,31], analytical network process [32], TOPSIS [33,34], and the weighted linear combination [26,27]. The quantitative method is a data-driven method and an objective approach [20], which leveraged soft computing, deterministic methods, and statistical algorithms to evaluate the relationship between the landslide-influencing factors and slope instability and predict the probability of landslide [35,36]. Artificial neural network (ANN) [37,38], decision trees [39,40], logistic regression method (LoR) [35,41], support vector machine (SVM) [42,43], and linear regression (LiR) [44,45] are the commonly used quantitative methods. In this study, we propose the application of LiR, LoR, and SVM to landslide susceptibility mapping. Numerous studies have proven the outcomes of these models to be better than those of other conventional ML techniques [46,47]. The SVM model maximizes margin, it is slightly more efficient. The SVM supports kernels, allowing you to model non-linear relationships. It is based on a non-linear change of the covariates in a high-dimensional space where distinct classes can be separated linearly [48,49]. The LiR aims to identify the best-fitted line and is used to handle the regression problems and shows how landslide susceptibility changes as the standard deviation of independent variables and predictors changes [50]. The LoR tries fitting the line values to the sigmoid curve. It maximizes the posterior class probability. It uses independent variables to estimate the likelihood of an Land 2022, 11, 1265 3 of 23 event occurring on any given piece of land. The fact that the dependent variable is dichotomous is crucial in LoR. The independent variables in this model can be measured on a nominal, ordinal, interval, or ratio scale and are predictors of the dependent variable. The dependent variable and independent variables have a nonlinear relationship [42,51]. Many studies have been completed so far to compare the performance of different models for evaluating and analyzing landslide susceptibility. For example, Hong et al. [39] conducted a study in 187 landslide locations using 14 landslide conditioning factors and concluded that the prediction capability is 81.1%, 84.2%, and 93.3% for KLR (Kernel Logistic Regression), SVM, and the ADT (Alternating Decision Tree), respectively [47]. The Analytical Hierarchy Process (AHP) and Logistic Regression (LoR) were compared with the combined fuzzy and SVM hybrid model. The results indicated that the combined fuzzy and SVM method with an accuracy of 85.73% performed better than AHP and LR. In another study in Inje, Korea, Park et al. [41] compared four models, Frequency Ratio (FR), AHP, LoR, and ANN methods with the AUC values 0.794, 0.789, 0.794, and 0.806, respectively, suggesting that the ANN led to a better result compared to the other three models. The main goal of this research is to produce GIS-based LS maps over a broader scale of the Abbottabad district using three different machine learning (ML) techniques, including LiR, LoR, and SVM, and compare their accuracy. An extensive database of landslide inventory and influencing factors is formulated for training and validating the LS mapping. Fourteen causative factors broadly grouped into geomorphological, geological, hydrological, and topographical factors were used in this research. The factor analysis is performed to identify the critical parameters by weight. The LS models were validated using validation datasets based on the receiver operating characteristic (ROC) curve, and the area under the curve (AUC). The accuracy assessment is conducted at the end to validate the generated LS maps. There is extensive work previously performed by authors in the northern areas of Pakistan. Qing et al. [52] assessed the debris flow susceptibility mapping along the China-Pakistan Karakoram highway using support vector classification (SVC), and Ali et al. [53] assessed the LS mapping using AHP along the China-Pakistan Economic Corridor. Basharat et al. [54] produced an LS map covering a smaller portion of the study area using the weighted overlay method. Kamp et al. [15] presented landslide susceptibility analysis based on AHP. Torizin et al. [55] investigated the landslide susceptibility assessment of the Mansehra and Torghar districts by using the weight of evidence (WofE). These studies are frequently associated with traditional quantitative and decisionmaking approaches, which are less precise than ML methods. We presented ML models in the present study. These models will improve the accuracy of susceptibility maps. Still, to our knowledge, no prior LS mapping has been performed in the Abbottabad district. This study will fill the gap by identifying the landslide-prone areas in the study area. Related outcomes may help disaster management authorities, researchers, government, planners, and decision-makers in land use planning to prevent causalities, economic losses, and depletion of land resources in the study area. 2. Materials and Methods 2.1. Study Area The study area is the province of Abbottabad located in the Khyber Pakhtunkhwa province of Pakistan. It lies in the geographical coordinates 34.1688° N, 73.2215° E as shown in Figure 1, and covers an area of 1969 km2 with a population of 1,332,912 [56]. The Abbottabad district is situated to the south-west of Muzaffarabad district where the epicenter of the devastating Kashmir 2005 earthquake is located. The maximum elevation in the region is 2957 m above sea level. This region consists of fragile geology of igneous, metamorphic, and sedimentary rocks. According to Gansser et al., 1964 [57] classification of tectonostratigraphic zones, the study area is a part of the lesser Himalayan fold and Land 2022, 11, 1265 4 of 23 thrust belt, enclosed to the south by main boundary thrust (MBT) and to the north by main mantle thrust (MMT) [58]. Panjal thrust, Nathia Gali thrust fault, Gandghar fault, Kuzagali fault, and MBT run across the Abbottabad district. Panjal thrust fault trends in a northeastsouthwest direction with a dip facing south-east in most northern regions and a northwest dip facing the south-western. The Nathia Gali thrust fault is oriented roughly towards the southwest with a northwest dip direction. Figure 1. Location map of the study area showing Abbottabad district along with the distribution of faults and inventory of 116 landslides derived from Landsat 8 pre and post event imageries. The MBT is north-south-oriented with a dip direction towards the southwest. This area is mainly governed by the northwest-southeast-oriented compressional stresses, Land 2022, 11, 1265 5 of 23 which makes the area tectonically active with high seismicity. The rivers of Dor and Salhad Nala, which flow through the eastern part of the Abbottabad district from north to south, represent the district’s most important drainage system. During the summer monsoon, these affluents cause temporary fluctuations in the river discharge system. The annual mean precipitation is 1262 mm. Precipitation increases during the monsoon season from July to September, resulting in frequent floods, making the study area susceptible to landslides. In summary, the study area is extremely exposed to slope failures due to intense rainfall caused by monsoon cycles, the ruggedness of the terrain, earthquakes occurring intermittently, and anthropogenic activities. Due to these conditions, the Abbottabad district is marked by a high level of geohazards. The further occurrence of landslides is a major threat that could cost economic losses and casualties. 2.2. Methodological Framework The methodological framework of this study is illustrated in Figure 2. The details about the different steps involved are presented in the following sections. Land 2022, 11, 1265 6 of 23 Figure 2. Methodology flowchart used in the preparation of susceptibility map (NDWI is Normalized Difference Water Index, LCCS is Landcover Classification System, TRI is Topographic Roughness Index, FAO is Food and Agricultural Organization, and NDVI is Normalized Difference Vegetation Index). 2.3. Landslide Inventory Dataset A landslide inventory is essential to perform LS mapping, which helps us understand the relationship between the distribution of landslide occurrences and causative factors [39,59]. The past and present landslide events were the keys to forecast future landslides occurrences [20,60]. Data such as historical landslides, satellite images, field surveys, literature, and aerial photographs can be used to prepare the landslide inventory map. For this paper, the landslide location polygon (centroid) was developed using Landsat satellite imageries suitable for middle and large-scale slope failures due to its image resolution. The inventory is prepared by mapping these landslide locations based on Landsat imagery of pre-and post-event after 2005 major seismic and rainfall events. In this study, the Land 2022, 11, 1265 7 of 23 earthquake and rainfall-induced landslides were considered for landslide inventory mapping. The inventory contains polygons of mass movements. The spatial distribution of the landslide polygons dataset from the satellite data is also verified from the ground truth of a field survey. The purpose of using the location data was to mark the extent and verify the landslide location and extent. The locations were also validated by visiting Poona Landslide, Havelian shown in Figure 3. This landslide was triggered after the earthquake of 7.5-moment magnitude (Mw) (with an epicenter in Afghanistan) and struck the northern area of Pakistan on 26 October 2015 [61]. The landslide classification is not provided in this research because of the unavailability of information on the landslide type and is challenging to distinguish them on the basis of landslide types for a large inventory. These studies [62–64] also used the mass movement inventory data for LS mapping. A total of 116 landslide polygons were developed in the study area. Due to the low visibility and the small size of the inventory map, the 116 polygons were converted and depicted as points on the study area map in Figure 1. The 116 non-landslide points were also generated and were randomly distributed in ArcGIS using the random point tool, and in total 232 landslide and non-landslide points were used for training and testing data with a ratio of 70% and 30%, respectively, illustrated in Figure 4. The models were trained using the training dataset, and the testing dataset is used for assessing and validating their accuracy. 2.4. Landslide Conditioning Factors Dataset Many natural and anthropogenic factors contribute to landslide activity, and these factors are important to be considered in examining landslide susceptibility in a local context. There are no standard rules or guidelines to select the landslide causative factors, and the selection greatly depends on data availability and also the local conditions of an area [59,65]. The causative factors are broadly grouped into geological, hydrological, topographical, geomorphological, and meteorological factors. In this study, a total of 14 causative factors, namely slope degree, slope aspect, curvature, plan curvature, profile curvature, landcover classification system (LCCS), normalized difference water index (NDWI), normalized difference vegetation index (NDVI), soil, lithology, fault density, topographic roughness index (TRI), and road density are considered for the landslide susceptibility analysis. Land 2022, 11, 1265 8 of 23 Figure 3. Google Earth Pro 7.3 is used to generate (A,B). Pre landslide imagery (2014) (A), and post landslide imagery (B). Picture (C,D) Illustrate field imagery of the Poona landslide, Havelian in the study area. The slope aspect map shows the orientations of the slopes. TRI is a morphological factor that is commonly used in landslide analysis. It is computed from DEM by using a methodology developed by [66]. The curvature of an area shows convex, concave, and flat surfaces. Profile curvature presents the acceleration and deacceleration characteristics of the water flow down the slope and influences erosion and deposition. Whereas the curvature perpendicular to the slope direction is plan curvature. It affects the convergence and divergence of flow. The construction of roads disturbs the stability of the slope due to tremors caused by vehicles. Cutting the slope for road construction and additional load caused instability promoting landslides [59]. The shuttle radar topography mission (SRTM) digital elevation model (DEM) having a spatial resolution of 30 m was used to derive the factors of slope angle, slope aspect, curvature, elevation, profile curvature, TRI, and plan curvature. The tiles of SRTM DEM (30 m resolution) for the study area were mosaicked together in ArcGIS to produce a sinkfree DEM. Landsat-8 images with a 30 m spatial resolution were derived from USGS Earth Explorer and were used to obtain the factors such as NDWI and NDVI. The NDWI is a causative factor and higher NDWI values denote the presence of higher moisture content. The NDWI is acquired from Landsat-8 satellite data. It is calculated from: NDWI = (Green − NIR) (Green + NIR) (1) The NDVI visualized vegetation density and is acquired from Landsat-8, which is calculated from: NDVI = (NIR – RED) (NIR + RED) (2) Land 2022, 11, 1265 9 of 23 Fault density and lithology are extracted from geological maps of Pakistan obtained from the Survey of Pakistan. The roads were digitized from Google maps. The line density tool in Arc Map was used to calculate the density of roads and faults. The LCCS map of the study area is extracted from the landcover map of the Himalayas region (FAO-GLCN program). The soil map of the study area is acquired from the FAO data. The thematic maps of these factors were prepared in an ArcGIS environment. The conditioning factors having distinct resolutions were resampled at a 30 m resolution to match the resolution with the factors acquired from SRTM DEM and Landsat-8. The data were standardized and normalized before being processed further in which redundancy in the dataset is minimized by structuring the data. The raster layer of each conditioning factor was standardized into five classes and these classes were assigned a weight from 1 (very low) to 5 (very high) depending on their importance in triggering landslides. All the maps prepared in this study have WGS 1984 datum and UTM zone 43 projection system. Figure 4. Historical landslide and generated non-landslide points were used for testing and training in the study. 2.5. Susceptibility Modeling Techniques R-Studio is used to implement LoR, LiR, and SVM models. Following the training of the models, the final landslide susceptibility maps were generated by adopting the weighted overlay (WOM) technique in ArcMap 10.8. The WOM tool is used to create maps utilizing overlays of several raster layers, with each raster layer given a weight based on its importance [57]. The ML models were trained using a 10-fold cross-validation process. Land 2022, 11, 1265 10 of 23 2.5.1. Linear Regression A multiple linear regression model, which includes two or more independent variables, is used to predict the variance of the landslide susceptibility. The linear regression model depicts the changes in landslide susceptibility with the change in the standard deviation of predictor variables. The equation of the multiple regression model is: Y = β + β X + β X + ….+ β X + ℇ (3) The right side of the equation contains a sum of linear parameters except for epsilon (error term). Y is the dependent variable depending on the presence or absence of landslides; β0 is an intercept and has a fixed value in the regression equation; β1 to βn are coefficients (weight); X1 to Xn represent the independent variables, and ℇ represents the model error term. 2.5.2. Logistic Regression Logistic regression also allows for evaluating the relationship between the dependent variable and a set of independent parameters. Unlike the linear regression, the dependent variable in the LoR is dichotomous, which in this paper is the probability of the presence and absence of landslides. In contrast, the independent variable can be numerical, categorical, or both [67,68]. There is a non-linear relationship between the independent and dependent variables [69]. The relationship between the occurrence and its dependency on several variables can be illustrated quantitatively as: P = 1 (1 + 𝑒 ) (4) where P represents the probability of landslide occurrence. On an S-shaped curve, the probability ranges from 0 to 1. Z represents the linear combination. It follows that the LoR involves fitting an equation into the following form of the data: Z = 𝑏 + 𝑏𝑥 + 𝑏 𝑥 + …+ 𝑏 𝑥 (5) The presence (1) or absence (0) is illustrated by the dependent variable Z; b0 is the intercept of a mode; b1 … bn are the coefficients of the LoR model, and x1 … xn represent the independent variables. 2.5.3. Support Vector Machine A support vector machine is a supervised ML method. It is based on statistical learning and optimization theories [70], which provide a non-linear perspective to regression and classification problems by mapping the input variables into a high-dimensional attribute space [70]. SVM is suitable for extreme cases. It draws a decision boundary known as a hyperplane between extreme data points, also known as support vectors, to separate the landslide and non-landslide classes. There are different kernel functions for various decision functions to find support vector classifiers in higher dimensions systematically. The classes are linearly separable in the Linear Support Vector Machine (LSVM) and have a linear decision boundary. SVM can increase prediction accuracy and lower the model complexity and error test by avoiding overfitting [71–73]. SVM used different kernel functions to map the data into higher dimensional space. The most popular kernel functions are linear, polynomial, radial, and sigmoid kernel functions. However, one of the most generally utilized kernels for landslide modeling is the radial kernel function which is also used in the present study. The equations for all the kernel functions are shown below: Radial kernel function = k(x y ) e ( ) (6) Land 2022, 11, 1265 11 of 23 2.6. Model Evaluation and Accuracy Assessment The receiver operating characteristic (ROC) curve is used to evaluate the overall performance of the models. The ROC curve graphs are constructed using the sensitivity versus the specificity in a two-dimensional space [74–76]. The ROC curve technique is appealing because it is unaffected by changes in the distribution of classes. The ROC curve remains unchanged when the proportions of landslide and non-landslide points in the validation dataset are changed. The area under the ROC curve (AUC) is a summary measure of the ROC analysis result that assesses the landslide models’ prediction capabilities using the validation data. The AUC equal to 1 suggests a flawless model, whereas AUC equal to 0 indicates a non-informative model. The landslide model performs best when the AUC value is close to 1 [77–79]. The landslide inventory was overlaid on the final maps to see how many landslides were falling in high landslide susceptibility areas, for the accuracy assessment of the final LS maps. 3. Results 3.1. Thematic Maps of Conditioning Factors The LCCS of the study area is categorized into agriculture, bare areas, natural herbaceous, trees, shrubs, urban areas, and water bodies, as can be seen from Figure 5a. Agriculture constitutes a large portion of the study area, followed by natural trees in the eastern region. The soil type map of the study area depicted in Figure 5b shows that the soil in the study area consists of sand, silt, and clay. The most dominant soil type is sand, followed by silt in the south-eastern region. The presence of water bodies is marked by an NDWI greater than 0.5 as shown in Figure 5c. The areas having less water content are marked by a positive value between 0 and 0.2. The NDWI map is categorized into high and low classes. The central region of the study area shows a lower NDWI value and is less prone to landslides due to the built-up area and low moisture content. The slope angles value is ranging from 0 to 89 degrees. The north-eastern and eastern parts of the study area tend to have steep slopes, while the slope angle in the central part tended to be lower as can be observed from Figure 5d. The lithology of the study area comprised various units, classified into three classes: dolomite, schist, and sandstone, as illustrated in Figure 5e. The NDVI map is classified into high and low values as depicted in Figure 5f. The highest values represent denser vegetation, and bare soil has a value close to zero. Vegetated areas have a positive NDVI value between 0.1 and 0.7. The NDVI values were lower in the western region and higher in the eastern region because of the dense vegetation in the eastern region. The elevation of the study area is also classified into high and low values, where the eastern region of the study area has a higher elevation as can be observed from Figure 5g. Fault density is also classified into high and low. There are five active faults that run across the Abbottabad district. The eastern and south-eastern areas are marked by the presence of numerous faults and are prone to landslides as depicted in Figure 5h. The nearness of roads increases the susceptibility of slopes to landslides. The road density of the study area is shown in Figure 5i. The curvature is categorized into higher and lower values as illustrated in Figure 5l. Convex surfaces are marked by a positive curvature value. In contrast, a negative curvature value indicates a concave surface, and intermediate values a flat-lying surface. Profile curvature of the study area is shown in Figure 5j. Negative values represent the upwardly convex surfaces, while upwardly concave surfaces tend to have positive values. The plan curvature of the study area is represented in Figure 5k. The positive values show laterally convex surfaces, while the laterally concave surfaces are represented by negative values. The slope aspect map of the study area is presented in Figure 5m. The hill slope oriented towards the south-west is more susceptible to landslide occurrence, followed by north-west oriented hill slopes as these slopes are affected by the highest amounts of seasonal monsoon precipitation. Figure 5n depicts the TRI of the research area. The high TRI Land 2022, 11, 1265 12 of 23 of the study area signifies a rough terrain, whereas a lower value is a depiction of relatively less rough terrain. Land 2022, 11, 1265 13 of 23 Figure 5. Landslide conditioning factor maps used in this study: (a) LCCS, (b) soil type (from bedrock erosion), (c) NDWI, (d) slope, (e) lithology, (f) NDVI, (g) elevation, (h) fault density, (i) road density, (j) profile curvature, (k) plan curvature, (l) total curvature, (m) Aspect, (n) TRI. 3.2. Conditioning Factor Analysis The weights of the used 14 conditioning factors obtained from different ML techniques are shown in Table 1. It can be perceived from Table 1 that a similar controlling element showed variation for distinct models. The weights were derived by processing the landslide inventory along with the thematic layers of the conditioning factors in the ML techniques. The weights of variables were computed by using the caret library in RStudio by calculating the relative importance of each variable. The relative importance of each conditioning factor is the weight of a particular factor in all the three models. According to the results of the LiR model, the factors of slope, lithology, soil, and curvature with the weight of 9%, respectively, are the most crucial parameters for landslide events. Land 2022, 11, 1265 14 of 23 Table 1. Weights of conditioning factors from three ML models. Dataset Aspect Curvature Elevation Lithology NDVI NDWI TRI Plane Curvature Profile Curvature Slope Faults Roads Soil LCCS Total LoR 6 9 8 9 6 9 7 4 5 9 7 5 7 9 100 SVM 9 6 7 10 8 10 6 5 3 8 6 5 8 9 100 LiR 5 9 8 9 8 8 8 5 4 9 6 5 9 7 100 For the LoR model, the Lithology, slope, NDWI, and Land-use are vital parameters with 9% weight, respectively. In SVM, the NDWI and lithology have the highest importance with a weight of 10%, respectively. The LCCS and aspect are the second most important parameters with the resulting weight of 9%, respectively. In general, the study region’s most influencing factors to landslides are lithology, NDWI, slope, and LCCS. At the same time, the profile curvature played the slightest role in triggering landslides in the study region. 3.3. Landslide Susceptibility (LS) Maps The produced LS maps were created by multiplying the derived weights with the factors through the weighted overlay in the GIS environment. The classification was performed using the Equal Intervals classification technique to split the final susceptibility map into five susceptibility classes: very low, low, medium, high, and very high. 3.3.1. Linear Regression (LiR) The LS map derived from the LiR model is illustrated in Figure 6. Medium landslide susceptibility is observed for the western area, while the central and southern regions are marked by a high to very high susceptibility. In contrast, the marginal areas in the west and the higher mountain regions in the east are much less susceptible to landslide activity. Land 2022, 11, 1265 15 of 23 Figure 6. Landslide susceptibility map based on LiR model. 3.3.2. Logistic Regression (LoR) From the LoR model, the produced LS map is shown in Figure 7. The southern and north-western regions show high to very high susceptibility, and the central part exhibits medium susceptibility. In contrast, the northeastern region of the study area exhibited very low to medium susceptibility. Figure 7. Landslide susceptibility map based on LoR model. Land 2022, 11, 1265 16 of 23 3.3.3. Support Vector Machine The LS map produced by the SVM model is depicted in Figure 8. The southern region exhibits a high to very high susceptibility. It also reveals that the marginal areas in the west and east have medium susceptibility. Very low and low susceptibility regions are in the middle part of the district towards the east side. Figure 8. Landslide susceptibility map based on SVM model. 3.4. Model Validation The AUC for the three models is calculated using the testing dataset shown in Figure 9. The sensitivity (true positive rate) is plotted against the 100-specificity (false positive rate) at different threshold values to generate the curve. The LiR (0.88) accomplished a greater AUC prediction rate than the SVM (0.86) and was followed closely by the LoR (0.85), which made LiR the highest precise model. The higher the AUC value indicated, the higher accuracy of the model. Land 2022, 11, 1265 17 of 23 Figure 9. ROC curve for the three landslide susceptibility models. 3.5. Accuracy Assessment To assess the outcomes of landslide susceptibility analysis, the historical landslide positions were overlaid on the LS maps as shown in the produced LS maps from the different ML models (see Figures 6–8). Accuracy assessment results illustrate that the LiR model attained an accuracy of 85%, followed by the SVM model at 83%, and the LoR model at 79%. 4. Discussion Remote sensing and GIS technologies have been effectively used to assess landslide susceptibility by exercising different methods. The first step in the concerning work is to produce and validate landslide inventory. For this purpose, the Landsat satellite imageries are used to develop landslide inventory and assess the outcomes of pre, and postlandslide events and the landslide locations were verified from the field survey. The landslide points along with the equal number of non-landslide points were used as training (70%) and testing (30%) data. The second step involved the preparation of LS maps. In this study, 14 causative factors including land cover, type of soil cover, NDWI, slope, lithology, NDVI, elevation, fault density, road density, TRI, curvature, profile curvature, plan curvature, and aspect were considered and were standardized and normalized in an ArcGIS environment. Expert judgment, a critical investigation of existing literature, and landslide inventory were done in this study to determine the selection of the contributing factors. Their weighting during the normalization process was also based on the previously mentioned criteria. The 14 chosen factors for this study were weighted using the considered ML models. Each model showed variations in terms of weights for each factor, as can be seen from Table 1. The conditioning factors considered the most effective by all the models were lithology, NDWI, slope, and LCCS. The high susceptibility areas occurred in steep slopes and weak lithologies. High susceptibility is found to be characteristic of slopes highly exposed to seismic and climatic effects on slope failure. The climatic influence is more represented by the NDWI and landcover factors while the seismic influence depends on the fault density, which contributes to the LS in our study area. This argument can also be supported by multiple examples of previous trends of landslide occurrences in the study area. One such example of landslides where both seismic and climatic influences are dominant is Poona Landslide, Havelian presented in Figure 2. The Poona landslide occurred in November 2015 approximately one month after the October 2015 earthquake; seismic Land 2022, 11, 1265 18 of 23 shaking is supposed to have initiated landslide movements that were then intensified by subsequent rainfall which finally triggered a massive failure. These highly relevant conditioning factors that influence landslide occurrences can be extended to other areas because of their major contribution to initiating landslides for LS mapping in the future. The explorative area for LS mapping is categorized into five LS classes: very low, low, medium, high, and very high, as can be seen from Figures 6–8. The high susceptibility areas occurred in steep slopes and weak lithologies. Additionally, the region categorized as high/very high LS corresponds to zones having high moisture content and they align closely to the historical landslides. The results of susceptibility maps in terms of area under different susceptibility classes are summarized as a graphical representation in Figure 10. Figure 10 illustrates that most of the study area is classified as a very low susceptible zone by all three models. The maps produced by the three models illustrated that five LS classes have varied trends in terms of positions and percentages, as can be witnessed from Figures 6–8 and 10. The third step involves the validation of a model. The trial-and-error process was carried out for ML methods to readjust and determine the ideal model to obtain higher estimation performance. The 10-fold cross-validation is used to avoid overfitting. All the models yielded very high prediction accuracy with AUC values between 0.85 and 0.88. The LiR model has the highest AUC value as compared to other models. Figure 10. A histogram shows susceptible areas from different models that fall into various classes. The LiR model outperformed the SVM and LoR models. Still, the SVM model shows higher proportions in terms of area under a very high susceptibility class. The results of the LiR model stand above other models in terms of area under high and medium susceptibility classes. The closer affinity of considered landslide conditioning factors with landslide occurrence resulted in the higher accuracy of the models in predicting landslides. Moreover, there was also no multicollinearity present among the considered conditioning factors. The availability of high-quality data has a significant impact on the results, and the accuracy of the result improves as the number and suitability of factors increases. The selection of contributing factors proves to be highly appropriate. Kalantar et al., 2018 [80] also established in their study that quality data play an efficient role in the performance of the ML models. The authors employed 14 conditioning factors and used SVM, LoR, and artificial neural networks for assessing the effects of different training data on landslide susceptibility mapping. They achieved an overall accuracy of 79.82% for SVM and 81.42% for LoR which is less than the achieved accuracy for LoR (85%) and SVM (86%) in the present study. Land 2022, 11, 1265 19 of 23 The SVM is effective when the number of dimensions exceeds the number of samples because LoR usually required a sufficiently large sample size to accurately predict. When there is a small amount of training data and many features, both perform well. The LiR assumes no collinearity and that the input features are normally distributed, which may not be the case. Because linear and logistic regression is more susceptible to outliers than SVM, SVM outperforms LoR by a slight margin. Preprocessing is required in linear regression to remove multicollinearity, handle outliers, and reduce dimensionality. This region is under the influence of frequent landslide occurrences and no proper study has been performed in the study area. So, this methodology can serve as a baseline for upcoming studies. The limitation is that feature extraction is not being performed in this study. The recommendation for future work would be to use feature extraction by using deep learning convolutional neural networks, which will improve the results. In addition, McNemar’s test could be performed for future analysis to compare the statistical significance of different ML models. Further, we recommend using deep learning to avoid the uncertainties in the factors caused by the subjective judgment by following the work of [37] to highlight the performance ability of the hybrid approach of combined fuzzy and SVM model in the area. We will focus more on risk analysis and incorporate the results with temporal factors and examine their effect in future work. 5. Conclusions This study employed three ML models, namely LiR, LoR, and SVM to produce LS maps for the Abbottabad district of Khyber Pakhtunkhwa, Pakistan. Landslide inventory preparation, selection, and processing of the conditioning factors, susceptibility mapping, validation of the models, and accuracy assessment were the main stages in this study. A total of 14 conditioning factors were prepared, including LCCS, soil type, NDWI, slope, lithology, NDVI, elevation, fault density, road density, curvature, profile curvature, plan curvature, TRI, and aspect. The landslide inventory map comprised 232 samples, of which 116 were non-landslide, and 116 were landslide locations. These samples were utilized to calculate the weights of the conditioning factors using LiR, LoR, and SVM models. The results reveal that the most influencing factors are lithology, NDWI, slope, and LCCS. By adopting the weighted overlay technique, the weights of all conditioning factors were used to prepare the final landslide susceptibility maps. The study area is subjected to landslides induced both by seismic and climatic events. The areas having high susceptibility are marked by the presence of high and steep slopes having weaker lithologies and are exposed to high seismic shaking potential. The results indicate that most of the area is subjected to very low susceptibility. The AUC values of all the models were satisfactory. However, the LiR model achieved better results overall and stood above the other models concerning model validation and accuracy of produced susceptibility maps. The outcomes of this research will provide essential information to researchers, authorities, and planners, who aid in decision making, land management, and hazard mitigation in the Abbottabad district. Author Contributions: Conceptualization, I.U. and B.A.; methodology, A.T. and I.U.; software, I.U. and A.T.; validation, A.T., B.A. and I.U.; formal analysis, S.H.I.A.S., A.T. and B.A.; investigation, A.T. and I.U.; resources, S.H.I.A.S., A.T. and S.Q.; data curation, I.U. and A.T.; writing—original draft preparation, I.U., S.Q., M.M. and A.T.; writing—review and editing, A.T., H.-B.H., I.U., S.Q. and B.A.; visualization, A.T. and B.A.; supervision, A.T.; project administration, S.Q.; funding acquisition, S.Q. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded by Postdoctoral Research Foundation of China (grant no.2020M682477) and the Fundamental Research Funds for the Central Universities (grant no.2042021kf0053). Institutional Review Board Statement: Not applicable Informed Consent Statement: Not applicable Land 2022, 11, 1265 20 of 23 Data Availability Statement: The data presented in this study are available on request from the first or corresponding authors. Acknowledgments: We acknowledge the anonymous reviewers and editors of the journal’s special issue, which provided constructive comments that helped improve the final version of the manuscript. Alban Kuriqi acknowledges the Portuguese Foundation for Science and Technology (FCT) support through PTDC/CTA-OHR/30561/2017 (WinTherface). Conflicts of Interest: The authors declare no conflict of interest. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. Varnes, D. Slope Movement Types and Processes. Transp. Res. Board Spec. Rep. 1978, 176, 11–33. Farooq, K.; Rogers, J.D.; Ahmed, M.F. Effect of Densification on the Shear Strength of Landslide Material: A Case Study from Salt Range, Pakistan. Earth Sci. Res. 2015, 4, 113–125. https://doi.org/10.5539/esr.v4n1p113. Das, I.; Stein, A.; Kerle, N.; Dadhwal, V.K. Landslide susceptibility mapping along road corridors in the Indian Himalayas using Bayesian logistic regression models. Geomorphology 2012, 179, 116–125. https://doi.org/10.1016/j.geomorph.2012.08.004. Haque, U.; Blum, P.; da Silva, P.F.; Andersen, P.; Pilz, J.; Chalov, S.R.; Malet, J.P.; Auflič, M.J.; Andres, N.; Poyiadji, E.; et al. Fatal landslides in Europe. Landslides 2016, 13, 1545–1554. https://doi.org/10.1007/s10346-016-0689-3. Papoutsis, I.; Kontoes, C.; Alatza, S.; Apostolakis, A.; Loupasakis, C. InSAR greece with parallelized persistent scatterer interferometry: A national ground motion service for big copernicus sentinel-1 data. Remote Sens. 2020, 12, 3207. https://doi.org/10.3390/rs12193207. USAID; UCL. Natural disasters in 2017: Lower mortality, higher cost. Cent. Res. Epidemiol. Disasters 2018. Retreived from: https://reliefweb.int/report/world/cred-crunch-newsletter-issue-no-50-march-2018-natural-disasters-2017-lower-mortality (Assessed on 24 April 2020). Chen, W.; Chen, Y.; Tsangaratos, P.; Ilia, I.; Wang, X. Combining evolutionary algorithms and machine learning models in landslide susceptibility assessments. Remote Sens. 2020, 12, 3854. https://doi.org/10.3390/rs12233854. Zillman, J. The Physical impact of Disasters. In Natural Disaster Management. Leicester; Ingleton, J., Ed.; Tudor Rose Holdings Ltd.: Leicester, UK, 1999; p. 320. Feizizadeh, B.; Blaschke, T. Landslide Risk Assessment Based on GIS Multi-Criteria Evaluation : A Case Study in Bostan-Abad County Iran. J. Earth Sci. Eng. 2011, 1, 66–71. Tsironi, V.; Ganas, A.; Karamitros, I.; Efstathiou, E.; Koukouvelas, I.; Sokos, E. Kinematics of Active Landslides in Achaia (Peloponnese, Greece) through InSAR Time Series Analysis and Relation to Rainfall Patterns. Remote Sens. 2022, 14, 844. https://doi.org/10.3390/rs14040844. Froude, M.J.; Petley, D.N. Global fatal landslide occurrence from 2004 to 2016. Nat. Hazards Earth Syst. Sci. 2018, 18, 2161–2181. https://doi.org/10.5194/nhess-18-2161-2018. Hobbs, J.J.; Salter, C.L. Essentials of World Regional Geography; Brooks/Cole Thomson Learning: Melbourne, Australia, 2006; ISBN 9780534466008. Aslam, B.; Zafar, A.; Khalil, U. Comparison of multiple conventional and unconventional machine learning models for landslide susceptibility mapping of Northern part of Pakistan. Environ. Dev. Sustain. 2022, 1–28. https://doi.org/10.1007/s10668-022-023146. Mustafa, Z.U.; Ahmad, S.R.; Luqman, M.; Ahmad, U.; Khan, S.; Nawaz, M.; Javed, A. Investigating Factors of Slope Failure for Different Landsliding Sites in Murree Area, Using Geomatics Techniques. J. Geosci. Environ. Prot. 2015, 3, 39–45. https://doi.org/10.4236/gep.2015.38004. Kamp, U.; Growley, B.J.; Khattak, G.A.; Owen, L.A. GIS-based landslide susceptibility mapping for the 2005 Kashmir earthquake region. Geomorphology 2008, 101, 631–642. https://doi.org/10.1016/j.geomorph.2008.03.003. Wei, Z.-L.; Shang, Y.-Q.; Sun, H.-Y.; Xu, H.-D.; Wang, D.-F. The effectiveness of a drainage tunnel in increasing the rainfall threshold of a deep-seated landslide. Landslides 2019, 16, 1731–1744. https://doi.org/10.1007/s10346-019-01241-4. Marjanović, M. Advanced Methods for landslide Assessment Using GIS. Ph.D. Thesis, Palacký University Olomouc, Olomouc, Czechia, 2013; Volume 154, pp. 1–128. Kanwal, S.; Atif, S.; Shafiq, M. GIS based landslide susceptibility mapping of northern areas of Pakistan, a case study of Shigar and Shyok Basins. Geomat. Nat. Hazards Risk 2017, 8, 348–366. https://doi.org/10.1080/19475705.2016.1220023. Ozdemir, A.; Altural, T. A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan mountains, SW Turkey. J. Asian Earth Sci. 2013, 64, 180–197. https://doi.org/10.1016/j.jseaes.2012.12.014. Guzzetti, F.; Carrara, A.; Cardinali, M.; Reichenbach, P. Landslide hazard evaluation: A review of current techniques and their application in a multi-scale study, Central Italy. Geomorphology 1999, 31, 181–216. https://doi.org/10.1016/S0169-555X(99)000781. Zêzere, J.L.; Pereira, S.; Melo, R.; Oliveira, S.C.; Garcia, R.A.C. Mapping landslide susceptibility using data-driven methods. Sci. Total Environ. 2017, 589, 250–267. https://doi.org/10.1016/j.scitotenv.2017.02.188. Land 2022, 11, 1265 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 21 of 23 Tariq, A.; Yan, J.; Gagnon, A.S.; Riaz Khan, M.; Mumtaz, F. Mapping of cropland, cropping patterns and crop types by combining optical remote sensing images with decision tree classifier and random forest. Geo-Spat. Inf. Sci. 2022, 1–19. https://doi.org/10.1080/10095020.2022.2100287. Tariq, A.; Mumtaz, F.; Zeng, X.; Baloch, M.Y.J.; Moazzam, M.F.U. Spatio-temporal variation of seasonal heat islands mapping of Pakistan during 2000–2019, using day-time and night-time land surface temperatures MODIS and meteorological stations data. Remote Sens. Appl. Soc. Environ. 2022, 27, 100779. https://doi.org/10.1016/j.rsase.2022.100779. Shah, S.H.I.A.; Jianguo, Y.; Jahangir, Z.; Tariq, A.; Aslam, B. Integrated geophysical technique for groundwater salinity delineation, an approach to agriculture sustainability for Nankana Sahib Area, Pakistan. Geomat. Nat. Hazards Risk 2022, 13, 1043–1064. https://doi.org/10.1080/19475705.2022.2063077. Farhan, M.; Moazzam, U.; Rahman, G.; Munawar, S.; Tariq, A.; Safdar, Q.; Lee, B. Trends of Rainfall Variability and Drought Monitoring Using Standardized Precipitation Index in a Scarcely Gauged Basin of Northern Pakistan. Water 2022, 14, 1132. https://doi.org/10.3390/w14071132. Ayalew, L.; Yamagishi, H. The application of GIS-based logistic regression for landslide susceptibility mapping in the KakudaYahiko Mountains, Central Japan. Geomorphology 2005, 65, 15–31. https://doi.org/10.1016/j.geomorph.2004.06.010. Kouli, M.; Loupasakis, C.; Soupios, P.; Vallianatos, F. Landslide hazard zonation in high risk areas of Rethymno Prefecture, Crete Island, Greece. Nat. Hazards 2010, 52, 599–621. https://doi.org/10.1007/s11069-009-9403-2. Feizizadeh, B.; Blaschke, T. GIS-multicriteria decision analysis for landslide susceptibility mapping: Comparing three methods for the Urmia lake basin, Iran. Nat. Hazards 2013, 65, 2105–2128. https://doi.org/10.1007/s11069-012-0463-3. Ayalew, L.; Yamagishi, H.; Ugawa, N. Landslide susceptibility mapping using GIS-based weighted linear combination, the case in Tsugawa area of Agano River, Niigata Prefecture, Japan. Landslides 2004, 1, 73–81. https://doi.org/10.1007/s10346-003-0006-9. Sejrup, H.P.; Haflidason, H.; Flatebø, T.; Kristensen, D.K.; Grøsfjeld, K.; Larsen, E. Late-glacial to Holocene environmental changes and climate variability: evidence from Voldafjorden, western Norway. J. Quat. Sci. 2001, 16, 181–198. https://doi.org/10.1002/jqs.593. Alexakis, D.D.; Agapiou, A.; Tzouvaras, M.; Themistocleous, K.; Neocleous, K.; Michaelides, S.; Hadjimitsis, D.G. Integrated use of GIS and remote sensing for monitoring landslides in transportation pavements: The case study of Paphos area in Cyprus. Nat. Hazards 2014, 72, 119–141. https://doi.org/10.1007/s11069-013-0770-3. Neaupane, K.M.; Piantanakulchai, M. Analytic network process model for landslide hazard zonation. Eng. Geol. 2006, 85, 281– 294. https://doi.org/10.1016/j.enggeo.2006.02.003. Hwang, C.-L.; Yoon, K. Multiple Objective Decision Making-Methods and Applications. Lect. Notes Econ. Math. Syst. 1981, 1, 1– 358. https://doi.org/10.1007/978-3-642-45511-7. Arabameri, A.; Pradhan, B.; Rezaei, K.; Conoscenti, C. Gully erosion susceptibility mapping using GIS-based multi-criteria decision analysis techniques. Catena 2019, 180, 282–297. https://doi.org/10.1016/j.catena.2019.04.032. Bai, S.B.; Wang, J.; Lü, G.N.; Zhou, P.G.; Hou, S.S.; Xu, S.N. GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorges area, China. Geomorphology 2010, 115, 23–31. https://doi.org/10.1016/j.geomorph.2009.09.025. Corominas, J.; van Westen, C.; Frattini, P.; Cascini, L.; Malet, J.P.; Fotopoulou, S.; Catani, F.; Van Den Eeckhaut, M.; Mavrouli, O.; Agliardi, F.; et al. Recommendations for the quantitative analysis of landslide risk. Bull. Eng. Geol. Environ. 2014, 73, 209–263. https://doi.org/10.1007/s10064-013-0538-8. Chen, W.; Pourghasemi, H.R.; Kornejady, A.; Zhang, N. Landslide spatial modeling: Introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geoderma 2017, 305, 314–327. https://doi.org/10.1016/j.geoderma.2017.06.020. Oh, H.J.; Lee, S. Shallow landslide susceptibility modeling using the data mining models artificial neural network and boosted tree. Appl. Sci. 2017, 7, 1000. https://doi.org/10.3390/app7101000. Hong, H.; Pradhan, B.; Xu, C.; Bui, D.T. Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 2015, 133, 266–281. https://doi.org/10.1016/j.catena.2015.05.019. Pradhan, B. A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput. Geosci. 2013, 51, 350–365. https://doi.org/10.1016/j.cageo.2012.08.023. Park, S.; Choi, C.; Kim, B.; Kim, J. Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ. Earth Sci. 2013, 68, 1443–1464. https://doi.org/10.1007/s12665-012-1842-5. Yao, X.; Tham, L.G.; Dai, F.C. Landslide susceptibility mapping based on Support Vector Machine: A case study on natural slopes of Hong Kong, China. Geomorphology 2008, 101, 572–582. https://doi.org/10.1016/j.geomorph.2008.02.011. Bui, D.T.; Tuan, T.A.; Hoang, N.D.; Thanh, N.Q.; Nguyen, D.B.; Van Liem, N.; Pradhan, B. Spatial prediction of rainfall-induced landslides for the Lao Cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization. Landslides 2017, 14, 447–458. https://doi.org/10.1007/s10346-016-0711-9. Onagh, M.; Kumra, V.K.; Rai, P.K. Landslide Susceptibility Mapping in a Part of Uttarkashi District (India) By Multiple Linear Regression Method. Int. J. Geol. Earth Environ. Sci. 2012, 2, 102–120. Arabameri, A.; Pradhan, B.; Rezaei, K.; Sohrabi, M.; Kalantari, Z. GIS-based landslide susceptibility mapping using numerical risk factor bivariate model and its ensemble with linear multivariate regression and boosted regression tree algorithms. J. Mt. Sci. 2019, 16, 595–618. https://doi.org/10.1007/s11629-018-5168-y. Land 2022, 11, 1265 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 22 of 23 Chen, W.; Peng, J.; Hong, H.; Shahabi, H.; Pradhan, B.; Liu, J.; Zhu, A.X.; Pei, X.; Duan, Z. Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci. Total Environ. 2018, 626, 1121– 1135. https://doi.org/10.1016/j.scitotenv.2018.01.124. Meng, Q.; Miao, F.; Zhen, J.; Wang, X.; Wang, A.; Peng, Y.; Fan, Q. GIS-based landslide susceptibility mapping with logistic regression, analytical hierarchy process, and combined fuzzy and support vector machine methods: A case study from Wolong Giant Panda Natural Reserve, China. Bull. Eng. Geol. Environ. 2016, 75, 923–944. https://doi.org/10.1007/s10064-015-0786-x. Aslam, B.; Zafar, A.; Khalil, U. Development of integrated deep learning and machine learning algorithm for the assessment of landslide hazard potential. Soft Comput. 2021, 25, 13493–13512. https://doi.org/10.1007/s00500-021-06105-5. Ballabio, C.; Sterlacchini, S. Support Vector Machines for Landslide Susceptibility Mapping: The Staffora River Basin Case Study, Italy. Math. Geosci. 2012, 44, 47–70. https://doi.org/10.1007/s11004-011-9379-9. Onagh, M.; Kumra, V.; Rai, P. Application of Multiple Linear Regression Model in Landslide Susceptibility Zonation Mapping the Case Study Narmab Basin. Int. J. Geol. Earth Environ. Sci. 2012, 2, 87–101. Lee, S.; Min, K. Statistical analysis of landslide susceptibility at Yongin, Korea. Environ. Geol. 2001, 40, 1095–1113. https://doi.org/10.1007/s002540100310. Qing, F.; Zhao, Y.; Meng, X.; Su, X.; Qi, T.; Yue, D. Application of machine learning to debris flow susceptibility mapping along the China-Pakistan Karakoram Highway. Remote Sens. 2020, 12, 2933. https://doi.org/10.3390/RS12182933. Ali, S.; Biermanns, P.; Haider, R.; Reicherter, K. Landslide susceptibility mapping by using a geographic information system (GIS) along the China-Pakistan Economic Corridor (Karakoram Highway), Pakistan. Nat. Hazards Earth Syst. Sci. 2019, 19, 999– 1022. https://doi.org/10.5194/nhess-19-999-2019. Basharat, M.; Shah, H.R.; Hameed, N. Landslide susceptibility mapping using GIS and weighted overlay method: A case study from NW Himalayas, Pakistan. Arab. J. Geosci. 2016, 9, 292. https://doi.org/10.1007/s12517-016-2308-y. Torizin, J.; Fuchs, M.; Awan, A.A.; Ahmad, I.; Akhtar, S.S.; Sadiq, S.; Razzak, A.; Weggenmann, D.; Fawad, F.; Khalid, N.; et al. Statistical landslide susceptibility assessment of the Mansehra and Torghar districts, Khyber Pakhtunkhwa Province, Pakistan. Nat. Hazards 2017, 89, 757–784. https://doi.org/10.1007/s11069-017-2992-2. Pakistan Bureau of Statistics Census Pakistan. 2017. Retreived from: https://www.pbs.gov.pk/content/final-results-census-2017 (Assessed on 07 May 2022). Gansser, A. Geology of the Himalayas; Interscience Publishers: London, UK; New York, NY, USA; Sydney, Australia, 1964. (tr. Zurich) Akhtar, S.; Rahim, Y.; Hu, B.; Tsang, H.; Ibrar, K.M.; Ullah, M.F.; Bute, S.I. Stratigraphy and Structure of Dhamtaur Area, District Abbottabad, Eastern Hazara, Pakistan. Open J. Geol. 2019, 9, 57–66. https://doi.org/10.4236/ojg.2019.91005. Youssef, A.M.; Pourghasemi, H.R.; Pourtaghi, Z.S.; Al-Katheeri, M.M. Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 2016, 13, 839–856. https://doi.org/10.1007/s10346-015-0614-1. Guzzetti, F.; Reichenbach, P.; Cardinali, M.; Galli, M.; Ardizzone, F. Probabilistic landslide hazard assessment at the basin scale. Geomorphology 2005, 72, 272–299. https://doi.org/10.1016/j.geomorph.2005.06.002. Ismail, N.; Khattak, N. Observed failure modes of unreinforced masonry buildings during the 2015 Hindu Kush earthquake. Earthq. Eng. Eng. Vib. 2019, 18, 301–314. https://doi.org/10.1007/s11803-019-0505-x. Wu, Y.; Ke, Y.; Chen, Z.; Liang, S.; Zhao, H.; Hong, H. Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping. Catena 2020, 187, 104396. https://doi.org/10.1016/j.catena.2019.104396. Khan, H.; Shafique, M.; Khan, M.A.; Bacha, M.A.; Shah, S.U.; Calligaris, C. Landslide susceptibility assessment using Frequency Ratio, a case study of northern Pakistan. Egypt. J. Remote Sens. Sp. Sci. 2019, 22, 11–24. https://doi.org/10.1016/j.ejrs.2018.03.004. Wang, Y.; Fang, Z.; Hong, H. Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Sci. Total Environ. 2019, 666, 975–993. https://doi.org/10.1016/j.scitotenv.2019.02.263. Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 2018, 180, 60–91. https://doi.org/10.1016/j.earscirev.2018.03.001. Riley, S.J.; DeGloria, S.D.; Elliot, R. A Terrain Ruggedness that Quantifies Topographic Heterogeneity. Intermt. J. Sci. 1999, 5, 23–27. Lee, S.; Sambath, T. Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models. Environ. Geol. 2006, 50, 847–855. https://doi.org/10.1007/s00254-006-0256-7. Dai, F.C.; Lee, C.F. Landslide characteristics and slope instability modeling using GIS, Lantau Island, Hong Kong. Geomorphology 2002, 42, 213–228. https://doi.org/10.1016/S0169-555X(01)00087-3. Yesilnacar, E.; Topal, T. Landslide susceptibility mapping: A comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng. Geol. 2005, 79, 251–266. https://doi.org/10.1016/j.enggeo.2005.02.002. Vapnik, V. The support vector method of function estimation. In Nonlinear Modeling; Springer: Boston, MA, USA, 1998; pp. 55– 85. https://doi.org/10.1007/978-1-4615-5703-6_3. Tariq, A.; Shu, H.; Kuriqi, A.; Siddiqui, S.; Gagnon, A.S.; Lu, L.; Linh, N.T.T.; Pham, Q.B. Characterization of the 2014 Indus River Flood Using Hydraulic Simulations and Satellite Images. Remote Sens. 2021, 13, 2053. https://doi.org/10.3390/rs13112053. Tariq, A.; Shu, H.; Siddiqui, S.; Mousa, B.G.; Munir, I.; Nasri, A.; Waqas, H.; Lu, L.; Baqa, M.F. Forest fire monitoring using spatial-statistical and Geo-spatial analysis of factors determining forest fire in Margalla Hills, Islamabad, Pakistan. Geomat. Nat. Hazards Risk 2021, 12, 1212–1233. https://doi.org/10.1080/19475705.2021.1920477. Land 2022, 11, 1265 73. 74. 75. 76. 77. 78. 79. 80. 23 of 23 Waqas, H.; Lu, L.; Tariq, A.; Li, Q.; Baqa, M.F.; Xing, J.; Sajjad, A. Flash Flood Susceptibility Assessment and Zonation Using an Integrating Analytic Hierarchy Process and Frequency Ratio Model for the Chitral District, Khyber Pakhtunkhwa, Pakistan. Water 2021, 13, 1650. https://doi.org/10.3390/w13121650. Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. https://doi.org/10.1016/j.patrec.2005.10.010. Tariq, A.; Shu, H.; Siddiqui, S.; Imran, M.; Farhan, M. Monitoring Land Use and Land Cover Changes Using Geospatial Techniques, A Case Study of Fateh Jang, Attock, Pakistan. Geogr. Environ. Sustain. 2021, 14, 41–52. https://doi.org/10.24057/20719388-2020-117. Tariq, A.; Shu, H.; Gagnon, A.S.; Li, Q.; Mumtaz, F.; Hysa, A.; Siddique, M.A.; Munir, I. Assessing Burned Areas in Wildfires and Prescribed Fires with Spectral Indices and SAR Images in the Margalla Hills of Pakistan. Forests 2021, 12, 1371. https://doi.org/10.3390/f12101371. Vakhshoori, V.; Zare, M. Is the ROC curve a reliable tool to compare the validity of landslide susceptibility maps? Geomat. Nat. Hazards Risk 2018, 9, 249–266. https://doi.org/10.1080/19475705.2018.1424043. Tariq, A.; Shu, H. CA-Markov chain analysis of seasonal land surface temperature and land use landcover change using optical multi-temporal satellite data of Faisalabad, Pakistan. Remote Sens. 2020, 12, 3402. https://doi.org/10.3390/rs12203402. Tariq, A.; Shu, H.; Siddiqui, S.; Munir, I.; Sharifi, A.; Li, Q.; Lu, L. Spatio-temporal analysis of forest fire events in the Margalla Hills, Islamabad, Pakistan using socio-economic and environmental variable data with machine learning methods. J. For. Res. 2021, 13, 12. https://doi.org/10.1007/s11676-021-01354-4. Kalantar, B.; Pradhan, B.; Amir Naghibi, S.; Motevalli, A.; Mansor, S. Assessment of the effects of training data selection on the landslide susceptibility mapping: A comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomat. Nat. Hazards Risk 2018, 9, 49–69. https://doi.org/10.1080/19475705.2017.1407368.