Enhancing the performance of regional land cover mapping

Weicheng Wu; Fadi  Karam

Enhancing the performance of regional land cover mapping

Enhancing the performance of regional land cover mapping, 2016

Different pixel-based, object-based and subpixel-based methods such as time-series analysis, decision-tree, and different supervised approaches have been proposed to conduct land use/cover classification. However, despite their proven advantages in small dataset tests, their performance is variable and less satisfactory while dealing with large datasets, particularly, for regional-scale mapping with high resolution data due to the complexity and diversity in landscapes and land cover patterns, and the unacceptably long processing time. The objective of this paper is to demonstrate the comparatively highest performance of an operational approach based on integration of multisource information ensuring high mapping accuracy in large areas with acceptable processing time. The information used includes phenologically contrasted multiseasonal and multispectral bands, vegetation index, land surface temperature, and topographic features. The performance of different conventional and machine learning classifiers namely Malahanobis Distance (MD), Maximum Likelihood (ML), Artificial Neural Networks (ANNs), Support Vector Machines (SVMs) and Random Forests (RFs) was compared using the same datasets in the same IDL (Interactive Data Language) environment. An Eastern Mediterranean area with complex landscape and steep climate gradients was selected to test and develop the operational approach. The results showed that SVMs and RFs classifiers produced most accurate mapping at local-scale (up to 96.85% in Overall Accuracy), but were very time-consuming in whole-scene classification (more than five days per scene) whereas ML fulfilled the task rapidly (about 10 min per scene) with satisfying accuracy (94.2–96.4%). Thus, the approach composed of integration of seasonally contrasted multisource data and sampling at subclass level followed by a ML classification is a suitable candidate to become an operational and effective regional land cover mapping method....Read more

UNCORRECTED PROOF ARTICLE INFO Article history: Received 12 May 2016 Received in revised form 19 July 2016 Accepted 20 July 2016 Available online xxx Keywords: Multisource data integration Phenological contrast Topographic features Separability Accuracy ABSTRACT Different pixel-based, object-based and subpixel-based methods such as time-series analysis, decision-tree, and differ- ent supervised approaches have been proposed to conduct land use/cover classification. However, despite their proven advantages in small dataset tests, their performance is variable and less satisfactory while dealing with large datasets, particularly, for regional-scale mapping with high resolution data due to the complexity and diversity in landscapes and land cover patterns, and the unacceptably long processing time. The objective of this paper is to demonstrate the com- paratively highest performance of an operational approach based on integration of multisource information ensuring high mapping accuracy in large areas with acceptable processing time. The information used includes phenologically con- trasted multiseasonal and multispectral bands, vegetation index, land surface temperature, and topographic features. The performance of different conventional and machine learning classifiers namely Malahanobis Distance (MD), Maximum Likelihood (ML), Artificial Neural Networks (ANNs), Support Vector Machines (SVMs) and Random Forests (RFs) was compared using the same datasets in the same IDL (Interactive Data Language) environment. An Eastern Mediterranean area with complex landscape and steep climate gradients was selected to test and develop the operational approach. The results showed that SVMs and RFs classifiers produced most accurate mapping at local-scale (up to 96.85% in Overall Accuracy), but were very time-consuming in whole-scene classification (more than five days per scene) whereas ML ful- filled the task rapidly (about 10 min per scene) with satisfying accuracy (94.2–96.4%). Thus, the approach composed of integration of seasonally contrasted multisource data and sampling at subclass level followed by a ML classification is a suitable candidate to become an operational and effective regional land cover mapping method. © 2016 Published by Elsevier Ltd. International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx Contents lists available at ScienceDirect International Journal of Applied Earth Observations and Geoinformation journal homepage: www.elsevier.com Enhancing the performance of regional land cover mapping Weicheng Wu a, ⁎ , Claudio Zucca b , Fadi Karam c , Guangping Liu d a State-Key Lab of Nuclear Resources and Environment, East China Institute of Technology (ECIT), Nanchang, 330013 Jiangxi, China b ICARDA (International Center for Agricultural Research Center in the Dry Areas), Amman, Jordan c Litani River Authority, Beirut, Lebanon d Faculty of Sciences, East China Institute of Technology (ECIT), 330013, Nanchang, Jiangxi, China 1. Introduction Land cover (LC) and land use (LU) data are fundamental inputs for a wide range of environmental planning, management and research applications. Nowadays, LC mapping mostly relies on remote sensing building on more than 40 years of scientific research and technolog- ical developments from local to global scale (Atkinson and Tatnall, 1997; Chen et al., 2015; DeFries et al., 1998; Friedl et al., 2002; Gong et al., 1992, 2013; Hansen et al., 2000; Haralick et al., 1973; Wu and Zhang, 2003; Wu et al., 2013a). However, accuracy and reliability may become a challenge when using high resolution data for regional and global mapping. For example, as reported by Gong et al. (2013) concerning their global LC mapping using Landsat data, the Overall Accuracies (OAs) were below 70% for all continental-scale and below 75% for most national-scale maps except for some countries like Al- geria, Saudi Arabia, Libya where LC patterns are simple. The conventional classification approaches adopted pattern recog- nition techniques including both supervised and unsupervised algo- rithms, assuming that the study area is composed of a number of unique internally homogenous classes that are mutually exclusive (Townshend, 1984). However, such assumption is not applicable to ⁎ Corresponding author. Email address: Wuwc123@gmail.com, wchwu@ecit.cn (W. Wu) most natural or semi-natural areas where there are mixed pixels (Adams et al., 1995; Atkinson, 2005; Hill and Schutt, 2000; Van Der Meer, 1995), and especially LC types exist as continua rather than as a mosaic of discrete classes (Foody et al., 1992; Kent et al., 1997; Wu and Zhang, 2003). As a result, the classes intergrade showing a low degree of separability, and cannot be distinguished by means of sharp boundaries (Foody et al., 1992). The separability of classes can be evaluated by the Jeffreys-Matusita Distance (JMD) according to Swain and King (1973) and Richards and Jia (2006). For the pair of classes i and j, this distance can be expressed as: where C i —the covariance matrix of class i; μ i —the mean vector of class i; ln—the natural logarithm function; T—the transposition function; and |C i |—the determinant of C i ; the same meanings for the counter http://dx.doi.org/10.1016/j.jag.2016.07.014 0303-2434/© 2016 Published by Elsevier Ltd. (1) (2)

UNCORRECTED PROOF 2 International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx parts of class j. JMD ranges from 0 to 2.0; when it is below 1.0, two classes (of a class-pair) are not separable; when it is between 1.0 and 1.5, two classes are separable but with confusion, and when it is be- tween 1.5 and 1.9, two classes are clearly separable; only when JMD is above 1.9 the class-pair is completely separable. For poorly separable classes the accuracy of classification is the major problem in LC mapping. For this purpose, a number of authors have explored the possibility to improve mapping accuracy by tak- ing into account the texture (Gong et al., 1992; Haralick et al., 1973; Zhang, 2001) or by object-based segmentation (Mao and Jain, 1992; Blaschke, 2010; Pu et al., 2011) or by combining both pixel- and ob- ject-based approaches (Huth et al., 2012; Chen et al., 2015). In ad- dition to the traditional unsupervised (e.g., IsoData, K-Means) and supervised algorithms, e.g., Mahalanobis Distance (MD) and Max- imum Likelihood (ML), a number of authors have introduced ma- chine learning algorithms that can capture the non-parametric signa- tures of classes such as Artificial Neural Networks (ANNs, Atkinson and Tatnall, 1997; Benediktsson et al., 1990; Kavzoglu and Mather, 2003), Support Vector Machines (SVMs, Foody and Mathur, 2004; Huang et al., 2002; Kavzoglu and Colkesen, 2009; Vapnik and Lerner, 1963) and Random Forests (RFs, Breiman, 2001; Rodriguez-Galiano et al., 2012; Waske et al., 2012). For mixed pixels, various subpixel processing techniques have been proposed to decompose land cover fraction or to improve LC mapping accuracy, e.g., linear spectral unmixing (Adams et al., 1986; Foody and Cox, 1994; Hill and Schutt, 2000; Lu and Weng, 2004; Smith et al., 1990; Van Der Meer, 1995), linear optimization (Verhoeye and De Wulf, 2002), Hopfield neural network (Tatem et al., 2002), pixel-swapping (Atkinson, 2005), subpixel/pixel attraction (Mertens et al., 2006), etc. Some authors have also integrated a set of single or time-series vegetation indices (VIs) such as NDVI (Normalized Difference Veg- etation Index) or EVI (Enhanced Vegetation Index) and land sur- face temperature (LST) to undertake LC mapping (Friedl et al., 2002; Loveland et al., 2000; Lu et al., 2014). Furthermore, topographic fea- tures have been employed in LC classification to improve accuracy (Benediktsson et al., 1990; Rodriguez-Galiano et al., 2012), particu- larly by the ESA-funded DesertWatch project (Pace et al., 2006; ESA, 2008), based on the assumption that landscape features restrain to a certain extent land use or land cover. For example, irrigated land gen- erally occurs in flat to gently sloping land. Phenological patterns and features (Zhu and Wan, 1963) have also played a role in LC map- ping (Friedl et al., 2002; Jia et al., 2014; Lu et al., 2014). The above mentioned DesertWatch project and Rodriguez-Galiano et al. (2012) used paired season-contrasted spring and summer images instead of time-series data to enhance LC classification. The goal of this research is to demonstrate the performance of a LC mapping procedure based on the integrated use of the phenol- ogy-contrasted information including multispectral (MS) bands of im- ages, GDVI (Generalized Difference Vegetation Index) which is more sensitive than other VIs for dryland characterization (Wu, 2014), LST, and topographic features extracted from a Digital Elevation Model (DEM), and to compare it with that of some other widely adopted su- pervised approaches. The specific objective is to quantify the achieved improvement in terms of separability of classes, accuracy of the clas- sification, and processing time by integration of multisource high res- olution data for area with complex landscape. 2. Data and methods 2.1. Study area The study area is located in the Eastern Mediterranean Region and coincides with the area covered by Landsat scenes with path/row numbers of 174/35–174/37 (Fig. 1). This area was chosen because it is a dryland characterized by steep climatic gradients with various land- forms and complex LC patterns, thus a challenging site for remote sensing-based LC mapping. Two subset sites with contrasting LC and LU characteristics were also defined (Fig. 1) for experimental pur- poses as explained below. In the study area rainfall is mostly concentrated between Novem- ber and April and ranges from around 650 mm on the western coastal slopes to less than 100 mm in the eastern dry rangelands and deserts. Three main landforms are respectively, from the west to the east, the coastal plains and piedmont, the mountain–valley–mountain se- quence of the north-south stretching coastal ranges, and the eastern plateau. Natural vegetation cover mainly consists of coniferous and broadleaf forests in the highlands, shrublands and maquis in the moun- tain slopes, and herbaceous rangelands in the eastern hills and plateau (Wu, 2014). Irrigation is mainly concentrated in the Aleppo Plain, Orontes and Litani watersheds and Jordan River valley. The main spring crops are irrigated wheat and vegetables, and rainfed barley, whereas summer crops are irrigated cotton, maize, sunflower, sesame, water melon and vegetables. Olive is widespread in rainfed areas, interleaved with fig and pistachio. Orchards including citrus, apple, cherry, peach, etc., are mainly distributed in the western coastal plains and slopes. Date, ba- nana and vineyards are mostly present in the Bekaa and Jordan River valleys. The major land use/cover classes of the study area are sum- marized in Table 1. In Table 1 the category “Conifers” does not only include mono- specific pine and/or cedar stands, but also mixed formations includ- ing broadleaved species. The distinction between forests (Conifers and Broadleaf) and “Woodland” or “Woody Shrubland”, is based on the FAO Land Cover Classification System (LCCS, Di Gregorio and Jansen, 2000): forests have tree canopy cover (CC) above 60%, whereas CC is between 20 and 60% for woodlands and less than 20% for sparse woodlands (Wu et al., 2013b). Since sparse woody forma- tions are generally used as grazing land in the study area, this class was considered as part of the “Rangelands”. 2.2. Data Landsat 5 TM (Thematic Mapper) spring (01 May 2007) and sum- mer (21 August 2007) images were acquired for the scene 174/35. Landsat 8 OLI (Operational Land Imager) and TIRS (Thermal In- frared Sensor) data were obtained for the scenes 174/36 (02 April 2014 and 24 August 2014) and 174/37 (18 April 2014 and 24 August 2014). The two dates represent respectively the spring vegetative max- imum and the summer minimum, they are thus highly contrasted. Ex- tensive ground-truthing work was conducted in the period 2007–2011 in Syria and in 2013–2014 in Lebanon and Jordan (GPS locations in Fig. 1). Google Earth was used as a complementary source for areas not covered by field work. SRTM (Shuttle Radar Topography Mis- sion) DEM data (90 m in resolution) were obtained and used to gener- ate elevation (E), slope (S), and aspect (A) information. 2.3. Methods 2.3.1. Dataset preparation The following major processing steps were undertaken for the scene 174/35 for developing and testing the approaches, while the other two scenes were used for regional-scale application as explained in section 2.3.5: 1) Atmospheric correction on Landsat images was performed by means of the COST model (Chavez, 1996): DOS (Dark-object

International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx International Journal of Applied Earth Observations and Geoinformation PR OO journal homepage: www.elsevier.com F Contents lists available at ScienceDirect Enhancing the performance of regional land cover mapping Weicheng Wu a, ⁎, Claudio Zucca b, Fadi Karam c, Guangping Liu d a b c d State-Key Lab of Nuclear Resources and Environment, East China Institute of Technology (ECIT), Nanchang, 330013 Jiangxi, China ICARDA (International Center for Agricultural Research Center in the Dry Areas), Amman, Jordan Litani River Authority, Beirut, Lebanon Faculty of Sciences, East China Institute of Technology (ECIT), 330013, Nanchang, Jiangxi, China ABSTRACT Article history: Received 12 May 2016 Received in revised form 19 July 2016 Accepted 20 July 2016 Available online xxx Different pixel-based, object-based and subpixel-based methods such as time-series analysis, decision-tree, and different supervised approaches have been proposed to conduct land use/cover classification. However, despite their proven advantages in small dataset tests, their performance is variable and less satisfactory while dealing with large datasets, particularly, for regional-scale mapping with high resolution data due to the complexity and diversity in landscapes and land cover patterns, and the unacceptably long processing time. The objective of this paper is to demonstrate the comparatively highest performance of an operational approach based on integration of multisource information ensuring high mapping accuracy in large areas with acceptable processing time. The information used includes phenologically contrasted multiseasonal and multispectral bands, vegetation index, land surface temperature, and topographic features. The performance of different conventional and machine learning classifiers namely Malahanobis Distance (MD), Maximum Likelihood (ML), Artificial Neural Networks (ANNs), Support Vector Machines (SVMs) and Random Forests (RFs) was compared using the same datasets in the same IDL (Interactive Data Language) environment. An Eastern Mediterranean area with complex landscape and steep climate gradients was selected to test and develop the operational approach. The results showed that SVMs and RFs classifiers produced most accurate mapping at local-scale (up to 96.85% in Overall Accuracy), but were very time-consuming in whole-scene classification (more than five days per scene) whereas ML fulfilled the task rapidly (about 10 min per scene) with satisfying accuracy (94.2–96.4%). Thus, the approach composed of integration of seasonally contrasted multisource data and sampling at subclass level followed by a ML classification is a suitable candidate to become an operational and effective regional land cover mapping method. OR RE C Keywords: Multisource data integration Phenological contrast Topographic features Separability Accuracy TE D ARTICLE INFO 1. Introduction UN C Land cover (LC) and land use (LU) data are fundamental inputs for a wide range of environmental planning, management and research applications. Nowadays, LC mapping mostly relies on remote sensing building on more than 40 years of scientific research and technological developments from local to global scale (Atkinson and Tatnall, 1997; Chen et al., 2015; DeFries et al., 1998; Friedl et al., 2002; Gong et al., 1992, 2013; Hansen et al., 2000; Haralick et al., 1973; Wu and Zhang, 2003; Wu et al., 2013a). However, accuracy and reliability may become a challenge when using high resolution data for regional and global mapping. For example, as reported by Gong et al. (2013) concerning their global LC mapping using Landsat data, the Overall Accuracies (OAs) were below 70% for all continental-scale and below 75% for most national-scale maps except for some countries like Algeria, Saudi Arabia, Libya where LC patterns are simple. The conventional classification approaches adopted pattern recognition techniques including both supervised and unsupervised algorithms, assuming that the study area is composed of a number of unique internally homogenous classes that are mutually exclusive (Townshend, 1984). However, such assumption is not applicable to ⁎ Corresponding author. Email address: Wuwc123@gmail.com, wchwu@ecit.cn (W. Wu) http://dx.doi.org/10.1016/j.jag.2016.07.014 0303-2434/© 2016 Published by Elsevier Ltd. © 2016 Published by Elsevier Ltd. most natural or semi-natural areas where there are mixed pixels (Adams et al., 1995; Atkinson, 2005; Hill and Schutt, 2000; Van Der Meer, 1995), and especially LC types exist as continua rather than as a mosaic of discrete classes (Foody et al., 1992; Kent et al., 1997; Wu and Zhang, 2003). As a result, the classes intergrade showing a low degree of separability, and cannot be distinguished by means of sharp boundaries (Foody et al., 1992). The separability of classes can be evaluated by the Jeffreys-Matusita Distance (JMD) according to Swain and King (1973) and Richards and Jia (2006). For the pair of classes i and j, this distance can be expressed as: (1) where (2) Ci—the covariance matrix of class i; μi—the mean vector of class i; ln—the natural logarithm function; T—the transposition function; and |Ci|—the determinant of Ci; the same meanings for the counter International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx PR OO F numbers of 174/35–174/37 (Fig. 1). This area was chosen because it is a dryland characterized by steep climatic gradients with various landforms and complex LC patterns, thus a challenging site for remote sensing-based LC mapping. Two subset sites with contrasting LC and LU characteristics were also defined (Fig. 1) for experimental purposes as explained below. In the study area rainfall is mostly concentrated between November and April and ranges from around 650 mm on the western coastal slopes to less than 100 mm in the eastern dry rangelands and deserts. Three main landforms are respectively, from the west to the east, the coastal plains and piedmont, the mountain–valley–mountain sequence of the north-south stretching coastal ranges, and the eastern plateau. Natural vegetation cover mainly consists of coniferous and broadleaf forests in the highlands, shrublands and maquis in the mountain slopes, and herbaceous rangelands in the eastern hills and plateau (Wu, 2014). Irrigation is mainly concentrated in the Aleppo Plain, Orontes and Litani watersheds and Jordan River valley. The main spring crops are irrigated wheat and vegetables, and rainfed barley, whereas summer crops are irrigated cotton, maize, sunflower, sesame, water melon and vegetables. Olive is widespread in rainfed areas, interleaved with fig and pistachio. Orchards including citrus, apple, cherry, peach, etc., are mainly distributed in the western coastal plains and slopes. Date, banana and vineyards are mostly present in the Bekaa and Jordan River valleys. The major land use/cover classes of the study area are summarized in Table 1. In Table 1 the category “Conifers” does not only include monospecific pine and/or cedar stands, but also mixed formations including broadleaved species. The distinction between forests (Conifers and Broadleaf) and “Woodland” or “Woody Shrubland”, is based on the FAO Land Cover Classification System (LCCS, Di Gregorio and Jansen, 2000): forests have tree canopy cover (CC) above 60%, whereas CC is between 20 and 60% for woodlands and less than 20% for sparse woodlands (Wu et al., 2013b). Since sparse woody formations are generally used as grazing land in the study area, this class was considered as part of the “Rangelands”. 2. Data and methods UN C OR RE C parts of class j. JMD ranges from 0 to 2.0; when it is below 1.0, two classes (of a class-pair) are not separable; when it is between 1.0 and 1.5, two classes are separable but with confusion, and when it is between 1.5 and 1.9, two classes are clearly separable; only when JMD is above 1.9 the class-pair is completely separable. For poorly separable classes the accuracy of classification is the major problem in LC mapping. For this purpose, a number of authors have explored the possibility to improve mapping accuracy by taking into account the texture (Gong et al., 1992; Haralick et al., 1973; Zhang, 2001) or by object-based segmentation (Mao and Jain, 1992; Blaschke, 2010; Pu et al., 2011) or by combining both pixel- and object-based approaches (Huth et al., 2012; Chen et al., 2015). In addition to the traditional unsupervised (e.g., IsoData, K-Means) and supervised algorithms, e.g., Mahalanobis Distance (MD) and Maximum Likelihood (ML), a number of authors have introduced machine learning algorithms that can capture the non-parametric signatures of classes such as Artificial Neural Networks (ANNs, Atkinson and Tatnall, 1997; Benediktsson et al., 1990; Kavzoglu and Mather, 2003), Support Vector Machines (SVMs, Foody and Mathur, 2004; Huang et al., 2002; Kavzoglu and Colkesen, 2009; Vapnik and Lerner, 1963) and Random Forests (RFs, Breiman, 2001; Rodriguez-Galiano et al., 2012; Waske et al., 2012). For mixed pixels, various subpixel processing techniques have been proposed to decompose land cover fraction or to improve LC mapping accuracy, e.g., linear spectral unmixing (Adams et al., 1986; Foody and Cox, 1994; Hill and Schutt, 2000; Lu and Weng, 2004; Smith et al., 1990; Van Der Meer, 1995), linear optimization (Verhoeye and De Wulf, 2002), Hopfield neural network (Tatem et al., 2002), pixel-swapping (Atkinson, 2005), subpixel/pixel attraction (Mertens et al., 2006), etc. Some authors have also integrated a set of single or time-series vegetation indices (VIs) such as NDVI (Normalized Difference Vegetation Index) or EVI (Enhanced Vegetation Index) and land surface temperature (LST) to undertake LC mapping (Friedl et al., 2002; Loveland et al., 2000; Lu et al., 2014). Furthermore, topographic features have been employed in LC classification to improve accuracy (Benediktsson et al., 1990; Rodriguez-Galiano et al., 2012), particularly by the ESA-funded DesertWatch project (Pace et al., 2006; ESA, 2008), based on the assumption that landscape features restrain to a certain extent land use or land cover. For example, irrigated land generally occurs in flat to gently sloping land. Phenological patterns and features (Zhu and Wan, 1963) have also played a role in LC mapping (Friedl et al., 2002; Jia et al., 2014; Lu et al., 2014). The above mentioned DesertWatch project and Rodriguez-Galiano et al. (2012) used paired season-contrasted spring and summer images instead of time-series data to enhance LC classification. The goal of this research is to demonstrate the performance of a LC mapping procedure based on the integrated use of the phenology-contrasted information including multispectral (MS) bands of images, GDVI (Generalized Difference Vegetation Index) which is more sensitive than other VIs for dryland characterization (Wu, 2014), LST, and topographic features extracted from a Digital Elevation Model (DEM), and to compare it with that of some other widely adopted supervised approaches. The specific objective is to quantify the achieved improvement in terms of separability of classes, accuracy of the classification, and processing time by integration of multisource high resolution data for area with complex landscape. TE D 2 2.1. Study area The study area is located in the Eastern Mediterranean Region and coincides with the area covered by Landsat scenes with path/row 2.2. Data Landsat 5 TM (Thematic Mapper) spring (01 May 2007) and summer (21 August 2007) images were acquired for the scene 174/35. Landsat 8 OLI (Operational Land Imager) and TIRS (Thermal Infrared Sensor) data were obtained for the scenes 174/36 (02 April 2014 and 24 August 2014) and 174/37 (18 April 2014 and 24 August 2014). The two dates represent respectively the spring vegetative maximum and the summer minimum, they are thus highly contrasted. Extensive ground-truthing work was conducted in the period 2007–2011 in Syria and in 2013–2014 in Lebanon and Jordan (GPS locations in Fig. 1). Google Earth was used as a complementary source for areas not covered by field work. SRTM (Shuttle Radar Topography Mission) DEM data (90 m in resolution) were obtained and used to generate elevation (E), slope (S), and aspect (A) information. 2.3. Methods 2.3.1. Dataset preparation The following major processing steps were undertaken for the scene 174/35 for developing and testing the approaches, while the other two scenes were used for regional-scale application as explained in section 2.3.5: 1) Atmospheric correction on Landsat images was performed by means of the COST model (Chavez, 1996): DOS (Dark-object 3 OR RE C TE D PR OO F International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx UN C Fig. 1. Location of the study area (defined by Landsat scenes 174/35–174/37) including the two subsets, and the field observation points. Subtraction) technique (Chavez, 1988) was used to determine the haze of each band, then spectral radiance in digital number was converted into reflectance. 2) Spring and summer Landsat images and SRTM data were resized to the same image dimension (7156 × 6858) and pixel size (30 m). Nearest Neighbor resampling was used for the images and Cubic Convolution for DEM data, to minimize the defect of bad pixels in SRTM data. 3) Spring and summer GDVI with power number of 2 was derived from the multispectral bands; LST was calculated in terms of Chander et al. (2009) and USGS (2015): LST = K2/ln((K1/Lλ) + 1) (3) where Lλ—spectral radiance, K1 and K2—conversion constants. For TM thermal band, K1 and K2 are respectively 607.76 and 1260.56 (Chander et al., 2009); for Landsat 8 TIRS, K1 and K2 are respectively 774.89 and 1321.08 for band 10, and 480.89 and 1201.84 for band 11 (USGS, 2015). It is worthy of mention that there is a difference of about 0.15–1.8 K of LST from the Landsat 8 bands 10 and 11; hence, the below mentioned LST is actually their average. 4) E, S, and A derived from SRTM data. 5) Multisource and multifactor datasets were compiled as follows to test the effects of different degrees of integration of spectral, phenological and topographical information on the classification performance: International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx Description 2.3.2. Ground-truth sampling in subsets and whole scene Based on the field observation and visual interpretation of the acquired satellite images with reference to Google Earth, ground-truth samples represented by regions of interests (ROIs) were defined on the colorful composites for each land use/cover class in Subset 1, Subset 2 and whole-scene datasets. For a number of classes such as Olives, Rangelands, Woodlands, Conifers, and so on, multiple subclasses were also defined and sampled to deal with the observed variability of classes with landscape. For example, “Rangeland 1” for herbaceous rangeland, “Rangeland 2” for woody herbaceous rangeland, and “Rangeland 3” where the latter mixed cover was observed on dark volcanic soils, etc. “Olive 1” for olive groves on brown soils, “Olive 2” for those on light-colored, whitish soils containing lime and/or gypsum, “Olive 3” for those cultivated on dark volcanic soils, “Olive 4”, for young groves with low canopy cover (e.g., CC < 5–10%, close to bare soil), and “Olive 5” for mature olive groves with CC > 45–50%, and so on. In total, 13, 19 and 44 classes and subclasses (including clouds and shadows) were recognized respectively for the Subset 1, Subset 2 and whole-scene datasets of 174/35, and as many as possible ROIs were selected to cover the whole datasets for each class and subclass. In total, the training samples (ROIs-1) accounted for respectively 11.72%, 6.02%, and 6.79% of the total pixels to be classified in Subsets 1, 2 and whole scene datasets in agreement with the 5–10% rate recommended by Zhuang et al. (1994). An independent set of samples (ROIs-2) was selected in Subsets 1, 2 and whole-scene composites following the same principle as for ROIs-1 for cross-validation purposes, covering respectively 12.82%, 6.52% and 11.08% of the total pixels. OR RE C Artificial areas Built-Up Mainly urban (industrial residential, etc.), roads, infrastructures Mining or Mining areas or sites under construction Construction Sites Agricultural areas Spring Wheat and vegetables, locally barley Irrigation Summer Cotton, maize, melon, sesame, and vegetables Irrigation Rotated Perennial crops irrigated in both spring and summer Irrigation Spring Rainfed Mainly barley, locally wheat, cultivated for harvesting Orchards Citrus, apple, cherry, pistachio, date palm, banana, etc. Can include olive groves. Terraced In the mountainous terraces, rainfed crops (barley and wheat) Rainfed with cultivated under or mixed with olive groves or fruit trees Olives Vineyards Vineyards Olives Olive groves Greenhouse Greenhouses Fallows Cropland not being cultivated at the date. Rainfed Mainly barley, locally wheat, cultivated for grazing, not for Pastures harvesting; managed grassland in Golan Heights Natural and semi-natural areas Broadleaf Deciduous and sclerophyll tree formations, and maquis, with tree Forests CC > 60% Conifers Mainly pine, locally cedar, with tree CC > 60% Hylophytes Shrub formations in salt marsh in Jordan Valley with CC usually >60% Woodlands Woody shrublands including maquis, with tree CC between 20% and 60% Rangelands Unmanaged grassland locally including sparse trees or shrubs, with tree CC < 20% Bare Lands Bare soil, bare rocks, and deserts with vegetation cover generally <5% Saline Land Salt marshes Beaches Coastal sand deposits Snow Snow cover Other areas Water Bodies Sea, lakes, artificial water bodies Burnt Scars Burnt areas F Land Use/ Cover Classes scene levels. Hence, multisource datasets, Subset 1 (1384 × 1211 pixels) and Subset 2 (1943 × 1776 pixels), which are different in both location and land use/cover types, were prepared. PR OO Table 1 Land use/cover classes defined in the study area based on the field observation with reference to the European CORINE classification scheme (CEC, 1994). TE D 4 UN C a. 3-band datasets: Three uncorrelated bands (MS1, 4 and 7) in reflectance from both the spring and summer images were compiled into two 3-band datasets as bands 1, 2 and 3 in visible spectral region of Landsat TM and ETM+ data are mutually correlated (R2 = 0.982 − 0.984), and so are bands 5 and 7 in shortwave infrared spectral region; and the combination of MS741 as RGB constitutes the best pseudo-natural color composite after atmospheric correction (Wu, 2003). These uncorrelated bands were thus taken to avoid redundancy of spectral information and saving running time; b. 6-band dataset: containing the two 3-band MS147 (spring + summer) datasets; c. 8-band dataset: containing the 6-band dataset + two bands of GDVI (spring + summer); d. 10-band dataset: the 8-band dataset + two bands of LST (spring + summer); e. 12-band dataset: the 10-band dataset + E + S f. 13-band dataset: the 12-band dataset + A. 6) Subsetting The performance of different supervised classifiers (listed below) was tested at different dataset sizes, i.e., at both subset and whole- 2.3.3. Separability of classes The separability of the classes was investigated on the whole-scene datasets using JMD to fully consider the complexity in landforms and variability in land use/cover. The impacts of different degrees of multisource data integration (from 3 bands, 6 bands up to 13 bands) on the separability of the problematic class-pairs were quantified. 2.3.4. Performance of different classifiers The conventional and machine learning supervised classifiers namely MD, ML, ANNs, SVMs and RFs were tested on Subset 1, Subset 2 and whole-scene datasets for evaluating their performances using the Overall Accuracy (OA), Kappa Coefficient (KC) and processing time. MD is a direction-sensitive distance classifier that uses statistics for each class assuming all class covariances are equal, while ML assumes that the statistics for each class in each band are normally distributed and calculates the probability of a given pixel belonging to a specific class (Richards and Jia, 2006). These two classifications were conducted without setting distance error or probability threshold so that all pixels were classified. ANNs are a layered feed-forward neural network classification technique in which the multilayer perceptron (MLP) backpropagation algorithm is commonly used (Mas and Flores, 2008). In our test, the logistic (or sigmoid) function was selected as activation function with hidden layer number (representing non-linear degree) of 1 as suggested; and training iterations were respectively set to 100 (Subset1), 150 (Subset2) and 1000 (Whole-scene) avoiding overfitting. The training threshold contribution (θ), denoting the contribution of the internal weight with respect to the activation level of the node, ranges International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx PR OO F For performance evaluation purposes the conventional and machine learning classifications by ANNs and SVMs were conducted with software ENVI 5.2, and RFs classification was realized with EnMap-Box (Waske et al., 2012; Van der Linden et al., 2015) installed on a PC equipped with 16 GB of RAM and Intel(R) Core i7-4510 CPU (4 processors). All tests were undertaken in IDL (Interactive Data Language) environment. 3. Results and discussion 3.1. Subset scales TE D Among the conventional classifiers, MD and ML, especially ML, could produce reasonable maps with high accuracy and reliability in both Subsets (Table 2). With the increase in band number (higher degree of multisource data integration) the OA and KC also increased, with best accuracy for 13-band integration (e.g., 94.4–96.1% for products by ML classifier at both subset sites). Concerning the machine learning algorithms, the ANNs classifier produced maps with a high accuracy when applied to 6-band datasets at Subsets 1 (95.61%) but not at Subset 2 (89.18–90.4%, Table 3). It failed to generate satisfying results (low OA and KC) with high band number datasets (e.g. >8). Additional tests using different combinations of the model parameters, for example, θ, η and α respectively set to 0.3–0.6, 0.5–0.8, 0.2–0.5, did not show better results. The RFs classifier showed consistently high accuracy in all cases, from 3-band to 13-band datasets, in both sites. With the increase in band number by adding T, E, S and A, no substantial improvement in OA and KC was observed, implying that integration of phenology-contrasted information is sufficient to enhance the performance of RFs (Table 3). SVMs performed very well at both subsets especially when radial basis was selected as kernel function. Linear kernel type saved about 20–26% of time for each running but had lower OA by about 0.5–4.0% than radial basis. The increase in band number also led to a further improvement in OA and KC similar to the case of ML. In summary, the conventional classifier, ML, performed quite well in categorizing land use/cover at both Subsets 1 and 2; its OA increased with the integration degree of multisource information, and it reached the best at 13-band datasets. The popularly applied ANNs did not ideally perform as expected in the complex LC area (e.g., Subset 2), whereas RFs and SVMs allowed to achieve highly reliable mapping with OAs from 95.66% to 96.94%. The disadvantage was, however, their long processing time, particularly SVMs. With an increase of 105% in pixel number from Subset 1 to Subset 2, the time used by ML increased from 17–18 s to 30–31 s only (Table 2), yet, the increase was dramatic for SVMs, from 19–56 min to OR RE C from 0 to 1.0; the training rate (η), the magnitude of the adjustment of the weights, also comes between 0 and 1.0 (a higher rate will speed up the training but will also increase the risk of oscillations of the training result). The training momentum rate (α), varies from 0 to 1.0, and a higher α will conduct training with larger steps than a lower one. The default values of θ, η and α are respectively 0.9, 0.2 and 0.9 within ENVI (ENvironment for Visualizing Images) 5.2 package. Kavzoglu and Mather (2003) noted that selecting 0.1–0.2 for η and 0.5–0.6 for α, a better classification accuracy can be reached. Apart from the default setting, we tested also the classification accuracy by tuning these parameters as indicated. For SVMs that use support vectors to maximize the margin and find the optimal hyperplanes among the clusters (Huang et al., 2002; Kavzoglu and Colkesen, 2009), different kernel type functions namely linear, polynomial, radial basis and sigmoid were tested. RFs are a combination of decision-tree classifiers such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest (Breiman, 2001). For this classification, tree number in training was set to 100 and number of features was determined by square root of all features, and Gini Index was used to define the impurity (Waske et al., 2012). After the classification, the LC subclasses (e.g., the Olive and Rangeland subclasses) were respectively merged together, and checked against the ground-truth data (ROIs-2) to calculate the final OA and KC to evaluate the performance of different classifiers. The subpixel mapping as mentioned in Section 1 would also be promising. However, linear spectral unmixing required selection of endmember which is not straightforward in landscape complex areas, and the endmember number may not exceed the band number, e.g., 3–4 (Adams et al., 1995; Lu and Weng, 2004), which can hardly reflect the full spectrum of LC diversity in the study area. It would be time-consuming as well if the class boundary was defined by thresholding on the endmember components or on the endmember ternary diagram (Adams et al., 1995; Lu and Weng, 2004). Other soft algorithms were successfully tested in small sites with simple land cover, e.g., 2–3 classes (Verhoeye and De Wulf, 2002; Tatem et al., 2002; Atkinson, 2005; Mertens et al., 2006). It was not sure whether these techniques were applicable in complex LC areas (>20 classes). We hence decided not to test these techniques in this study. 5 UN C 2.3.5. Regional-scale mapping Based on the tests in the above Sections 2.3.2–2.3.4, the best performed classifier together with the phenologically contrasted multisource data integration and sampling scheme was proposed and applied to the other two scenes (174/36 and 174/37) for regional-scale mapping. The ROIs-1 and ROIs-2 took up respectively 8.90% and 5.82% of pixels, and 7.40% and 6.64% of pixels for the two scenes. Table 2 Performance of the conventional supervised classifiers at Subsets 1 and 2. Time duration includes both training (TRN) and classification (CLS). Datasets Subset 1 (1384×1211) Subset 2 (1943×1776) MD 3-Band (spring) 6-Band 8-Band 10-Band 12-Band 13-Band 3-Band (spring) 6-Band 8-Band 10-Band 12-Band 13-Band ML OA KC Time (TRN+CLS) (s) OA KC Time (TRN+CLS) (s) 55.05% 80.01% 90.40% 91.14% 91.06% 91.74% 71.25% 75.97% 80.27% 82.99% 84.54% 84.87% 0.4191 0.6763 0.8404 0.8527 0.8515 0.8627 0.6666 0.7216 0.77 0.7995 0.8176 0.8215 8 (2+6) 14 (2+12) 12 (3+9) 12 (3+9) 33 (8+25) 24 (8+26) 13 (4+9) 25 (5+20) 31 (7+24) 33 (8+25) 37 (10+27) 37 (8+29) 88.75% 93.99% 93.85% 94.74% 96.01% 96.08% 80.53% 88.50% 90.45% 93.34% 94.31% 94.41% 0.8145 0.8992 0.8969 0.9118 0.9329 0.9342 0.7709 0.8641 0.8869 0.9208 0.9323 0.9334 13 (4+9) 14 (4+10) 13 (4+9) 16 (4+12) 17 (4+13) 18 (4+14) 14 (3+11) 17 (3+14) 18 (4+14) 19 (4+15) 30 (4+26) 31 (4+27) 6 International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx Table 3 Performance of the machine learning classifiers at Subset 1 and Subset 2. Subset 2 (1943×1774) SVMs OA KC 93.18% 0.8928 27 (26+1) 96.39% 0.9436 19 (16+3) 95.61% 90.49% 85.87% 88.00% 87.49% 74.01% 0.9315 0.8674 0.772 0.8131 0.8057 0.6825 29 (22+1) 17 (15+2) 29 (28+1) 26 (25+1) 32 (30+2) 30 (28+2) 96.85% 96.84% 96.38% 95.93% 95.23% 0.9507 0.9507 0.9436 0.9368 0.9528 17 (14+3) 25 (22+3) 22 (19+3) 22 (20+2) 24 (21+3) 89.18% 90.41% 64.04% 70.49% 77.46% 0.8701 0.885 0.5467 0.6369 0.7271 31 (29+2) 31 (29+2) 34 (32+2) 24 (23+1) 97 (95+2) 95.71% 96.42% 0.9487 0.9572 59 (50+9) 49 (41+8) 95.66% 0.9481 60 (54+ 6) 178–543 min. The maps produced by different classifiers (ML, ANNs, RFs and SVMs) for Subsets 1 and 2 are respectively presented in Figs. 2 and 3. 3.2. Whole-scene scale KC OA KC 95.59% 0.9307 UN C 3.2.2. Classification accuracy The test on the performance of different classifiers on the whole-scene datasets (7156 × 6858 pixels, 3–13 bands) using the same whole-scene ROIs-1 and ROIs-2 revealed that MD ran fast and the maps generated were largely reasonable (83.52–84.8% of OA for both 12- and 13-band datasets) except for the confusion between Olives and Built-Up, and between Beaches and Built-Up; yet, ML showed much better performance, rapidly yielding LC maps with OAs of 94.67–95.26% for datasets of more than 8 bands (Table 5). Among the machine learning algorithms, ANNs classifier accomplished the whole-scene classification in about 7–8 h with an OA reaching 91.44% for the 6-band dataset (Table 5); however, it misclassified Olives into Rangelands, Barelands and Clouds, and Beaches into Built-Up, among others. The best performed classifiers at subset scales, RFs and SVMs, could not complete the whole-scene classification during the 5-day test period. Therefore, ML performed best at whole-scene scale; its map derived from the 13-band dataset is presented in Fig. 4 and the accuracies of different LC types are shown in Table 6. 3.3. Regional-scale mapping To perform large scale (regional to global) LC mapping using high resolution data such as Landsat, the capacity of the classification approach to process the whole-scene datasets and produce reliable maps of high accuracy within acceptable time duration are critical factors. Time (TRN+CLA) (min) 19 (16+3) 96.81% 96.94% 96.59% 96.76% 96.74% 83.56% 0.9501 0.9523 0.9469 0.9496 0.9494 0.8008 45 (22+23) 47 (23+24) 53 (31+22) 42 (20+ 22) 56 (26+30) 543 88.91% 95.71% 96.09% 96.21% 96.41% 0.8664 0.9466 0.9485 0.9547 0.9571 532 397 486 185 178 Both SVMs and RFs algorithms have a strong capacity for grouping clusters and can hence deliver accurate classification results. However, they are time-consuming and more suited to local scale LC mapping with small datasets viewing that the very powerful processing facilities used by Gong et al. (2013), for example, are not available to most analysts and users. Large datasets can be processed by tiling but one may face problems linked to abrupt connections among different tiles after mosaicking and time-consuming work to amend them. The ML classifier is the most widely known and employed conventional classifier thanks to its robustness (Huang et al., 2002; Gong et al., 2013) though questioned for its parametric assumption. As our tests revealed, after integrating the phenology-contrasted spectral and biophysical information and topographic features, the ML classification was able to sort out the low separability problem; and sampling to the subclass level helped to resolve the problem related to non-parametric signatures of classes. Thus, the proposed integration procedure followed with a subclass-level sampling and ML classification was considered as a relevant approach and applied to the adjacent scenes 174/36 and 174/ 37 for regional-scale mapping. The results were satisfactory, of which OAs are respectively 96.4% and 94.2%. The accuracies of each LC class are illustrated in Table 6 and their LC maps are demonstrated in Fig. 4. OR RE C 3.2.1. Separability of classes The separability of the LC classes was analyzed at whole-scene scale (7156 × 6858 pixels), for different levels of multisource data integration (from 3 to 13 bands). The results show that the improvement in the separability of the easily confused class-pairs, namely Bare Lands and Olives, Rangelands and Olives, Woodlands and Olives, Rangelands and Woodlands, and Built-Up and Bare Lands, etc., is remarkable (Table 4), from a JMD mean of 1.10–1.25 (separable but with strong confusion) of the 3-band datasets to 1.94 (completely separable) of the 13-band dataset. Phenology-contrasted multispectral (MS) and biophysical information (GDVI and LST) accounted for 90–92% of the improvement in separability, while topographic features contributed by 8–10% (Table 4). OA Time (TRN+CLA) (min) F 3-Band (spring) 6-Band 8-Band 10-Band 12-Band 13-Band 3-Band (spring) 6-Band 8-Band 10-Band 12-Band 13-Band RFs Time (TRN+CLA) (min) PR OO Subset 1 (1384×1211) ANNs TE D Datasets 4. Conclusions This research developed an operational approach to map LC in large areas using high resolution data. The effectiveness of the method, based on the integration of both spectral and ancillary information, was assessed by comparing the performance of the most popular classification algorithms. The innovative aspect of this research lies in the assessment of the gains deriving from different levels of progressive integration of different pieces of information that were individually reported as relevant by various publications. The experiments revealed that integration of phenology-contrasted multisource data (MS, GDVI, LST) and topographic features can significantly improve the separability of the problematic LC classes and the overall mapping accuracy. After such integration, the multispectral space (e.g., 3-dimension) becomes a high-dimension space (13-dimension), and the non-separable clusters become separable. The ML classifier yielded the best performance at whole-scene scale. The tests were conducted in the Mediterranean region, the proposed approach, however, can be applied to other climate-type areas by taking the local phenological pattern into consideration. In humid areas, NDVI or EVI should be used instead of GDVI which gets easily saturated in the densely vegetated areas (Wu, 2014). ASTER im 7 OR RE C TE D PR OO F International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx UN C Fig. 2. Land use/cover maps of Subset 1 produced from different supervised classifiers. (a) Pseudo natural color composite of MS 7, 4 and 1 of the spring images as RGB; (b) Result of ANNs (6-band dataset, though its OA reached 95.61%, Woody shrublands were missed and Olives misclassified as Orchards); (c) Result of RFs (8-band dataset; OA: 96.84%), (d) Result of ML (13-band dataset; OA: 96.08%), and (e) Result of SVMs (13-band dataset; OA: 96.74%) agery could also be used. Datasets not including the thermal band such as SPOT, CBERS, and RapidEye may yield a lower accuracy, about 0.9–2.7% of degradation (see the difference in accuracy between 8-band and 10-band in Tables 2 and 5), nevertheless, an OA of >92% could still be achievable. Acknowledgements The field observation was conducted while the first author was working with ICARDA (International Center for Agricultural Re search in the Dry Areas). The study was supported by the research fund of the State-Key Lab of Nuclear Resources and Environment, ECIT (No. NRE1501) for Weicheng Wu, and by the CGIAR Research Program in Dryland Systems (CRP-DS) fund for Claudio Zucca. We thank Dr Claudia Kunzer for her useful discussion and suggestions about the study. Landsat images were acquired from the USGS data server (http://glovis.usgs.gov/); SRTM data were obtained from the CGIAR-CSI (http://srtm.csi.cgiar.org/); and country borderline shapefiles were from the Natural Earth (http://www.naturalearthdata.com/). International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx OR RE C TE D PR OO F 8 UN C Fig. 3. Land use/cover maps of Subset 2 produced by different supervised classifiers. (a) Pseudo natural color composite of the spring images MS741 as RGB; (b) Result of ANN (8-band dataset; OA: 90.41%); (c) Result of RFs (10-band; OA: 96.42%); (d) Result of ML (13-band; OA: 94.41%); and (e) Result of SVMs (13-band; OA: 96.41%) International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx 9 Table 4 Separability improvement by integrating multisource data. JMD Built-Up ∧ Bare Lands Built-Up ∧ Olives Built-Up ∧ Rangelands Built-Up ∧ Beach Bare Lands ∧ Olives Bare Lands ∧ Rangelands Rangelands ∧ Olives Rangelands ∧ Woodland Woodlands ∧ Olives JMD Mean 3-band 6-band 8-band 10-band 13-band 3MS147 (spring) 3MS147 (summer) 6MS147 6MS147+ 2GDVI 6MS147+ 2GDVI + 2T 6MS147 + 2GDVI +2T + E + S + A 1.1544 1.2112 1.2689 1.2927 0.9967 1.2077 0.5848 1.1078 1.0892 1.1012 1.5386 1.4819 1.4297 1.3691 1.1824 1.3989 1.0687 1.0346 0.7737 1.2531 1.6163 1.5806 1.7165 1.7524 1.3753 1.5868 1.3989 1.3487 1.3762 1.5280 1.8669 1.6496 1.8215 1.8488 1.9009 1.7978 1.6231 1.5461 1.8548 1.7677 Supervised Classifiers and Multisource Datasets KC Time (TRN + CLA) (min) Remark Conventional MD 84.80% Classification (12-band) 0.7882 6 MD 83.53% (13-band) 0.7704 6 ML (3-band, spring) ML (6-band) 75.70% 0.7193 8 Result with strong confusion 91.03% 0.8742 8 ML (8-band) 93.41% 0.9074 8 ML 94.67% (10-band) 0.9223 9 The majority of classes is clearly classified Result is rather good with minor confusion Similar to the above ML 95.15% (12-band) 0.9319 10 ML 95.26% (13-band) 0.9334 10 0.8798 441 (433 + 8) Result largely acceptable but there is confusion Same as the above OR RE C OA 91.44% ANNs (8-band) 81.87% 0.7480 463 (450 + 13) RFs N/A N/A N/A SVMs N/A N/A N/A UN C Machine ANNs Learning (6-band) Classification Results very good with very minor confusion between BuiltUp and Beaches, and between Orchards and Olives, etc. Same as the above Despite of rather high OA, there was strong confusion between BuiltUp and Beaches, Rainfed Pastures and Rangelands, and so on Result is spurious Not finished during 5-day test period Not finished during 5-day test period 1.9928 1.8113 1.9249 1.954 1.9598 1.8851 1.6813 1.7261 1.9525 1.8764 TE D Table 5 Performance of different classifiers on the whole-scene datasets of the scene 174/35. F 3-band PR OO Class-Pairs 1.9966 1.9234 1.9742 1.9989 1.9903 1.9470 1.7866 1.8405 1.9721 1.9366 International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx UN C OR RE C TE D PR OO F 10 Fig. 4. Regional-scale land use/cover map of an Eastern Mediterranean area. Note: OAs are respectively 95.26%, 96.44% and 94.20% for scenes 174/35, 174/36 and 174/37. White color represents clouds and shadows in spring and/or summer images. International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx KC OA% KC OA% KC 95.26 0.9334 96.44 0.9228 94.20 0.9171 PA (%) UA (%) PA (%) UA (%) PA (%) UA (%) 98.52 100.00 94.74 98.30 99.64 42.79 94.81 72.98 99.38 70.61 99.35 97.98 61.02 98.70 88.97 94.77 99.05 93.55 94.05 97.94 92.94 77.14 54.92 96.18 38.26 90.81 97.79 91.46 76.63 87.33 92.55 83.63 63.63 93.12 69.11 98.05 / 91.99 96.99 54.63 94.36 46.27 / 76.21 / / 95.16 / 25.78 / 91.11 98.22 87.65 / 81.01 21.86 72.05 86.43 86.91 100.00 92.90 74.32 79.96 1.45 78.77 53.67 96.19 / 83.01 94.48 88.43 12.67 99.47 / 97.80 90.85 / 93.75 77.43 89.79 99.77 99.15 99.95 50.95 95.28 / 70.45 97.44 86.69 100.00 100.00 100.00 94.38 85.52 99.44 65.97 88.20 93.84 99.96 99.23 100.00 27.42 94.53 36.65 93.76 78.66 95.73 100.00 38.92 100.00 99.98 / 100.00 100.00 100.00 73.94 99.99 99.47 99.99 100.00 F 174/37 OA% Artificial areas Built-Up 95.25 Mining or Construction 97.40 Sites Agricultural areas Spring Irrigation 97.00 Summer Irrigation 98.54 Rotated Irrigation 95.54 Spring Rainfed 97.37 Rainfed for Grazing 83.34 (Pastures) Orchards 94.07 Terraced Rainfed with / Olives Vineyards / Olives 98.73 Greenhouse / Fallow 95.43 Natural and semi-natural areas Broadleaf Forests 94.98 Conifers 90.97 Hylophytes / Woodlands 91.15 Rangelands 91.04 Bare Lands 36.80 Saline Land 99.91 Beaches 99.13 Snow / Other areas Water Bodies 99.87 Burnt Scars / References 174/36 PR OO Class Accuracy 174/ 35 .,. DeFries, R.S., Hansen, M., Townshend, J.R.G., Sohlberg, R., 1998. Global land cover classification at 8 km spatial resolution: use of training data derived from Landsat imagery in decision tree classifiers. Int. J. Remote Sens. 19, 3141–3168. Di Gregorio A., Jansen L.J.M., 2000. Land Cover Classification System (LCCS): Classification Concepts and User Manual FOR SOFTWARE VERSION 1.0. FAO, Rome. http://www.fao.org/docrep/003/x0596e/x0596e00.htm (accessed 02.02.09.). ESA, 2008. DesertWatch Final Technical Report. http://dup.esrin.esa.int/files/ 131-176-149-30_2009430103852.pdf (accessed 12.06.15.). Foody, G.M., Cox, D.P., 1994. Sub-pixel land cover composition estimation using a linear mixture model and fuzzy membership functions. Int. J. Remote Sens. 15, 619–631. Foody, G.M., Mathur, A., 2004. A relative evaluation of multiclass image classification by support vector machines. IEEE Trans. Geosci. Remote Sens. 42, 1335–1343. Foody, G.M., Campbell, N.A., Trodd, N.M., Wood, T.F., 1992. Derivation and applications of probabilistic measures of class membership from the maximum-likelihood classification. Photogramm. Eng. Remote Sens. 58 (9), 1335–1341. (). Friedl, M.A., McIver, D.K., Hodges, J.C.F., Zhang, X.Y., Muchoney, D., Strahler, A.H., Woodcock, C.E., Gopal, S., Schneider, A., Cooper, A., Baccini, A., Gao, F., Schaaf, C., 2002. Global land cover mapping from MODIS: algorithms and early results. Remote Sens. Environ. 83, 287–302. Gong, P., Marceau, D.J., Howarth, P.J., 1992. A comparison of spatial feature extraction algorithms for land use classification with SPOT HRV data. Remote Sens. Environ. 40, 137–151. Gong, P., Wang, J., Yu, L., Zhao, Y., Zhao, Y., Liang, L., et al., 2013. Finer resolution observation and monitoring of global land cover: first mapping results with Landsat TM and ETM+ data. Int. J. Remote Sens. 34 (7), 2607–2654. http://dx. doi.org/10.1080/01431161.2012.748992. Hansen, M.C., DeFries, R.S., Townshend, J.R.G., Sohlberg, R., 2000. Global land cover classification at 1 km spatial resolution using a classification tree approach. Int. J. Remote Sens. 21, 1331–1364. Haralick, R.M., Shanmugam, K., Dinstein, I., 1973. Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3 (6), 610–621. Hill, J., Schutt, B., 2000. Mapping complex patterns of erosion and stability in dry Mediterranean ecosystems. Remote Sens. Environ. 74, 557–569. Huang, C., Davis, L.S., Townshend, J.R.G., 2002. An assessment of support vector machines for land cover classification. Int. J. Remote Sens. 23 (4), 725–749. Huth, J., Kuenzer, C., Wehrmann, T., Gebhardt, S., Tuan, V.Q., Dech, S., 2012. Land cover and land use classification with TWOPAC: towards automated processing for pixel- and object-based image classification. Remote Sens. 4, 2530–2553. Jia, K., Liang, S., Wei, X., Yao, Y., Su, Y., Jiang, B., Wang, X., 2014. Land cover classification of Landsat data with phenological features extracted from time series MODIS NDVI data. Remote Sens. 6, 11518–11532. http://dx.doi.org/10. 3390/rs61111518. Kavzoglu, T., Colkesen, I., 2009. A kernel functions analysis for support vector machines for land cover classification. Int. J. Appl. Earth Obs. Geoinf. 11, 352–359. Kavzoglu, T., Mather, P.M., 2003. The use of backpropagating artificial neural networks in land cover classification. Int. J. Remote Sens. 24 (23), 4907–4938. Kent, M., Gill, W.J., Weaver, R.E., Armitage, R., 1997. Landscape and plant community boundaries in biogeography. Prog. Phys. Geogr. 21 (3), 315–354. Loveland, T.R., Reed, B.C., Brown, J.F., Ohlen, D.O., Zhu, Z., Yang, L., Merchant, J.W., 2000. Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data. Int. J. Remote Sens. 21 (6–7), 1303–1330. Lu, D., Weng, Q., 2004. Spectral mixture analysis of the urban landscape in Indianapolis with Landsat ETM+ imagery. Photogramm. Eng. Remote Sens. 70 (9), 1053–1062. Lu, L., Kuenzer, C., Guo, H., Li, Q., Long, T., Li, X., 2014. A novel land cover classification map based on a MODIS time-series in Xinjiang, China. Remote Sens. 6, 3387–3408. Mao, J., Jain, A., 1992. Texture classification and segmentation using multiresolution simultaneous autoregressive models. Pattern Recognit. 25 (2), 173–188. Mas, J.F., Flores, J.J., 2008. The application of artificial neural networks to the analysis of remotely sensed data. Int. J. Remote Sens. 29 (3), 617–663. http://dx. doi.org/10.1080/01431160701352154. Mertens, K.C., De Baets, B., Verbeke, L.P.C., De Wulf, R.R., 2006. A sub-pixel mapping algorithm based on sub-pixel/pixel spatial attraction models. Int. J. Remote Sens. 27, 3293–3310. Pace, G., Zucca, C., Wu, W., 2006. Sviluppo di un algoritmo di classificazione automatico per la generazione di mappe di land cover finalizzate al monitoraggio della desertificazione. In: Atti della 10a Conferenza Nazionale ASITA, vol. II, pp. 1505–1510, Bolzano, Nov. 14–17. OR RE C Whole-scene Accuracy resolution: a POK-based operational approach. ISPRS J. Photogramm. Remote Sens. 103, 7–27. http://dx.doi.org/10.1016/j.isprsjprs.2014.09.002. TE D Table 6 Accuracies of the whole-scene land use/cover categorization in the study area. The results were obtained from the 13-band datasets with ML classification. PA—Producer’s Accuracy, UA—User’s Accuracy UN C Adams, J.B., Smith, M.O., Johnson, P.E., 1986. Spectral mixture modeling: a new analysis of rock and soil types at the Viking Lander 1 site. J. Geophys. Res. 91 (B8), 8098–8112. Adams, J.B., Sabol, D.E., Kapos, V., Filho, R.A., Roberts, D.A., Smith, M.O., Gillespie, A.R., 1995. Classification of multispectral images based on fractions of endmembers: application to land-cover change in the Brazilian Amazon. Remote Sens. Environ. 52, 137–154. Atkinson, P.M., Tatnall, A.R., 1997. Neural networks in remote sensing. Int. J. Remote Sens. 18, 699–709. Atkinson, P.M., 2005. Sub-pixel target mapping from soft-classified remotely sensed imagery. Photogramm. Eng. Remote Sens. 71, 839–846. Benediktsson, J.A., Swain, P.H., Ersoy, O.K., 1990. Neural network approaches versus statistical methods in classification of multisource remote sensing data. IEEE Trans. Geosci. Remote Sens. 28, 540–552. Blaschke, T., 2010. Object based image analysis for remote sensing. ISPRS J. Photogram. Remote Sens. 65, 2–16. Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. CEC (Commission of the European Communities), 1994. CORINE Land cover, Part 1: Methodology. http://www.eea.europa.eu/publications/COR0-landcover (accessed 18.09.15.). Chander G., Markham B.L., Helder D.L., Summary of current radiometric calibration coefficients for Landsat MSS, TM, ETM+, and EO-1 ALI sensors, Remote Sens. Environ. 113(5), 2009, 893–903.Chavez Jr., P.S., 1988. An improved dark-object subtraction technique for atmospheric scattering correction of multispectral data. Remote Sens. Environ. 24, 459–479. Chavez Jr., P.S., 1996. Image-based atmospheric corrections—revisited and improved. Photogramm. Eng. Remote Sens. 62 (9), 1025–1036. Chen, J., Chen, J., Liao, A., Cao, X., Chen, L., Chen, X., He, C., Han, G., Peng, S., Lu, M., Zhang, W., Tong, X., Mills, J., 2015. Global land cover mapping at 30 m 11 International Journal of Applied Earth Observations and Geoinformation xxx (2016) xxx-xxx UN C F OR RE C , Van der Linden S., Rabe A., Held M., Jakimow B., Leitão P.J., Okujeni A., Schwieder M., Suess S., Hostert, P., The EnMAP-Box—a toolbox and application programming interface for EnMAP data processing, Remote Sens. 7 (2015) 11249–11266. Vapnik, V., Lerner, A., 1963. Pattern recognition using generalized portrait method. Autom. Remote Control24, 774–780. Verhoeye, J., De Wulf, R.R., 2002. Land cover mapping at sub-pixel scales using linear optimization techniques. Remote Sens. Environ. 79 (1), 96–104. Waske, B., van der Linden, S., Oldenburg, C., Jakimow, B., Rabe, A., Hostert, P., 2012. imageRF—a user-oriented implementation for remote sensing image analysis with Random Forests. Environ. Model. Softw. 35, 192–193. http://dx. doi.org/10.1016/j.envsoft.2012.01.014. Wu, W., Zhang, W., 2003. Present land use and cover patterns and their development potential in North Ningxia, China. J. Geogr. Sci. 13 (1), 54–62. Wu, W., De Pauw, E., Zucca, C., 2013. Use remote sensing to assess impacts of land management policies in the Ordos rangelands in China. Int. J. Digit. Earth 6 (Supplement 2), 81–102. http://dx.doi.org/10.1080/17538947.2013. 825656. Wu, W., De Pauw, E., Hellden, U., 2013. Assessing woody biomass in African tropical savannas by multiscale remote sensing. Int. J. Remote Sens. 34 (13), 4525–4549. http://dx.doi.org/10.1080/01431161.2013.777487. Wu, W., 2003. Application de la géomatique au suivi de la dynamique environnementale en zones arides PhD thesis. Université de Paris 1, Paris. Wu, W., 2014. The generalized difference vegetation index (GDVI) for dryland characterization. Remote Sens. 6 (2), 1211–1233. http://dx.doi.org/10.3390/ rs6021211. Zhang, Y., 2001. Texture-integrated classification of urban treed areas in high-resolution color-infrared imagery. Photogramm. Eng. Remote Sens. 67 (12), 1359–1365. Zhu, K.Z., Wan, W.M., 1963. Phenology. Science Press, Beijing. (in Chinese). Zhuang, X., Engel, B.A., Lozano-Garcia, D.F., Fernandez, R.N., Johannsen, C.J., 1994. Optimization of training data required for neuro-classification. Int. J. Remote Sens. 15, 3271–3277. PR OO Pu, R., Landry, S., Yu, Q., 2011. Object-based urban detailed land cover classification with high spatial resolution IKONOS imagery. Int. J. Remote Sens. 32 (12), 3285–3308. Richards, J.A., Jia, X., 2006. Remote Sensing Digital Image Analysis: An Introduction, 4th ed. Springer-Verlag, Berlin. Rodriguez-Galiano, V.F., Ghimire, B., Rogan, J., Chica-Olmo, M., Rigol-Sanchez, J.P., 2012. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 67, 93–104. http: //dx.doi.org/10.1016/j.isprsjprs.2011.11.002. Smith, M.O., Ustin, S.L., Adams, J.B., Gillespie, A.R., 1990. Vegetation in deserts: I. A regional measure of abundance from multispectral images. Remote Sens. Environ. 31, 1–26. Swain, P.H., King, R.C., 1973. Two effective feature selection criteria for multispectral remote sensing (LARS Technical Note 042673). In: Proceedings of The International Joint Conference on Pattern Recognition. Washington, D.C., November. pp. 536–540. Tatem, A.J., Lewis, H.G., Atkinson, P.M., Nixon, M.S., 2002. Super-resolution land cover pattern prediction using a Hopfield neural network. Remote Sens. Environ. 79 (1), 1–14. Townshend, J.R.G., 1984. Agricultural land cover discrimination using thematic mapper spectral bands. Int. J. Remote Sens. 6, 681–698. USGS, 2015. Landsat 8 (L8) data users handbook (V1.0). http://landsat.usgs.gov/ documents/Landsat8DataUsersHandbook.pdf (accessed 30.06.15.). Van Der Meer, F., 1995. Spectral unmixing of Landsat Thematic Mapper data. Int. J. Remote Sens. 16 (16), 3189–3194. TE D 12

Log In

Enhancing the performance of regional land cover mapping