Abstract
Predictive mapping of environment is an important means for environment assessment and management. The selection of predictor variables (or environmental covariates) is the first and key step in predictive mapping. A number of machine learning and statistical models have been developed to select what and how many environmental covariates in a wide range of predictive mapping. Nevertheless, those models require a large amount of field data for model training and calibration, which can be problematic in applying to the areas with no or very limited field data available. To overcome the shortcoming, this paper proposes the most similar case method for selecting environmental covariates for predictive mapping. First, we describe the basic idea and the development procedures of the most similar case method; second, as an experimental test, we employ the proposed method to select the topographic covariates for inputting to the predictive soil mapping; third, we evaluate the effectiveness of the proposed method in the designed experiment using the leave-one-out cross-validation method. In total, 191 evaluation cases are included in the experimental case base and the test results show that 58.7% of the topographic covariates originally used in each evaluation case are correctly selected by the proposed method, which suggests that the proposed most-similar-case method perform reasonably well even with a relatively limited size of the case base. The future work should include the selection of other types of environmental covariates (e.g., climate, organism, etc.) and the development of an automatic method to extract the existing application cases from literature.
Similar content being viewed by others
References
Behrens T, Zhu A-X, Schmidt K, Scholten T (2010) Multi-scale digital terrain analysis and feature selection for digital soil mapping. Geoderma 155:175–185. https://doi.org/10.1016/j.geoderma.2009.07.010
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46. https://doi.org/10.1177/001316446002000104
Congalton RG, Green K (2009) Assessing the accuracy of remotely sensed data: principles and practices, 2nd edn. CRC Press, Boca Raton
Contador JFL, Schnabel S, Gutiérrez AG, Fernández MP (2009) Mapping sensitivity to land degradation in Extremadura. SW Spain Land Degrad Dev 20:129–144. https://doi.org/10.1002/ldr.884
Fourcade Y, Besnard AG, Secondi J (2018) Paintings predict the distribution of species, or the challenge of selecting environmental predictors and evaluation statistics. Glob Ecol Biogeogr 27:245–256. https://doi.org/10.1111/geb.12684
Global-SoilMap (2013) Specifications Tiered1 GlobalSoilMap products, release 2.4
Graham CH, Hijmans RJ (2006) A comparison of methods for mapping species ranges and species richness. Glob Ecol Biogeogr 15:578–587. https://doi.org/10.1111/j.1466-8238.2006.00257.x
Kolodner JL (1993) Case-based reasoning. In: What Is Case-Based Reasoning? Morgan Raufmann Publishers, San Mateo
Lagacherie P, Sneep A-R, Gomez C, Bacha S, Coulouma G, Hamrouni MH, Mekki I (2013) Combining Vis–NIR hyperspectral imagery and legacy measured soil profiles to map subsurface soil properties in a Mediterranean area (cap-bon, Tunisia). Geoderma 209–210:168–176. https://doi.org/10.1016/j.geoderma.2013.06.005
Lecours V, Devillers R, Simms AE, Lucieer VL, Brown CJ (2017) Towards a framework for terrain attribute selection in environmental studies. Environ Model Softw 89:19–30. https://doi.org/10.1016/j.envsoft.2016.11.027
Liu F, Rossiter DG, Song XD, Zhang GL, Yang RM, Zhao YG, Li DC, Ju B (2016) A similarity-based method for three-dimensional prediction of soil organic matter concentration. Geoderma 263:254–263. https://doi.org/10.1016/j.geoderma.2015.05.013
Lu HX (2008) Modelling terrain complexity. In: Zhou Q, Lees B, Tang G (eds) Advances in digital terrain analysis. Springer, Berlin, pp 159–176
Lu D, Li G, Valladares GS, Batistella M (2004) Mapping soil erosion risk in Rondônia, Brazilian Amazonia: using RUSLE, remote sensing and GIS. Land Degrad Dev 15:499–512. https://doi.org/10.1002/ldr.634
Mansuy N, Thiffault E, Paré D, Bernier P, Guindon L, Villemaire P, Poirier V, Beaudoin A (2014) Digital mapping of soil properties in Canadian managed forests at 250m of resolution using the k-nearest neighbor method. Geoderma 235–236:59–73. https://doi.org/10.1016/j.geoderma.2014.06.032
McBratney AB, Mendonça Santos ML, Minasny B (2003) On digital soil mapping. Geoderma 117:3–52. https://doi.org/10.1016/S0016-7061(03)00223-4
Qin C-Z, Wu X-W, Jiang J-C, Zhu A-X (2016) Case-based knowledge formalization and reasoning method for digital terrain analysis- application to extracting drainage networks. Hydrol Earth Syst Sci 20:3379–3392. https://doi.org/10.5194/hess-20-3379-2016
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth-Sci Rev 180:60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Shi X, Zhu AX, Burt JE, Qi F, Simonson D (2004) A case-based reasoning approach to fuzzy soil mapping. Soil Sci Soc Am J 68:885–894. https://doi.org/10.2136/sssaj2004.8850
Shi X, Long R, Dekett R, Philippe J (2009) Integrating different types of knowledge for digital soil mapping. Soil Sci Soc Am J 73:1682–1692. https://doi.org/10.2136/sssaj2007.0158
Süzen ML, Kaya BŞ (2012) Evaluation of environmental parameters in logistic regression models for landslide susceptibility mapping. Int J Digit Earth 5:338–355. https://doi.org/10.1080/17538947.2011.586443
Thompson JA, Roecker S, Grunwald S, Owens PR (2012) Chapter 21 - digital soil mapping: interactions with and applications for Hydropedology. In: Lin H (ed) Hydropedology. Academic Press, Boston, pp 665–709. https://doi.org/10.1016/B978-0-12-386941-8.00021-6
Vaysse K, Lagacherie P (2015) Evaluating digital soil mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France). Geoder Reg 4:20–30. https://doi.org/10.1111/ejss.12244
Zald HSJ, Ohmann JL, Roberts HM, Gregory MJ, Henderson EB, McGaughey RJ, Braaten J (2014) Influence of lidar, Landsat imagery, disturbance history, plot location accuracy, and plot size on accuracy of imputation maps of forest composition and structure. Remote Sens Environ 143:26–38. https://doi.org/10.1016/j.rse.2013.12.013
Zhang G, Zhu A-X, Windels SK, Qin C-Z (2018) Modelling species habitat suitability from presence-only data using kernel density estimation. Ecol Indic 93:387–396. https://doi.org/10.1016/j.ecolind.2018.04.002
Zhu A-X, Liu J, Du F, Zhang S-J, Qin CZ, Burt J, Behrens T, Scholten T (2015) Predictive soil mapping with limited sample data. Eur J Soil Sci 66:535–547. https://doi.org/10.1111/ejss.12244
Zhu A-X, Lu G, Liu J, Qin C-Z, Zhou C (2018) Spatial prediction based on third law of geography. Ann GIS 24:225–240. https://doi.org/10.1080/19475683.2018.1534890
Acknowledgements
The work reported here was supported by grants from National Natural Science Foundation of China (Project No.: 41431177, 41871300) and National Key R&D Program of China (No. 2016YFC0500205). We thank the support from PAPD, and Outstanding Innovation Team in Colleges and Universities in Jiangsu Province. Supports to A-Xing Zhu through the Vilas Associate Award, the Hammel Faculty Fellow Award, and the Manasse Chair Professorship from the University of Wisconsin-Madison are greatly appreciated.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. Babaie
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
ESM 1
(DOCX 26 kb)
Rights and permissions
About this article
Cite this article
Liang, P., Qin, CZ., Zhu, AX. et al. Using the most similar case method to automatically select environmental covariates for predictive mapping. Earth Sci Inform 13, 719–728 (2020). https://doi.org/10.1007/s12145-020-00466-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-020-00466-5