Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3557915.3561475acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article
Open access

A co-training approach for spatial data disaggregation

Published: 22 November 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Socio-demographic information is usually only accessible at relatively coarse spatial resolutions. However, its availability at thinner granularities is of substantial interest for several stakeholders, since it enhances the formulation of informed hypotheses on the distribution of population indicators. Spatial disaggregation methods aim to compute these fine-grained estimates, often using regression algorithms that employ ancillary data to re-distribute the aggregated information. However, since disaggregation tasks are ill-posed, and given that examples of disaggregated data at the target geospatial resolution are seldom available, model training is particularly challenging. We propose to address this problem through a self-supervision framework that iteratively refines initial estimates from seminal disaggregation heuristics. Specifically, we propose to co-train two different models, using the results from one model to train/refine the other. By doing so, we are able to explore complementary views from the data. We assessed the use of co-training with a fast regressor based on random forests that takes individual raster cells as input, together with a more expressive model, based on a fully-convolutional neural network, that takes raster patches as input. We also compared co-training against the use of self-training with a single model. In experiments involving the disaggregation of a socio-demographic variable collected for Continental Portugal, the results show that our co-training approach outperforms alternative disaggregation approaches, including methods based on self-training or co-training with two similar fully-convolutional models. Co-training is effective at exploring the characteristics of both regression algorithms, leading to a consistent improvement in different types of error metrics.

    References

    [1]
    Steven Abney. 2002. Bootstrapping. In Proceedings of the Annual Meeting of the Association for Computational Linguistics.
    [2]
    Maria-Florina Balcan, Avrim Blum, and Ke Yang. 2005. Co-training and Expansion: Towards Bridging Theory and Practice. Proceedings of the Annual Meeting on Neural Information Processing Systems.
    [3]
    Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer, and W. Philip Kegelmeyer. 2007. A Comparison of Decision Tree Ensemble Creation Techniques. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 1.
    [4]
    Jonathan T Barron. 2019. A General and Adaptive Robust Loss Function. In Proceedings of the International Conference on Computer Vision and Pattern Recognition.
    [5]
    Thorsten Behrens, Karsten Schmidt, Raphael A Viscarra Rossel, Philipp Gries, Thomas Scholten, and Robert A MacMillan. 2018. Spatial Modelling with Euclidean Distance Fields and Machine Learning. European Journal of Soil Science 69, 5.
    [6]
    Avrim Blum and Tom Mitchell. 1998. Combining Labeled and Unlabeled Data with Co-training. In Proceedings of the International Conference on Computational Learning Theory.
    [7]
    Leo Breiman. 2001. Random Forests. Machine Learning 45.
    [8]
    David J. Briggs, John Gulliver, Daniela Fecht, and Danielle M. Vienneau. 2007. Dasymetric Modelling of Small-area Population Distribution Using Land Cover and Light Emissions Data. Remote Sensing of Environment 108, 4.
    [9]
    Zhifeng Cheng, Jianghao Wang, and Yong Ge. 2020. Mapping Monthly Population Distribution and Variation at 1-km Resolution across China. International Journal of Geographical Information Science 1, 1.
    [10]
    Christina Corbane, Martino Pesaresi, Panagiotis Politis, Vasileios Syrris, Aneta J Florczyk, Pierre Soille, Luca Maffenini, Armin Burger, Veselin Vasilev, Dario Rodriguez, et al. 2017. Big Earth Data Analytics on Sentinel-1 and LandSat Imagery in Support to Global Human Settlements Mapping. Big Earth Data 1.
    [11]
    Erin Doxsey-Whitfield, Kytt MacManus, Susana B Adamo, Linda Pistolesi, John Squires, Olena Borkovska, and Sandra R Baptista. 2015. Taking Advantage of the Improved Availability of Census Data: A First Look at the Gridded Population of the World, Version 4. Papers in Applied Geography 1, 3.
    [12]
    Aneta Jadwiga Florczyk, Stefano Ferri, Vasileios Syrris, Thomas Kemper, Matina Halkia, Pierre Soille, and Martino Pesaresi. 2016. A New European Settlement Map from Optical Remotely Sensed Data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 9, 5.
    [13]
    Sergio Freire, Erin Doxsey-Whitfield, Kytt MacManus, Jane Mills, and Martino Pesaresi. 2016. Development of New Open and Free Multi-temporal Global Population Grids at 250m Resolution. In Proceedings of the AGILE International Conference on Geographic Information Science.
    [14]
    Sergio Freire, Thomas Kemper, Martino Pesaresi, Aneta Florczyk, and Vasileios Syrris. 2015. Combining GHSL and GPW to Improve Global Population Mapping. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium.
    [15]
    Francisco Javier Gallego. 2010. A Population Density Grid of the European Union. Population and Environment 31, 6.
    [16]
    Michael F. Goodchild, Luc Anselin, and Uwe Deichmann. 1993. A Framework for the Areal Interpolation of Socioeconomic Data. Environment and Planning A 25, 3.
    [17]
    Tomislav Hengl, Madlene Nussbaum, Marvin N Wright, Gerard BM Heuvelink, and Benedikt Gräler. 2018. Random Forest as a Generic Framework for Predictive Modeling of Spatial and Spatio-temporal Variables. PeerJ 6.
    [18]
    Y. Heymann, C. Steenmans, G. Croisille, and M. Bossard. 1994. CORINE Land Cover Technical Guide. Technical Report. Office for Official Publications of the European Communities.
    [19]
    Nathan Jacobs, Adam Kraft, Muhammad Usman Rafique, and Ranti Dev Sharma. 2018. A Weakly Supervised Approach for Estimating Spatial Density Functions from High-resolution Satellite Imagery. In Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems.
    [20]
    Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based Learning Applied to Document Recognition. Proceedings of the IEEE 86, 11.
    [21]
    Brendan P. Malone, Alex B. McBratney, Budiman Minasny, and Ichsani Wheeler. 2012. A General Method for Downscaling Earth Resource Information. Computers and Geosciences 41, 1.
    [22]
    João Monteiro, Bruno Martins, Miguel Costa, and João M Pires. 2021. Geospatial Data Disaggregation through Self-Trained Encoder-Decoder Convolutional Models. ISPRS International Journal of Geo-Information 10, 9.
    [23]
    João Monteiro, Bruno Martins, Patricia Murrieta-Flores, and João M Pires. 2019. Spatial Disaggregation of Historical Census Data Leveraging Multiple Sources of Ancillary Information. ISPRS International Journal of Geo-Information 8, 8.
    [24]
    João Monteiro, Bruno Martins, and João M Pires. 2018. A Hybrid Approach for the Spatial Disaggregation of Socio-economic Indicators. International Journal of Data Science and Analytics 5, 2--3.
    [25]
    Martino Pesaresi, Daniele Ehrlich, Stefano Ferri, Aneta Florczyk, Sergio Freire, Matina Halkia, Andreea Julea, Thomas Kemper, Pierre Soille, and Vasileios Syrris. 2016. Operating Procedure for the Production of the Global Human Settlement Layer from LandSat Data of the Epochs 1975, 1990, 2000, and 2014. Technical Report. Publications Office of the European Union.
    [26]
    Ge Qiu, Yuhai Bao, Xuchao Yang, Chen Wang, Tingting Ye, Alfred Stein, and Peng Jia. 2020. Local Population Mapping using a Random Forest Model Based on Remote and Social Sensing data: A Case Study in Zhengzhou, China. International Journal of Remote sensing 12, 10.
    [27]
    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention.
    [28]
    Forrest R. Stevens, Andrea E. Gaughan, Catherine Linard, and Andrew J. Tatem. 2015. Disaggregating Census Data for Population Mapping Using Random Forests with Remotely-sensed and Ancillary Data. PloS one 10, 2.
    [29]
    Tobias G Tiecke, Xianming Liu, Amy Zhang, Andreas Gros, Nan Li, Gregory Yetman, Talip Kilic, Siobhan Murray, Brian Blankespoor, Espen B Prydz, et al. 2017. Mapping the World Population One Building at a Time. arXiv preprint arXiv:1712.05839 (2017).
    [30]
    Waldo R. Tobler. 1979. Smooth Pycnophylactic Interpolation for Geographical Regions. Journal of the American Statistical Association 74, 367.
    [31]
    Wei Wang and Zhi-Hua Zhou. 2007. Analyzing Co-training Style Algorithms. In Proceedings of the European Conference on Machine Learning.
    [32]
    Zhi-Hua Zhou, Ming Li, et al. 2005. Semi-supervised Regression with Co-training. In Proceedings of the International Joint Conference on Artificial Intelligence.

    Cited By

    View all
    • (2024)A systematic review of spatial disaggregation methods for climate action planningEnergy and AI10.1016/j.egyai.2024.10038617(100386)Online publication date: Sep-2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGSPATIAL '22: Proceedings of the 30th International Conference on Advances in Geographic Information Systems
    November 2022
    806 pages
    ISBN:9781450395298
    DOI:10.1145/3557915
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 November 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. co-training
    2. convolutional neural networks
    3. dasymetric disaggregation
    4. deep learning
    5. encoder-decoder neural networks
    6. geospatial data disaggregation
    7. self-supervised learning

    Qualifiers

    • Research-article

    Funding Sources

    • Fundação para a Ciência e Tecnologia (FCT)
    • Thales Portugal
    • EU H2020 research and innovation program

    Conference

    SIGSPATIAL '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 220 of 1,116 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)120
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 28 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A systematic review of spatial disaggregation methods for climate action planningEnergy and AI10.1016/j.egyai.2024.10038617(100386)Online publication date: Sep-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media