Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3543507.3583862acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

Interpreting wealth distribution via poverty map inference using multimodal data

Published: 30 April 2023 Publication History

Abstract

Poverty maps are essential tools for governments and NGOs to track socioeconomic changes and adequately allocate infrastructure and services in places in need. Sensor and online crowd-sourced data combined with machine learning methods have provided a recent breakthrough in poverty map inference. However, these methods do not capture local wealth fluctuations, and are not optimized to produce accountable results that guarantee accurate predictions to all sub-populations. Here, we propose a pipeline of machine learning models to infer the mean and standard deviation of wealth across multiple geographically clustered populated places, and illustrate their performance in Sierra Leone and Uganda. These models leverage seven independent and freely available feature sources based on satellite images, and metadata collected via online crowd-sourcing and social media. Our models show that combined metadata features are the best predictors of wealth in rural areas, outperforming image-based models, which are the best for predicting the highest wealth quintiles. Our results recover the local mean and variation of wealth, and correctly capture the positive yet non-monotonous correlation between them. We further demonstrate the capabilities and limitations of model transfer across countries and the effects of data recency and other biases. Our methodology provides open tools to build towards more transparent and interpretable models to help governments and NGOs to make informed decisions based on data availability, urbanization level, and poverty thresholds.

References

[1]
JL Abitbol and AJ Morales. 2021. Socioeconomic Patterns of Twitter User Activity.Entropy (Basel, Switzerland) 23, 6 (2021).
[2]
Jacob Levy Abitbol and Marton Karsai. 2020. Interpretable socioeconomic status inference from aerial imagery through urban patterns. Nature Machine Intelligence 2, 11 (2020), 684–692.
[3]
Jacob Levy Abitbol, Márton Karsai, and Eric Fleury. 2018. Location, occupation, and semantics based socioeconomic status inference on twitter. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 1192–1199.
[4]
Lishan Adam and Michael Minges. 2018. ICTs, LDCs and the SDGs: Achieving Universal and Affordable Internet in the Least Developed Countries., 128 pages.
[5]
Areppim AG. 2017. Mobile Phone Market Forecast. https://stats.areppim.com/stats/stats_mobilex2017.htm. Accessed: 2022-02-02.
[6]
Nikolaos Aletras and Benjamin Paul Chamberlain. 2018. Predicting Twitter User Socioeconomic Attributes with Network and Language Information. In Proceedings of the 29th on Hypertext and Social Media (Baltimore, MD, USA) (HT18). Association for Computing Machinery, New York, NY, USA, 20–24. https://doi.org/10.1145/3209542.3209577
[7]
Kumar Ayush, Burak Uzkent, Marshall Burke, David Lobell, and Stefano Ermon. 2021. Generating Interpretable Poverty Maps Using Object Detection in Satellite Images. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (Yokohama, Yokohama, Japan) (IJCAI’20). Article 608, 7 pages.
[8]
Kumar Ayush, Burak Uzkent, Kumar Tanmay, Marshall Burke, David Lobell, and Stefano Ermon. 2021. Efficient Poverty Mapping from High Resolution Remote Sensing Images. Proceedings of the AAAI Conference on Artificial Intelligence 35, 1 (May 2021), 12–20. https://ojs.aaai.org/index.php/AAAI/article/view/16072
[9]
World Bank. 2018. Decline of Global Extreme Poverty Continues but Has Slowed: World Bank. https://www.worldbank.org/en/news/press-release/2018/09/19/decline-of-global-extreme-poverty-continues-but-has-slowed-world-bank. Accessed: 2022-02-02.
[10]
Marc Barthélemy. 2011. Spatial networks. Physics reports 499, 1-3 (2011), 1–101.
[11]
Luís MA Bettencourt. 2013. The origins of scaling in cities. science 340, 6139 (2013), 1438–1441.
[12]
Deborah F Bryceson, Tatenda C Mbara, and David Maunder. 2003. Livelihoods, daily mobility and poverty in sub-Saharan Africa. Transport reviews 23, 2 (2003), 177–196.
[13]
Earth Engine Data Catalog. 2022. VIIRS Stray Light Corrected Nighttime Day/Night Band Composites Version 1. https://developers.google.com/earth-engine/datasets/catalog/NOAA_VIIRS_DNB_MONTHLY_V1_VCMSLCFG. Accessed: 2022-05-03.
[14]
Guanghua Chi, Han Fang, Sourav Chatterjee, and Joshua E. Blumenstock. 2022. Microestimates of wealth for all low- and middle-income countries. Proceedings of the National Academy of Sciences 119, 3 (2022). https://doi.org/10.1073/pnas.2113658119
[15]
Eduardo Cruz, Carmen Vaca, and Monica Villavicencio. 2021. Estimating urban socioeconomic inequalities through airtime top-up transactions data. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, 4265–4272.
[16]
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9268–9277.
[17]
Assunta Di Vaio, Rosa Palladino, Rohail Hassan, and Octavio Escobar. 2020. Artificial intelligence and business models in the sustainable development goals perspective: A systematic literature review. Journal of Business Research 121 (2020), 283–314.
[18]
Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, and Nitesh V Chawla. 2014. Inferring user demographics and social strategies in mobile social networks. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 15–24.
[19]
Nathan Eagle, Michael Macy, and Rob Claxton. 2010. Network diversity and economic development. Science 328, 5981 (2010), 1029–1031.
[20]
Lisette Espín-Noboa. 2023. PovertyMaps. https://github.com/lisette-espin/PovertyMaps. Accessed: 2023-02-10.
[21]
Facebook Connectivity Lab and Center for International Earth Science Information Network, CIESIN, Columbia University. 2016. High Resolution Settlement Layer (HRSL). Source imagery for HRSL © 2016 DigitalGlobe.Accessed:08-07-2021 (SL), 10-07-2021 (UG).
[22]
Masoomali Fatehkia, Benjamin Coles, Ferda Ofli, and Ingmar Weber. 2020. The relative value of facebook advertising data for poverty mapping. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 934–938.
[23]
Masoomali Fatehkia, Isabelle Tingzon, Ardie Orden, Stephanie Sy, Vedran Sekara, Manuel Garcia-Herranz, and Ingmar Weber. 2020. Mapping socioeconomic indicators using social media advertising data. EPJ Data Science 9, 1 (2020), 22.
[24]
Meta for Developers. 2022. Reach Estimate. https://developers.facebook.com/docs/marketing-api/reference/reach-estimate/. Accessed: 2022-02-02.
[25]
Jian Gao, Yi-Cheng Zhang, and Tao Zhou. 2019. Computational socioeconomics. Physics Reports 817 (2019), 1–104.
[26]
Tilottama Ghosh, Rebecca L Powell, Christopher D Elvidge, Kimberly E Baugh, Paul C Sutton, and Sharolyn Anderson. 2010. Shedding light on the global distribution of economic activity. The Open Geography Journal 3, 1 (2010).
[27]
Serena Giurgola, Simone Piaggesi, Márton Karsai, Yelena Mejova, André Panisson, and Michele Tizzoni. 2021. Mapping urban socioeconomic inequalities in developing countries through Facebook advertising data. arXiv preprint arXiv:2105.13774 (2021).
[28]
Google. 2022. Earth Engine. https://earthengine.google.com/. Accessed: 2022-02-02.
[29]
Google. 2022. The Maps Static API. https://developers.google.com/maps/documentation/maps-static/overview. Accessed: 2022-02-02.
[30]
Arash Hajikhani and Arho Suominen. 2022. Mapping the sustainable development goals (SDGs) in science, technology and innovation: application of machine learning in SDG-oriented artefact detection. Scientometrics (2022), 1–33.
[31]
Ola Hall, Francis Dompae, Ibrahim Wahab, and Fred Mawunyo Dzanku. 2023. A review of machine learning and satellite imagery for poverty prediction: Implications for development research and applications. Journal of International Development (2023).
[32]
Antonio Hidalgo, Samuel Gabaly, Gustavo Morales-Alonso, and Alberto Urueña. 2020. The digital divide in light of sustainable development: An approach through advanced machine learning techniques. Technological Forecasting and Social Change 150 (2020), 119754.
[33]
Jacinta Holloway, Kerrie Mengersen, and Kate Helmstedt. 2018. Spatial and machine learning methods of satellite imagery analysis for Sustainable Development Goals. In Proceedings of the 16th Conference of International Association for Official Statistics (IAOS). International Association for Official Statistics (IAOS), 1–14.
[34]
Harold Hotelling. 1933. Analysis of a complex of statistical variables into principal components.Journal of educational psychology 24, 6 (1933), 417.
[35]
Andrew Hudson-Smith, Michael Batty, Andrew Crooks, and Richard Milton. 2009. Mapping for the masses: Accessing Web 2.0 through crowdsourcing. Social science computer review 27, 4 (2009), 524–538.
[36]
Neal Jean, Marshall Burke, Michael Xie, W Matthew Davis, David B Lobell, and Stefano Ermon. 2016. Combining satellite imagery and machine learning to predict poverty. Science 353, 6301 (2016), 790–794.
[37]
Mikhail Yurievich Khavinson and Matvey Pavlovich Kulakov. 2017. Gravitational model of population dynamics. Bulletin of the South Ural State University. Series: Mathematical modeling and programming 10, 3 (2017), 80–93.
[38]
Unwired Labs. 2021. OpenCelliD - The world’s largest Open Database of Cell Towers. https://www.opencellid.org. Accessed: 2022-02-02.
[39]
Kamwoo Lee and Jeanine Braithwaite. 2022. High-resolution poverty maps in Sub-Saharan Africa. World Development 159 (2022), 106028. https://doi.org/10.1016/j.worlddev.2022.106028
[40]
Yannick Leo, Eric Fleury, J Ignacio Alvarez-Hamelin, Carlos Sarraute, and Márton Karsai. 2016. Socioeconomic correlations and stratification in social-communication networks. Journal of The Royal Society Interface 13, 125 (2016), 20160598.
[41]
Yannick Leo, Márton Karsai, Carlos Sarraute, and Eric Fleury. 2018. Correlations and dynamics of consumption patterns in social-economic networks. Social Network Analysis and Mining 8, 1 (2018), 1–16.
[42]
Alejandro Llorente, Manuel Garcia-Herranz, Manuel Cebrian, and Esteban Moro. 2015. Social media fingerprints of unemployment. PloS one 10, 5 (2015), e0128692.
[43]
Paige Maas, Shankar Iyer, Andreas Gros, Wonhee Park, Laura McGorman, Chaya Nayak, and P Alex Dow. 2019. Facebook Disaster Maps: Aggregate Insights for Crisis Response & Recovery. In KDD, Vol. 19. 3173.
[44]
Meta. 2022. Data for Good. https://dataforgood.facebook.com/. Accessed: 2022-02-02.
[45]
David Mhlanga. 2021. Artificial intelligence in the industry 4.0, and its impact on poverty, innovation, infrastructure development, and the sustainable development goals: Lessons from emerging economies¿Sustainability 13, 11 (2021), 5788.
[46]
Franz-Benjamin Mocnik, Amin Mobasheri, and Alexander Zipf. 2018. Open Source Data Mining Infrastructure for Exploring and Analysing OpenStreetMap. https://github.com/mocnik-science/osm-python-tools. Open Geospatial Data, Software and Standards 7, 3 (2018). https://doi.org/10.1186/s40965-018-0047-6
[47]
United Nations. 2020. Sustainable Development Goals - Goal 1: End poverty in all its forms everywhere. https://www.un.org/sustainabledevelopment/poverty/. Accessed: 2022-02-02.
[48]
Faizaan Naveed. 2019. Satellite Imagery Classification Using Deep Learning. https://medium.datadriveninvestor.com/patch-based-cover-type-classification-using-satellite-imagery-a67edeae7e24. Accessed: 2022-05-04.
[49]
Humza Naveed. 2021. Survey: Image mixing and deleting for data augmentation. arXiv preprint arXiv:2106.07085 (2021).
[50]
OpenStreetMap. 2021. Infrastructure Elements. Nodes: https://wiki.openstreetmap.org/wiki/Node, Ways: https://wiki.openstreetmap.org/wiki/Way. Accessed: 2022-02-02.
[51]
OpenStreetMap. 2021. Populated Places. https://wiki.openstreetmap.org/wiki/Key:place. Accessed: 2022-02-02.
[52]
OpenStreetMap. 2022. Explorer. https://www.openstreetmap.org/. Accessed: 2022-02-02.
[53]
Iván Palomares, Eugenio Martínez-Cámara, Rosana Montes, Pablo García-Moral, Manuel Chiachio, Juan Chiachio, Sergio Alonso, Francisco J Melero, Daniel Molina, Bárbara Fernández, 2021. A panoramic view and swot analysis of artificial intelligence for achieving the sustainable development goals by 2030: Progress and prospects. Applied Intelligence 51, 9 (2021), 6497–6527.
[54]
Maxim Pinkovskiy and Xavier Sala-i Martin. 2016. Lights, camera… income! Illuminating the national accounts-household surveys debate. The Quarterly Journal of Economics 131, 2 (2016), 579–631.
[55]
The DHS Program. 2022. Analyzing DHS Data. https://dhsprogram.com/data/Guide-to-DHS-Statistics/Analyzing_DHS_Data.htm. Accessed: 2022-04-12.
[56]
The DHS Program. 2022. The Demographic and Health Surveys (DHS) Program. https://dhsprogram.com/. Accessed: 2022-02-02.
[57]
The DHS Program. 2022. GPS Data Collection. https://dhsprogram.com/methodology/gps-data-collection.cfm. Accessed: 2022-02-02.
[58]
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. Advances in neural information processing systems 31 (2018).
[59]
Anat Rafaeli, Shelly Ashtar, and Daniel Altman. 2019. Digital traces: New data, resources, and tools for psychological-science research. Current directions in psychological science 28, 6 (2019), 560–566.
[60]
Anil Rahate, Rahee Walambe, Sheela Ramanna, and Ketan Kotecha. 2022. Multimodal co-learning: challenges, applications with datasets, recent advances and future directions. Information Fusion 81 (2022), 203–239.
[61]
Parikshit Ram and Kaushik Sinha. 2019. Revisiting kd-tree for nearest neighbor search. In Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining. 1378–1388.
[62]
Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of big data 6, 1 (2019), 1–48.
[63]
Sklearn. 2022. Compute Sample Weight. https://scikit-learn.org/stable/modules/generated/sklearn.utils.class_weight.compute_sample_weight.html. Accessed: 2022-10-30.
[64]
Jeroen Smits and Roel Steendijk. 2015. The international wealth index (IWI). Social indicators research 122, 1 (2015), 65–85. Accessed: 2022-02-02, Play with IWI: https://globaldatalab.org/iwi/.
[65]
StatsDirect. 2000. Gini Coefficient of Inequality. https://www.statsdirect.com/help/default.htm#nonparametric_methods/gini.htm. Accessed: 2020-11-09.
[66]
Tobias G Tiecke, Xianming Liu, Amy Zhang, Andreas Gros, Nan Li, Gregory Yetman, Talip Kilic, Siobhan Murray, Brian Blankespoor, Espen B Prydz, 2017. Mapping the world population one building at a time. arXiv preprint arXiv:1712.05839 (2017).
[67]
Steve Wiggins and Sharon Proctor. 2001. How special are rural areas¿ The economic implications of location for rural development. Development policy review 19, 4 (2001), 427–436.
[68]
World Population Review. 2022. Poverty Rate by Country 2022. https://worldpopulationreview.com/country-rankings/poverty-rate-by-country. Accessed: 2022-05-12.
[69]
Liuhuaying Yang and Lisette Espín-Noboa. 2023. Interactive Poverty Map Visualization. https://vis.csh.ac.at/poverty-maps/. Accessed: 2023-02-10.

Cited By

View all
  • (2024)THE OPPORTUNITIES, LIMITATIONS, AND CHALLENGES IN USING MACHINE LEARNING TECHNOLOGIES FOR HUMANITARIAN WORK AND DEVELOPMENTAdvances in Complex Systems10.1142/S021952592440002227:03Online publication date: 3-May-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '23: Proceedings of the ACM Web Conference 2023
April 2023
4293 pages
ISBN:9781450394161
DOI:10.1145/3543507
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Check for updates

Author Tags

  1. deep learning
  2. high-resolution spatial inference
  3. machine learning
  4. online crowd-sourced data
  5. poverty maps
  6. satellite images

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • SoBigData++ H2020

Conference

WWW '23
Sponsor:
WWW '23: The ACM Web Conference 2023
April 30 - May 4, 2023
TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)424
  • Downloads (Last 6 weeks)58
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)THE OPPORTUNITIES, LIMITATIONS, AND CHALLENGES IN USING MACHINE LEARNING TECHNOLOGIES FOR HUMANITARIAN WORK AND DEVELOPMENTAdvances in Complex Systems10.1142/S021952592440002227:03Online publication date: 3-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media