An Open-Source Workflow for Spatiotemporal Studies with COVID-19 as an Example
Abstract
:1. Introduction
- Open-source software describes the free collaborative and interoperability approach of software development.
- Open data are data that are freely accessible, shareable, and usable.
- Open hardware includes the machine, physical devices, and build environment.
- Open standards represent the open process, including the specifications for hardware, software, and data.
- Open education denotes the transfer of knowledge without restriction.
- Open science represents the promotion of scientific research and its dissemination of discoveries across the globe.
2. Literature Review
2.1. Available Open-Source COVID-19 Digital Resources and Their Issues
2.2. Significance of Open and Collaborative Approach
2.3. Platforms That Support Open-Source Collaboration
2.4. Challenges in Open-Source Software Development and Collaboration
2.5. Open-Source Software Development Approach
3. Methods
3.1. Spatiotemporal Application Development
3.2. Sharing and Maintenance
3.3. Reproducible Research
4. Use Case—Assess the Impact of Air Quality during COVID-19 Based on the Proposed Open-Source Spatiotemporal Workflow
4.1. Hypothesis
4.2. Open Data Sources
- Ground-based air pollutants data required for this study are obtained from the US Environmental Protection Agency (EPA). The data are downloaded from this link: https://www.epa.gov/outdoor-air-quality-data/download-daily-data (accessed on 14 December 2021).
- Tropospheric nitrogen dioxide needed for this study is obtained from GES DISC: https://disc.gsfc.nasa.gov/datasets/OMNO2d_003/summary (accessed on 14 December 2021).
- The locations of major power plants in CA are obtained from Wikipedia: https://en.wikipedia.org/wiki/List_of_power_stations_in_California (accessed on 14 December 2021).
- The locations of major wildfires are downloaded from the California Department of Forestry and Fire Protection (CAL FIRE): https://www.fire.ca.gov/incidents/2020/ (accessed on 14 December 2021).
- The locations of national highways in California are obtained from the official Website of US Census Bureau, Department of Commerce: https://catalog.data.gov/dataset/tiger-line-shapefile-2016-nation-u-s-primary-roads-national-shapefile (accessed on 14 December 2021).
4.3. Spatiotemporal Data
4.4. Spatiotemporal Analytical Tool
4.5. Results/Visualization
4.6. Sharing and Maintenance
- “Air Pollutants Data” folder contains “CA ground-based air pollution data” of carbon monoxide, ozone, nitrogen dioxide, PM10, PM2.5, and sulfur dioxide and “Satellite-based NO2 data” for the study period 2020.
- “Air Quality Analytical Tool” folder contains the python script (OMI_static_ca.py) to calculate the daily mean of nitrogen dioxide and calculate the difference between the two study periods.
- “Air Quality Results” folder contains a spreadsheet named california_counties_COVID_env_data.xlsx. It has a daily average concentration for each pollutant.
4.7. User Guide and Video
- Specify the required python packages.
- Provide instructions on how to set up the virtual environment.
- Install the python packages using pip.
- Show where to obtain the input datasets.
- Execute the analytical tool and obtain the results.
5. Traffic and Commit Statistics
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- World Health Organization—Interactive Timeline. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/interactive-timeline (accessed on 10 May 2021).
- Brovelli, M.; Ilie, C.M.; Coetzee, S. Openness and community geospatial science for monitoring SDGs—An example from Tanzania. In Sustainable Development Goals Connectivity Dilemma: Land and Geospatial Information for Urban and Rural Resilience; CRC Press: Boca Raton, FL, USA, 2019; pp. 313–324. [Google Scholar]
- Coetzee, S.; Ivánová, I.; Mitasova, H.; Brovelli, M.A. Open geospatial software and data: A review of the current state and a perspective into the future. ISPRS Int. J. Geo-Inf. 2020, 9, 90. [Google Scholar] [CrossRef] [Green Version]
- Open Definition. Open Knowledge Foundation. Available online: http://opendefinition.org/ (accessed on 19 April 2021).
- Stallman, R. Free Software, Free Society: Selected Essays of Richard M. Stallman; Lulu. Com: Morrisville, NC, USA, 2002. [Google Scholar]
- Steiniger, S.; Hunter, A.J.S. The 2012 free and open source GIS software map–A guide to facilitate research, development, and adoption. Comput. Environ. Urban Syst. 2013, 39, 136–150. [Google Scholar] [CrossRef]
- Stallman, R. Viewpoint Why open source misses the point of free software. Commun. ACM 2009, 52, 31–33. [Google Scholar] [CrossRef]
- The Open Source Geospatial Foundation. Available online: https://www.osgeo.org/ (accessed on 19 April 2021).
- What Is Open Source? Available online: https://www.osgeo.org/about/what-is-open-source/ (accessed on 19 April 2021).
- Brovelli, M.A.; Minghini, M.; Moreno-Sanchez, R.; Oliveira, R. Free and open source software for geospatial applications (FOSS4G) to support Future Earth. Int. J. Digit. Earth 2017, 10, 386–404. [Google Scholar] [CrossRef] [Green Version]
- Hall, G.B.; Leahy, M.G. Open Source Approaches in Spatial Data Handling, 2nd ed.; Springer: Berlin, Germany, 2008. [Google Scholar]
- Rao, K.V.; Govardhan, A.; Rao, K.C. Spatiotemporal data mining: Issues, tasks and applications. Int. J. Comput. Sci. Eng. Surv. 2012, 3, 39. [Google Scholar]
- Brunsdon, C.; Comber, A. Opening practice: Supporting reproducibility and critical spatial data science. J. Geogr. Syst. 2020, 23, 477–496. [Google Scholar] [CrossRef]
- Shu, Y.; McCauley, J. GISAID: Global initiative on sharing all influenza data–from vision to reality. Eurosurveillance 2017, 22, 30494. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Frazer, J.S.; Shard, A.; Herdman, J. Involvement of the open-source community in combating the worldwide COVID-19 pandemic: A review. J. Med. Eng. Technol. 2020, 44, 169–176. [Google Scholar] [CrossRef] [PubMed]
- Hu, T.; Guan, W.W.; Zhu, X.; Shao, Y.; Liu, L.; Du, J.; Liu, H.; Zhou, H.; Wang, J.; She, B.; et al. Building an Open Resources Repositories for COVID-19 Research. Data Inf. Manag. 2020, 3, 130–147. [Google Scholar]
- Alamo, T.; Reina, D.G.; Mammarella, M.; Abella, A. Open data resources for fighting covid-19. arXiv preprint 2020, arXiv:2004.06111. [Google Scholar]
- Coronavirus Resource Center. John Hopkins University of Medicine. Available online: https://coronavirus.jhu.edu/map.html (accessed on 19 April 2021).
- COVID-19 Government Response Tracker. University of Oxford. Available online: https://www.bsg.ox.ac.uk/research/research-projects/covid-19-government-response-tracker (accessed on 19 April 2021).
- Harvard Dataverse. Available online: https://dataverse.harvard.edu/dataverse/2019ncov (accessed on 19 April 2021).
- Shuja, J.; Alanazi, E.; Alasmary, W.; Alashaikh, A. COVID-19 Datasets: Asurvey and Future Challenges. Development 2020, 11, 12. [Google Scholar]
- Cohen, J.P.; Morrison, P.; Dao, L.; Roth, K.; Duong, T.Q.; Ghassemi, M. COVID-19 image data collection: Prospective predictions are the future. arXiv preprint 2020, arXiv:2006.11988. [Google Scholar]
- Dong, E.; Du, H.; Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 2020, 20, 533–534. [Google Scholar] [CrossRef]
- Chen, E.; Lerman, K.; Ferrara, E. Tracking social media discourse about the covid-19 pandemic: Development of a public coronavirus twitter data set. JMIR Public Health Surveill. 2020, 6, e19273. [Google Scholar] [CrossRef]
- Wang, L.L.; Lo, K.; Chandrasekhar, Y.; Reas, R.; Yang, J.; Eide, D.; Funk, K.; Kinney, R.M.; Liu, Z.; Merrill, W.; et al. CORD-19: The Covid-19 Open Research Dataset. arXiv 2020, arXiv:2004.10706v2. [Google Scholar]
- Liu, Q.; Liu, W.; Sha, D.; Kumar, S.; Chang, E.; Arora, V.; Lan, H.; Li, Y.; Wang, Z.; Zhang, Y.; et al. An environmental data collection for COVID-19 pandemic research. Data 2020, 5, 68. [Google Scholar] [CrossRef]
- Marivate, V.; Combrink, H.M. Use of available data to inform the COVID-19 outbreak in South Africa: A case study. arXiv preprint 2020, arXiv:2004.04813. [Google Scholar] [CrossRef]
- Wang, L.; Li, R.; Zhu, J.; Bai, G.; Wang, H. When the Open Source Community Meets COVID-19: Characterizing COVID-19 themed GitHub Repositories. arXiv preprint 2020, arXiv:2010.12218. [Google Scholar]
- Singleton, A.D.; Spielman, S.; Brunsdon, C. Establishing a framework for Open Geographic Information science. Int. J. Geogr. Inf. Sci. 2016, 30, 1507–1521. [Google Scholar] [CrossRef] [Green Version]
- Dabbish, L.; Stuart, C.; Tsay, J.; Herbsleb, J. Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, New York, NY, USA, 11–15 February 2012. [Google Scholar]
- Zagalsky, A.; Feliciano, J.; Storey, M.-A.; Zhao, Y.; Wang, W. The emergence of github as a collaborative platform for education. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, Vancouver, BC, Canada, 14–18 March 2015. [Google Scholar]
- Kemper, C.; Oxley, I. Foundation Version Control for Web Developers; APress: New York, NY, USA, 2012. [Google Scholar]
- Erenkrantz, J.R. Release management within open source projects. In Proceedings of the 3rd. Workshop on Open Source Software Engineering, IEEE Computer Society, Portland, OR, USA, 3–10 May 2003. [Google Scholar]
- Stol, K.-J.; Babar, M.A. Challenges in using open source software in product development: A review of the literature. In Proceedings of the 3rd International Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development, Cape Town, South Africa, 8 May 2010. [Google Scholar]
- Ankolekar, A.; Herbsleb, J.D.; Sycara, K. Addressing challenges to open source collaboration with the semantic web. In Proceedings of the 3rd Workshop on Open Source Software Engineering, the 25th International Conference on Software Engineering (ICSE), IEEE Computer Society, Portland, OR, USA, 3–10 May 2003. [Google Scholar]
- Scacchi, W. Understanding the requirements for developing open source software systems. IEEE Proc.-Softw. 2002, 149, 24–39. [Google Scholar] [CrossRef] [Green Version]
- Mockus, A.; RFielding, T.; Herbsleb, J. A case study of open source software development: The Apache server. In Proceedings of the 22nd International Conference on Software Engineering, Limerick, Ireland, 4–11 June 2000. [Google Scholar]
- Kon, F.; Meirelles, P.; Lago, N.; Terceiro, A.; Chavez, C.; Mendonça, M. Free and open source software development and research: Opportunities for software engineering. In Proceedings of the 2011 25th Brazilian Symposium on Software Engineering, Sao Paulo, Brazil, 28–30 September 2011. [Google Scholar]
- German, D.M. The GNOME project: A case study of open source, global software development. Softw. Process Improv. Pract. 2003, 8, 201–215. [Google Scholar] [CrossRef]
- Dinh-Trong, T.; Bieman, J.M. Open source software development: A case study of FreeBSD. In Proceedings of the 10th International Symposium on Software Metrics, Chicago, IL, USA, 11–17 September 2004. [Google Scholar]
- Mitasova, H.; Neteler, M. Freedom in geoinformation science and software development: A GRASS GIS contribution. In Proceedings of the Open Source Free Software GIS-GRASS Users Conference, Trento, Italy, 11–13 September 2002. [Google Scholar]
- Allan, J.; Carbonell, J.G.; Doddington, G.; Yamron, J.; Yang, Y. Topic Detection and Tracking Pilot Study Final Report; Carnegie Mellon University: Pittsburgh, PA, USA, 1998. [Google Scholar]
- Yu, M.; Bambacus, M.; Cervone, G.; Clarke, K.; Duffy, D.; Huang, Q.; Li, J.; Li, W.; Li, Z.; Liu, Q.; et al. Spatiotemporal event detection: A review. Int. J. Digit. Earth 2020, 13, 1339–1365. [Google Scholar] [CrossRef] [Green Version]
- Tryfona, N.; Price, R.; Jensen, C.S. Chapter 3: Conceptual Models for Spatio-temporal Applications. In Spatio-Temporal Databases: The CHOROCHRONOS Approach; Sellis, T.K., Koubarakis, M., Frank, A., Grumbach, S., Güting, R.H., Jensen, C., Lorentzos, N.A., Manolopoulos, Y., Nardelli, E., Pernici, B., et al., Eds.; Springer: Berlin/Heidelberg, Germany, 2003; pp. 79–116. [Google Scholar]
- Pfoser, D.; Tryfona, N. Requirements, Definitions, and Notations for Spatiotemporal Application Environments. In Proceedings of the 6th ACM International Symposium on Advances in Geographic Information Systems, New York, NY, USA, 2–7 November 1998. [Google Scholar]
- Peuquet, D. Time in GIS and Geographical Databases. Geogr. Inf. Syst. 2005, 1, 91–103. [Google Scholar]
- Yang, C.; Clarke, K.; Shekhar, S.; Tao, C.V. Big Spatiotemporal Data Analytics: A research and innovation frontier. Int. J. Geogr. Inf. Sci. 2020, 34, 1075–1088. [Google Scholar] [CrossRef] [Green Version]
- Shekhar, S.; Jiang, Z.; Ali, R.Y.; Eftelioglu, E.; Tang, X.; Gunturi, V.M.V.; Zhou, X. Spatiotemporal Data Mining: A Computational Perspective. ISPRS Int. J. Geo-Inf. 2015, 4, 2306–2338. [Google Scholar] [CrossRef]
- Fagan, M.E. Design and code inspections to reduce errors in program development. IBM Syst. J. 1999, 38, 258–287. [Google Scholar] [CrossRef]
- Peng, R.D. Reproducible research in computational science. Science 2011, 334, 1226–1227. [Google Scholar] [CrossRef] [Green Version]
- Tobler, W.R. A Computer Movie Simulating Urban Growth in the Detroit Region. Econ. Geogr. 1970, 46, 234–240. [Google Scholar] [CrossRef]
- Benureau, F.C.Y.; Rougier, N.P. Re-run, Repeat, Reproduce, Reuse, Replicate: Transforming Code into Scientific Contributions. Front. Neuroinform. 2018, 11, 69. [Google Scholar] [CrossRef]
- Liu, Q.; Harris, J.T.; Chiu, L.S.; Sun, D.; Houser, P.R.; Yu, M.; Duffy, D.Q.; Little, M.M.; Yang, C. Spatiotemporal impacts of COVID-19 on air pollution in California, USA. Sci. Total Environ. 2021, 750, 141592. [Google Scholar] [CrossRef]
Free and Open-Source Category | Function | Examples |
---|---|---|
Desktop GIS | Organizing, analyzing, and visualizing spatial data | GRASS, QGIS, gvGIS, uDig, OpenJUMP |
Web GIS services | Web-based GIS services including searching, retrieving, and visualizing | GeoServer, MapServer, QGIS Server, MapCache, deegree |
Geospatial libraries | Accessing, analyzing, and spatial data processing | GDAL/OGR, GeoTools, GEOS, PROJ4 |
Spatial Data Storage | Database management system for spatial data | PostgreSQL/PostGIS, SpatiaLite, SQLite, MySQL Spatial, MongoDB |
Geovisualization | Interactive visualization to support spatial analysis | OpenLayers, Leaflet, Cesium, WebGL Earth, OpenWeb Globe |
Platform/Language | Provides tools to support handling and analysis of Spatio-temporal data | R (Rspatial) Python (GeoPython, GeoPandas, PySAL, landlab) JavaScript (Leaflet, D3, MapBox, NodeJS) |
Study | COVID-19 Datasets | Data Type | Application | Link |
---|---|---|---|---|
[22] | CT scans and X-ray | Image | COVID-19 Diagnosis and infected area segmentation | https://github.com/ieee8023/COVID-chestxray-dataset (accessed on 14 December 2021) |
[23] | Daily cases | Textual | Reporting and visualizing | https://github.com/CSSEGISandData/COVID-19 (accessed on 14 December 2021) |
[24] | Tweets | Textual | Conversation dynamics | https://github.com/echen102/COVID-19-TweetIDs (accessed on 14 December 2021) |
[25] | Scholarly articles | Textual | Reporting | https://www.semanticscholar.org/cord19/download (accessed on 14 December 2021) |
[26] | Environmental factors | Textual | Reporting | https://github.com/stccenter/COVID-19-Data/tree/master/Environmental%20factors (accessed on 14 December 2021) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Malarvizhi, A.S.; Liu, Q.; Sha, D.; Lan, H.; Yang, C. An Open-Source Workflow for Spatiotemporal Studies with COVID-19 as an Example. ISPRS Int. J. Geo-Inf. 2022, 11, 13. https://doi.org/10.3390/ijgi11010013
Malarvizhi AS, Liu Q, Sha D, Lan H, Yang C. An Open-Source Workflow for Spatiotemporal Studies with COVID-19 as an Example. ISPRS International Journal of Geo-Information. 2022; 11(1):13. https://doi.org/10.3390/ijgi11010013
Chicago/Turabian StyleMalarvizhi, Anusha Srirenganathan, Qian Liu, Dexuan Sha, Hai Lan, and Chaowei Yang. 2022. "An Open-Source Workflow for Spatiotemporal Studies with COVID-19 as an Example" ISPRS International Journal of Geo-Information 11, no. 1: 13. https://doi.org/10.3390/ijgi11010013
APA StyleMalarvizhi, A. S., Liu, Q., Sha, D., Lan, H., & Yang, C. (2022). An Open-Source Workflow for Spatiotemporal Studies with COVID-19 as an Example. ISPRS International Journal of Geo-Information, 11(1), 13. https://doi.org/10.3390/ijgi11010013