Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Comparison of missing value estimation techniques in rainfall data of Bangladesh

  • Original Paper
  • Published:
Theoretical and Applied Climatology Aims and scope Submit manuscript

Abstract

The presence of missing values in daily rainfall data may hamper the analyses to determine effective results for solving problems of hydrological, agricultural, and climatological issues. The study attempts to select an appropriate method for estimating the missing value of daily rainfall data of Bangladesh. For this purpose, eight methods and seven comparison techniques are employed. For imputation of missing values employing these methods, three sets of daily rainfall data (1, 5, and 10% missing values) with 1000 repetitions are considered randomly for five regions of the country. These samples are artificially created as missing and then imputation for these missing values is made applying the selected methods. The relative performance of the methods are examined using some comparison criteria. The following observations can be made from the study regarding the choice of the appropriate missing value estimation technique: for imputation of the missing values of daily rainfall data, the arithmetic average method for rainfall stations Chittagong and Rajshahi in the south-east region and the north-west region, respectively, is found as the best methods. Further, the single best estimator method for rainfall stations Sylhet and Dhaka in the north-east region and the mid-region, respectively, and the EM-MCMC method for rainfall station Khulna of the south-east region are also identified as the best methods in respect of Kolmogorov-Smirnov test, the lowest bias of estimate, the value of S index, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Ahrens B (2006) Distance in spatial interpolation of daily rain gauge data. Hydrol Earth Syst Sci 10:197–208

    Article  Google Scholar 

  • Asati SR (2012) Analysis of rainfall data for drought investigation at Agra U. P. Int J Life Sci Biotechnol Pharm Res 1(4):81–86

    Google Scholar 

  • Bangladesh Economic Review (2016) Economic adviser’s wing, finance division, Ministry of Finance, Government of the People’s Republic of Bangladesh

  • Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?—arguments against avoiding RMSE in the literature. Geosci Model Dev 7(3):1247–1250

    Article  Google Scholar 

  • Chen FW, Liu CW (2012) Estimation of the spatial rainfall distribution using inverse distance weighting (IDW) in the middle of Taiwan. Paddy Water Environ 10(3):209–222

    Article  Google Scholar 

  • Chowdhury MRK (2013) Country report: Bangladesh meteorological department (BMD), People’s republic of Bangladesh

  • Collins LM, Schafer JL, Kam CM (2001) A comparison of inclusive and restrictive strategies in modern missing-data procedures. Psychol Methods 6:330–351

    Article  Google Scholar 

  • Cong RG, Brady M (2012) The interdependence between rainfall and temperature: copula analyses. Sci World J 2012:1–11

    Article  Google Scholar 

  • Coulibaly P, Evora ND (2007) Comparison of neural network methods for infilling missing daily weather records. J Hydrol 341:27–41

    Article  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38

    Google Scholar 

  • Dumedah G, Coulibaly P (2011) Evaluation of statistical methods for infilling missing values in high-resolution soil moisture data. J Hydrol 400(1–2):95–102

    Article  Google Scholar 

  • Eischeid JK, Baker CB, Karl TR, Diaz HGF (1995) The quality control of long-term climatological data using objective data analysis. J Appl Meteorol 34:2787–2795

    Article  Google Scholar 

  • Eischeid JK, Pasteris PA, Diaz HF, Plantico MS, Lott NJ (2000) Creating a serially complete, national daily time series of temperature and precipitation for the western United States. J Appl Meteorol 39(9):1580–1591

    Article  Google Scholar 

  • Ferrari GT, Ozaki V (2014) Missing data imputation of climate datasets: implications to modeling extreme drought events. Rev Bras Meteorol 29(1):21–28

    Article  Google Scholar 

  • Garcia B, Sentelhas P, Tapia L, Sparovek G (2006) Filling in missing rainfall data in the Andes region of Venezuela, based on a cluster analysis approach. Rev Bras Agrometeorol 14(2):225–233

    Google Scholar 

  • Garcia M, Peters-Lidard CD, Goodrich DC (2008) Spatial interpolation in a dense gauge network for monsoon storm events in the southwestern United States. Water Resour Res 44:W05S13. https://doi.org/10.1029/2006WR005788

    Article  Google Scholar 

  • Goodison B, Louie PYT, Yang D (1998) WMO solid precipitation measurement inter comparison. Final report

  • Graham JW, Hofer SM, Donaldson SI, MacKinnon DP, Schafer JL (1997) Analysis with missing data in prevention research. The science of prevention: methodological advances from alcohol and substance abuse research, 1, pp 325-366

  • Hubbard KG (1994) Spatial variability of daily weather variables in the high plains of the USA. Agric For Meteorol 68:29–41

    Article  Google Scholar 

  • Kemp WP, Burnell DG, Everson DO, Thomson AJ (1983) Estimating missing daily maximum and minimum temperatures. J Climate Appl 22:1587–1593

    Article  Google Scholar 

  • Kripalani RH, Inamdar S, Sontakke NA (1996) Rainfall variability over Bangladesh and Nepal: comparison and connections with features over India. Int J Climatol 16(6):689–703

    Article  Google Scholar 

  • Lam NSN (1983) Spatial interpolation methods : a review. Am Cartographer 10(2):129–149

    Article  Google Scholar 

  • Lennon JJ, Turner JRG (1995) Predicting the spatial distribution of climate: temperature in Great Britain. J Anim Ecol 64:370–392

    Article  Google Scholar 

  • Li X, Z Zhao (2001) Measures of performance for evaluation of estimators and filters. Proc. 2001 SPIE Conf. on Signal and Data Processing, (July–August), pp 1–12

  • Little JRA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York

    Google Scholar 

  • Lo Presti R, Barca E, Passarella G (2010) A methodology for treating missing data applied to daily rainfall data in the Candelaro River Basin (Italy). Environ Monit Assess 160:1–22

    Article  Google Scholar 

  • Massey FJ (1951) The Kolmogorov-Smirnov test for goodness of fit. JASA 46(253):68–78

    Article  Google Scholar 

  • National Hurricane Center of USA n.d. http://www.nhc.noaa.gov/gccalc.shtml

  • Paulhus JLH, Kohler MA (1952) Interpolation of missing precipitation records. Mon Weather Rev 80(8):129–133

    Article  Google Scholar 

  • Rashid H-e (1991) Geography of Bangladesh (2nd edition). In: Dhaka University Press Limited, Dhaka

  • Rubel F, Hantel M (1999) Correction of daily gauge measurements in the Baltic Sea drainage basin. Nord Hydrol 30:191–208

    Article  Google Scholar 

  • Rubin DB (1976) Inference and missing data. Biometrika 63(3):581–592

    Article  Google Scholar 

  • Rubin DB (1978) Multiple imputation in sample surveys—a phenomenological Bayesian approach to nonresponse. Proceedings of the Survey Research Methods Section, ASA, pp 20–34

  • Rubin DB (1987) Multiple imputation for non-response in surveys. Wiley, New York

    Book  Google Scholar 

  • Schafer JL (1997) Analysis of incomplete multivariate data. Chapman & Hall, London

    Book  Google Scholar 

  • Scheffer J (2002) Dealing with missing data. Res Lett Inf Math Sci 3:53–160

    Google Scholar 

  • Shepard D (1968) A two-dimensional interpolation functions for irregularly spaced data. Proceeding of the Twenty-Third National Conference of the ACM, Washington, DC, pp 517–524

    Google Scholar 

  • Silva RP, Dayawansa NDK, Ratnasiri MD (2007) A comparison of methods used in estimating missing rainfall data. J Agric Sci 3(May):101–108

    Google Scholar 

  • Simanton JR, Osborn HB (1980) Reciprocal-distance estimate of point rainfall. J Hydraul Eng 106:1242–1246

    Google Scholar 

  • Simolo C, Brunetti M, Maugeri M, Nanni T (2010) Improving estimation of missing values in daily precipitation series by a probability density function-preserving approach. Int J Climatol 30:1564–1576

    Google Scholar 

  • Suhalia J, Sayang MD, Jemain AA (2008) Revised spatial weighting methods for estimation of missing rainfall data. Asia-Pac J Atmos Sci 44(2):93–104

    Google Scholar 

  • Tabios GQ, Salas JD (1985) A comparative analysis of techniques for spatial interpolation of precipitation. Water Resour Bull 21:365–380

    Article  Google Scholar 

  • Tabony RC (1983) The estimation of missing climatological data. J Climatol 3:297–314

    Article  Google Scholar 

  • Tang WY, Kassim AHM, Abubakar SH (1996) Comparative studies of various missing data treatment methods-Malaysian experience. Atmos Res 42:247–262

    Article  Google Scholar 

  • Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation. JASA 82(398):528–540

    Article  Google Scholar 

  • Teegavarapu RSV, Chandramouli V (2005) Improved weighting methods, deterministic and stochastic data-driven models for estimation of missing precipitation records. J Hydrol 312:191–206

    Article  Google Scholar 

  • Tronci N, Molteni F, Bozzini M (1986) A comparison of local approximation methods for the analysis of meteorological data. Arch Meteorol Geophys Bioclimatol A 36:189–211

    Article  Google Scholar 

  • Walther BA, Moore JL (2005) The concept of bias, precison and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimators. Ecography 28:815–829

    Article  Google Scholar 

  • Wilks DS (1995) Statistical methods in the atmospheric sciences. Academic Press, New York

    Google Scholar 

  • Williams P (1998) Modelling seasonality and trends in daily rainfall data. Adv Neural Inf Proces Syst 10:985–991

    Google Scholar 

  • Wallis JR, Letten Mayer DP, Wood EF (1991) A daily hydro climatological data set for the continental United States. Water Resour Res 27:1657–1663

    Article  Google Scholar 

  • Wilmott CJ (1981) On the validation of models. Phys Geogr 2:194–194

    Article  Google Scholar 

  • Xia Y, Fabian P, Stohl A, Winterhalter M (1999) Forest climatology: estimation of missing values for Bavaria, Germany. Agric For Meteorol 96:131–144

    Article  Google Scholar 

  • Yim C (2015) Imputing missing data with SAS. SAS Global Forum 2015, April 26–29, 2015, Dallas, pp 1–21

  • Yozgatligil C, Aslan S, Iyigun C, Batmaz I (2013) Comparison of missing value imputation methods in time series: the case of Turkish meteorological data. Theor Appl Climatol 112(1–2):143–167

    Article  Google Scholar 

  • Young KC (1992) A three way model for interpolating monthly precipitation values. Mon Weather Rev 120:2561–2569

    Article  Google Scholar 

Download references

Acknowledgements

This study is supported under the HEQEP sub-project, CP-3293, in the Department of Applied Statistics, East West University funded by World Bank and implemented by University Grants Commission of Bangladesh (UGC). The authors are also grateful to Bangladesh Meteorological Department (BMD) for providing the data. We acknowledge the critical comments from anonymous reviewers and editor.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farzana Jahan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jahan, F., Sinha, N.C., Rahman, M.M. et al. Comparison of missing value estimation techniques in rainfall data of Bangladesh. Theor Appl Climatol 136, 1115–1131 (2019). https://doi.org/10.1007/s00704-018-2537-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00704-018-2537-y