Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A Scalable Smart Meter Data Generator Using Spark

  • Conference paper
  • First Online:
On the Move to Meaningful Internet Systems. OTM 2017 Conferences (OTM 2017)

Abstract

Today, smart meters are being used worldwide. As a matter of fact smart meters produce large volumes of data. Thus, it is important for smart meter data management and analytics systems to process petabytes of data. Benchmarking and testing of these systems require scalable data, however, it can be challenging to get large data sets due to privacy and/or data protection regulations. This paper presents a scalable smart meter data generator using Spark that can generate realistic data sets. The proposed data generator is based on a supervised machine learning method that can generate data of any size by using small data sets as seed. Moreover, the generator can preserve the characteristics of data with respect to consumption patterns and user groups. This paper evaluates the proposed data generator in a cluster based environment in order to validate its effectiveness and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Smart Meter From Wikipedia. https://en.wikipedia.org/wiki/Smart_meter

  2. Liu, X., Golab, L., Golab, W., Ilyas, I.F.: Benchmarking smart meter data analytics. In: Proceedings of the 18th International Conference on Extending Database Technology, pp. 385–396 (2015)

    Google Scholar 

  3. Liu, X., Golab, L., Golab, W., Ilyas, I.F., Jin, S.: Smart meter data analytics: systems, algorithms, and benchmarking. In: ACM Transactions on Database Systems (TODS), 42(1), Article no. 2. ACM Press, New York (2017)

    Google Scholar 

  4. Liu, X., Golab, L., Ilyas, I.F.: SMAS: a smart meter data analysis system (Demo). In: Proceedings of the 31st International Conference on Data Engineering, pp. 147–1479 (2015)

    Google Scholar 

  5. ISSDA. www.ucd.ie/issda/data/commissionforenergyregulationcer

  6. Iftikhar, N., Liu, X., Nordbjerg, F.E., Danalachi, S.: A prediction-based smart meter data generator. In: 19th International Conference on Network-Based Information Systems, pp. 173–180. IEEE (2016)

    Google Scholar 

  7. Time Series Components. www.otexts.org/fpp/6/1

  8. Zhang, G.P., Qi, M.: Neural network forecasting for seasonal and trend time series. Eur. J. Oper. Res. 160(2), 501–514 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  9. Weiers, R.: Introduction to Business Statistics. Cengage Learning, Boston (2010)

    Google Scholar 

  10. Lawrence, K.D., Klimberg, R.K., Lawrence, S.M.: Fundamentals of Forecasting using Excel. Industrial Press Inc., Norwalk (2009)

    Google Scholar 

  11. Okcan, A., Riedewald, M.: Processing theta-joins using MapReduce. In: Proceedings of SIGMOD, pp. 949–960 (2011)

    Google Scholar 

  12. Wu, J.: Advances in K-means Clustering: A Data Mining Thinking. Springer Science & Business Media, Heidelberg (2012)

    Book  MATH  Google Scholar 

  13. Parsian, M.: Data Algorithms: Recipes for Scaling Up with Hadoop and Spark. O’Reilly Media Inc., Sebastopol (2015)

    Google Scholar 

  14. Liao, T.W.: Clustering of time series data—a survey. Pattern Recogn. 38(11), 1857–1874 (2005)

    Article  MATH  Google Scholar 

  15. Black, K.: Business Statistics: For Contemporary Decision Making. Wiley, Hoboken (2011)

    Google Scholar 

  16. Peng, B., Wan, C., Dong, S., Lin, J., Song, Y., Zhang, Y., Xiong, J.: A two-stage pattern recognition method for electric customer classification in smart grid. In: Smart Grid Communications (SmartGridComm), pp. 758–763 (2016)

    Google Scholar 

  17. Poess, M., Floyd, C.: New TPC benchmarks for decision support and web commerce. ACM Sigmod Rec. 29(4), 64–71 (2000)

    Article  Google Scholar 

  18. Breinl, K., Turkington, T., Stowasser, M.: Simulating daily precipitation and temperature: a weather generation framework for assessing hydrometeorological hazards. Meteorol. Appl. 22(3), 334–347 (2014)

    Article  Google Scholar 

  19. Li, Z., Brissette, F., Chen, J.: Finding the most appropriate precipitation probability distribution for stochastic weather generation and hydrological modeling in nordic watersheds. Hydrol. Process. 27(25), 3718–3729 (2013)

    Article  Google Scholar 

  20. Breinl, K., Turkington, T., Stowasser, M.: A weather generator for hydro-meteorological hazard applications EGU general assembly conference. In: EGU General Assembly Conference Abstracts, vol. 16, p. 10522 (2014)

    Google Scholar 

  21. van Paassen, A.H., Luo, Q.X.: Weather data generator to study climate change on buildings. Build. Serv. Eng. Res. Technol. 23(4), 251–258 (2002)

    Article  Google Scholar 

  22. Shamshad, A., Bawadi, M.A., Hussin, W.W., Majid, T.A., Sanusi, S.A.M.: First and second order markov chain models for synthetic generation of wind speed time series. Energy 30(5), 693–708 (2005)

    Article  Google Scholar 

  23. Cuddihy, M.A., Drummond Jr., J.B., Bourquin, D.J.: Ford motor company, vehicle crash data generator. U.S. Patent No. 5,608,629 (1997)

    Google Scholar 

  24. Zhang, G.P.: Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50, 159–175 (2003)

    Article  MATH  Google Scholar 

  25. Anderson, P.L., Meerschaert, M.M., Zhang, K.: Forecasting with prediction intervals for periodic autoregressive moving average models. J. Time Ser. Anal. 34(2), 187–193 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  26. Kegel, L., Hahmann, M., Lehner, W.: Template-based time series generation with loom. In: EDBT/ICDT Workshops, vol. 1558 (2016)

    Google Scholar 

  27. De Gooijer, J.G., Hyndman, R.J.: 25 years of time series forecasting. Int. J. Forecast. 22(3), 443–473 (2006)

    Article  Google Scholar 

  28. Arlitt, M., Marwah, M., Bellala, G., Shah, A., Healey, J., Vandiver, B.: IoTA bench: an internet of things analytics benchmark. In: 6th ACM/SPEC International Conference on Performance Engineering, pp. 133–144. ACM Press, New York (2015)

    Google Scholar 

Download references

Acknowledgement

This research is supported by UCN-FOU funding (Project-6/2016-17) and the CITIES project by Danish Innovation Fund (1035-00027B).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nadeem Iftikhar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Iftikhar, N., Liu, X., Danalachi, S., Nordbjerg, F.E., Vollesen, J.H. (2017). A Scalable Smart Meter Data Generator Using Spark. In: Panetto, H., et al. On the Move to Meaningful Internet Systems. OTM 2017 Conferences. OTM 2017. Lecture Notes in Computer Science(), vol 10573. Springer, Cham. https://doi.org/10.1007/978-3-319-69462-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69462-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69461-0

  • Online ISBN: 978-3-319-69462-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics