Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Scaling Posterior Distributions over Differently-Curated Datasets: A Bayesian-Neural-Networks Methodology

  • Conference paper
  • First Online:
Foundations of Intelligent Systems (ISMIS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13515))

Included in the following conference series:

  • 949 Accesses

Abstract

This paper provides an introduction to an innovative methodology for scaling posterior distributions over differently-curated datasets. The proposed methodology is based on Bayesian Neural Networks, improved by effective sampling algorithms. These algorithms finally realize a suitable model setup for improving the scaling effect. Theoretical results are presented and discussed in details, as well as a modern case study focused on stock quotation prediction that confirms the successful application of our proposed methodology to emerging big data analytics settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aitchison, L.: A statistical theory of cold posteriors in deep neural networks (2021)

    Google Scholar 

  2. Al Nuaimi, E., Al Neyadi, H., Mohamed, N., Al-Jaroodi, J.: Applications of big data to smart cities. J. Internet Serv. Appl. 6(1), 1–15 (2015). https://doi.org/10.1186/s13174-015-0041-5

    Article  Google Scholar 

  3. Audu, A.-R.A., Cuzzocrea, A., Leung, C.K., MacLeod, K.A., Ohin, N.I., Pulgar-Vidal, N.C.: An intelligent predictive analytics system for transportation analytics on open data towards the development of a smart city. In: Barolli, L., Hussain, F.K., Ikeda, M. (eds.) CISIS 2019. AISC, vol. 993, pp. 224–236. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-22354-0_21

    Chapter  Google Scholar 

  4. Bellatreche, L., Cuzzocrea, A., Benkrid, S.: \(\cal{F}\) &\(\cal{A}\): a methodology for effectively and efficiently designing parallel relational data warehouses on heterogenous database clusters. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2010. LNCS, vol. 6263, pp. 89–104. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15105-7_8

    Chapter  Google Scholar 

  5. Bello-Orgaz, G., Jung, J.J., Camacho, D.: Social big data: recent achievements and new challenges. Inf. Fusion 28, 45–59 (2016)

    Article  Google Scholar 

  6. Bonifati, A., Cuzzocrea, A.: Efficient fragmentation of large XML documents. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 539–550. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74469-6_53

    Chapter  Google Scholar 

  7. Brooks, S., Gelman, A., Jones, G.L., Meng, X.L.: Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC, Boca Raton (2011)

    Book  Google Scholar 

  8. Ceci, M., Cuzzocrea, A., Malerba, D.: Effectively and efficiently supporting roll-up and drill-down OLAP operations over continuous dimensions via hierarchical clustering. J. Intell. Inf. Syst. 44(3), 309–333 (2013). https://doi.org/10.1007/s10844-013-0268-1

    Article  Google Scholar 

  9. Chen, T., Fox, E.B., Guestrin, C.: Stochastic gradient Hamiltonian monte Carlo (2014)

    Google Scholar 

  10. Cuzzocrea, A., Darmont, J., Mahboubi, H.: Fragmenting very large XML data warehouses via k-means clustering algorithm. Int. J. Bus. Intell. Data Min. 4(3/4), 301–328 (2009)

    Google Scholar 

  11. Cuzzocrea, A., Furfaro, F., Greco, S., Masciari, E., Mazzeo, G.M., Saccà, D.: A distributed system for answering range queries on sensor network data. In: 3rd IEEE Conference PerCom 2005. Workshops, 2005. pp. 369–373. IEEE Computer Society (2005)

    Google Scholar 

  12. Cuzzocrea, A., Furfaro, F., Saccà, D.: Enabling OLAP in mobile environments via intelligent data cube compression techniques. J. Intell. Inf. Syst. 33(2), 95–143 (2009)

    Article  Google Scholar 

  13. Heek, J., Kalchbrenner, N.: Bayesian inference for large scale image classification. CoRR abs/1908.03491 (2019)

    Google Scholar 

  14. Hoffman, M.D., Gelman, A.: The no-u-turn sampler: adaptively setting path lengths in Hamiltonian monte Carlo. J. Mach. Learn. Res. 15, 1593–1623 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Koulali, R., Zaidani, H., Zaim, M.: Image classification approach using machine learning and an industrial Hadoop based data pipeline. Big Data Res. 24, 100184 (2021)

    Article  Google Scholar 

  16. Li, C., Chen, C., Carlson, D., Carin, L.: Preconditioned stochastic gradient Langevin dynamics for deep neural networks (2015)

    Google Scholar 

  17. Ma, Y.A., Chen, T., Fox, E.B.: A complete recipe for stochastic gradient MCMC (2015)

    Google Scholar 

  18. Milinovich, G.J., Magalhães, R.J.S., Hu, W.: Role of big data in the early detection of Ebola and other emerging infectious diseases. Lancet Glob. Health 3(1), 20–21 (2015)

    Article  Google Scholar 

  19. Morris, K.J., Egan, S.D., Linsangan, J.L., Leung, C.K., Cuzzocrea, A., Hoi, C.S.H.: Token-based adaptive time-series prediction by ensembling linear and non-linear estimators: a machine learning approach for predictive analytics on big stock data. In: 17th IEEE International Conference on ICMLA 2018, pp. 1486–1491. IEEE (2018)

    Google Scholar 

  20. Morzfeld, M., Tong, X.T., Marzouk, Y.M.: Localization for MCMC: sampling high-dimensional posterior distributions with local structure. J. Comput. Phys. 380, 1–28 (2019)

    Article  MathSciNet  Google Scholar 

  21. Nawaz, M.Z., Arif, O.: Robust kernel embedding of conditional and posterior distributions with applications. In: 15th IEEE ICMLA 2016, pp. 39–44. IEEE Computer Society (2016)

    Google Scholar 

  22. Ngiam, K.Y., Khor, W.: Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 20(5), 262–273 (2019)

    Article  Google Scholar 

  23. Nguyen, D.T., Nguyen, S.P., Pham, U.H., Nguyen, T.D.: A calibration-based method in computing Bayesian posterior distributions with applications in stock market. In: Kreinovich, V., Sriboonchitta, S., Chakpitak, N. (eds.) TES 2018. SCI, vol. 753, pp. 182–191. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-70942-0_10

    Chapter  Google Scholar 

  24. Ollier, V., Korso, M.N.E., Ferrari, A., Boyer, R., Larzabal, P.: Bayesian calibration using different prior distributions: an iterative maximum A posteriori approach for radio interferometers. In: 26th European Conference, EUSIPCO 2018, pp. 2673–2677. IEEE (2018)

    Google Scholar 

  25. Ovadia, Y., et al.: Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift (2019)

    Google Scholar 

  26. Pearce, T., Tsuchida, R., Zaki, M., Brintrup, A., Neely, A.: Expressive priors in Bayesian neural networks: kernel combinations and periodic functions (2019)

    Google Scholar 

  27. Pendharkar, P.C.: Bayesian posterior misclassification error risk distributions for ensemble classifiers. Eng. Appl. Artif. Intell. 65, 484–492 (2017)

    Article  Google Scholar 

  28. Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning (2017)

    Google Scholar 

  29. Ramamoorthi, R.V., Sriram, K., Martin, R.: On posterior concentration in misspecified models. Bayesian Anal. 10(4), 759–789 (2015)

    Article  MathSciNet  Google Scholar 

  30. Ruli, E., Ventura, L.: Higher-order Bayesian approximations for pseudo-posterior distributions. Commun. Stat. Simul. Comput. 45(8), 2863–2873 (2016)

    Article  MathSciNet  Google Scholar 

  31. Rajaraman, V.: Big data analytics. Resonance 21(8), 695–716 (2016). https://doi.org/10.1007/s12045-016-0376-7

    Article  Google Scholar 

  32. Shokrzade, A., Ramezani, M., Tab, F.A., Mohammad, M.A.: A novel extreme learning machine based KNN classification method for dealing with big data. Expert Syst. Appl. 183, 115293 (2021)

    Article  Google Scholar 

  33. Springenberg, J.T., Klein, A., Falkner, S., Hutter, F.: Bayesian optimization with robust Bayesian neural networks. In: Advances in Neural Information Processing Systems, vol. 29, pp. 4134–4142 (2016)

    Google Scholar 

  34. Stuart, A.M., Teckentrup, A.L.: Posterior consistency for gaussian process approximations of Bayesian posterior distributions. Math. Comput. 87(310), 721–753 (2018)

    Article  MathSciNet  Google Scholar 

  35. Tran, B.H., Rossi, S., Milios, D., Filippone, M.: All you need is a good functional prior for Bayesian deep learning (2020)

    Google Scholar 

  36. Tsai, C.-W., Lai, C.-F., Chao, H.-C., Vasilakos, A.V.: Big data analytics: a survey. J. Data 2(1), 1–32 (2015). https://doi.org/10.1186/s40537-015-0030-3

    Article  Google Scholar 

  37. Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient Langevin dynamics. In: Proceedings of the 28th International Conference on ICML 2011, pp. 681–688. Omnipress (2011)

    Google Scholar 

  38. Wenzel, F., et al.: How good is the Bayes posterior in deep neural networks really? (2020)

    Google Scholar 

  39. Zhu, L., Yu, F.R., Wang, Y., Ning, B., Tang, T.: Big data analytics in intelligent transportation systems: a survey. IEEE Trans. Intell. Transp. Syst. 20(1), 383–398 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Cuzzocrea .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cuzzocrea, A., Soufargi, S., Baldo, A., Fadda, E. (2022). Scaling Posterior Distributions over Differently-Curated Datasets: A Bayesian-Neural-Networks Methodology. In: Ceci, M., Flesca, S., Masciari, E., Manco, G., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2022. Lecture Notes in Computer Science(), vol 13515. Springer, Cham. https://doi.org/10.1007/978-3-031-16564-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16564-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16563-4

  • Online ISBN: 978-3-031-16564-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics