Abstract
Software bugs in cloud management systems often cause erratic behavior, hindering detection, and recovery of failures. As a consequence, the failures are not timely detected and notified, and can silently propagate through the system. To face these issues, we propose a lightweight approach to runtime verification, for monitoring and failure detection of cloud computing systems. We performed a preliminary evaluation of the proposed approach in the OpenStack cloud management platform, an “off-the-shelf” distributed system, showing that the approach can be applied with high failure detection coverage.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aguilera, M.K., Mogul, J.C., Wiener, J.L., Reynolds, P., Muthitacharoen, A.: Performance debugging for distributed systems of black boxes. ACM SIGOPS Oper. Syst. Rev. 37(5), 74–89 (2003)
Arlat, J., Fabre, J.C., Rodríguez, M.: Dependability of cots microkernel-based systems. IEEE Trans. Comput. 51(2), 138–163 (2002)
Barham, P., Isaacs, R., Mortier, R., Narayanan, D.: Magpie: online modelling and performance-aware systems. In: Proceedings of the HotOS, pp. 85–90 (2003)
Bartocci, E., Falcone, Y.: Lectures on Runtime Verification: Introductory and Advanced Topics, vol. 10457. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75632-5
Bianculli, D., Ghezzi, C., Pautasso, C., Senti, P.: Specification patterns from research to industry: a case study in service-based applications. In: Proceedings of the ICSE, pp. 968–976. IEEE (2012)
Blom, S., van de Pol, J., Weber, M.: LTSmin: distributed and symbolic reachability. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 354–359. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14295-6_31
Chen, F., Roşu, G.: Mop: an efficient and generic runtime verification framework. In: Proceedings of the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems and Applications, pp. 569–588 (2007)
Chen, Y.Y.M., Accardi, A.J., Kiciman, E., Patterson, D.A., Fox, A., Brewer, E.A.: Path-based failure and evolution management. In: Proceedings of the NSDI, pp. 309–322 (2004)
Cimatti, A., et al.: NuSMV 2: an OpenSource tool for symbolic model checking. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 359–364. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45657-0_29
Cotroneo, D., De Simone, L., Liguori, P., Natella, R.: Profipy: programmable software fault injection as-a-service. In: 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 364–372 (2020)
Cotroneo, D., De Simone, L., Liguori, P., Natella, R., Bidokhti, N.: Enhancing failure propagation analysis in cloud computing systems. In: 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), pp. 139–150. IEEE (2019)
Cotroneo, D., De Simone, L., Liguori, P., Natella, R., Bidokhti, N.: How bad can a bug get? an empirical analysis of software failures in the openstack cloud computing platform. In: Proceedings of the ESEC/FSE, pp. 200–211 (2019)
Dang, L.M., Piran, M., Han, D., Min, K., Moon, H., et al.: A survey on internet of things and cloud computing for healthcare. Electronics 8(7), 768 (2019)
Delgado, N., Gates, A.Q., Roach, S.: A taxonomy and catalog of runtime software-fault monitoring tools. IEEE Trans. Software Eng. 30(12), 859–872 (2004)
Denton, J.: Learning OpenStack Networking. Packt Publishing Ltd. (2015)
Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the ICSE, pp. 411–420 (1999)
Ernst, M.D., et al.: The daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1–3), 35–45 (2007)
EsperTech: ESPER HomePage (2020). http://www.espertech.com/esper
EsperTech: Esper Reference (2020). http://esper.espertech.com/release-8.5.0/reference-esper/html_single/index.html
Grant, S., Cech, H., Beschastnikh, I.: Inferring and asserting distributed system invariants. In: Proceedings of the ICSE, pp. 1149–1159 (2018)
Gu, J., Wang, L., Yang, Y., Li, Y.: Kerep: experience in extracting knowledge on distributed system behavior through request execution path. In: Proceedings of the ISSREW, pp. 30–35. IEEE (2018)
Holzmann, G.J.: The model checker spin. IEEE Trans. Software Eng. 23(5), 279–295 (1997)
OpenStack: Tempest Testing Project (2018). https://docs.openstack.org/tempest
OpenStack: OpenStack HomePage (2020). https://www.openstack.org/
OpenStack: OSProfiler HomePage (2020). https://github.com/openstack/osprofiler
OpenStack project: The OpenStack marketplace (2018). https://www.openstack.org/marketplace/distros/
OpenStack project: User stories showing how the world #RunsOnOpenStack (2018). https://www.openstack.org/user-stories/
Pnueli, A.: The temporal logic of programs. In: Proceedings of the SFCS, pp. 46–57. IEEE (1977)
Power, A., Kotonya, G.: Providing fault tolerance via complex event processing and machine learning for IoT systems. In: Proceedings of the IoT, pp. 1–7 (2019)
Rabiser, R., Guinea, S., Vierhauser, M., Baresi, L., Grünbacher, P.: A comparison framework for runtime monitoring approaches. J. Syst. Software 125, 309–321 (2017)
Reynolds, P., Killian, C.E., Wiener, J.L., Mogul, J.C., Shah, M.A., Vahdat, A.: PIP: detecting the unexpected in distributed systems. Proc. NSDI. 6, 9 (2006)
Solberg, M.: OpenStack for Architects. Packt Publishing (2017)
Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proceedings of the SIGMOD/PODS, pp. 407–418 (2006)
Yabandeh, M., Anand, A., Canini, M., Kostic, D.: Finding almost-invariants in distributed systems. In: Proceedings of the SRDS, pp. 177–182. IEEE (2011)
Yin, Z., Yu, F.R., Bu, S., Han, Z.: Joint cloud and wireless networks operations in mobile cloud computing environments with telecom operator cloud. IEEE Trans. Wirel. Commun 14(7), 4020–4033 (2015)
Zhou, J., Chen, Z., Wang, J., Zheng, Z., Dong, W.: A runtime verification based trace-oriented monitoring framework for cloud systems. In: Proceedings of the ISSREW, pp. 152–155. IEEE (2014)
Acknowledgements
This work has been supported by the COSMIC project, U-GOV 000010–PRD-2017-S-RUSSO_001_001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cotroneo, D., De Simone, L., Liguori, P., Natella, R., Scibelli, A. (2021). Towards Runtime Verification via Event Stream Processing in Cloud Computing Infrastructures. In: Hacid, H., et al. Service-Oriented Computing – ICSOC 2020 Workshops. ICSOC 2020. Lecture Notes in Computer Science(), vol 12632. Springer, Cham. https://doi.org/10.1007/978-3-030-76352-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-76352-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76351-0
Online ISBN: 978-3-030-76352-7
eBook Packages: Computer ScienceComputer Science (R0)