Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Mitigating the impact of controller failures on QoS robustness for software-defined wide area networks

Published: 14 March 2024 Publication History

Abstract

Emerging cloud services and applications pose different Quality of Service (QoS) requirements for the network, where Software-Defined Wide Area Networks (SD-WANs) play a crucial role in QoS provisioning by introducing network programmability into network flows to enable dynamic flow routing and ensure low data transmission latency for these applications. However, controller failures may happen in SD-WANs, and all programmable flows that the failed controller previously controlled will become offline and lose the network programmability, resulting in the degradation of QoS. Existing control recovery solutions propose to remap offline switches/flows to available active controllers but cannot promise good recovery performance due to the following two problems: (1) the recovery performance suffers from either coarse-grained remapping granularity or introducing extra processing delays, and (2) QoS robustness cannot be guaranteed in the design of recovery solution. To this end, we propose Predator, a QoS-aware network programmability recovery scheme that utilizes the P4 Runtime enabled by existing P4 switches to achieve fine-grained per-flow remapping without introducing extra delays. Specifically, our proposed Predator categorizes flows based on their QoS requirements and smartly recovers offline flows based on their priorities to guarantee the QoS robustness for high-priority flows. Simulation results under real-world topology demonstrate that our proposed Predator can improve the recovered network programmability of high-priority flows by up to 505.5%, and substantially reduce the communication overhead of high-priority flows, compared with baselines.

References

[1]
Boudaoud B., Knowles P., Kim J., Spjut J., Gaming at warp speed: Improving aiming with late warp, in: ACM SIGGRAPH 2021 Emerging Technologies, 2021, pp. 1–4.
[2]
Q. He, Z. Dong, F. Chen, S. Deng, W. Liang, Y. Yang, Pyramid: Enabling hierarchical neural networks with edge computing, in: Proceedings of the ACM Web Conference 2022, 2022, pp. 1860–1870.
[3]
Wang Y., Su Z., Zhang N., Xing R., Liu D., Luan T.H., Shen X., A survey on metaverse: Fundamentals, security, and privacy, IEEE Commun. Surv. Tutor. (2022).
[4]
M. Diarra, W. Dabbous, A. Ismail, T. Turletti, Ran-aware proxy-based flow control for high throughput and low delay embb, in: Proceedings of the 24th International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, 2021, pp. 41–50.
[5]
Holloway R.L., Registration error analysis for augmented reality, Presence: Teleoper. Virtual Environ. 6 (4) (1997) 413–432.
[6]
P. Sweetser, Z. Rogalewicz, Q. Li, Understanding enjoyment in VR games with GameFlow, in: Proceedings of the 25th ACM Symposium on Virtual Reality Software and Technology, 2019, pp. 1–2.
[7]
P. Rhienmora, K. Gajananan, P. Haddawy, M.N. Dailey, S. Suebnukarn, Augmented reality haptics system for dental surgical skills training, in: Proceedings of the 17th ACM Symposium on Virtual Reality Software and Technology, 2010, pp. 97–98.
[8]
Van Huynh D., Khosravirad S.R., Masaracchia A., Dobre O.A., Duong T.Q., Edge intelligence-based ultra-reliable and low-latency communications for digital twin-enabled metaverse, IEEE Wireless Commun. Lett. 11 (8) (2022) 1733–1737.
[9]
Kreutz D., Ramos F.M., Verissimo P.E., Rothenberg C.E., Azodolmolky S., Uhlig S., Software-defined networking: A comprehensive survey, Proc. IEEE 103 (1) (2014) 14–76.
[10]
C.-Y. Hong, S. Mandal, M. Al-Fares, M. Zhu, R. Alimi, C. Bhagat, S. Jain, J. Kaimal, S. Liang, K. Mendelev, et al., B4 and after: managing hierarchy, partitioning, and asymmetry for availability and scale in google’s software-defined WAN, in: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, 2018, pp. 74–87.
[11]
Yang Z., Cui Y., Li B., Liu Y., Xu Y., Software-defined wide area network (SD-WAN): Architecture, advances and opportunities, in: 2019 28th International Conference on Computer Communication and Networks (ICCCN), IEEE, 2019, pp. 1–9.
[12]
Schiff L., Schmid S., Kuznetsov P., In-band synchronization for distributed SDN control planes, ACM SIGCOMM Comput. Commun. Rev. 46 (1) (2016) 37–43.
[13]
Guo Z., Dou S., Jiang W., Xia Y., Toward improved path programmability recovery for software-defined WANs under multiple controller failures, IEEE/ACM Trans. Netw. (2023).
[14]
Heller B., Sherwood R., McKeown N., The controller placement problem, ACM SIGCOMM Comput. Commun. Rev. 42 (4) (2012) 473–478.
[15]
He F., Oki E., Preventive priority setting against multiple controller failures in software defined networks, IEEE Trans. Parallel Distrib. Syst. (2023).
[16]
Z. Guo, W. Feng, S. Liu, W. Jiang, Y. Xu, Z.-L. Zhang, RetroFlow: Maintaining control resiliency and flow programmability for software-defined WANs, in: Proceedings of the International Symposium on Quality of Service, 2019, pp. 1–10.
[17]
Dou S., Miao G., Guo Z., Yao C., Wu W., Xia Y., Matchmaker: Maintaining network programmability for software-defined WANs under multiple controller failures, Comput. Netw. 192 (2021).
[18]
Sherwood R., Gibb G., Yap K.-K., Appenzeller G., Casado M., McKeown N., Parulkar G., Flowvisor: A network virtualization layer, in: OpenFlow Switch Consortium, Tech. Rep 1, 2009, p. 132.
[19]
Lampe U., Wu Q., Dargutev S., Hans R., Miede A., Steinmetz R., Assessing latency in cloud gaming, in: Cloud Computing and Services Science: Third International Conference, CLOSER 2013, Aachen, Germany, May 8-10, 2013, Revised Selected Papers 3, Springer, 2014, pp. 52–68.
[20]
Tanha M., Sajjadi D., Ruby R., Pan J., Capacity-aware and delay-guaranteed resilient controller placement for software-defined WANs, IEEE Trans. Netw. Serv. Manag. 15 (3) (2018) 991–1005.
[21]
He F., Oki E., Main and secondary controller assignment with optimal priority policy against multiple failures, IEEE Trans. Netw. Serv. Manag. 18 (4) (2021) 4391–4405.
[22]
Bosshart P., Daly D., Gibb G., Izzard M., McKeown N., Rexford J., Schlesinger C., Talayco D., Vahdat A., Varghese G., et al., P4: Programming protocol-independent packet processors, ACM SIGCOMM Comput. Commun. Rev. 44 (3) (2014) 87–95.
[24]
Tootoonchian A., Ghobadi M., Ganjali Y., Opentm: traffic matrix estimator for OpenFlow networks, in: International Conference on Passive and Active Network Measurement, Springer, 2010, pp. 201–210.
[25]
Xie J., Guo D., Li X., Shen Y., Jiang X., Cutting long-tail latency of routing response in software defined networks, IEEE J. Sel. Areas Commun. 36 (3) (2018) 384–396.
[26]
Yao G., Bi J., Guo L., On the cascading failures of multi-controllers in software defined networks, in: 2013 21st IEEE International Conference on Network Protocols (ICNP), IEEE, 2013, pp. 1–2.
[27]
Knight S., Nguyen H.X., Falkner N., Bowden R., Roughan M., The internet topology zoo, IEEE J. Sel. Areas Commun. 29 (9) (2011) 1765–1775.
[28]
Killi B.P.R., Rao S.V., Towards improving resilience of controller placement with minimum backup capacity in software defined networks, Comput. Netw. 149 (2019) 102–114.
[29]
Killi B.P.R., Rao S.V., Capacitated next controller placement in software defined networks, IEEE Trans. Netw. Serv. Manag. 14 (3) (2017) 514–527.
[30]
Robusto C.C., The cosine-haversine formula, Amer. Math. Monthly 64 (1) (1957) 38–40.
[31]
Speed, Rates, Times, Delays: Data Link Parameters for CSE 461. https://courses.cs.washington.edu/courses/cse461/99wi/issues/definitions.html.
[32]
Z. Wang, Z. Li, G. Liu, Y. Chen, Q. Wu, G. Cheng, Examination of WAN traffic characteristics in a large-scale data center network, in: Proceedings of the 21st ACM Internet Measurement Conference, 2021, pp. 1–14.
[33]
[34]
Guillen L., Izumi S., Abe T., Suganuma T., A resilient mechanism for multi-controller failure in hybrid SDN-based networks, in: 2021 22nd Asia-Pacific Network Operations and Management Symposium (APNOMS), IEEE, 2021, pp. 285–290.
[35]
Petale S., Thangaraj J., Failure-based controller placement in software defined networks, IEEE Trans. Netw. Serv. Manag. 17 (1) (2019) 503–516.
[36]
Das T., Gurusamy M., Controller placement for resilient network state synchronization in multi-controller SDN, IEEE Commun. Lett. 24 (6) (2020) 1299–1303.
[37]
Duong T.Q., Van Huynh D., Khosravirad S.R., Sharma V., Dobre O.A., Shin H., From digital twin to metaverse: The role of 6G ultra-reliable and low-latency communications with multi-tier computing, IEEE Wirel. Commun. 30 (3) (2023) 140–146.
[38]
Qi L., Dou S., Guo Z., Li C., Li Y., Zhu T., Low control latency SD-WANs for metaverse, in: 2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW), IEEE, 2022, pp. 266–271.
[39]
Yu W., Chua T.J., Zhao J., Asynchronous hybrid reinforcement learning for latency and reliability optimization in the metaverse over wireless communications, IEEE J. Sel. Areas Commun. (2023).

Index Terms

  1. Mitigating the impact of controller failures on QoS robustness for software-defined wide area networks
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Computer Networks: The International Journal of Computer and Telecommunications Networking
            Computer Networks: The International Journal of Computer and Telecommunications Networking  Volume 238, Issue C
            Jan 2024
            268 pages

            Publisher

            Elsevier North-Holland, Inc.

            United States

            Publication History

            Published: 14 March 2024

            Author Tags

            1. Software-defined wide area networks
            2. Quality of Service
            3. Network programmability

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 04 Oct 2024

            Other Metrics

            Citations

            View Options

            View options

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media