Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Efficiency-First Fault-Tolerant Replica Scheduling Strategy for Reliability Constrained Cloud Application

  • Conference paper
  • First Online:
Network and Parallel Computing (NPC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13152))

Included in the following conference series:

  • 823 Accesses

Abstract

Reliability requirement assurance is an important prerequisite for application execution in the cloud. Although copy management can improve the reliability of applications, it also brings a series of resource waste and overhead issues. Therefore, the efficiency-first fault-tolerant algorithm (EFFT) with minimum execution cost in the cloud application is proposed. This algorithm minimizes the execution cost of the application under the constraints of reliability, and solves the problem of excessive overhead caused by too many copies. The EFFT algorithm is divided into two stages: initial allocation and dynamic adjustment. On the initial allocation of EFFT algorithm, a sorting rule is defined to determine the priority of tasks and instances. During the adjustment phase, by defining an actual efficiency ratio indicator to measure the cost-effectiveness of an instance, the EFFT algorithm makes a good trade-off between cost and reliability in order to minimize execution costs. Run our algorithm on randomly generated parallel applications of different scales and compare the experimental results with four advanced algorithms. The experiments show that the performance of the algorithm we proposed is better than the other algorithms in terms of execution cost and fault tolerance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kalra, M., Singh, S.: Multi-objective energy aware scheduling of deadline constrained workflows in clouds using hybrid approach. Wireless Pers. Commun. 116(3), 1743–1764 (2020). https://doi.org/10.1007/s11277-020-07759-4

    Article  Google Scholar 

  2. Mukwevho, M.A., Celik, T.: Toward a smart cloud: a review of fault- tolerance methods in cloud systems. IEEE Trans. Serv. Comput. (2018)

    Google Scholar 

  3. Faragardi H R, Sedghpour M S, Fazliahmadi S, et al.: GRP-HEFT: A Budget-Constrained Resource Provisioning Scheme for Workflow Scheduling in IaaS Clouds 31(6), 1239–1254 (2020)

    Google Scholar 

  4. Tang, Y., Shaer, E., Joshi, K.: Reasoning under uncertainty for overlay fault diagnosis. IEEE Trans. Network Serv. Manage. 9(1), 34–47 (2012)

    Google Scholar 

  5. Shatz, S.M., Wang, J.P.: Models and algorithms for reliability-oriented task-allocation in redundant distributed-computer systems. IEEE Trans. Rel. 38(1), 16–27 (1989)

    Google Scholar 

  6. Li, J., Liang, W., Huang, M., et al.: Reliability-aware network service provisioning in mobile edge-cloud networks. IEEE Trans. Parallel Distrib. Syst. 31(7), 1545–1558 (2020)

    Article  Google Scholar 

  7. Kumar, N., Mayank, J., Mondal, A.: Reliability aware energy optimized scheduling of non-preemptive periodic real-time tasks on heterogeneous multiprocessor system. IEEE Trans. Parallel Distrib. Syst. 31(4), 871–885 (2020)

    Article  Google Scholar 

  8. Kherraf, N., Sharafeddine, S., Assi, C.M., et al.: Latency and reliability-aware workload assignment in IoT networks with mobile edge clouds. IEEE Trans. Netw. Serv. Manage. 16(99), 1435–1449 (2019)

    Article  Google Scholar 

  9. Xie, G., Wei, Y.H., Le, Y., et al.: Redundancy minimization and cost reduction for workflows with reliability requirements in cloud-based services. IEEE Trans. Cloud Comput., 99 (2019)

    Google Scholar 

  10. Yao, G., Ren, Q., Li, X., Zhao, S.: Rub.: A hybrid fault-tolerant scheduling for deadline-constrained tasks in cloud system. IEEE Trans. Serv. Comput. (2020)

    Google Scholar 

  11. Zhao, L., Ren, Y., Xiang, Y., Sakurai, K.: Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems. In: Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, pp. 434–441 (2010)

    Google Scholar 

  12. Zhao, L., Ren, Y., Sakurai, K.: Reliable workflow scheduling with less resource redundancy. Parallel Comput. 39(10), 567–585 (2013)

    Article  MathSciNet  Google Scholar 

  13. Hu, B., Cao, Z.: Minimizing resource consumption cost of DAG applications with reliability requirement on heterogeneous processor systems. IEEE Trans. Industr. Inf. 16(12), 7437–7447 (2020)

    Article  Google Scholar 

  14. Xie, G., Zeng, G., Chen, Y., et al.: Minimizing redundancy to satisfy reliability requirement for a parallel application on heterogeneous service-oriented systems. IEEE Trans. Serv. Comput. 13(5), 871–886 (2020)

    Article  Google Scholar 

  15. Xie, G., Zeng, G., Li, R.: Quantitative fault-tolerance for reliable workflows on heterogeneous iaas clouds. IEEE Trans. Cloud Comput. 8(4), 1223–1236 (2020)

    Google Scholar 

  16. Nik, S.S.M., Naghibzadeh, M., Sedaghat, Y.: Task replication to improve the reliability of running workflows on the cloud. Clust. Comput. 24(1), 343–359 (2021)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61772200), Shanghai Natural Science Foundation (No. 21ZR1416300).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guisheng Fan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Y., Fan, G., Yu, H., Chen, X. (2022). Efficiency-First Fault-Tolerant Replica Scheduling Strategy for Reliability Constrained Cloud Application. In: Cérin, C., Qian, D., Gaudiot, JL., Tan, G., Zuckerman, S. (eds) Network and Parallel Computing. NPC 2021. Lecture Notes in Computer Science(), vol 13152. Springer, Cham. https://doi.org/10.1007/978-3-030-93571-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93571-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93570-2

  • Online ISBN: 978-3-030-93571-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics