Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Robust Network Supercomputing with Malicious Processes

  • Conference paper
Distributed Computing (DISC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4167))

Included in the following conference series:

Abstract

Internet supercomputing is becoming a powerful tool for harnessing massive amounts of computational resources. However in typical master-worker settings the reliability of computation crucially depends on the ability of the master to depend on the computation performed by the workers. Fernandez, Georgiou, Lopez, and Santos [12,13] considered a system consisting of a master process and a collection of worker processes that can execute tasks on behalf of the master and that may act maliciously by deliberately returning fallacious results. The master decides on the correctness of the results by assigning the same task to several workers. The master is charged one work unit for each task performed by a worker. The goal is to design an algorithm that enables the master to determine the correct result with high probability, and at the least possible cost. Fernandez et al. assume that the number of faulty processes or the probability of a process acting maliciously is known to the master. In this paper this assumption is removed. In the setting with n processes and n tasks we consider two different failure models, viz., model \({\mathcal F}_a\), where f-fraction, \(0 < f < \frac{1}{2}\), of the workers provide faulty results with probability \(0 < p < \frac{1}{2}\), given that the master has no a priori knowledge of the values of p and f; and model \({\mathcal F}_b\), where at most f-fraction, \(0 < f < \frac{1}{2}\), of the workers can reply with arbitrary results and the rest reply with incorrect results with probability p, \(0 < p < \frac{1}{2}\), but the master knows the values of f and p. For model \({\mathcal F}_a\) we provide an algorithm—based on the Stopping Rule Algorithm by Dagum, Karp, Luby, and Ross [10]—that can estimate f and p with (ε,δ)-approximation, for any 0 < δ< 1 and ε>0. This algorithm runs in O(logn) time, O(log2 n) message complexity, and O(log2 n) task-oriented work and O(nlogn) total-work complexities. We also provide a randomized algorithm for detecting the faulty processes, i.e., identifying the processes that have non-zero probability of failures in model \({\mathcal F}_a\), with task-oriented work O(n), and time O(logn). A lower bound on the total-work complexity of performing n tasks correctly with high probability is shown. Finally, two randomized algorithms to perform n tasks with high probability are given for both failure models with closely matching upper bounds on total-work and task-oriented work complexities, and time O(logn).

This work is supported in part by the NSF Grants 0121277 and 0311368.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Distributed .net, http://www.distributed.net/

  2. Internet primenet server, http://mersenne.org/ips/stats.html

  3. Seti@home, http://setiathome.ssl.berkeley.edu/

  4. Aguilera, M.K., Chen, W., Toueg, S.: Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: Mavronicolas, M. (ed.) WDAG 1997. LNCS, vol. 1320, pp. 126–140. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  5. Almeida, C., Verissimo, P.: Timing failure detection and real-time group communication in real-time systems. In: Proceedings of 8th Euromicro Workshop on Real-Time Systems (June 1996)

    Google Scholar 

  6. Bollo, R., Narzul, J.P.L., Raynal, M., Tronel, F.: Probabilistic analysis of a group failure detection protocol. In: Proceedings of 4th International Workshop on Object-Oriented Real-Time Dependable Systems(WORDS 1999) (January 1999)

    Google Scholar 

  7. Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  8. Chen, W., Toueg, S., Aguilera, M.K.: On the quality of service of failure detectors. In: Proceedings of 30th International Conference on Dependable Systems and Networks (ICDSN/FTCS-30) (June 2000)

    Google Scholar 

  9. Chlebus, B.S., De Prisco, R., Shvartsman, A.A.: Performing tasks on restartable message-passing processors. In: Mavronicolas, M. (ed.) WDAG 1997. LNCS, vol. 1320, pp. 96–110. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  10. Dagum, P., Karp, R.M., Luby, M., Ross, S.: An optimal algorithm for monte carlo estimation. In: Proceedings of the Foundations of Computer Science, pp. 142–149 (1995)

    Google Scholar 

  11. Dwork, C., Halpern, J.Y., Waarts, O.: Performing work efficiently in the presence of faults. In: Proceedings of the eleventh annual ACM symposium on Principles of distributed computing, pp. 91–102 (1992)

    Google Scholar 

  12. Fernández, A., Georgiou, C., López, L., Santos, A.: Reliably executing tasks in the presence of malicious processors. In: Fraigniaud, P. (ed.) DISC 2005. LNCS, vol. 3724, pp. 490–492. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Fernandez, A., Georgiou, C., Lopez, L., Santos, A.: Reliably executing tasks in the presence of malicious processors. Technical Report Numero 9 (RoSaC-2005-9), Grupo de Sistemas y Comunicaciones, Universidad Rey Juan Carlos (2005), http://gsyc.escet.urjc.es/publicaciones/tr/RoSaC-2005-9.pdf

  14. Gao, L., Malewicz, G.: Toward maximizing the quality of results of dependent tasks computed unreliably. Theory of Computing Systems (to appear); preliminary version OPODIS 2004

    Google Scholar 

  15. Karp, R.M.: Probabilistic recurrence relations. Journal of the Association for Computing Machinery 41(6), 1136–1150 (1994)

    MATH  MathSciNet  Google Scholar 

  16. Kedem, Z.M., Palem, K.V., Raghunathan, A., Spirakis, P.: Combining tentative and definite executions for dependable parallel computing, pp. 381–390 (1991)

    Google Scholar 

  17. Konwar, K.M., Rajasekaran, S., Shvartsman, A.A.: Robust network supercomputing with malicious processes (2006), http://www.cse.uconn.edu/~kishori/KRS2006.pdf

  18. Korpela, E., Werthimer, D., Anderson, D., Cobb, J., Lebofsky, M.: Seti@home - massively distributed computing for seti. Computing in Science & Enginering 3(1) (2001)

    Google Scholar 

  19. Martel, C., Subramonian, R.: On the complexity of certified write-all algorithms. Journal of Algorithms 16(3), 361–387 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  20. Paquette, M., Pelc, A.: Optimal decision strategies in byzantine environments. In: Kralovic, R., Sýkora, O. (eds.) SIROCCO 2004. LNCS, vol. 3104, pp. 245–254. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  21. De. Prisco, R., Mayer, A., Yung, M.: Time-optimal message-efficientwork performance in the presence of faults. In: Proceedings of the 13th ACM Symp. Principles of Distributed Computing, pp. 161–172 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Konwar, K.M., Rajasekaran, S., Shvartsman, A.A. (2006). Robust Network Supercomputing with Malicious Processes. In: Dolev, S. (eds) Distributed Computing. DISC 2006. Lecture Notes in Computer Science, vol 4167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11864219_33

Download citation

  • DOI: https://doi.org/10.1007/11864219_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44624-8

  • Online ISBN: 978-3-540-44627-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics