Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Review Article: Lifetime and availability of data stored on a P2P system: Evaluation of redundancy and recovery schemes

Published: 01 May 2014 Publication History

Abstract

This paper studies the performance of Peer-to-Peer storage and backup systems (P2PSS). These systems are based on three pillars: data fragmentation and dissemination among the peers, redundancy mechanisms to cope with peers churn and repair mechanisms to recover lost or temporarily unavailable data. Usually, redundancy is achieved either by using replication or by using erasure codes. A new class of network coding (regenerating codes) has been proposed recently. Therefore, we will adapt our work to these three redundancy schemes. We introduce two mechanisms for recovering lost data and evaluate their performance by modeling them through absorbing Markov chains. Specifically, we evaluate the quality of service provided to users in terms of durability and availability of stored data for each recovery mechanism and deduce the impact of its parameters on the system performance. The first mechanism is centralized and based on the use of a single server that can recover multiple losses at once. The second mechanism is distributed: reconstruction of lost fragments is iterated sequentially on many peers until that the required level of redundancy is attained. The key assumptions made in this work, in particular, the assumptions made on the recovery process and peer on-times distribution, are in agreement with the analysis in [1] and in [2] respectively. The models are thereby general enough to be applicable to many distributed environments as shown through numerical computations. We find that, in stable environments such as local area or research institute networks where machines are usually highly available, the distributed-repair scheme in erasure-coded systems offers a reliable, scalable and cheap storage/backup solution. For the case of highly dynamic environments, in general, the distributed-repair scheme is inefficient, in particular to maintain high data availability, unless the data redundancy is high. Using regenerating codes overcomes this limitation of the distributed-repair scheme. P2PSS with centralized-repair scheme are efficient in any environment but have the disadvantage of relying on a centralized authority. However, the analysis of the overhead cost (e.g. computation, bandwidth and complexity cost) resulting from the different redundancy schemes with respect to their advantages (e.g. simplicity), is left for future work.

References

[1]
A. Dandoush, S. Alouf, P. Nain, Simulation analysis of download and recovery processes in P2P storage systems, in: Proc. of 21st International Teletraffic Congress (ITC), Paris, France, 2009.
[2]
D. Nurmi, J. Brevik, R. Wolski, Modeling machine availability in enterprise and wide-area distributed computing environments, in: Proc. of Euro-Par 2005, LNCS, vol. 3648, Lisbon, Portugal, 2005, pp. 432-441.
[3]
J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, B. Zhao, Oceanstore: an architecture for global-scale persistent storage, in: Proc. of ACM ASPLOS, Boston, Massachusetts, 2000, pp. 190-201.
[4]
F. Dabek, F. Kaashoek, D. Karger, R. Morris, I. Stoica, Wide-area cooperative storage with CFS, in: Proc. of ACM SOSP '01, Banff, Canada, 2001, pp. 202-215.
[5]
R. Bhagwan, K. Tati, Y. Cheng, S. Savage, G. Voelker, Total Recall: system support for automated availability management, in: Proc. of ACM/USENIX NSDI '04, San Francisco, California, 2004, pp. 337-350.
[6]
Wuala, The Wuala Project, http://www.wuala.com.
[7]
UbiStorage, http://http://www.ubistorage.com.
[8]
I. Stoica, R. Morris, D. Karger, F. Kaashoek, H. Balakrishnan, Chord: a scalable peer-to-peer lookup service for internet applications, in: Proc. of ACM SIGCOMM, San Diego, California, 2001, pp. 149-160.
[9]
Dimakis, A., Prabhakaran, V. and Ramchandran, K., Decentralized erasure codes for distributed networked storage. IEEE Trans. Inform. Theory. 52
[10]
Reed, I. and Solomon, G., Polynomial codes over certain finite fields. J. SIAM. v8 i2. 300-304.
[11]
A. Rowstron, P. Druschel, Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility, in: Proc. of ACM SOSP '01, Banff, Canada, 2001, pp. 188-201.
[12]
H. Weatherspoon, J. Kubiatowicz, Erasure coding vs. replication: a quantitative comparison, in: Proc. of IPTPS '02, Cambridge, Massachusetts, Lecture Notes in Computer Science, vol. 2429, 2002, pp. 328-337.
[13]
Bhagwan, R., Moore, D., Savage, S. and Voelker, G., Replication strategies for highly available peer-to-peer storage. In: Lecture Notes in Computer Science, vol. 2584. Springer. pp. 153-158.
[14]
A.G. Dimakis, P.B. Godfrey, Y. Wu, M. Wainwright, K. Ramchandran, Network coding for distributed storage systems, in: Proc. of 26th IEEE Conference on Computer Communications (INFOCOM), Anchorage, Alaska, USA, 2007.
[15]
A. Dimakis, K. Ramchandran, Y. Wu, C. Suh, A survey on network codes for distributed storage, in: Proc. IEEE 99, IEEE Computer Society, 2011, pp. 476-489.
[16]
S. Ramabhadran, J. Pasquale, Analysis of long-running replicated systems, in: Proc. of IEEE Infocom, Barcelona, Spain, 2006.
[17]
S. Alouf, A. Dandoush, P. Nain, Performance analysis of peer-to-peer storage systems, in: Proc. of 20th International Teletraffic Congress (ITC), LNCS, vol. 4516, Ottawa, Canada, 2007, pp. 642-653.
[18]
O. Dalle, F. Giroire, J. Monteiro, S. Pérennes, Analysis of failure correlation impact on peer-to-peer storage systems, in: The Proceeding of P2P 2009, the Ninth IEEE International Conference on Peer-to-Peer Computing, Seattle, Washington, USA, 2009.
[19]
L. Taoyu, C. Minghua, C. Dah-Ming, C. Maoke, Queuing models for peer-to-peer systems, in: Proceedings of the 8th International Conference on Peer-to-peer Systems, IPTPS'09, USENIX Association, Berkeley, CA, USA, 2009, pp. 4-4. URL http://dl.acm.org/citation.cfm?id=1855663.1855667.
[20]
Yang, Z., Dai, Y. and Xiao, Z., Exploring the costavailability tradeoff in p2p storage systems. In: ICPP'09: Proceedings of the 2009 International Conference on Parallel Processing, IEEE Computer Society. pp. 429-436.
[21]
Kermarrec, A., Merrer, E.L., Straub, G. and Kempen, A.V., Availability-based methods for distributed storage systems. In: 31st IEEE International Symposium on Reliable Distributed Systems, IEEE Computer Society. pp. 151-160.
[22]
D. Kondo, B. Javadi, A. Iosup, D. Epema, The failure trace archive: enabling comparative analysis of failures in diverse distributed systems, in: Proc. of the IEEE International Symposium on Cluster Computing and the Grid, 2010.
[23]
Martalo, M., Amoretti, M., Picone, M. and Ferrari, G., Sporadic decentralized resource maintenance for p2p distributed storage networks. Parallel Distribut. Comput. 74
[24]
A. Dandoush, S. Alouf, P. Nain, Performance analysis of centralized versus distributed recovery schemes in P2P storage systems, in: Proc. of IFIP/TC6 Networking 2009, LNCS, vol. 5550, Aachen, Germany, 2009, pp. 676-689.
[25]
S. Saroiu, P. Gummadi, S. Gribble, A measurement study of peer-to-peer file sharing systems, in: Proc. of Multimedia Computing and Networking (MMCN), San Jose, Cailfornia, 2002, (Best Paper Award).
[26]
A. Guha, N. Daswani, R. Jain, An experimental study of the skype peer-to-peer VoIP system, in: Proc. of 5th IPTPS, Santa Barbara, California, 2006.
[27]
Harrison, P. and Zertal, S., Queueing models of RAID systems with maxima of waiting times. Perform. Evaluat. J. v64 i7-8. 664-689.
[28]
Baskett, F., Chandy, K., Muntz, R. and Palacios, F., Open, closed, and mixed networks of queues with different classes of customers. J. ACM. v22 i2. 248-260.
[29]
Kobayashi, H. and Mark, B.L., On queuing networks and loss networks. In: Proc. 1994 Annual Conference on Information Sciences and Systems, Princeton, NJ.
[30]
Neuts, M., Matrix Geometric Solutions in Stochastic Models. In: An Algorithmic Approach, John Hopkins University Press, Baltimore.
[31]
Grinstead, C. and Laurie Snell, J., Introduction to Probability. 1997. American Mathematical Society.
[32]
A. Dandoush, S. Alouf, P. Nain, Lifetime and Availability of Data Stored on a P2P System: Evaluation of Recovery Schemes, Tech. Rep. RR-7170, INRIA Sophia Antipolis, January 2010.
[33]
A. Dandoush, S. Alouf, P. Nain, A realistic simulation model for peer-to-peer storage systems, in: Proc. of 2nd International ICST Workshop on Network Simulation Tools (NSTOOLS09), in Conjunction with the 4th International Conference (VALUETOOLS'09), Pisa, Italy, 2009.
[34]
A. Dandoush, A. Jean-Marie, Flow-level modeling of parallel download in distributed systems, in: Third International Conference on Communication Theory, Reliability, and Quality of Service (CTRQ), 2010, pp. 92-97, (Best Paper Award).
[35]
Condor: High Throughput Computing, <http://www.cs.wisc.edu/condor/>, 2007.
[36]
J. Stribling, PlanetLab - All Pairs Pings, http://pdos.csail.mit.edu/ strib/pl_app, 2005.
[37]
PlanetLab, An open platform for developing, deploying, and accessing planetary-scale services, http://www.planet-lab.org/, 2007.
[38]
Bhagwan, R., Savage, S. and Voelker, G., Understanding availability. In: Proc. of 2nd IPTPS, Berkeley, California.
[39]
D. Caromel, Keynote lecture proactive parallel suite: multi-cores to clouds to autonomicity, in: IEEE 5th International Conference on Intelligent Computer Communication and Processing, 2009. ICCP 2009, 2009.
[40]
INRIA, Proactive Parallel Suite, http://proactive.activeeon.com/index.php.
  1. Review Article: Lifetime and availability of data stored on a P2P system: Evaluation of redundancy and recovery schemes

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Computer Networks: The International Journal of Computer and Telecommunications Networking
      Computer Networks: The International Journal of Computer and Telecommunications Networking  Volume 64, Issue
      May, 2014
      302 pages

      Publisher

      Elsevier North-Holland, Inc.

      United States

      Publication History

      Published: 01 May 2014

      Author Tags

      1. Absorbing Markov chain
      2. Data availability
      3. Distributed storage system
      4. Peer-to-Peer network
      5. Performance evaluation
      6. System engineering

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media