Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Supporting Fault-Tolerant Parallel Programming in Linda

Published: 01 March 1995 Publication History
  • Get Citation Alerts
  • Abstract

    Linda is a language for programming parallel applications whose most notable feature is a distributed shared memory called tuple space. While suitable for a wide variety of programs, one shortcoming of the language as commonly defined and implemented is a lack of support for writing programs that can tolerate failures in the underlying computing platform. This paper describes FT-Linda, a version of Linda that addresses this problem by providing two major enhancements that facilitate the writing of fault-tolerant applications: stable tuple spaces and atomic execution of tuple space operations. The former is a type of stable storage in which tuple values are guaranteed to persist across failures, while the latter allows collections of tuple operations to be executed in an all-or-nothing fashion despite failures and concurrency. The design of these enhancements is presented in detail and illustrated by examples drawn from both the Linda and fault-tolerance domains. An implementation of FT-Linda for a network of workstations is also described. The design is based on replicating the contents of stable tuple spaces to provide failure resilience and then updating the copies using atomic multicast. This strategy allows an efficient implementation in which only a single multicast message is needed for each atomic collection of tuple space operations.Index Terms Parallel programming, fault-tolerance, Linda, atomic execution, stable storage, atomic multicast.

    References

    [1]
    S. Ahmed and D. Gelernter, “A higher-level environment for parallel programming,” Yale Univ., Dep. Comput. Sci., Tech. Rep. YALEDU/DCS/RR-877, Nov. 1991.
    [2]
    ——, “Program builders as alternatives to high-level languages,” Yale Univ., Dep. Comput. Sci., Tech. Rep. YALEDU/DCS/RR-887, Nov. 1991.
    [3]
    S. Ahuja, N. Carriero, and D. Gelernter, “Linda and friends,” IEEE Comput. , vol. 19, pp. 26–34, Aug. 1986.
    [4]
    B. G. Anderson and D. Shasha, “Persistent Linda: Linda + transactions + query processing,” in Research Directions in High-Level Parallel Programming Languages , J. P. Banâtre and D. Le Métayer, Eds. New York: Springer, 1991, no. 574 in LNCS, pp. 93–109.
    [5]
    D. E. Bakken, “Supporting fault-tolerant parallel programming in Linda,” Ph.D. dissertation, Dep. Comput. Sci., the Univ. Arizona, 1994.
    [6]
    D. E. Bakken and R. D. Schlichting, “Tolerating failures in the bag-of-tasks programming paradigm,” in Proc. Twenty-First Int. Symp. Fault-Tolerant Comput. , June 1991, pp. 248–255.
    [7]
    H. E. Bal, J. G. Steiner, and A. S. Tanenbaum, “Programming languages for distributed computing systems,” ACM Comput. Surv. , vol. 21, no. 3, pp. 261–322, Sept. 1989.
    [8]
    K. Birman, A. Schiper, and P. Stepehson, “Lightweight causal and atomic group multicast,” ACM Trans. Comput. Syst. , vol. 9, pp. 272–314, Aug. 1991.
    [9]
    R. D. Bjornson, “Linda on distributed memory multiprocessors,” Ph.D. dissertation, Dep. Comput. Sci., Yale Univ., Nov. 1992.
    [10]
    L. Cagan and Andrew H. Sherman, “Linda unites network systems,” IEEE Spectrum , vol. 30, pp. 31–35, Dec. 1993.
    [11]
    S. Cannon and D. Dunn, “A high-level model for the development of fault-tolerant parallel and distributed systems,” Dep. Comput. Sci., Utah State Univ., Tech. Rep. A0192, Aug. 1992.
    [12]
    N. Carriero and D. Gelernter, “The S/Net's Linda kernel,” ACM Trans. Comput. Syst. , vol. 4, no. 2, pp. 110–129, May 1986.
    [13]
    ——, How to Write Parallel Programs: A First Course . Cambridge, MA: MIT Press, 1990.
    [14]
    N. Carriero, D. Gelernter, and T. G. Mattson, “Linda in heterogenous computing environments,” in Proc. Workshop on Heterogenous Processing , IEEE, Mar. 1992.
    [15]
    S. Chiba, K. Kato, and T. Masuda, “Exploiting a weak consistency to implement distributed tuple space,” in Proc. 12th Int. Conf. Distribut. Comput. Syst. , June 1992, pp. 416–423.
    [16]
    D. Gelernter, “Generative communication in Linda,” ACM Trans. Programming Languages and Syst. , vol. 7, no. 1, pp. 80–112, Jan. 1985.
    [17]
    D. Gelernter and D. Kaminsky, “Supercomputing out of recycled garbage: Preliminary experience with Piranha,” in Proc. Sixth ACM Int. Conf.Supercomput. , Washington, DC, July 1992.
    [18]
    J. Gray, “Notes on database operating systems,” in Operating Systems: An Advanced Course, Lecture Notes in Computer Science . New York: Springer-Verlag, 1978.
    [19]
    W. Hasselbring, “A formal z specification of proset-Linda,” Dep. Comput. Sci., Univ. of Essen, Tech. Rep. 04–92, 1992.
    [20]
    N. C. Hutchinson and L. L. Peterson, “The x -kernel: An architecture for implementing network protocols,” IEEE Trans. Software Eng. , vol. 17, pp. 64–76, Jan. 1991.
    [21]
    R. Jellinghaus, “Eiffel Linda: An object-oriented Linda dialect,” ACM SIGPLAN Notices , vol. 25, no. 12, pp. 70–84, Dec. 1990.
    [22]
    S. Kambhatla, “Recovery with limited replay: Fault-tolerant processes in Linda,” Dep. Comput. Sci., Oregon Graduate Inst., Tech. Rep. CS/E 90-019, 1990.
    [23]
    ——, “Replication issues for a distributed and highly available Linda tuple space,” Master's thesis, Dep. Comput. Sci., Oregon Graduate Inst., 1991.
    [24]
    D. Kaminsty, “Adaptive Parallelism with Piranha,” Ph.D. dissertation, Dep. Comput. Sci., Yale Univ., May 1994.
    [25]
    R. Koo and S. Toueg, “Checkpointing and rollback-recovery for distributed systems,” IEEE Trans. Software Eng. , vol. SE–13, pp. 23–31, Jan. 1987.
    [26]
    B. Lampson. “Atomic transactions,” in Distributed Systems—Architecture and Implementation . New York: Springer-Verlag, 1981, pp. 246–265.
    [27]
    J. Leichter, “Shared tuple memories, shared memories, buses and LAN's—Linda implementation across the spectrum of connectivity,” Ph.D. dissertation, Dep. Comput. Sci., Yale Univ., July 1989.
    [28]
    LRW Systems, LRW TM Linda-C for VAX User's Guide , 1991. Order number VLN-UG-102.
    [29]
    S. Mishra, L. L. Peterson, and R. D. Schlichting, “Consul: A communication substrate for fault-tolerant distributed programs,” Distrib. Syst. Eng. , vol. 1, pp. 87–103, 1993.
    [30]
    ——, “Experience with modularity in Consul,” Software—Practice and Experience , vol. 23, no. 10, pp. 1059–1075, Oct. 1993.
    [31]
    B. J. Nelson, “Remote procedure call,” Ph.D. dissertation, Comput. Sci. Dep., Carnegie-Mellon Univ., 1981.
    [32]
    B. Nitzberg and V. Lo, “Distributed shared memory: A survey of issues and algorithms,” Computer , vol. 24, no. 8, pp. 52–60, Aug. 1991.
    [33]
    L. I Patterson, R. S. Turner, R. M. Hyatt, and K. D. Reilly, “Construction of a fault-tolerant distributed tuple-space,” in Proc. 1993 Symp. Appl. Comput. , ACM/SIGAPP, Feb. 1993, pp. 279–285.
    [34]
    D. Powell, Ed., Delta-4: A Generic Architecture for Dependable Distributed Computing . New York: Springer-Verlag, 1991.
    [35]
    D. Powell, D. Seaton, G. Bonn, P. Verissimo, and F. Waeselynk, “The Delta-4 approach to dependability in open distributed computing systems,” in Proc. Eighteenth Symp. Fault-Tolerant Comput. , Tokyo, Japan, June 1988.
    [36]
    R. D. Schlichting and F. B. Schneider, “Fail-stop processors: An approach to designing fault-tolerant computing systems,” ACM Trans. Comput. Syst. , vol. 1, no. 3, pp. 222–238, Aug. 1983.
    [37]
    F. Schneider, “Implementing fault-tolerant services using the state machine approach,” ACM Comput. Surv. , vol. 22, no. 4, pp. 299–319, Dec. 1990.
    [38]
    E. Segall, “Tuple space operations: Multiple-key search, on-line matching, and wait-free synchronization.” Ph.D. dissertation, Dep. Elec. Eng., Rutgers Univ., 1993.
    [39]
    D. Shasha and K. Jeong, “PLinda 2.0: A transactional/checkpointing approach to fault tolerant Linda,” in Proc. Thirteenth Symp. on Reliable Distrib. Syst., Dana Point, CA, pp. 96–105.
    [40]
    J. Turek and D. Shasha, “The many faces of consensus in distributed systems,” Comput. , vol. 25, no. 6, pp. 8–17, June 1992.
    [41]
    A. Xu, “A fault-tolerant network kernel for Linda,” Master's thesis, MIT Lab. Comput. Sci., Aug. 1988.
    [42]
    A. Xu and B. Liskov, “A design for a fault-tolerant, distributed implementation of Linda,” in Proc. Nineteenth Int. Symp. Fault-Tolerant Comput. , June 1989, pp. 199–206,

    Cited By

    View all
    • (2023)Fault-tolerance at your Finger Tips with the TeamPlay Coordination LanguageProceedings of the 35th Symposium on Implementation and Application of Functional Languages10.1145/3652561.3652571(1-13)Online publication date: 29-Aug-2023
    • (2018)Brief AnnouncementProceedings of the 2018 ACM Symposium on Principles of Distributed Computing10.1145/3212734.3212782(281-284)Online publication date: 23-Jul-2018
    • (2016)Coordinated Concurrent Programming in SyndicateProceedings of the 25th European Symposium on Programming Languages and Systems - Volume 963210.1007/978-3-662-49498-1_13(310-336)Online publication date: 2-Apr-2016
    • Show More Cited By

    Reviews

    Adina Magda Florea

    FT-Linda is a version of Linda. Linda is a parallel programming language whose most notable feature is a distributed shared memory called tuple space that can be used despite the lack of physical shared memory. FT-Linda is intended to support the programming of fault-tolerant parallel applications. The two main features of FT-Linda are stable tuple spaces and atomic guarded statements. The former provide protection against data loss, while the latter support synchronization and the execution of multiple tuple space operations despite failures or concurrency. After a brief presentation of Linda and the specific problems that failures can cause, the authors present the design decisions of FT-Linda and give examples of its use. Next, implementation details and some initial performance results are given. The paper ends with a short presentation of related work. The paper communicates research and implementation results in a well-organized and clear manner. It is intended for both parallel programming language designers and Linda programmers. The authors state, however, that “all parts of the system have been implemented, but the final integration awaits the porting of Consul to a new version of the x-kernel.” Consequently, prospective FT-Linda programmers still have to wait, and performance results are either unavailable or not very relevant.

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Parallel and Distributed Systems
    IEEE Transactions on Parallel and Distributed Systems  Volume 6, Issue 3
    March 1995
    110 pages
    ISSN:1045-9219
    Issue’s Table of Contents

    Publisher

    IEEE Press

    Publication History

    Published: 01 March 1995

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Fault-tolerance at your Finger Tips with the TeamPlay Coordination LanguageProceedings of the 35th Symposium on Implementation and Application of Functional Languages10.1145/3652561.3652571(1-13)Online publication date: 29-Aug-2023
    • (2018)Brief AnnouncementProceedings of the 2018 ACM Symposium on Principles of Distributed Computing10.1145/3212734.3212782(281-284)Online publication date: 23-Jul-2018
    • (2016)Coordinated Concurrent Programming in SyndicateProceedings of the 25th European Symposium on Programming Languages and Systems - Volume 963210.1007/978-3-662-49498-1_13(310-336)Online publication date: 2-Apr-2016
    • (2015)Extensible distributed coordinationProceedings of the Tenth European Conference on Computer Systems10.1145/2741948.2741954(1-16)Online publication date: 17-Apr-2015
    • (2014)Fault-tolerant dynamic task graph schedulingProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC.2014.64(719-730)Online publication date: 16-Nov-2014
    • (2013)Large-scale computation not at the cost of expressivenessProceedings of the 14th USENIX conference on Hot Topics in Operating Systems10.5555/2490483.2490494(11-11)Online publication date: 13-May-2013
    • (2010)Selective Recovery from Failures in a Task Parallel Programming ModelProceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing10.1109/CCGRID.2010.34(709-714)Online publication date: 17-May-2010
    • (2009)A communication framework for fault-tolerant parallel executionProceedings of the 22nd international conference on Languages and Compilers for Parallel Computing10.1007/978-3-642-13374-9_1(1-15)Online publication date: 8-Oct-2009
    • (2008)DepSpaceACM SIGOPS Operating Systems Review10.1145/1357010.135261042:4(163-176)Online publication date: 1-Apr-2008
    • (2008)DepSpaceProceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 200810.1145/1352592.1352610(163-176)Online publication date: 1-Apr-2008
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media