Beowulf Cluster Computing with Linux: | Guide books

Beowulf Cluster Computing with LinuxDecember 2003

Publisher:

MIT Press
55 Hayward St.
Cambridge
MA
United States

ISBN:978-0-262-69292-2

Published:01 December 2003

Pages:

504

Available at Amazon

Bibliometrics

Abstract

From the Publisher: Use of Beowulf clusters (collections of off-the-shelf commodity computers programmed to act in concert, resulting in supercomputer performance at a fraction of the cost) has spread far and wide in the computational science community. Many application groups are assembling and operating their own private supercomputers rather than relying on centralized computing centers. Such clusters are used in climate modeling, computational biology, astrophysics, and materials science, as well as non-traditional areas such as financial modeling and entertainment. Much of this new popularity can be attributed to the growth of the open-source movement. The second edition of Beowulf Cluster Computing with Linux has been completely updated; all three stand-alone sections have important new material. The introductory material in the first part now includes a new chapter giving an overview of the book and background on cluster-specific issues, including why and how to choose a cluster, as well as new chapters on cluster initialization systems (including ROCKS and OSCAR) and on network setup and tuning. The information on parallel programming in the second part now includes chapters on basic parallel programming and available libraries and programs for clusters. The third and largest part of the book, which describes software infrastructure and tools for managing cluster resources, has new material on cluster management and on the Scyld system.

Cited By

Contributors

Thomas Lawrence Sterling
Luddy School of Informatics, Computing, and Engineering
- Publication Years1995 - 2024
- Publication counts97
- Citation count566
- Available for Download20
- Downloads (cumulative)5,778
- Downloads (12 months)382
- Downloads (6 weeks)92
- Average Downloads per Article289
- Average Citation per Article6
View Full Profile
Ewing “Rusty” Lusk
Argonne National Laboratory
- Publication Years1977 - 2018
- Publication counts127
- Citation count2,446
- Available for Download16
- Downloads (cumulative)8,345
- Downloads (12 months)528
- Downloads (6 weeks)70
- Average Downloads per Article522
- Average Citation per Article19
View Full Profile
William W Gropp
- Publication Years1995 - 1995
- Publication counts1
- Citation count4
- Available for Download0
- Downloads (cumulative)0
- Downloads (12 months)0
- Downloads (6 weeks)0
- Average Downloads per Article0
- Average Citation per Article4
View Full Profile

Index Terms

Beowulf Cluster Computing with Linux

Reviews

Reviewer: Balaraman Subbanaidu

Lately, there has been a lot of interest in parallel computing and distributed computing for scientific and commercial applications. Of all the relevant technologies, cluster computing with Linux has been gaining predominance. I did not have an opportunity to go through the first edition of this book [1], but, after reading this well-formatted and well-compiled second edition, I do not regret having missed the first. This edition fulfills the purpose of the title very comprehensively. This is a good book for people working on, or planning to work on, cluster computing. The book's chapters are actually a collection of essays from different authors. Each chapter has a summary, and a discussion of future trends or connectivity to the next chapter. There are many relevant and useful references, including Web links. Some chapters, in particular those covering programming, present coding examples. The overall plan and presentation of the book is admirable. Many chapters of the book talk about hardware, software, installation, and constraints. The initial chapters present a very good introduction, and are suitable for beginners in the area of cluster computing. There is a feeling, when reading these first chapters, that we are going back to learn difficult things about hardware, because the authors talk about the central processing unit (CPU), memory, bus, basic input/output system (BIOS), storage, and so on. There is a brief chapter on Linux as well, which Linux experts can perhaps skip. There are topics on message passing interface (MPI) programming that discuss parallel input/output (I/O), fault tolerance, and so on. There is some discussion of improving the performance of such programs. How C++ and Fortran are used is elucidated. Some useful tools are mentioned, which would help those readers who are working on cluster computing environments. Chapter 5 is quite promising, discussing the implementation of clusters. The chapter thoroughly explains the basic backbone of network infrastructure. Fault-tolerance is described. Since this is an important concept, the next edition of the book is expected to cover it much more. The two kinds of message passing for parallel programming are also discussed in the book. Three chapters are dedicated to workload management tools, such as PBS and Condor. There are useful hints for perfecting them. The introduction to writing parallel programs for clusters in the seventh chapter is informative. The book presents the technology coherently, which is a great achievement for both the editors and authors. Nowhere does one feel lost in the knowledge milieu. The material is not exhaustive, however; in a book of this kind, the coverage can only be comprehensive. Online Computing Reviews Service

Reviewer: John P. Dougherty

The stated purpose of this book is to help the reader understand the Beowulf approach to parallel computing. This task is not simple, nor is it small; the 600-page volume only scratches the conceptual surface of cluster computing in the Beowulf world. The book is part of the Scientific and Engineering Computation Series, from MIT Press, that many in scientific parallel computing have come to rely on for concise and practical information. The book is a collection of threaded essays on topics in cluster computing. After an overview, by one of the editors, chapters are partitioned into those covering enabling technologies, parallel programming, and managing clusters. A short concluding chapter is provided for balance, and to address considerations for future changes in cluster computing. Appendices include a reading list, and relevant links from the World Wide Web. Each chapter provides a starting point for some aspect of Beowulf cluster computing, often with small but useful examples. Many of the chapters provide code examples, as well as recipes for the access, installation, and execution of hardware and software components. This approach generally works under the constraints imposed; it is hard to balance clear conceptual treatment with the dense, detailed considerations encountered in actual cluster computing. The overview chapter is appropriate for novices, as well as for more seasoned professionals desiring a quick refresher. I appreciated the glossary, and the brief definitions of terms that seem to appear and mutate throughout this field. Chapters on hardware, Linux, and networks follow the initial overview; they provide descriptions of various components, and their associated issues. The remainder of the first part, on enabling technologies, outlines the installation, configuration, and tuning of a Beowulf. I found the fifth chapter most promising in its walkthrough of the steps needed to configure a simple eight-node cluster example with Red Hat Linux 9, especially the networking setup. The second part, on parallel programming, is a concise treatment of how to design and implement the most basic and most popular parallel algorithms. Many platforms are considered, including C and sockets, Python, Perl, message passing interface (MPI), and parallel virtual machine (PVM). There is an advanced chapter describing how to improve the performance of MPI programs, and another on how to improve PVM fault-tolerance and adaptability. These chapters are fairly complete, but readers may want to have other MPI or PVM references available. The final part of the book involves cluster management. This part does not flow as well as the others, and is more a set (rather than a sequence) of articles on helpful management facilities and topics. The first two chapters provide background on cluster and workload management, reviewing such issues as monitoring, recovery from failure, and software upgrades. A collection of management tools is then discussed. These tools include Condor, Maui, Portable Batch System (PBS), Scyld, and Parallel Virtual File System (PVFS). There is an interesting chapter, just before the conclusion, comparing two Beowulfs, maintained at Argonne National Laboratory, that are about three years apart in age. The reader can glean some information about the expected usage and eventual path toward replacement for a cluster by comparing the experiences associated with these Beowulfs. I would suggest the editors consider similar reports from other laboratories, perhaps where the scale is smaller (and budgets are a stronger driving issue), and/or the applications are not scientific (for example, economic computation or transaction-based processing for business). There are a few typographical errors and other minor distractions in the text, and I have not had the time or the other resources to verify the code examples. I would also read this book one article at a time, when a specific answer or example is needed. Reading the articles helps get you ready to implement a Beowulf cluster, but (like many projects) the actual construction experience is not possible to capture in prose alone. This book is appropriate for people with a reasonable technical background, involved in applications or projects where a Beowulf is advantageous. In other words, you really need to bring a few things to reading this book: experience with programming, Linux/Unix, and basic networking. The reader should be aware of a strongly related book from the same series [1]. This second book is the result of a workshop on Beowulf setup conducted by the authors, and reads more as a case study than this book. Both books have merit, and, ideally, both are useful when managing a cluster. If the ideal is not possible, then my suggestion is to consider the alternate book first, see if it contains what is needed for your application project, and refer to the current book as specific issues arise.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Browse Books

Sections

Cited By

Index Terms

Reviews

Access critical reviews of Computing literature here