Abstract
A new cluster resource management system called Simple Linux Utility Resource Management (SLURM) is described in this paper. SLURM, initially developed for large Linux clusters at the Lawrence Livermore National Laboratory (LLNL), is a simple cluster manager that can scale to thousands of processors. SLURM is designed to be flexible and fault-tolerant and can be ported to other clusters of different size and architecture with minimal effort. We are certain that SLURM will benefit both users and system architects by providing them with a simple, robust, and highly scalable parallel job execution environment for their cluster system.
This document was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor the University of California nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or the University of California, and shall not be used for advertising or product endorsement purposes. This work was performed under the auspices of the U.S. Department of Energy by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48. Document UCRL-JC-147996.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Basney, J., Livny, M., Tannenbaum, T.: High Throughput Computing with Condor. HPCU news 1(2) (June 1997)
Beowulf Distributed Process Space, http://bproc.sourceforge.net
Beowulf Project, http://www.beowulf.org
Blue Gene/L, http://cmg-rr.llnl.gov/asci/platforms/bluegenel
Condor, http://www.cs.wisc.edu/condor
Distributed Production Control System, http://www.llnl.gov/icc/lc/dpcs_overview.html
Foster, I., Kesselman, C.: The GRID: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, Inc., San Francisco (1999)
Frachtenberg, E., Petrini, F., et al.: Storm: Lightning-fast resource management. In: Proceedings of Super Computing (2002)
GNU General Public License, http://www.gnu.org/licenses/gpl.html
A. home page, http://www.theether.org/authd/
IBM Corporation. LoadLeveler’s User Guide, Release 2.1
Jette, M., Dunlap, C., Garlick, J., Grondona, M.: Survey of Batch/Resource Management-Related System Software. Technical Report N/A, Lawrence Liver-more National Laboratory (2002)
Litzknow, M., Livny, M., Mutka, M.: Condor - a hunter for idle workstations. In: Proc. International Conference on Distributed Computing Systems (June 1988)
Load Leveler, http://www-1.ibm.com/servers/eservers/pseries/library/sp_books/loadleveler.html
Load Sharing Facility, http://www.platform.com
Loki - Commodity Parallel Processing, http://loki-www.lanl.org
Maui Scheduler, mauischeduler.sourceforge.net
Multiprogrammatic Capability Cluster, http://www.llnl.gov/linux/mcr
Parallel Capacity Resource, http://www.llnl.gov/linux/pcr
Portable Batch System, http://www.openpbs.org
Quadircs Resource Management System, http://www.quadrics.com/website/pdf/rms.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoo, A.B., Jette, M.A., Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_3
Download citation
DOI: https://doi.org/10.1007/10968987_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20405-3
Online ISBN: 978-3-540-39727-4
eBook Packages: Springer Book Archive