Getting Started Guide For Windows HPC Server 2008 Beta 2
Getting Started Guide For Windows HPC Server 2008 Beta 2
Abstract
This guide provides information for deploying an HPC cluster using the Beta 2 release of Windows HPC Server 2008. It includes step-by-step instructions to deploy and configure the head node, add compute nodes to the cluster, and verify that your cluster deployment has been successful. It also includes an overview of some of the new technologies available in Windows HPC Server 2008 Beta 2.
Copyright Information
This document supports a preliminary release of a software product that may be changed substantially prior to final commercial release, and is the confidential and proprietary information of Microsoft Corporation. It is disclosed pursuant to a non-disclosure agreement between the recipient and Microsoft. This document is provided for informational purposes only and Microsoft makes no warranties, either express or implied, in this document. Information in this document, including URL and other Internet Web site references, is subject to change without notice. The entire risk of the use or the results from the use of this document remains with the user. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place, or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. 2008 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, Excel, Windows, Windows PowerShell, Windows Server, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. All other trademarks are property of their respective owners.
Contents
Getting Started Guide for Windows HPC Server 2008 Beta 2 ........................................................ 5 What's New in Windows HPC Server 2008 Beta 2 ...................................................................... 5 Compatibility with Previous Versions ........................................................................................... 9 Additional Considerations ............................................................................................................ 9 Checklist: Deploy an HPC Cluster - Overview .............................................................................. 10 Step 1: Prepare for Your Deployment ........................................................................................... 10 Checklist: Prepare for Your Deployment .................................................................................... 10 1.1. Review the System Requirements ...................................................................................... 11 1.2. Decide How to Add Compute Nodes to Your Cluster ......................................................... 13 1.3. Choose the Active Directory Domain for Your Cluster ........................................................ 13 1.4. Choose a User Account for Installation and Diagnostics .................................................... 14 1.5. Choose a Network Topology for Your Cluster .................................................................... 14 1.6. Prepare for Multicast (Optional) .......................................................................................... 15 Step 2: Deploy the Head Node ...................................................................................................... 15 Checklist: Deploy the Head Node .............................................................................................. 15 2.1. Install Windows Server 2008 on the Head Node Computer ............................................... 16 2.2. Join the Head Node Computer to a Domain ....................................................................... 16 2.3. Install Microsoft HPC Pack 2008 on the Head Node Computer ......................................... 16 Step 3: Configure the Head Node ................................................................................................. 17 Checklist: Configure the Head Node .......................................................................................... 17 3.1. Configure the HPC Cluster Network ................................................................................... 18 3.2. Provide Installation Credentials........................................................................................... 19 3.3. Configure the Naming of New Nodes ................................................................................. 20 3.4. Create a Node Template ..................................................................................................... 21 3.5. Add Drivers for the Operating System Images (Optional) .................................................. 22 3.6. Add or Remove Users (Optional) ........................................................................................ 23 Step 4: Add Compute Nodes to the Cluster .................................................................................. 23 4.1. Deploy Compute Nodes from Bare Metal ........................................................................... 24 4.2. Add Compute Nodes by Importing a Node XML File .......................................................... 25 4.3. Add Preconfigured Compute Nodes ................................................................................... 25 4.4. Monitor Deployment Progress............................................................................................. 27 Step 5: Run Diagnostic Tests on the Cluster ................................................................................ 28 Step 6: Run a Test Job on the Cluster .......................................................................................... 29 Checklist: Run a Test Job on the Cluster ................................................................................... 29 6.1. Create a Job Template........................................................................................................ 29 6.2. Create and Submit a Job .................................................................................................... 30
6.3. Create and Submit a Job Using the Command-Line Interface (Optional) .......................... 31 6.4. Create and Submit a Job Using the HPC PowerShell (Optional) ....................................... 32 HPC Cluster Security ..................................................................................................................... 33 Groups and Authorized Operations............................................................................................ 33 HPC Cluster Credentials ............................................................................................................ 35 Application Firewall Exceptions.................................................................................................. 37 Access for Managed Code Applications .................................................................................... 38 Important Features in Windows HPC Server 2008 Beta 2 ............................................................ 38 The Service Oriented Application (SOA) Programming Model and Runtime ............................ 38 Access Your HPC Cluster Using a Web Service ....................................................................... 39 HPC Cluster Reporting ............................................................................................................... 40 Using the HPC PowerShell ........................................................................................................ 41 Additional Resources ..................................................................................................................... 42 Appendix 1: HPC Cluster Networking ........................................................................................... 43 HPC Cluster Networks ............................................................................................................... 43 Supported HPC Cluster Network Topologies ............................................................................. 44 HPC Network Services ............................................................................................................... 52 Windows Firewall Configuration ................................................................................................. 53 Appendix 2: Creating a Node XML File ......................................................................................... 54 Benefits of Using a Node XML File for Deployment................................................................... 54 How to Create a Node XML File ................................................................................................ 55 Sample Node XML File .............................................................................................................. 61
Feature Overview
Feature Compute Cluster Server 2003 Windows HPC Server 2008 Beta 2
Administrative tools
Cluscfg command-line tool for configuration. Clusrun command-line tool for running commands remotely on one or more nodes. Compute Cluster Administrator snap-in enabled simpler completion of essential tasks such as configuring cluster networking topology, setting user permissions, and monitoring jobs.
Cluscfg command-line tool for configuration. Clusrun command-line tool for running commands remotely on one or more nodes. The administrator console in Windows HPC Server 2008 Beta 2 (HPC Cluster Manager) is now based on Microsoft System Center. The new interface integrates all aspects of cluster management and simplifies the completion of essential tasks such as the configuration of cluster
5
Feature
Remote Installation Services (RIS) is used when deploying nodes with the automatic method.
networking, setting of user permissions, and monitoring of jobs and the operational health of the cluster. New features, such as node templates that utilize Windows Deployment Services, significantly improve the deployment of compute nodes. Cluster administrators can create node groups to view and manage collections of nodes. A new heat map view feature enables cluster administrators to view cluster health at a glance. New cluster and node diagnostic testing. Built-in extensible reporting. Ability to configure two head nodes for high availability so that jobs can continue running in the case of head node failures, and new jobs can be submitted within a short period of time after a failure. Support for cluster updates, including the ability to specify when nodes are updated so that running jobs are not interrupted. Support for all Windows Compute Cluster Server 2003 (CCS 2003) command-line interface (CLI) commands. New user interface for managing jobs (HPC Job Manager), with improved support for parametric commands. PowerShell commands for
6
Command-line interface Compute Cluster Job Manager snap-in made submission and tracking of user jobs simple.
Feature
scheduling and managing jobs. Windows HPC Server 2008 Beta 2 is fully compatible with existing CCS 2003 commands and also offers full support of PowerShell. More than 130 command-line tools enable the automation of system administration tasks. Networking Wizard-driven configuration of network services such as Dynamic Host Configuration Protocol (DHCP), Internet Connection Sharing (ICS), Remote Installation Services (RIS) and firewall settings. Custom job view filtering. New wizard-driven configuration of network services such as DHCP, network address translation (NAT), Domain Name System (DNS), and firewall settings. Support for NetworkDirect, a new Remote Direct Memory Access (RDMA) networking interface built for speed and stability. Programming interface identical to MPICH2 Improved cluster efficiency to shorten solution time Integrated with NetworkDirect, a new, high-speed Remote Direct Memory Access (RDMA) networking interface for Windows New implementation of shared memory communications that provides better support for nodes with many cores. Resource matching allows jobs to specify the types of resources on which they need to run Job templates for setting constraints and requirements on
7
Feature
jobs, including the ability to route jobs to specific sets of resources. Job templates also simplify enduser job submission by enabling cluster administrators to specify default job values. Multilevel resource allocation allows jobs to take better advantage of multi-core systems by requesting resources at a granularity appropriate for their performance characteristics. Adaptive allocation of running jobs allows jobs with multiple tasks to shrink and grow as resources become available and work is completed. Pre-emption allows high priority jobs to start sooner by taking resources away from lower priority jobs.
Developer tools
.NET (C#) and COM application programming interface (API) for the CCS 2003 Job Scheduler Microsoft Message Passing Interface (MS-MPI)
Supports for all COM APIs in CCS 2003 Service Oriented Application (SOA) programming platform Scalable Microsoft HPC Pack 2008 API with eventing support MS-MPI events are now integrated with Event Tracing for Windows. This enhancement helps you to troubleshoot and fine tune the performance of your applications without having to use special trace or debug builds. For more information about the new MS-MPI tracing capability, see the
8
Feature
Windows HPC Server 2008 Software and Driver Development Kit (SDK). This SDK is available for download at the Windows HPC Server 2008 site on Connect (http://go.microsoft.com/fwlink/?Li nkID=119579). Customizable default job template
Additional Considerations
A side-by-side installation of Windows HPC Server 2008 Beta 2 and Windows Compute Cluster Server 2003 on the same computer is not supported. This includes the Windows HPC Server 2008 Beta 2 client utilities. The upgrade of a Windows Compute Cluster Server 2003 head node to a Windows HPC Server 2008 Beta 2 head node is not supported. The installation of HPC Pack 2008 adds the following server roles to the head node: DHCP Windows Deployment Services
9
File Services Network Policy and Access Services, which enables Routing and Remote Access so that NAT services can be provided to the cluster nodes.
Before you start deploying your HPC cluster, review the list of prerequisites and initial considerations. Deploy the head node by installing Windows Server 2008 and Microsoft HPC Pack 2008. Configure the head node by following the steps in the configuration to-do list.
Add nodes to the cluster by deploying them Step 4: Add Compute Nodes to the Cluster from bare metal, by importing an XML file, or by manually configuring them Run diagnostic tests to verify that the deployment of the cluster was successful Run some basic jobs on the cluster to verify that the cluster is operational Step 5: Run Diagnostic Tests on the Cluster
Task
Reference
Review the list of system requirements to ensure that you have all the necessary hardware and software components to deploy an HPC cluster. Decide if you will be adding compute nodes to your cluster from bare metal, from preconfigured nodes, or from an XML file. Choose the Active Directory domain to which you will join the head node and compute nodes of your HPC cluster. Choose an existing domain account with enough privileges to perform installation and diagnostics tasks. Choose how the nodes in your cluster will be connected, and how the cluster will be connected to your enterprise network. If you will be deploying nodes from bare metal and would like to multicast the operating system image that you will be using during deployment, configure your network switches appropriately.
Hardware Requirements
Hardware requirements for Windows HPC Server 2008 Beta 2 are very similar to those for the 64-bit editions of Windows Server 2008. Note For more information about installing Windows Server 2008, including system requirements, see http://go.microsoft.com/fwlink/?LinkID=119578.
11
Processor (x64-based):
Minimum: 1.4 GHz Recommended: 2 GHz or faster
RAM:
Minimum: 512 MB Recommended: 2 GB or more
Drive:
DVD-ROM drive
Network adapters:
The number of network adapters on the head node and on the compute nodes depends on the network topology that you choose for your cluster. For more information, see Appendix 1: HPC Cluster Networking.
Software Requirements
The following list outlines the software requirements for the head node and the compute nodes in a Window HPC Server 2008 Beta 2 cluster: One of the 64-bit editions of Windows Server 2008. Microsoft HPC Pack 2008 Beta 2, which can be downloaded from the Windows HPC Server 2008 site on Connect (http://go.microsoft.com/fwlink/?LinkID=119579).
To enable users to submit jobs to your HPC cluster, you can install the utilities included with Microsoft HPC Pack 2008 Beta 2 on client computers. Those client computers must be running any of the following operating systems: Windows XP Professional with Service Pack 3 or later (x86- or x64-based) Windows Vista Enterprise, Windows Vista Business, Windows Vista Home, and Windows Vista Ultimate Windows Server 2003 Standard Edition or Windows Server 2003 Enterprise Edition with Service Pack 2 or later (x86- or x64-based) Windows Server 2003, Compute Cluster Edition Windows Server 2003 R2 Standard Edition or Windows Server 2003 R2 Enterprise Edition (x86- or x64-based)
12
The following is a list of things to take into consideration when choosing how to add nodes to your HPC cluster: When deploying nodes from bare metal, Windows HPC Server 2008 Beta 2 automatically generates computer names for your compute nodes. During the configuration process, you will be required to specify the naming convention to use when automatically generating computer names for the new nodes. Compute nodes are assigned their computer name in the order that they are deployed. If you want to add compute nodes from bare metal and assign computer names in a different way, you can use a node XML file. For more information about node XML files, see Appendix 2: Creating a Node XML File. If you want to add preconfigured nodes to your cluster, you will need to install one of the 64bit editions of the Windows Server 2008 operating system on each node (if not already installed), as well as Microsoft HPC Pack 2008.
13
Important If you choose to install Active Directory Domain Services on the head node, consult with your network administrator about the correct way to isolate the new Active Directory domain from the enterprise network, or how to join the new domain to an existing Active Directory forest.
For more information about each network topology, see Appendix 1: HPC Cluster Networking. When you are choosing a network topology: Decide which cluster network in the topology that you have chosen will serve as the enterprise network, the private network, and the application network.
14
Do not have the network adapter that is connected to the enterprise network on the head node in automatic configuration (that is, the IP address for that adapter does not start with: 169.254). That adapter must have a valid IP address, dynamically or manually assigned (static). If you choose a topology that includes a private network, and you are planning to add nodes to your cluster from bare metal, ensure that there are no active DHCP or Pre-boot Execution Environment (PXE) servers on that network. Please contact your system administrator to determine if IPSec is enforced on your domain through Group Policy. If IPSec is enforced on your domain through Group Policy, you may have problems during deployment. A workaround is to make your head node an IPSec boundary server so that compute nodes are allowed to talk to the head node at PXE boot.
Install one of the 64-bit editions of the Windows Server 2008 operating system on the computer that will act as the head node. Join the computer that will act as the head node to a Microsoft Active Directory Domain. Install Microsoft HPC Pack 2008 on the
Task
Reference
computer that will act as the head node, using the installation media or from a network location.
2.3. Install Microsoft HPC Pack 2008 on the Head Node Computer
After Windows Server 2008 is installed on the head node computer, and the head node is joined to an Active Directory domain, you can install Microsoft HPC Pack 2008 on the head node. To install Microsoft HPC Pack 2008 on the head node computer 1. To start the Microsoft HPC Pack 2008 installation wizard on the computer that will act as the head node, run setup.exe from the HPC Pack 2008 installation media or from a network location. 2. On the Getting Started page, click Next. 3. On the Microsoft Software License Terms page, read or print the software license terms in the license agreement, and accept or reject the terms of that agreement. If you accept the terms, click Next.
16
4. On the Select Installation Type page, click Create New Compute Cluster, and then click Next. 5. On the Basic New Compute Cluster page, click Create new instance using SQLExpress, and then click Next. 6. On the Select Installation Location page, click Next. 7. On the Install Required Components page, click Install. 8. On the Installation Complete page, click Close.
Configure the cluster network by using the Network Configuration wizard. Specify which credentials to use for system configuration and when adding new nodes to the cluster. Specify the naming convention to use when generating names automatically for new compute nodes. Create a template that defines the steps to follow when configuring a compute node. If you will be deploying compute nodes from bare metal and those nodes require special device drivers, add drivers for the operating system images that you created for your node template on the previous task. If you will be giving access to the cluster to other members of your organization, add or
17
Task
Reference
on the head node check box. b. To enable DHCP services for the nodes connected to this network, select the Enable DHCP and define a scope check box, and then type the starting and ending IP addresses for the DHCP scope. If the Gateway and DNS server IP addresses have not been automatically detected, type each of these addresses. Note For more information about enabling NAT and DHCP on your cluster network, see HPC Network Services in Appendix 1: HPC Cluster Networking. 7. Click Next after you are done configuring the private network. 8. Repeat steps 4, 6, and 7 for the application network adapter. Click Next after you are done configuring the application network. 9. On the Firewall Setup page, select the firewall setting for the cluster: a. To apply firewall settings automatically to head nodes and compute nodes on each network, click ON for that network. b. To disable the firewall on a network, click OFF. c. If you do not want to change any firewall settings, click Do not manage firewall settings. Note For more information about firewall settings for your cluster, see Windows Firewall Configuration in Appendix 1: HPC Cluster Networking. 10. On the Review page, verify your settings and click Configure. If you want to change any of the settings, navigate to the appropriate wizard page by clicking it on the navigation pane or by clicking Previous. 11. After the network configuration process is completed, on the Configuration Summary page, review the list of configuration items. If you want to save a report of the network configuration, click Save the configuration report. 12. To close the wizard, click Finish.
the domain user account you will use to deploy compute nodes and to run diagnostic tests. Important The account must be a domain account with enough privileges to create AD DS accounts for the compute nodes. If part of your deployment requires access to resources on the enterprise network, the account should have the necessary permissions to access those resources. If you want to restart nodes remotely from the cluster administration console (HPC Cluster Manager), the account must be a member of the local Administrators group on the head node. This requirement is only necessary if you do not have Intelligent Platform Management Interface (IPMI) tools that you can use to remotely restart the compute nodes.
20
To specify the compute node naming series 1. In the To do list, click Configure the naming of new nodes. 2. Type the naming series that you want to use. The preview helps you to see an example of how the naming series will be applied to the names of the compute nodes. Note You cannot specify a compute node naming series that consists only of numbers. 3. To save the compute node naming series that you have specified, click OK.
The type of template that you create for the initial deployment of your HPC cluster depends on how you decided to add compute nodes to your cluster. For more information, see 1.2. Decide How to Add Compute Nodes to Your Cluster. Important To complete the following procedure, you will need the installation media for one of the 64-bit editions of Windows Server 2008, or you must have the installation files available on a network location that is accessible from the head node computer. To create a node template 1. In the To do list, click Create a node template. 2. On the Specify Template Name page, type a descriptive name for the template, and then click Next. 3. If you will be adding compute nodes to your cluster from bare metal: a. On the Select Deployment Type page, click With operating system, and then click Next.
21
b. On the Select Operating System Image page, click Add Image. c. On the Add Operating System Image window, click Create a new operating system image, and then type or browse to the location of the Windows setup file for one of the 64-bit editions of Windows Server 2008.
d. Type a descriptive name for the new operating system image, and then click OK. e. After the image is added, in the Image Name list, click the image that you want to use with the template. f. Optionally, specify if you want to multicast the operating system image during deployment. For more information, see 1.6. Prepare for Multicast (Optional) in Step 1: Prepare for Your Deployment.
g. Optionally, specify if you want to include a product key to activate the operating system on the compute nodes, and then type the product key that should be used. h. Click Next to continue. i. j. On the Specify Local Administrator Password for Compute Node page, click Use a specific password, and then type and confirm the password that you want to use. Click Next to continue, and then jump to step 5 in this procedure.
4. If you will be adding preconfigured compute nodes to your cluster, on the Select Deployment Type page, click Without operating system, and then click Next. 5. On the Specify Windows Updates page, specify if you want to add a step in the template to download and install updates using Microsoft Update or the enterprise Windows Server Update Services (WSUS). Also, you can specify specific updates to be added to the template. Click Next to continue. 6. On the Review page, click Create.
5. Repeat step 4 for all the drivers that you want to add. 6. After you are done adding drivers, click Close.
After creating a node template, you can use the Add Node wizard to add compute nodes to your HPC cluster. There are three ways by which you can add compute nodes to your cluster: 4.1. Deploy Compute Nodes from Bare Metal 4.2. Add Compute Nodes by Importing a Node XML File 4.3. Add Preconfigured Compute Nodes
click Continue responding to all PXE requests. If you will not be deploying more nodes, click Respond only to PXE requests that come from existing compute nodes. 9. To track deployment progress, select the Go to Node Management to track progress check box, and then click Finish. 10. During the deployment process of a compute node, its state is set to Provisioning. When the deployment process is complete, the state changes to Offline. To bring online the nodes that have finished deploying: a. In Node Management, under By State, click Offline. b. Select all the nodes that you want to bring online. To select all the nodes that are currently offline, on the list of offline nodes, click any node and then press CTRL+A. c. In the Actions pane, click Bring Online.
25
Important The computers that you will add to your cluster as preconfigured compute nodes must already be running one of the 64-bit editions of the Windows Server 2008 operating system. For more information about installing Windows Server 2008, including system requirements, see http://go.microsoft.com/fwlink/?LinkID=119578. Important To complete this procedure, you must have a template that does not include a step to deploy an operating system image. If you do not have a template that does not include a step to deploy an operating system image, create one by following the steps in 3.4. Create a Node Template in Step 3: Configure the Head Node. To install Microsoft HPC Pack 2008 on a compute node computer 1. To start the HPC Pack 2008 installation wizard on the computer that will act as a compute node, run setup.exe from the HPC Pack 2008 installation media or from a network location. 2. On the Getting Started page, click Next. 3. On the Microsoft Software License Terms page, read or print the software license terms in the license agreement, and accept or reject the terms of that agreement. If you accept the terms, click Next. 4. On the Select Installation Type page, click Join Existing Compute Cluster, and then click Next. 5. On the Join Cluster page, type the computer name of the head node on your cluster, and then click Next. 6. On the Select Installation Location page, click Next. 7. On the Install Required Components page, click Install. 8. On the Installation Complete page, click Close. When HPC Pack 2008 is installed on all the compute nodes that you want to add to your cluster, follow the steps in the Add Node wizard on the compute node to add the preconfigured nodes to your cluster. To add preconfigured compute nodes to your cluster 1. If HPC Cluster Manager is not already open on the head node, open it. Click Start, point to All Programs, click Microsoft HPC Pack, and then click HPC Cluster Manager. 2. In Configuration, click To Do. 3. In the To do list, click Add compute nodes. 4. On the Select Deployment Method page, click Add compute nodes that have already been configured, and then click Next.
26
5. Turn on all the preconfigured nodes that you want to add to your cluster. 6. When all the preconfigured nodes are turned on, on the Before Deploying page, click Next. 7. On the Select New Nodes page, in the Node template list, click the name of a node template that does not include a step to deploy an operating system image. 8. Select the preconfigured compute nodes that you want to add to your cluster. To select all the preconfigured compute nodes, click Select all. 9. To add the selected compute nodes to your cluster, click Add. 10. On the Completing the Add Node Wizard page, click Respond only to PXE requests that come from existing compute nodes. 11. To track deployment progress, select the Go to Node Management to track progress check box. 12. To add the preconfigured nodes to your cluster, click Finish. 13. During the deployment process of a compute node, its state is set to Provisioning. When the deployment process is complete, the state changes to Offline. To bring online the nodes that have finished deploying: a. In Node Management, under By State, click Offline. b. Select all the nodes that you want to bring online. To select all nodes that are currently offline, on the list of offline nodes, click any node and then press CTRL+A. c. In the Actions pane, click Bring Online.
4. To bring online the nodes that have finished deploying: a. In Node Management, under By State, click Offline. b. Select all the nodes that you want to bring online. To select all nodes that are currently offline, on the list of offline nodes, click any node and then press CTRL+A. c. In the Actions pane, click Bring Online. 5. To see the list of nodes that were not deployed successfully: a. In Node Management, under By Health, click Provisioning Failed. b. To see the list of operations related to the deployment failure of a specific node, click that node on the list of nodes, and then click View operations in the details pane (Properties tab). The pivoted view will list all the operations related to that node. c. To see more information about a specific operation, click that operation on the list of operations. The details pane will list the log entries for that operation.
If you realize that the template that you assigned to a node is incorrect, or that you assigned it to the wrong node, you can cancel the assignment of the node template. To cancel the assignment of a node template 1. In Node Management, under By State, click Provisioning. 2. Click the node for which you want to cancel the template assignment, and then click Cancel in the details pane (Properties tab). The state of the node will be changed to Unknown, and the health to Provisioning Failed.
Related Documents
For more information about monitoring, see Step-by-Step Guide for Monitoring in Windows HPC Server 2008 Beta 2 (HPCSxSMonitoring.doc).
4. On the Run Diagnostics window, click Run all functional test, click All nodes, and then click OK. 5. To see the progress of the diagnostic tests and the test results, in Diagnostics, click Test Results. 6. To see detailed information about a test, double-click the test. To expand the information in a section of the test results, click the down arrow for that section.
Related Documents
For more information about running diagnostics on your HPC cluster, see Step-by-Step Guide for Diagnostics in Windows HPC Server 2008 Beta 2 (HPCSxSDiagnostics.doc).
Create a job template by running the Generate Job Template wizard in HPC Cluster Manager. Create and submit a basic job in HPC Cluster Manager. Create and submit a basic job by using the HPC command-line tools. Create and submit a basic job by using the cmdlets in the HPC PowerShell.
6.3. Create and Submit a Job Using the Command-Line Interface (Optional) 6.4. Create and Submit a Job Using the HPC PowerShell (Optional)
To create a simple job template 1. If HPC Cluster Manager is not already open on the head node, open it. Click Start, point to All Programs, click Microsoft HPC Pack, and then click HPC Cluster Manager. 2. In Configuration, click Job Templates. 3. In the Actions pane, click New. 4. On the Welcome page, type Test Template for the name of the new job template, and optionally a description. Click Next to continue. 5. On the Job Run Times page, select the Jobs submitted to this template may not run for longer than check box, and then click Next without changing any settings. This will limit all jobs that are submitted using this template to run for no longer than 1 minute. 6. On the Job Priorities page, click Next without changing any settings. This will run jobs that are submitted using this template with Normal priority. 7. On the Project Names page, click Next without changing any settings. This will allow jobs from any project to be submitted using this template. 8. On the Node Groups page, click Next without changing any settings. This will allow jobs that are submitted using this template to run on any node group. 9. On the Finish page, click Finish.
c.
d. In the Work directory box, type c:\Program Files. e. To add this task, click Save. 4. To limit the job so that it only runs on a specific compute node in your HPC cluster, click Resource Selection, and then specify the following resource parameters: a. Select the Run this job only on nodes in the following list check box. b. Select the check box for one of the nodes in your HPC cluster. 5. To submit the job, click Submit. 6. If you are prompted to enter your credentials, type your user name and password, and then click OK. 7. To see the progress and the results of the job that you submitted: a. In Job Management, click All Jobs. b. In the list of jobs, click the job that you submitted. c. When the state of the job is Finished, in the lower pane, double-click the task that you created in step 3.
d. In the Task Properties window, in the Results tab, the Output box will display the directory of c:\Program Files for the compute node that you selected in step 4. e. If you want to copy the results to the clipboard, click Copy output to clipboard.
6.3. Create and Submit a Job Using the Command-Line Interface (Optional)
You can create and submit a job similar to the job that you created and submitted in the previous section, using the command-line interface tools that are included with Windows HPC Server 2008 Beta 2. To create and submit a job using the command-line interface 1. Open a Command Prompt window. Click Start, point to All Programs, click Accessories, and then click Command Prompt. 2. To create a new job, type the following command: job new /jobname:"Directory Contents" /priority:"Lowest" /RunTime:0:0:1 /requestednodes:<ComputeNodeName> Where <ComputeNodeName> is the name of a compute node in your HPC cluster. 3. To add a task to the job, type the following command: job add <JobID> /workdir:"C:\Program Files" dir Where <JobID> is the identification number for the job, as displayed on the commandline interface after typing the command in step 2.
31
4. To submit the job, type the following command: job submit /id: <JobID> Where <JobID> is the identification number for the job, as displayed on the commandline interface after typing the command in step 2. 5. If you are prompted to enter your credentials, type your password, and then type ENTER.
6.4. Create and Submit a Job Using the HPC PowerShell (Optional)
You can also create and submit the same job that you created and submitted in the previous section, using the HPC PowerShellTM. Note For more information about the HPC PowerShell, see Using the HPC PowerShell in Important Features in Windows HPC Server 2008 Beta 2. To create and submit a job using the HPC PowerShell 1. On the head node, click Start, point to All Programs, and then click Microsoft HPC Pack. 2. Right-click HPC PowerShell, and then click Run as administrator. 3. If you are prompted by Windows PowerShell if you want to run the ccppsh.format.ps1xml script, type A and then press ENTER. 4. To create a new job, type the following cmdlet: $j = New-HpcJob -Name "Directory Contents" -Priority Lowest RunTime "0:0:1" -RequestedNodes <ComputeNodeName> Where <ComputeNodeName> is the name of a compute node in your HPC cluster. 5. To add a task to the job, type the following cmdlet: $j | Add-HpcTask -WorkDir "C:\Program Files" -CommandLine "dir" 6. To submit the job, type the following cmdlet: $j | Submit-HpcJob 7. If you are prompted to enter your credentials, type your password, and then type ENTER. Note You can also type all three cmdlets in one line: New-HpcJob -Name "Directory Contents" -Priority Lowest RunTime "0:0:1" -RequestedNodes <ComputeNodeName> | AddHpcTask -WorkDir "C:\Program Files" CommandLine "dir" | Submit-HpcJob
32
Related Documents
For more information about creating and submitting jobs, see Step-by-Step Guide for Job Submission in Windows HPC Server 2008 Beta 2 (HPCSxSJobSubmission.doc). For more information about the job scheduler configuration, see Step-by-Step Guide for Configuring Job Submission and Scheduling Policies in Windows HPC Server 2008 Beta 2 (HPCSxSJobSchedulerConfig.doc).
Groups
Windows HPC Server 2008 Beta 2 uses the local settings for users and groups on the head node to assign administrator and user rights on the cluster. Local users and groups on the head node include the Administrators group and the Users group. When the head node is added to an Active Directory domain, the Domain Admins group is added to the Administrators group, and the Domain Users group is added to the Users group. All memberships in the Domain Users and Domain Admins groups are automatically propagated to all compute nodes and secondary head nodes on the cluster, as part of the configuration process. To add a user to the cluster, you
33
add the domain account for the user to the Users group, if it is not already part of the Domain Users group. Note The user account must be a domain account. A local user account on the head node is not sufficient. Also, installation credentials are provided during cluster configuration and used to install software on compute nodes. To have the necessary permissions to add an Active Directory object (for example, a computer account), and to reboot compute nodes remotely, the user account that is associated with the installation credentials must be a member of the Domain Admins group. If the user account is not a member of the Domain Admins group, but a member of the Domain Users group instead, the domain administrator must give that user account specific permissions to add Active Directory objects, or the installation process will fail. The following table lists memberships and where they are originated.
Membership Origin
Local Users and Groups: Administrators Local Users and Groups: Administrators: Domain Admins Local Users and Groups: Users Local Users and Groups: Users: Domain Users Local Users and Groups: Users: Authenticated Users
Authorized Operations
The following table lists Job Scheduler operations and which of these are authorized for members of the Users and Administrators groups.
Job Scheduler Operation User Administrator
List jobs for every user List all compute nodes View own tasks Submit jobs for every user Cancel own job
34
User
Administrator
Modify jobs of other users View tasks for every user Run the clusrun command-line tool Note
No No No
You can provide more detailed permissions to submit jobs and use shared sessions by creating job templates. User and group rights for SOA applications and for the HPC Basic Profile Web service are the same as those for the Job Scheduler.
The following table lists HPC cluster management operations and which of these are authorized for members of the Users and Administrators groups.
HPC Management Operation User Administrator
Cluster configuration Apply a template to a node Add a node to the cluster Run diagnostic tests on the cluster Create a computer account in Active Directory to be used during deployment Restart a node remotely Note
Yes
Windows HPC Server 2008 Beta 2 stops unauthorized computers from being added to the cluster and compute nodes. If an unauthorized node is detected, it is marked as Unknown until a cluster administrator adds that node to the cluster by applying a node template to it.
35
Credentials for cluster management are provided .in the following ways:
36
Installation credentials input on HPC Cluster Manager Installation credentials input on HPC PowerShell
37
and application networks behind the head node, and enable NAT if access to the enterprise network is required. Even when Windows Firewall is turned on, Windows HPC Server 2008 Beta 2 opens ports and application exceptions to enable internal services to run. It is the responsibility of the system administrator to create Windows Firewall exceptions for the executables of client applications.
Monte Carlo problems that simulate the behavior of various mathematical or physical systems. Monte Carlo methods are used in physics, physical chemistry, economics, and related fields. BLAST searches
Gene matching
38
Example Application
Example Task
Genetic algorithms Ray tracing Digital content creation Microsoft Excel add-in calculations
Evolutionary computational meta-heuristics Computational physics and rendering Rendering frames Calling add-in functions
Related Documents
For more information about the SOA programming model and the Job Scheduler in Windows HPC Server 2008 Beta 2, see http://go.microsoft.com/fwlink/?LinkID=119581. For more information about building a simple a simple SOA application, see Step-by-Step Guide for Building, Deploying, and Running SOA-based Applications in Windows HPC Server 2008 Beta 2 (HPCSxSSOAApp.doc). For more information about managing the SOA application infrastructure, see Step-by-Step Guide for Managing SOA Application Infrastructure in Windows HPC Server 2008 Beta 2 (HPCSxSSOAManaging.doc).
provide the basis for building tools and applications that can access high performance computing resources directly.
As mentioned before, there are several open source projects that use the HPC Basic Profile; among them: GridSAM (http://go.microsoft.com/fwlink/?LinkID=119584) BES++ (http://go.microsoft.com/fwlink/?LinkID=119585)
Extensions continue to be developed in response to community needs. More information can be obtained from the HPC Profile working group within the OGF (http://go.microsoft.com/fwlink/?LinkID=119586)
Related Documents
For more information about the HPC Basic Profile Web service and how to configure it for accessing your HPC cluster remotely, see Step-by-Step Guide for Configuring the Basic Profile Web Service in Windows HPC Server 2008 Beta 2 (HPCSxSBasicProfileConfig.doc). For more information about using a C# client to access the HPC Basic Profile Web service, see Step-by-Step Guide for Using the Basic Profile Web Service from C# in Windows HPC Server 2008 Beta 2 (HPCSxSUseBasicProfile.doc).
40
Determine how many jobs have been processed by the cluster, by generating a Job Throughput report. Determine how long jobs had to wait in the queue before they were processed, by generating a Job Turnaround report.
These reports can be generated in HPC Cluster Manager, in Charts and Reports. Also, in Charts and Reports you can monitor the current status and performance of your cluster by reviewing charts with real-time, aggregated data for node state, job throughput, network usage, and other cluster resources.
Related Documents
For more information about HPC cluster reporting, see Step-by-Step Guide for Reporting in Windows HPC Server 2008 Beta 2 (HPCSxSReporting.doc).
41
Additional Resources
An updated version of this Getting Started guide is available online, on the Windows HPC Server 2008 Library on TechNet: http://go.microsoft.com/fwlink/?LinkId=118024. The Windows HPC Server Library on TechNet: http://go.microsoft.com/fwlink/?LinkId=119594.
42
The release notes for Windows HPC Server 2008 Beta 2: http://go.microsoft.com/fwlink/?LinkId=117922. The Windows HPC Server 2008 site on Connect: http://go.microsoft.com/fwlink/?LinkID=119579.
Enterprise network
An organizational network to which the head node is connected and optionally the compute nodes. The enterprise network is often the network that most users in an organization log on to when performing their job. All intra-cluster management and deployment traffic is carried on the enterprise network unless a private network (and optionally, an application network) also connects the cluster nodes. A dedicated network that carries intra-cluster communication between nodes. This network carries management, deployment, and
43
Private network
Network name
Description
application traffic if no application network exists. Application network A dedicated network, preferably with high bandwidth and low latency. These characteristics are important so that this network can perform latency-sensitive tasks, such as carrying parallel MPI application communication between compute nodes
The following table lists and describes details about the different components in this topology:
44
Component
Description
Network adapters
The head node has two network adapters. Each compute node has one network adapter. The head node is connected to both an enterprise network and to a private network. The compute nodes are connected only to the private network. The private network carries all communication between the head node and the compute nodes, including deployment, management and application traffic (for example, MPI communication). The default configuration for this topology is NAT enabled on the private network in order to provide the compute nodes with address translation and access to services and resources on the enterprise network. DHCP is enabled by default on the private network to assign IP addresses to compute nodes. If a DHCP server is already installed on the private network, then both NAT and DHCP will be disabled by default. The default configuration on the cluster has the firewall turned ON for the enterprise network and turned OFF for the private network. Cluster performance is more consistent because intra-cluster communication is routed onto the private network Network traffic between compute nodes and resources on the enterprise network (such as databases and file servers) pass through the head node. For this reason, and depending on the amount of traffic, this might impact cluster performance.
45
Traffic
Network Services
Security
Component
Description
Compute nodes are not directly accessible by users on the enterprise network. This has implications when developing and debugging parallel applications for use on the cluster.
The following table lists and describes details about the different components in this topology:
Component Description
Network adapters
The head node has two network adapters. Each compute node has two network adapters. All nodes in cluster are connected to both the enterprise network and to a dedicated private cluster network. Communication between nodes, including deployment, management, and application traffic, is carried on the private network in this topology. Traffic from the enterprise network can be routed directly to a compute node.
Traffic
46
Component
Description
Network Services
The default configuration for this topology has DHCP enabled on the private network, to provide IP addresses to the compute nodes. NAT is not required in this topology because the compute nodes are connected to the enterprise network, so this option is disabled by default. The default configuration on the cluster has the firewall turned ON for the enterprise network and OFF for the private network. This topology offers more consistent cluster performance because intra-cluster communication is routed onto a private network. This topology is well suited for developing and debugging applications because all compute nodes are connected to the enterprise network. This topology provides easy access to compute nodes by users on the enterprise network. This topology provides faster access to enterprise network resources by the compute nodes.
Security
47
The following table lists and describes details about the different components in this topology:
Component Description
Network adapters
The head node has three network adapters: one for the enterprise network, one for the private network, and a highspeed adapter that is connected to the application network. Each compute node has two network adapters, one for the private network and another for the application network. The private network carries deployment and management communication between the head node and the compute nodes. Jobs running on the cluster use the highperformance application network for crossnode communication. The default configuration for this topology has both DHCP and NAT enabled for the private network, to provide IP addressing and address translation for compute nodes. DHCP is enabled by default on the application network, but not NAT. If a DHCP is already installed on the private network, then both NAT and DHCP will be disabled by default. The default configuration on the cluster has
48
Traffic
Network Services
Security
Component
Description
the firewall turned ON for the enterprise network and turned OFF on the private and application networks. Considerations when selecting this topology This topology offers more consistent cluster performance because intra-cluster communication is routed onto the private and application networks. Compute nodes are not directly accessible by users on the enterprise network in this topology.
The following table lists and describes details about the different components in this topology:
Component Description
Network adapters
The head node has three network adapters. All compute nodes have three network adapters. The network adapters are for the enterprise
49
Component
Description
network, the private network, and a high speed adapter for the high performance application network. Traffic The private cluster network carries only deployment and management traffic. The application network carries latencysensitive traffic, such as MPI communication between nodes. Network traffic from the enterprise network reaches the compute nodes directly. The default configuration for this topology has DHCP enabled for the private and application networks to provide IP addresses to the compute nodes on both networks. NAT is disabled for the private and application networks because the compute nodes are connected to the enterprise network. The default configuration on the cluster has the firewall turned ON for the enterprise network and turned OFF on the private and application networks. This topology offers more consistent cluster performance because intra-cluster communication is routed onto a private and application network. This topology is well suited for developing and debugging applications because all cluster nodes are connected to the enterprise network. This topology provides easy access to compute nodes by users on the enterprise network. This topology provides faster access to enterprise network resources by the compute nodes.
50
Network Services
Security
The following table lists and describes details about the different components in this topology:
Component Description
Network adapters
The head node has one network adapter. All compute nodes have one network adapter. All nodes are on the enterprise network. All traffic, including intra-cluster, application, and enterprise traffic, is carried over the enterprise network. This maximizes access to the compute nodes by users and developers on the enterprise network. This topology does not require NAT or DHCP because the compute nodes are connected to the enterprise network. The default configuration on the cluster has the firewall turned ON for the enterprise network. This topology offers easy access to compute nodes by users on the enterprise network. Access of resources on the enterprise network by individual compute nodes is
51
Traffic
Network Services
Security
Component
Description
faster. This topology, like topologies 2 and 4, is well suited for developing and debugging applications because all cluster nodes are connected to the enterprise network. This topology provides easy access to compute nodes by users on the enterprise network. This topology provides faster access to enterprise network resources by the compute nodes. Because all nodes are connected only to the enterprise network, you cannot use Windows Deployment Services to deploy compute node images using the new deployment tools in Windows HPC Server 2008 Beta 2.
52
DHCP Server
A DHCP server assigns IP addresses to network clients. Depending on the detected configuration of your HPC cluster and the network topology that you choose for your cluster, the compute nodes will receive IP addresses from either the head node running DHCP, or from a dedicated DHCP server on the private network, or via DHCP services coming from a server on the enterprise network.
5969
Required by the client tools on the enterprise network to connect to the HPC Job Scheduler Service on the head node. Used by the HPC Management Service on the compute nodes to communicate with the HPC SDM Service on the head node. Used for communication between the HPC Management Service on the compute nodes and the HPC Job Scheduler Service on the head node. Used for communication between ExecutionClient.exe on the compute nodes and the HPC Management Service on the head node. ExecutionClient.exe is used during the deployment process of a compute node. It performs tasks such as imaging the computer, installing all the necessary HPC components, and joining the computer to the domain. Used for communication between the client application on the enterprise network and the services provided by the WCF Broker node. Used by the HPC Job Scheduler Service on the head node to communicate with the HPC Node Manager Service on the compute
53
9892, 9893
5970
9794
1856
Required By
nodes. 8677 Used for communication between the HPC MPI Service on the head node and the HPC MPI Service on the compute nodes. Used for management services traffic coming from the compute nodes to the head node or WCF broker node. Used for communication between the HPC command-line tools on the enterprise network and the HPC Job Scheduler Service on the head node. Used by the remote node service on the enterprise network to enumerate nodes in a node group, or to bring a node online or take it offline. Used by HPC Cluster Manager on the enterprise network to communicate with the HPC Job Scheduler Service on the head node. Used by the clients on the enterprise network to connect to the HPC Basic Profile Web service on the head node.
6729
5800
5801
5999
443
54
You can pre-stage a PXE deployment of compute nodes for your HPC cluster by importing a node XML file with a list of all the computers that you will be adding to the cluster. The compute nodes can be deployed both from bare metal or preconfigured nodes. Preconfigured nodes that are added to your HPC cluster using a node XML file do not need to be manually approved into the cluster. This makes the deployment process more efficient and streamlined. Importing a node XML file is a simple and efficient way for you to associate properties with compute nodes. Examples of properties that can be associated with compute nodes are: location (including data center, rack, and chassis), a Windows product key, node templates, or tags that are used to automatically create node groups. You can give specific computer names (NetBIOS names) to compute nodes that are deployed from bare metal, without having to worry about powering them on in a specific order. Using a node XML file, computer names will already be associated with a specific SMBIOS GUID or MAC address (or both).
Location
No
Optional element. Contains attributes with information about the location of the compute node. Optional attribute of the Location element. Specifies the name of the data center where the compute node is located.
55
Location: DataCenter
No
Required
Description
Location: Rack
No
Optional attribute of the Location element. Specifies the name or number of the server rack where the compute node is located. Optional attribute of the Location element. Specifies the name or number of the chassis used for the compute node. Optional element. This element is required when deploying compute nodes from bare-metal. Contains attributes with information about the node template that will be used to deploy the compute node. Required attribute of the Template element. This attribute is required only when a Template element is included. Specifies the name of the node template that will be used to deploy the compute node. If the specified note template name does not exist on the head node, the deployment will fail. If you are deploying compute nodes from bare metal, this attribute must specify the name of a node template that includes a step to deploy an operating system image, or your deployment will fail. Optional attribute of the Template element. Specifies if the node is a preconfigured node (True), or not (False).
Location: Chassis
No
Template
No
Template: Name
Yes
Template: Provisioned
No
56
Required
Description
MacAddress
No
Optional element. Specifies the MAC address of the network adapter used by the compute node. If you are deploying compute nodes from bare metal, you must specify this element or the MachineGuid parameter, or the deployment will fail. You must also specify this element if the cluster nodes in your system have SMBIOS GUIDs that are not unique (that is, two or more nodes in the node XML file have the same value for the MachineGuid parameter). There can be multiple instances of this element, if the compute node uses more than one adapter. Ensure that you specify only those MAC addresses that exist in the compute node. Specifying a MAC address that does not exist in a compute node, might cause the import of that node to fail. Optional element. Specifies the name of the node group to which the compute node should be added during deployment. There can be multiple instances of this element, if the compute node should be added to more than one node group. Required attribute. Specifies the computer name (NetBIOS name) of the compute node. If you are deploying compute nodes from bare metal, this attribute specifies the computer name that will be assigned to the node during deployment. If you are deploying preconfigured
57
Tag
No
Name
Yes
Required
Description
nodes, this attribute specifies the current computer name of the compute node. If the specified name is that of a preconfigured node that has already been added to the cluster (that is, it is not in the Unknown state), the node XML file will fail to import. Optional attribute. Specifies the Active Directory domain to which the compute node should be added. If this attribute is not specified, the Active Directory domain of the head node is used. Optional attribute. Specifies information required for Intelligent Platform Management Interface (IPMI) integration. You only need this attribute if you are using IPMI tools to manage power on your cluster. Optional attribute. Specifies the SMBIOS GUID of the computer where the compute node is deployed. If you are deploying compute nodes from bare metal, you must specify this parameter or the MacAddress element, or the node XML file will fail to import. Optional attribute. Specifies the Windows product key that will be used to activate the operating system on the compute node. The product key is used during the activation task of a node template that includes a step to deploy an operating
58
Domain
No
ManagementIpAddress
No
MachineGuid
No
ProductKey
No
Required
Description
system image. The product key that you specify must match the edition of the operating system in the image that is used by the node template.
59
Creating a Node XML File for Deployment from Preconfigured Compute Nodes
When creating a node XML file for a deployment from preconfigured compute nodes, you only need to specify the computer name of each compute node. You can also specify the SMBIOS GUID or the MAC address of the computer, but it is not necessary. When creating a node XML file for deployment from preconfigured compute nodes: Specify the computer name of the compute node in the Name attribute for each compute node. If a node template is not specified for a compute node in the node XML file, the compute node will be listed in the Unknown state in Node Management. To add that compute node to the cluster, you will need to manually assign a template to the node. For more information about node templates, see 3.4. Create a Node Template. Ensure that the node template names that are specified in the node XML file match the names of the node templates listed on the head node. The node templates that are specified in the node XML file do not have to include a step to deploy an operating system image. If the node template specified in the node XML file does not include a step to deploy an operating system image, the node will be added to the cluster successfully, but you will not be able to reimage the node later on. If the node template specified in the node XML file does include a step to deploy an operating system image, the deployment process will skip that step when adding the preconfigured compute node to the cluster. To specify that a compute node is preconfigured, specify a True value for the Provisioned attribute of that node. Specify any location information that you want to be attached to the node. If you want nodes to be automatically added to specific node groups during deployment, specify the Tag attribute with the name of the node group for each compute node. If you are using a retail Windows product key, you can specify it in the node XML file. If your IPMI integration requires a BMC IP address for each compute node, it can be added to the node XML file.
Microsoft HPC Pack, and then click HPC Cluster Manager. 2. In Node Management, click Nodes. 3. Select all the nodes that you want to include in the node XML file. To select all nodes in your cluster, on the list of nodes, click any node and then press CTRL+A. 4. On the Actions pane, click Export Node XML. 5. Type a name for the node XML file and then click Save.
61