HPC Metering Protocol
HPC Metering Protocol
HPC Metering Protocol
Prepared for:
U.S. Department of Energy
Office of Energy Efficiency and Renewable Energy
Federal Energy Management Program
Prepared by:
Thomas Wenning
Michael MacDonald
Oak Ridge National Laboratory
September 2010
Introduction
Data centers in general are continually using more compact and energy intensive central processing units,
but the total number and size of data centers continues to increase to meet progressive computing
requirements. In addition, efforts are underway to consolidate smaller data centers across the country.
This consolidation is resulting in a growth of high-performance computing facilities (i.e. -
supercomputers) which consume large amounts of energy to support the numerically intensive
calculations they perform. The growth in electricity demand at individual data centers, coupled with the
increasing number of data centers nationwide, are causing a large increase in electricity demand
nationwide.
In the EPA’s Report to Congress on Server and Data Center Energy Efficiency Public Law 109-431,
2007, the report indicated that US data centers consumed about 61 billion kilowatt-hours (kWh) in 2006,
which equates to about 1.5% of all electricity used in the US at that time. The report then suggested that
the overall consumption would rise to about 100 billion kWh by 2011 or about 2.9% of total US
consumption. With this anticipated rapid increase in energy consumption, the U.S. Department of Energy
(DOE) is pursuing means of increasing energy efficiency in this rapidly transforming information
technology sector.
This report is part of the DOE effort to develop methods for measurement in High Performance
Computing (HPC) data center facilities and document system strategies that have been used in DOE data
centers to increase data center energy efficiency.
NOTICE
This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 with the U.S.
Department of Energy. The United States Government retains and the publisher, by accepting the article for
publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-
wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United
States Government purposes.
ii
Contacts
iii
Abbreviations and Acronyms
ASHRAE American Society of Heating, Refrigerating, and Air Conditioning Engineers
BAS Building automation system
CPU Central processing unit (computer)
CRAC Computer room air conditioner
CSB Computer Science Building — Building 5600
DCeP Data center energy productivity
DCiE Data center infrastructure efficiency
DOE US Department of Energy
EERE DOE Office of Energy Efficiency and Renewable Energy
EPA US Environmental Protection Agency
FEMP Federal Energy Management Program
FLOP Floating-point operation (computer calculation)
HVAC Heating, ventilating, and air conditioning
IT Information technology
kW kilo-Watt
kWh kilo-Watt-hour
LEED Leadership in Energy and Environmental Design
MFLOP Mega-FLOP
MRF Multiprogram Research Facility
NSF National Science Foundation
ORNL Oak Ridge National Laboratory
PDU or RDU Power distribution unit
PUE Power usage effectiveness
TGG The Green Grid
TVA Tennessee Valley Authority
UPS Uninterruptible power supply
VFD Variable frequency drive (or variable flow drive)
W Watt
iv
Contents
1 Metering Background........................................................................................................................... 1
2 Metering Purpose ................................................................................................................................. 1
3 Levels of Metering ............................................................................................................................... 3
4 Performance Metrics ............................................................................................................................ 4
5 Metering Equipment............................................................................................................................. 5
6 Key Equipment in HPC Facilities ........................................................................................................ 8
7 Oak Ridge National Laboratory Metering Case Study ........................................................................ 9
7.1 Site Background ....................................................................................................................... 10
8 Metering Protocol............................................................................................................................... 10
8.1 Electric System and Metering .................................................................................................. 10
8.2 Electric Data Measurement ...................................................................................................... 11
8.3 Data Center Room Cooling and Measurement ........................................................................ 14
8.4 Chilled Water System and Metering ........................................................................................ 15
8.5 Chiller Plant Measurements ..................................................................................................... 17
9 First-Floor Performance Measurement Results .................................................................................. 17
10 Second-Floor Performance Measurement Results ............................................................................. 18
11 Future Plans for the Data Centers ...................................................................................................... 19
12 References .......................................................................................................................................... 21
Appendix A: Other Data Center Metrics and Benchmarking ..................................................................... 22
v
1 Metering Background
Significant efforts to further understanding and benchmarking of data center energy efficiency have
occurred over the past few years. Numerous potential regulatory and institutional initiatives have driven
these efforts. The U.S. Environmental Protection Agency (EPA) “Report to Congress on Server and Data
Center Energy Efficiency” (2007) and the European Commission’s “Code of Conduct on Data Centres
Energy Efficiency, Version 1” (2008) are only two examples of regulatory interest in data center
efficiency.
Future rules and regulations regarding data center power consumption will be developed primarily by the
federal government. DOE is currently pursuing multiple requirements and guidelines for DOE facilities
also. DOE and its facility contractors invest more than $2 billion/yr in information technology resources,
a large component of which includes desktop and laptop computers utilized by end users, and servers and
storage media maintained in data centers. The Energy Independence and Security Act of 2007 directed
agencies to improve energy efficiency, reduce energy costs, and reduce greenhouse gas emissions. In
addition, the Department issued Order 450.1A in 2008 that required programs and sites to implement a
number of environmental stewardship practices, including enabling power management features on
computers and other electronic equipment. Further goals have been spelled out in the Department of
Energy’s Strategic Sustainability Performance Plan recently released in 2010 to address the requirements
of Executive Order 13514, Federal Leadership in Environmental, Energy and Economic Performance,
signed by the President on 10/5/2009.
DOE’s Office of Energy Efficiency and Renewable Energy (EERE) is working to evaluate energy
efficiency opportunities in data centers. As part of this process, energy assessments of data center
facilities will be conducted under the Save Energy Now program. These assessments are designed to help
data center professionals identify energy saving measures that are most likely to yield the greatest energy
savings. The assessments are not intended to be a complete energy audit, but rather, the process is meant
to educate data center staff and managers on an approach that can be used to identify potential energy
saving opportunities that can be further investigated. The intent is that sites will continue to track
improvement in their energy performance to document their energy performance metrics and actions
implemented over time. This performance tracking will be done by the sites.
Practical performance metrics have been developed by The Green Grid (TGG), an industry consortium
active in developing metrics and standards for the IT industry. DOE and TGG have a Memorandum of
Understanding signed in 2007 to cooperate on multiple fronts to address improving data center energy
and water efficiency. In 2009, TGG and ASHRAE published a book entitle, “Real-Time Energy
Consumption Measurements in Data Centers.” The book is a comprehensive resource discussing various
data center measurements.
2 Metering Purpose
Metering projects are undertaken for a variety of reasons. Meters allow for a better understanding of a
facilities power and cooling systems. Metered data can be used to determine benchmarks for the current
state of operations. Benchmarking and baselining enables a facility to track performance and efficiency
improvements over time. In addition, benchmarking allows for management and operations to take a
proactive approach to identifying performance improvement projects and technologies. With the use of
meters, new equipment, retrofits, and system development (HVAC, lighting, controls, etc) performance
1
can easily be tracked and documented over time. Meters provide reliable measurement and verification
for the expected energy benefits of system improvement projects.
Metering can be utilized as a diagnostic tool to continuously monitor, track, and improve facility
performance to ensure long-term efficient operation. If properly set up, meters can also enable a facility
to consistently track, report, and communicate various metrics.
From a facility management perspective, installing meters can help determine system benchmarks and use
them to compare against other facilities. This information can also be used to set future performance
goals, determine specific improvement targets, and verify that targets have been met. This could enable
management to incentivize goals for meeting and exceeding established targets. Analysis of metered data
can provide improved planning for future utilities.
Metering projects typically carry a burdensome cost, thus site managers typically struggle to justify
projects from solely a capital cost-savings perspective. Installing meters does not directly save energy;
however, intelligent integration of meters into a system can allow for improved system performance,
which in turn can save energy. Building management systems often use the information from various
meters to directly influence and dictate operations and control of key systems. In high performance
computing facilities, intelligent integration of meters can play a vital role in efficient system operation.
In high performance computing facilities, there are a number of potential measurements to be taken.
These measurements may include any of the following:
With all these possible measurements, it is easy to imagine how a metering project can quickly become a
costly and complex undertaking. Thus, it is important to establish the purpose at the onset of a metering
project and to understand what data needs to be acquired. Site managers must be able to set clear goals
for their metering project. This often requires resolving differences between desired and realistic
metering expectations. Several main considerations to account for during the planning stages of a
metering project include (ASHRAE):
• Project Goals – Establish the project goals and data requirements before selecting hardware.
• Project Cost and Resources – Determine the feasibility of the project given the available
resources.
• Data Products – Establish the desired final output data type and format before selecting data
measurement points.
• Data Management – Identify proper computer and personnel resources to handle data collection
needs.
• Data Quality Control – Identify the system to be implemented to check and validate data quality.
• Commitment – Projects often require long-term commitments of personnel and resources.
• Accuracy Requirements – Determine the required accuracy of the final data early in the project.
2
Developing a metering project often becomes an iterative process, revolving between budget constraints,
equipment costs, and metering goals. The following figure from ASHRAE Handbook – HVAC
Applications, 2007 shows a nine-step flowchart for the iterative planning process.
3 Levels of Metering
There are four general levels of utility data that can be captured with varying degrees of metering. The
levels start off with broad site consumption metering and it narrows down to specific end use
measurements (ASTM E 1465). The four levels consist of:
1. Site Level
o General utility coming into a site. The site may have several buildings and other end use
equipment.
2. Building Level
o Metering at the utility feed to an individual building. This encompasses all energy used
within a given building.
3. System Level
o Sub-metering at the system level. This may include whole systems such as, chiller
plants, lighting, computer room air conditioners (CRACs), etc.
4. Component Level
o Sub-metering within systems. This may include flow and temperatures within a chiller
system.
Detailed sub-metering at the component level will provide the greatest resolution of energy consumption
within a facility and will provide for more control option. It will also provide the most feedback for the
user to analyze; however, this is often the most expensive option when considering permanent metering
3
installations. Metering only at the site and building level is often the cheapest option, however, it is
generally insufficient when trying to determine system and facility performance.
One method for minimizing metering costs is to monitor energy as high in the distribution system as
possible. Doing so minimizes the number of monitoring nodes and therefore reduces equipment needs.
However, measuring too high in a system will lead to a poor understanding of end-use consumption and
results in difficulty trying to assess system performance. A rule-of-thumb is to not separately meter an
end-use if its expected consumption is less than 10% of the higher nodes consumption (ASTM E 1465).
Multiple levels of nodes allow for some redundant metering which can be used to help identify
installation problems and can be used to facilitate a comparison between end-use and utility meter data.
4 Performance Metrics
The primary metrics reported are related to energy and are the Power Usage Effectiveness (PUE) and
Datacenter Infrastructure Efficiency (DCiE), metrics from The Green Grid (TGG,
www.thegreeengrid.org). These metrics are defined in several white papers by TGG (see White Paper
#6 from the website).
For the PUE and DCiE equations, the Total Facility Power is defined as the power measured at the utility
meter — the power dedicated solely to the datacenter (this is important in mixed-use buildings that house
datacenters as one of a number of consumers of power). The IT Equipment Power is defined as the
equipment that is used to manage, process, store, or route data within the data center. It is important to
understand the components for the loads in the metrics, which can be described as follows:
IT EQUIPMENT POWER. This includes the load associated with all of the IT equipment, such as
compute, storage, and network equipment, along with supplemental equipment such as KVM
switches, monitors, and workstations/laptops used to monitor or otherwise control the datacenter.
TOTAL FACILITY POWER. This includes everything that supports the IT equipment load such as:
• Power delivery components such as UPS, switch gear, generators, PDUs, batteries, and
distribution losses external to the IT equipment.
• Cooling system components such as chillers, computer room air conditioning units
(CRACs), direct expansion air handler units, pumps, and cooling towers.
• Compute, network, and storage nodes.
• Other miscellaneous component loads such as datacenter lighting.
4
One so-called “green” metric used for the “Green500” list of computers is MFLOP/W. This “green”
metric has some limited value for understanding how “green” a computer is, but the metric is unduly
influenced by running with limited memory, and thus reduced ability to handle certain types of important
tasks.
A better data center productivity (DCP) metric has been identified by TGG as DCeP. DCeP is envisioned
as one of a family of DCP metrics, designated generically as DCxP. The energy productivity metric,
DCeP, is defined by TGG (White Paper #18) as:
The major issue facing this productivity metric at this time is developing a meaningful measure of the
numerator. Several proposals have been made, but no good solution has yet emerged (see TGG White
Paper #24).
5 Metering Equipment
In the broadest sense, there are two types of data, time dependant and time independent data. Time
dependant data includes weather and energy consumption data. Time independent data includes facility
descriptive data and project cost data. Various time dependant data can be measured using an array of
different measurement devices. Most devices are capable of being used to capture data at various time
intervals. Capturing data on the smallest possible time intervals provides the most resolution; however, it
also provides exceedingly large quantities of data. This can lead to issues of having too much information
to sort through. If data is to be captured on increments of less than an hour, it is very beneficial to have
an automated collection and processing system. Using outdoor air temperature for example, the minimum
time interval of data collection should be once daily. Shorter time increments, such as hourly, can
provide more clarity when comparing it to equipment’s electricity consumption. Smaller increments, less
than hourly, can be used to continuously calculate near-instantaneous system performance for systems
such as a chiller plant; however, this can only be achieved by using an automatic collection system.
The following tables describe a cursory breakdown of various metering technologies in the marketplace.
Included in the table are the measurement type, sensor, application, and relative accuracy of each
technology. For expanded information on the applications and limitations of the sensors listed in the table
below, please refer to ASHRAE Handbook – Fundamentals, 2009 and Real-Time Energy Consumption
Measurements in Data Centers, 2009.
5
Thermodynamic Measurements
Measurement Sensor Application Accuracy
Thermocouples Any 1.0 - 5.0%
Temperature Thermistors Any 0.1 - 2.0%
Resistance Temperature Detectors Any 0.01 - 1%
Bourdon Tube Pressure in pipe 0.25 - 1.5%
Pressure
Strain Gage Pressure in pipe 0.1 - 1%
Psychrometer Above freezing temperatures 3 - 7%
Humidity
Hygrometer Any Varies
Electrical Measurements
Measurement Sensor Application Accuracy
Solid Core Permanent Installations Varies
Current Split Core Permanent Installations Varies
Clamp-on / Flex Temporary Varies
Pressure Transducer Any Varies
Voltage
Voltage Divider Low Voltage AC or DC Varies
Portable Meter Temporary Varies
Panel Meter Permanent Installations Varies
Power
Revenue Meter Permanent Installations Varies
Power Transducer Monitoring Varies
The following table from ASHRAE Handbook – HVAC Applications, 2007 explicitly calls out the
accuracy and reliability issues of some of the most used metering instrumentation. One important issue
that is often overlooked in metering installations is the need to periodically re-calibrate sensors. If
sensors are not in calibration and are being used in system controls, they may be causing substantial
inefficiencies in the system. Sensors that are only used for data collection and not control still need to be
recalibrate to ensure accuracy of the calculated metrics and benchmarks.
6
Instrumentation Accuracy and Reliability
When it comes to capturing data from all the various sources in a HPC facility, it is best to do so using an
automated system. Though some measurements can be captured manually, this method often proves to be
cumbersome over time and ultimately becomes an unsustainable practice. The best method to capture,
store, and analyze the measurement information is through the means of a data acquisition system, also
known as building automation systems, building management systems, energy monitoring and control
systems, and supervisory control and data acquisition systems. These systems often serve multiple
purposes. They can monitor, trend, record system status, record energy consumption and demand, record
hours of operation, control subsystem functions, produce summary reports, and print alarms when
systems do not operate within specified limits. One example of the usefulness of controlling subsystems
is in room humidity control. DAS allows for the central processing and control of maintaining a
computer rooms relative humidity instead of having numerous humidification and dehumidification units
fighting one another in an effort to meet their localized sensor requests. Numerous platforms are already
used in the marketplace; examples include PowerNet and Metasys. A breakdown of various data
acquisition systems and there general purposes are described in the table below.
7
The following table from ASHRAE Handbook – HVAC Applications, 2007 lists details about some of the
concerns associated with using and installing data acquisition hardware in systems.
8
Measurable Equipment and Key Measurements in HPC Facilities
System Components Key Measurements
Outdoor Temperature, Outdoor
Relative Humidity, Indoor
General Measurements Temperatures, Indoor Relative
Humidites
Current, Voltage, Power, Air Intake
Servers / Storage / Networking Internal Fans Temperature
Uninterruptible Power Supplies /
Power, Current, Voltage
Power Distribution Units
Transformers Current, Voltage
Automatic Transfer Switches Power, Current, Voltage
Compressors
Computer Room Air Conditioner / Blowers/Fans
Temperature, Flow Rate, Power,
Computer Room Air Handling Pumps
Voltage, Current, Power
Units Humidifiers
Reheaters
Compressor Temperature, Flow Rate, Power,
Chillers
Heat Exchangers Voltage, Current
Blowers/Fans Current, Voltage, Power, Flow
Cooling Towers
Pumps Rate, Pressure
Current, Voltage, Power, Flow
Pumps / Fans / Blowers Rate, Pressure
Heat Exchangers Temperature, Flow Rate
Lighting Current, Voltage
Distributed Energy Systems / Varies upon Equipment:
Combined Heat & Power Varies Temperature, Flow Rate, Power,
Systems Voltage, Current
9
areas in the facility. Subsequent upgrades to data centers in either of these buildings have incorporated
the latest energy efficiency ideas that were implemented in the other building.
With the creation of DOE in the 1970s, ORNL’s mission broadened to include a variety of energy
technologies and strategies. Today the laboratory supports the nation with a peacetime science and
technology mission that is just as important as, but very different from, its role during the Manhattan
Project. ORNL is DOE’s largest science and energy laboratory.
8 Metering Protocol
Determination of the energy parameters for the data centers in Building 5600 requires an extensive array
of meters. The original data center’s electrical and cooling infrastructure received major upgrades in the
last few years to allow installation of major new supercomputers. As of June 30, 2010, Building 5600
houses the #1, #4, #20, and #36 most powerful supercomputers in the world. In addition, ORNL is
currently experiencing the installation and start-up of a new supercomputer to support the National
Oceanic and Atmospheric Administration (NOAA).
The three-story CSB contains offices and a 2nd floor computer center that houses a typical data center in
one half of the area and a supercomputer in the other half. A large raised-floor computer center currently
houses two supercomputers on the first floor. The NOAA supercomputer will be the third supercomputer
on the first floor once completed. The CSB data centers are part of the larger facility that includes other
functions.
In the continuous upgrading process, extensive electric metering has been installed in the building (about
100 in all). Of these meters, 56 were required to determine the energy benchmarking metrics. Chilled
water meters were also installed to measure chilled water flows to the first floor and second floor centers.
The electrical data metering network is Eaton (Cutler-Hammer) PowerNet, and the chilled water meters
are handled by the Johnson Controls Metasys building automation system (BAS). The two metering and
control systems are currently not integrated; however, infrastructure is expected to evolve to a type of
higher level integrative enterprise solution, such as one based on OLE Process Control.
10
480Y/277V transformers are located inside the building close to the computer room to reduce distribution
losses. ORNL power is measured every half-hour by TVA and is aggregated to the hourly level in this
figure. Building 5600 electric meters collect data at different intervals but is aggregated to hourly data.
The uninterruptible power supply (UPS) requirements were minimized to only required support for dual
corded systems such as disk drives, networking and communications equipment, and business
applications. UPS ride-through power is supplied until the backup generator can start and come on line
and provide power in the event of a power outage. One UPS unit has battery energy storage and one has
flywheel energy storage, with both of these units being double conversion types.
The PowerNet power distribution metering and control system can be used to manage energy cost,
analyze harmonics, view waveforms for transient events power quality, trending meter equipment usage,
maximize use of available capacity, etc.. The system has a total of 567 monitored points and provides
integrated metering from the 161 kV distribution level down to the 208V end user level. ORNL has one
of the largest PowerNet installations in the Southeast.
Overall, PowerNet is used as an engineering design / operation tool, for real time data monitoring of the
electrical infrastructure, for internal power billing, for power quality analysis and system monitoring, and
during medium voltage switching to verify system operation. Networking is via an ethernet interface to
several dedicated data servers in Building 5600.
11
Given the high level of metering, estimates are made to determine equipment consumption and losses.
Power losses from transformers are measured with on a spot-check basis and then estimates are made for
continual operation. Uninterruptable power supply (UPS) losses are estimated using load information and
manufacturers specification sheets. Likewise, lighting energy consumption throughout the computer
rooms is estimated based on fixture counts, power per fixture, and operating control. Light operation
includes dimming them to half power at night. The lighting power estimate is expected to be very close
to actual.
The 56 meters throughout the system along with system estimates are used to calculate the PUE for the
datacenters. The metering plan point list and the calculation methodology for the first and second floor
supercomputer centers in Building 5600 at ORNL is shown below. Dxxx stands for PowerNet device
number ‘xxx,’ where all the devices in this list are electric meters.
12
+ 1% xfmr loss
13
+ D97-5600 RPD1
+ D98-5600 RPD2
+ D99-5600 RPD3
+ D100-5600 RPD4
+ D101-5600 RPD6
+ D105-5600 RPD7
+ D106-5600 RPD8
+ D107-5600 RPD9
+ D108-5600 RPD10
+ 2% xfmr loss
subtotal
add subtotals
ERP Local power units on the UPS
D74-5600 ERP7
+ D75-5600 ERP8
+ D76-5600 ERP9
+ D77-5600 ERP10
+ D78-5600 ERP13
+ 7% xfmr / UPS loss
Disk drives
D135-5600 PDU-UPS2-1A MTR
+ 7% xfmr / UPS loss
Cray XT fan power has to be measured separately
The method of calculating electric power for the chiller plants is described under the Chiller Plant
Measurements section of this report. Cooling unit (AC unit) electricity is metered via the local power
units.
The electric meters are electronic programmable meters with extensive capabilities, but after initial
analysis of data variability and chiller plant performance variations, a decision was made to calculate the
performance metrics on a daily basis. Currently, PUE is determined monthly, based on the metering
protocol here. All the required meters were programmed to log daily kWh readings, and the daily data are
used to calculate the daily values of PUE for each month.
14
Data storage and networking systems do receive underfloor air cooling, but minimal underfloor air is
provided to the Jaguar and Kraken systems with the exception of 16 air cooled cabinets in the center of
the Jaguar system rows.
In the second floor center, underfloor air is provided to the previous-generation Cray systems, but the
other systems in the room do not have special cooling systems, so they are cooled by air from the
computer room AC units. The 2nd floor Cray cooling system utilizes supply air temperature control
instead of return air temperature control to ensure a constant temperature is delivered to the Cray systems.
This has reduced fan horsepower requirements and nearly eliminated issues with
hot spots. XT5 fan power
Hz kW
To acquire a better understanding of the affects that the Cray XT5 fan power has on 40 1.2
the PUE, manual frequency readings are taken on each of the 200 cabinets. The 45 1.6
fan power consumption is characterized as a function of frequency with the fan 50 2.2
motor variable speed frequencies being measured at each computer cabinet. The 55 2.8
fan power relationship is shown in the table to the right. Linear interpolation is 60 3.7
used to calculate between points. 65 4.7
Cray indicates the fan frequencies are controlled by the inlet air temperature. Since the fans are controlled
by inlet air temperature, and since most cabinets are cooled by the XDP system, it is assumed that the
level of computing work does not impact the frequencies much. Total XT5 fan power for the
supercomputers is obtained by ratioing up the representative measured Jaguar fan power for its 200
cabinets, to 288 to include Kraken.
The chilled water system currently has an interconnection with the MRF complex. Only a limited amount
of cooling can be supplied from the MRF complex systems, but the overall backup is important and the
chiller plants in the MRF complex are newer and more efficient. The chiller plants are controlled by the
Building Automation System (BAS), which also controls the data center cooling overall. The next
graphic shows the BAS main chiller plant display for Building 5600 as an example.
15
Chillers 1, 2, and 3 are the original 1200-ton chillers for the building (2002 vintage). Chillers 4 and 5
were added as part of the 2003–2008 upgrade (2006 vintage), and since they are more efficient, they are
the primary units to run and are run almost full out most of the year. In the winter, the balance of the
5600-5700-5800 complex cooling load can be provided by the chiller input from the MRF complex
plants, which are newer and more efficient than Chillers 4 and 5. Thus, during the winter, Chillers 1, 2,
and 3 are not run, and total cooling load for the data centers and the balance of the 5600-5700-5800
complex is provided by chillers 4 and 5 and the MRF complex interconnection. A diagram of data center
chilled water flows and meters is shown below.
16
The cumulative ton-hr are totalized for the key values needed to calculate chilled water electricity, and the
chilled water electricity data are used in calculating energy metrics.
Chillers 1, 2, and 3 have the lowest operating efficiency and thus are run as little as possible, typically
only in the summer when to meet overall building loads. Chillers 4 and 5 run almost continuously to
meet the data center loads. An interconnection to the MRF complex provides some chilled water that
typically is about equal to the general building loads much of the year. Chillers 4 and 5 are more efficient
than chillers 1, 2, and 3, and the MRF Chiller Complex is more efficient than any of the CSB chillers.
Chilled water energy metering currently provides the following data points:
Chilled water energy from chillers 1, 2, and 3 is not metered, but future plans include installation of these
energy meters. The protocol for calculating chilled water energy uses chillers 4 and 5, together with the
balance of plant installed with chillers 4 and 5 as representative of the total plant energy use for all chilled
water delivered to the data centers. The chilled water interconnection from MRF is expected to be closed
when the 5800 chiller plant comes on line and the current CSB chiller plant serves only data center loads.
Electricity use for chillers 4 and 5 (include related tower, pump, and peripheral electricity) is measured
separately. The daily kW/ton for chillers 4 and 5 and peripherals is calculated as total daily kWh divided
by total daily ton-hr delivered.
Chillers 4 and 5 and peripherals in the CSB at ORNL have a daily kW/ton of 0.62–0.75 in cold to mild
weather, and 0.75–0.85 in hotter weather. The annual average appears to be around 0.75. Climate
adjustment might make the climate-normalized value about 0.72. Daily chiller plant electricity for each
of the data centers is calculated as: kW/ton x ton-hr/day.
Pumping power of the refrigerant cooling system is not measured and is included in the “technical power”
total. This approach puts a high and low value on the performance metrics spectrum. Inclusion of the
fans in the “technical power” suggests higher performance than actual, while removal of the fan energy
completely suggests lower than actual performance.
17
Transformer losses have been measured on a spot basis and estimates should be close to actual. UPS
losses are estimated at 6% of UPS load. Lighting energy is also estimated, based on fixture counts, power
per fixture, and operating control. Estimated lighting power is expected to be close to actual.
A breakout of the July 2009 energy use is shown in the pie chart. Energy for Jaguar and Kraken includes
refrigerant cooling system energy and XT5 fan energy. The notation in the figure means:
The key performance metrics are summarized in the table below for the three cases of: XT5 fan power
included in total technical power, XT5 fan power excluded completely, and a reasonable middle ground.
The XT5 fan power is about 7.5% of total electricity use. Since all supercomputer cabinets have fan
power that is regularly included in the technical power when calculating PUE, the “reasonable” case is
bracketed by the other two cases in the table below for the first-floor data center in Building 5600. Since
the XDP cooling system power is still included in the technical power total, a PUE = 1.33 appears most
appropriate. PUE is expected to decrease in cold weather as cooling tower related energy consumption
drops.
18
cabinet system that is still the #20 supercomputer in the world as of June 2010. The second floor houses
the enterprise data systems for ORNL.
All computer systems are air-cooled. XT4 fan energy is included in the technical power for the
breakdown below. Lighting energy is estimated for this data center also, based on fixture counts, power
per fixture, and operating control. The lighting power estimate is expected to be close to actual.
Transformer losses and UPS losses are also estimated, similar to the estimates for the first-floor data
center.
The key performance metrics for the entire Breakout of 2nd-floor data center power
nd
2 -floor center are summarized in the table
below for the three cases of: XT4 fan
power included in total technical power, XT4 fan power excluded completely, and a reasonable middle
ground. The XT4 fan power is estimated based on the measurements for the XT5 cabinets at 4.6% of
total electricity use for the 2nd-floor datacenter. Similar to the first floor data center, when calculating
PUE, the “reasonable” case is bracketed by the other two cases in the table below. A PUE = 1.41 appears
most appropriate for the second floor. PUE is expected to decrease in cold weather as cooling tower
related energy consumption drops.
19
Performance metrics will continue to be calculated monthly and reported based on daily measurements of
electric use for over 50 electric meters and multiple chilled water energy meters. The current ability to
report PUE on a monthly basis is deemed less than desirable by internal management. There is a need to
have the capability to generate these values on a more real-time basis and then have these results
displayed on an electronic dashboard. Even if instant values are not an option, the possibility of a time-
delayed display would still be beneficial. A lag of a couple hours would still be helpful for diagnosing
system performance.
The ability to report real time results will require major improvements in data base configurations that
allow consistent data queries at regular intervals. Such queries are not possible with current data base
configurations for the electric metering and building automation systems. ORNL is actively seeking out
control products to allow for the integration of electric (PowerNet) and thermal (Metasys) data into one
common source. It is predicted that this would allow for better control of the various datacenter support
systems.
Efforts are being made to evaluate various energy technologies. For hot, humid climates like Tennessee,
it would be desirable if computer cabinets could be made to function with only cooling tower water to
cool the computers (or cool an intermediate ebullient system). Some room air cooling would still be
needed to keep room dewpoints at acceptable levels. Thus far, the design of such an approach remains
challenging. LED lighting technology will be tested in the near future at ORNL. The small pilot testing
will take place before deciding on large scale implementation in the datacenters. Deployment of the
technology will reduce energy consumption and alleviate maintenance issues.
Another large push is being made to find a control system that will optimize chiller plant efficiency. It is
believed that there are large potential savings in being able to properly control and stage various cooling
equipment on and off depending upon IT load and outdoor weather. ORNL is actively searching out
various options to allow for continuous optimization of chiller plant operation.
In addition, ORNL is studying the benefits of utilizing non-OEM installed metering to deal with issues
experienced in standard meters installed in OEM equipment. Facilities are generally plagued with issues
resulting from standard metering that comes with purchased equipment, including: inaccessibility, poor
performance, calibration difficulties, etc. Future studies will highlight issues and lessons learned at
ORNL during the installation and pilot testing of two new meters being installed on power distribution
panels.
The computing industry continues to experience high rates of change, and DOE supercomputer data
centers also see high rates of change. The information in this report is intended to help others consider
possible means of handling data center design, to document the energy performance metrics for the two
data centers in Building 5600, and also to understand how DOE’s largest data center operates.
20
12 References
ASHRAE, 2007 ASHRAE HVAC Applications. American Society of Heating, Refrigerating and Air-
Conditioning Engineers, Inc., 2007. Chapter 40 – Building Energy Monitoring.
ASHRAE, 2009 ASHRAE Handbook – Fundamentals. American Society of Heating, Refrigerating and
Air-Conditioning Engineers, Inc., 2009. Chapter 36 – Measurement and Instrumentation.
ASHRAE & The Green Grid, Real-Time Energy Consumption Measurements in Data Centers. American
Society of Heating, Refrigerating and Air-Conditioning Engineers, Inc., 2009.
ASTM E 1465: Standard Guide for Developing Energy Monitoring Protocols for Commercial and
Institutional Buildings or Facilities. ASTM International, 2005.
European Commission, European Union Code of Conduct for Data Centres, Version 1.0, 2008.
http://re.jrc.ec.europa.eu/energyefficiency/html/standby_initiative_data%20centers.htm
MacDonald, M., Energy Performance of ORNL Supercomputer Data Centers in Building 5600. FEMP.
December 2009.
MacDonald, J.M., Sharp, T.R., Getting, M.B., A protocol for monitoring energy efficiency improvements
in commercial and related buildings. ORNL/Con-291, 1989.
NREL, Best Practices Guide for Energy-Efficient Data Center Design. FEMP. February 2010.
United States Department of Energy, Strategic Sustainability Performance Plan. September 2010
United States Environmental Protection Agency, Report to Congress on Server and Data Center Energy
Efficiency, Public Law 109-431, ENERGY STAR Program, 2007.
http://www1.eere.energy.gov/femp/pdfs/epa_dc_report_congress.pdf
21
Appendix A: Other Data Center Metrics and Benchmarking
TGG has published several white papers on performance metrics for data centers. These and related
metrics are discussed below. Further metrics information can be found in the FEMP publication, Best
Practices Guide for Energy-Efficient Data Center Design.
Cooling System Efficiency = Average Cooling System Power (kW) / Average Cooling Load (ton)
Benchmark Values -
Airflow Efficiency
This metric provides an understanding of how efficiently air is moved through a data center.
Airflow Efficiency = Total Fan Power (W) / Total Fan Airflow (cfm)
Benchmark Values -
22
Heating, Ventilation and Air-Conditioning (HVAC) System Effectiveness
This metric is simply the ratio of the annual IT equipment energy consumption to the annual HVAC
energy consumption.
Benchmark Values –
Where,
T x = Mean temperature at equipment intake x
n = Total number of intakes
No temperature above
RCI HI = 100%
max recommended
No temperature below
Benchmark Values - RCI LO = 100%
min recommended
Often considered poor
RCI HI/LO < 90%
operation
Where,
∆T AHU = Typical air handler temperature drop (airflow weighted)
∆T EQUIP = Typical IT equipment temperature rise (airflow weighted)
23
EERE Information Center
1-877-EERE-INFO (1-877-337-3463)
www.eere.energy.gov/informationcenter
24