POWER7 Processors: The Beat Goes On: Joel M. Tendler, Executive IT Architect
POWER7 Processors: The Beat Goes On: Joel M. Tendler, Executive IT Architect
POWER7 Processors: The Beat Goes On: Joel M. Tendler, Executive IT Architect
+
–
POWER7
RS64IV Sstar 130nm -Multi-core
POWER6TM
RS64III Pulsar 180nm -Ultra High Frequency
.18um
RS64II North Star
.25um POWER5TM
-SMT
RS64I Apache .35um
BiCMOS .5um
POWER4TM Major POWER® Innovation
.5um -Dual Core
Muskie A35 .22um -1990 RISC Architecture
.5um -1994 SMP
-Cobra A10 -1995 Out of Order Execution
-64 bit
-1996 64 Bit Enterprise Architecture
POWER3TM -1997 Hardware Multi-Threading
.35um -630 -2001 Dual Core Processors
-2001 Large System Scaling
-2001 Shared Caches
.72um -2003 On Chip Memory Control
POWER2TM
P2SC -2003 SMT
.25um -2006 Ultra High Frequency
RSC .35um -2006 Dual Scope Coherence Mgmt
1.0um -2006 Decimal Float/VSX
.6um
-2006 Processor Recovery/Sparing
604e
-2009 Balanced Multi-core Processor
-603
POWER1 -2009 On Chip EDRAM
-AMERICA’s
-601
5
POWER7 Processors: The Beat Goes On
¾ Dynamic RAM
Uses MOSFETs and Capacitors
Charge , i.e., “0” or “1”, decays over time
(capacitor discharges) losing its state
To be useful, cell needs periodic refreshing
(dynamic)
Volatile memory (data lost when memory is
not powered)
¾ Each cell stores 1 bit
Memory cell: 1xMOSFET + 1xCapacitor
7
POWER7 Processors: The Beat Goes On
¾ Static RAM
Uses transistors only, e.g., MOSFET
8
POWER7 Processors: The Beat Goes On
10000
¾ Balanced Design
100
Multiple optimization points
Improved energy efficiency
10
RAS improvements
¾ Improved Thread Performance
1
Dynamic allocation of resources Thread Core Socket 32 Chip
System
Shared L3
¾ Increased Core parallelism Balanced View
4 Way SMT
10000
Aggressive out of order execution 32 Chip
System POWER6
¾ Extreme Increase in Socket
POWER7
Throughput 1000
System Thruput
¾ System 10
Thread
Scalable interconnect
Reduced coherence traffic 1
* Statements regarding SMP servers do not imply that IBM will Single Thread Performance
introduce a system with this capability.
10
Graphs for illustration purposes only (Not actual data)
POWER7 Processors: The Beat Goes On
POWER7: Core
¾ Execution Units DFU
2 Fixed point units
ISU
2 Load store units VSX
4 Double precision floating FPU
FXU
point
1 Vector unit
1 Branch
1 Condition register IFU
1 Decimal floating point unit CRU/BRU
6 Wide dispatch/8 Wide Issue
LSU
¾ Recovery Function Distributed
¾ 1,2,4 Way SMT Support
¾ Out of Order Execution
256KB L2
¾ 32KB I-Cache
¾ 32KB D-Cache
¾ 256KB L2
Tightly coupled to core
12
POWER7 Processors: The Beat Goes On
Microprocessor
Integrated Circuit OutputData
InputData
Addressable memory
13
POWER7 Processors: The Beat Goes On
POWER7™ is an 8-core, high performance Server chip. A solid chip is a good start.
But to win the race, you need a balanced system. POWER7 enables that balance.
14
POWER7 Processors: The Beat Goes On
Multi-core evolution
Multi-core evolution
15
POWER7 Processors: The Beat Goes On
- Economics 2 to 4 socket
16 to 32-way SMP Server
2-core 2-core
2-core 2-core
2 to 4 socket 8 to 32 socket
* Statements regarding SMP servers
4 to 8-way SMP Server 16 to 64-way SMP Server do not imply that IBM will introduce
a system with this capability.
16
POWER7 Processors: The Beat Goes On
Trends in Server Evolution
Single Image Virtualized/Cloud
Emerging Entry Server - A simple matter of riding
Virtualized/Cloud Platform the multi-core trend?
- Add more cores to the die,
Enabled by: 8-core 8-core
beef up some interfaces,
- Technology and scale to a large SMP?
- Innovation
Not so simple:
Driven by: 8-core 8-core - Emerging entry servers
Sim
- IT Evolution have characteristics similar
Time
i
lar
16 to 32-way SMP Server
large SMP servers
Ch
all
Traditional Entry Server Traditional High-End Server
en
Single Image Platform Virtualized Consolidation Platform
ge
Achieving solid virtual
machine performance
2-core 2-core
requires a Balanced
System Structure.
2-core 2-core
2 to 4 socket 8 to 32 socket
* Statements regarding SMP servers
4 to 8-way SMP Server 16 to 64-way SMP Server do not imply that IBM will introduce
a system with this capability.
17
POWER7 Processors: The Beat Goes On
Trends in Server Evolution
Single Image Virtualized/Cloud UltraScale Cloud
Emerging Entry Server Emerging High-End Server
Virtualized/Cloud Platform UltraScale Cloud Platform
2-core 2-core
2-core 2-core
2 to 4 socket 8 to 32 socket
* Statements regarding SMP servers
4 to 8-way SMP Server 16 to 64-way SMP Server do not imply that IBM will introduce
a system with this capability.
18
POWER7 Processors: The Beat Goes On
Multi-core evolution
19
POWER7 Processors: The Beat Goes On
20
POWER7 Processors: The Beat Goes On
22
POWER7 Processors: The Beat Goes On
23
POWER7 Processors: The Beat Goes On
Private Shared
Private Shared Private
Private
Large, Shared Private
Private 32M L3 Cache Private
Shared Private
Private
Working Set
Footprints
24
POWER7 Processors: The Beat Goes On
25
POWER7 Processors: The Beat Goes On
26
POWER7 Processors: The Beat Goes On
27
POWER7 Processors: The Beat Goes On
28
POWER7 Processors: The Beat Goes On
Private Private
Cloned Private Cloned
Private
Private
Cloned Cloned
Large, Shared
Private 32M L3 Cache Private
Shared Fast, Local Fast, Local Private
L3 Region Private
L3 Region
29
POWER7 Processors: The Beat Goes On
30
POWER7 Processors: The Beat Goes On
Private Private
Cloned Private Cloned
Private
Private
Cloned Cloned
Large, Shared
Private 32M L3 Cache Private
Shared Fast, Local Fast, Local Private
L3 Region Private
L3 Region
Multi-core evolution
32
POWER7 Processors: The Beat Goes On
POWER7 Requirements
Core
10-20GB/s ¾ Core:
sustained
bandwidth 10GB/s to 20GB/s sustained
per core
memory bandwidth per core
16GB to 32GB of cache
¾ Socket:
4 times growth in memory
bandwidth & capacity
¾ System:
16-32GB
Energy storage per Packaging more memory into
constraints core
similar volume, with similar energy
and cooling constraints
33
POWER7 Processors: The Beat Goes On
4) DDR3 DRAMs
- Supports 800, 1066, 1333, and 1600
34
* Statements regarding memory subsystem features do not imply that IBM will introduce a system with these capabilities.
POWER7 Processors: The Beat Goes On
35
POWER7 Processors: The Beat Goes On
Multi-core evolution
36
POWER7 Processors: The Beat Goes On
37
POWER7 Processors: The Beat Goes On
Multi-core evolution
38
POWER7 Processors: The Beat Goes On
Using local and remote SMP links, up to 32 POWER7 chips are connected
39
POWER7 Processors: The Beat Goes On
40
POWER7 Processors: The Beat Goes On
41
POWER7 Processors: The Beat Goes On
Compute Throughput
~5X
POWER6 High-End Server Global Scope POWER7 High-End Server
Virtualized Consolidation Platform Coherence UltraScale Cloud Platform
Broadcast
8 to 32 socket 8 to 32 socket
16 to 64-way SMP Server 64 to 256-way SMP Server
450
Compute Global Coherence
Throughput 1X GB/s Throughput
320
Global Coherence GB/s
Throughput
* Statements regarding SMP servers
do not imply that IBM will introduce
a system with this capability.
42
POWER7 Processors: The Beat Goes On
Nodal Scope
Speculative
Coherence
8 to 32 socket 8 to 32 socket
Broadcast
16 to 64-way SMP Server 64 to 256-way SMP Server
Chip Performance
¾ Chip Performance Improved Greater then 4X:
High performance on chip interconnect
Improved storage architecture
Dual high speed integrated memory controllers POWER6
POWER7 SMT4
¾ System
Achieves extreme Multi-core throughput while
providing Balance and SMP scaling by building
Floating Pt. Integer Commercial
on a foundation of solid innovation
Advanced SMP links will provide near linear * Performance estimates relate to processor
scaling for larger POWER7 systems. only and should not be used to estimate
projected server performance.
45
POWER7 Processors: The Beat Goes On
Wake-Up Latency
Reduce frequency to core
Caches and TLB remain coherent
Fast wake-Up Sleep
46
POWER7 Processors: The Beat Goes On
AC Power
Leverages excess energy capacity from:
Non worst case work loads
Idle cores
Vmin
¾ Processor and Memory Energy Usage can be
independently Balanced.
Real time hardware performance monitors
100 80 60 40 20 0
used.
Load Level (%)
On board power proxy logic estimates power
¾ Power Capping Support
Allows budgeting of power to different parts
of system
47
POWER7 Processors: The Beat Goes On
48
POWER7 Processors: The Beat Goes On
10
9
8
7
6
5
4
3
2
1
0
Win2000 Win2003 RHEL SOLARIS HP-UX SUSE AIX
The Yankee Group “2007-2008 Global Server Operating Systems Reliability Survey” as quoted in “Windows Server: The New King of Downtime” by Mark
Joseph Edwards at www.windowsitpro.com/article/articleid/98475/windows-server-the-new-king-of-downtime.html, March 5, 2008 and in
http://www.sunbeltsoftware.com/stu/Yankee-Group-2007-2008-Server-Reliability.pdf
49
POWER7 Processors: The Beat Goes On
ITIC Survey says Power Systems with AIX deliver 99.997% uptime
- 54% of IT executives and managers say that they require 99.99% or better availability for their applications
Core Recovery
BUF ¾ Leverage speculative execution resources to
enable recovery
BUF ¾ Error detected in GPRs FPRs VSR, flushed
and retried
¾ Stacked latches to improve SER
BUF
L3 eDRAM
¾ ECC protected
X8 Dimms IO Hub ¾ SUE handling
¾ Line delete
¾ 64 Byte ECC on Memory ¾ Spare rows and columns
Corrects full chip kill on X8 dimms
Spare X8 devices implemented GX IO Bus
¾ Dual memory chip failures do not cause PCI ¾ ECC protected
outage Bridge ¾ Hot add
¾ Selective memory mirror capability to recover
partition from dimm failures InfiniBand® Interface
¾ Hardware assisted scrubbing ¾ Redundant paths
¾ SUE handling
¾ Dynamic sparing on channel interface PCI Adapter * Statements regarding SMP servers
¾ PowerVM Hypervisor protected from full DIMM do not imply that IBM will introduce
failures a system with this capability.
51
POWER7 Processors: The Beat Goes On
52
POWER7 Processors: The Beat Goes On
Summary
Power Systems™ continue strong
7th Generation Power chip:
Balanced Multi-Core design
EDRAM technology
SMT4
Greater then 4X performance in similar power
envelope as previous generation
Scales to 32 socket, 1024 threads balanced
system
Building block for peta-scale PERCS project
Achieves extreme Multi-core throughput while
providing Balance and SMP scaling by Power7 High Volume Card
building on a foundation of solid innovation
53
POWER7 Processors: The Beat Goes On
Trademarks
The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.
Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not
actively marketed or is not significant within its relevant market.
Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.
*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA,
WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will
experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual
environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without
notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance,
compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
55