Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Power Optimization Approach of ORCA Processor For 32/28nm Technology Node

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Power Optimization Approach of ORCA Processor for

32/28nm Technology Node


Davit Babayan,
National Polytechnic University of Armenia
Synopsys Armenia CJSC
Yerevan, Armenia
e-mail: davitb@synopsys.com

ABSTRACT 2. PREVIOUS RESEARCH

This paper presents a method of power optimization Previously research of ORCA processor power reduction
implemented on RISC architecture ORCA processor with the with multi-voltage method was performed using different
help of power gating approach aimed at significant reduction voltage supplies for different power domains (RISC core).
of leakage power consumption. Presented approach results As a result, power consumption was decreased by about
significantly decrease both dynamic and leakage power of ~15%, compared with standard design, but the area overhead
ORCA processor when used in combination with multi- was about ~12%, timing characteristics were globally
voltage power reduction method. unchanged (RISC core clock frequency 200MHz). [1]

Frequency 200 MHz


1. INTRODUCTION
Data required time 20.21 ns
Data arrival time -20.20 ns
ORCA processor is a 32-bit CPU microprocessor core.
Slack(MET) 0.01 ns
Microprocessor has two main interfaces: PCI interface and
source synchronous DDR interface for SDRAM. The sub- Total Power 75.46 mW (-15%)
block CLOCK_GEN contains two PLLs (Phase Locked Macro/Black Box 16340.796387 µm2
Loop) and a clock multiplier for the functional clocks area
(Fig.1). These two PLLs cancel the clock tree insertion delay Total cell area 661980.75374 µm2(+15%)
for the PCI I/O interface timing and for the SDRAM input Total area 678321.550135 µm2(+12%)
interface timing. The sub-block RESET_BLOCK has a Table1. Results of timing/power/area report with multi
synchronizing reset circuitry for the global, asynchronous voltage design method ORCA/RISC core implementation
prst_n signal. The synchronizing reset circuitry is used
during functional mode, but bypassed in test mode. The Deep investigation of ORCA processor structure showed that
design has two main interfaces, a PCI interface and an RISC core consists of more than 1000 registers, and about
SDRAM with a source synchronous double data rate ~60% of total power is spent on registers [2]. This evidence
interface (DDR). The SDRAM bus is capable of addressing made it possible to consider power-gating method to be
PC266 type memory. The DDR data bus is synchronous efficient in decreasing power of RISC core. Replacing all
with both edges of the incoming and outgoing clocks. The registers with retention type will provide power reduction,
processor core consists of a high-speed RISC machine with a which at the same time will increase area.
power save mode. The BLENDER block is shut down
during power save mode and RISC_CORE is slowed down
to half its frequency. All asynchronous interfaces between
3. THE POWER GATING
clock domains are isolated with dual-port FIFOs. [3] IMPLEMENTATION
Power gating method is one of the main power reduction
100
methods. For its implementation ISOLATION and
Instructions
PARSER
RETENTION cells are used in the design. ISOLATION cell
66/33 66/33 100 usually consists of logic-NAND (with 2 inputs) from the
PCI_RFIFO
200/100 library and two transistors (p-MOS connected to VDD and
PCI Bus PCI_CORE n-MOS connected to the ground) with the ENABLE signal
RISC_CORE
PCI_WFIFO connected to the gates of transistors (Fig.2).
133 133 100 100
VD
D
SDRAM_RFIFO EN
BLENDER
SDRAM
SDRAM_IF S (MP) Control Transistor
DDR Bus
L
SDRAM_WFIFO VDDV

pci_rst_n
prst_n sdram_rst_n
RESET_BLOCK sys_rst_n
clocks sys_2x_rst_n
ORCA_TOP
VSSV
Fig.1. ORCA TOP (functional block diagram) !EN
S (MN) Control Transistor
Control PCI bus is operating at 33 (0) or 66 (1) MHz, L
Control RISC_CORE operates at 200 (0) or at 100 (1) MHz.
VS
S
Fig.2. ISOLATION cell structure
ISOLATION cells are placed around the borders of shut-
down power domains and effectively keep stable signal at ## TOPLEVEL CONNECTIONS
the outputs of the sub-block during inactive mode by the # VDD
create_supply_port VDD
application of ENABLE signal. [5]
create_supply_net VDD -domain TOP
During power off (shut-down) mode, there is a necessity to connect_supply_net VDD -ports VDD
save the state and restore it after wake-up implemented using create_supply_net VDD -domain RISC -reuse
RETENTION registers (sometimes called SAVE/RESTORE # VSS
registers) (Fig.3). These have second lower backup power create_supply_port VSS
supply (VDDG) which always stays active even when main create_supply_net VSS -domain TOP
supply (VDD) is off. create_supply_net VSS -domain RISC -reuse
connect_supply_net VSS -ports VSS
# VDDG
on/off VDD create_supply_port VDDG
create_supply_net VDDG -domain TOP
create_supply_net VDDG -domain GPRS -reuse
connect_supply_net VDDG -ports VDDG
VDD_BACKUP
create_supply_net VDDGS -domain RISC
-------------------------------------------------------------------------------
## PRIMARY POWER NETS
set_domain_supply_net TOP -primary_power_net VDD -
CP primary_ground_net VSS
set_domain_supply_net RISC -primary_power_net VDDGS -
D primary_ground_net VSS
Q
## RISC SETUP SWITCH
SI
SE
RR create_power_switch risc_sw \
-domain RISC \
LD -input_supply_port {in VDDG} \
RS -output_supply_port {out VDDGS} \
save -control_port {risc_sd PwrCtrl/risc_sd} \
-on_state {state2002 in {risc_sd}}
set_isolation risc_iso_out \
Shut-Down -domain RISC \
restore -isolation_power_net VDD -isolation_ground_net VSS \
-clamp_value 1 \
Fig.3. RETENTION register structure -applies_to outputs
set_isolation_control risc_iso_out \
4. DESIGN PROCCES -domain RISC \
-isolation_signal PwrCtrl/risc_iso \
-isolation_sense low \
The design flow of ORCA with power gating method fully
-location parent
fits into standard digital design flow with UPF integration # RETAIN
presented in (Fig.4). set_retention risc_ret -domain RISC \
-retention_power_net VDDG -retention_ground_net VSS
set_retention_control risc_ret -domain RISC \
SPECIFICATION -save_signal {PwrCtrl/risc_restore low} \
-restore_signal {PwrCtrl/risc_restore high}

map_retention_cell risc_ret \
LOW POWER -domain RISC \
Logic Design INTEGRATION -lib_cells {RDFFNX1 RDFFARX2 }
(DC) (UPF)
# ADD PORT STATE INFO
add_port_state VDD -state {HV 0.95}
Physical design LOW POWER add_port_state VDDG -state {LV 0.7}
(ICC) INTEGRATION add_port_state risc_sw/out -state {LV 0.7}-state {OFF off}
(UPF) add_port_state VSS -state {GND 0} ## CREATE PST
create_pst orca_pst -supplies {VDD VDDG VDDGS }
add_pst_state function1 -pst orca_pst -state {HV LV LV }
STATIC TIMING add_pst_state sleep -pst orca_pst -state {HV LV OFF }
ANALYZES (PT)
Fig.4. ORCA design steps with power gating method.
Fig.5. Unified Power Format (UPF) for power gating
During implementation the power gating method was chosen
for RISC sub-block as it contains high and low-performance In UPF diagram (Fig. 6) two power domains were defined.
parts. Design specification describes differences between Special cells ISOLATION were placed around the boundary
two low power optimization methods (power gating and of the chosen domain. Standard registers were replaced with
multi-voltage design [1]). Unified Power Format (UPF) RETENTION registers. In the result UPF synthesis used the
description was developed for power gating implementation same design constraints for frequency (for PCI clock at 75
in both logic and physical design processes (Fig.5). MHz, System RISC clock at 200 MHz, SDRAM clock at 75
MHz) and physical utilization: 30% as multi-voltage design.
## CREATE POWER DOMAINS Values of power, timing and area of power gating and multi-
create_power_domain TOP voltage designs are shown in Table 1.
create_power_domain RISC -elements RISC
1000000
800000
600000 power gating

400000 multi voltage

200000 standard

0
AREA

Fig.8. Die area of power gating, multi-voltage and standard


designs.

5. CONCLUSION
Power gating design is an efficient method of reduction of
ORCA/RISC processors power consumption. Compared
with other methods of power optimization [1] (multi-voltage
Fig.6. Power gating UPF diagram for ORCA design) power gating is efficient by more than 8% with the
same timing specification. Moreover, power gating method
is more favorable if area increase can be neglected.
Power gating multi-voltage

Frequency 200 MHz 200 MHz 6. ACKNOWLEDGEMENTS


Data required 27.48 ns 20.21 ns Design was implemented using SAED 32/28nm EDK
time developed by Synopsys Armenia Educational Department
Data arrival time -24.86 ns -20.20 ns with the help of Synopsys Design Compiler and IC Compiler
tools made available by Synopsys Armenia Educational
Slack(MET) 2.62 ns 0.01 ns Department [4].

Total Power 69.42 mW 75.46 mW


~(- 8%) REFERENCES
2
Macro/Black Box 16340.8µm 16340.8µm2
area [1] Melikyan V., Babayan D., Babayan E., Petrosyan P.,
Total cell area 807616.35µm2(+22%) 661980.75 Melkonyan V., 32/28 nm low power ORCA processor with
µm2 multi-voltage supply, ICSMN-2015
Total area 823956.35 678321.55 [2] Jason M. Hart, Member, IEEE, Hoyeol Cho, Yuefei Ge,
µm2 ~(+21%) µm2 Gregory Gruber, Dawei Huang, Member, IEEE, A 3.6 GHz
Table2. Results of timing/power/area report with power 16-Core SPARC SoC Processor in 28 nm, IEEE JOURNAL
gating design method and multi-voltage design method OF SOLID-STATE CIRCUITS, VOL. 49, NO. 1,
JANUARY 2014
Total power of the circuit was reduced by more than 8% [3] ORCA documentation, Synopsys Inc. Synopsys Inc.
compared to multi-voltage design (Table 2), and by more 2008
than 23% compared with standard design. However, total [4] Goldman,R,. Bartleson, K. ; Wood, T. ; Kranen,
area of design increased by ~21% mainly in register cell area K. ; Melikyan, V. ; Babayan, E. 32/28nm Educational
(22%). Increased area is due to retention flip-flops being Design Kit: Capabilities, deployment and future, 2013 IEEE
much bigger than standard flops as well as additional Asia Pacific Conference on Postgraduate Research in
isolation cells. 200 MHz frequency is still supported (with Microelectronics and Electronics (PrimeAsia)
increase of 7ns in input to output latency). Differences [5] Gourisetty, Venkatesh, et al. "Low power design flow
between power gaiting, multi-voltage design and standard based on Unified Power Format and Synopsys tool chain."
design are presented in Fig. 7 for power and in Fig. 8 for Interdisciplinary Engineering Design Education Conference
area respectively. (IEDEC), 2013 3rd. IEEE, 2013

90
80
70
60
power gating
50
40 multi voltage
30
standard
20
10
0
POWER

Fig.7. Power consumption for power gating, multi-voltage


and standard designs.

You might also like