Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Eeiol 2007jan16 Pow Ems Ta 01

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

EMBEDDED DESIGNS

Power saving with dynamic voltage


and frequency scaling
By Karl Lu,
Senior System and
Architecture Engineer,
Freescale Semiconductor
(China) Limited

Nowadays, chip vendors are


devoted to developing new
power saving techniques in Figure 1: Workflow of Prediction Based DVFS
order to extend the battery life
of portable devices such as cell
phone, MP3 player, portable
media player, notebook PC, etc.
In general, these techniques can
be categorized into two classes:
dynamic techniques and static
techniques.
Static techniques include dif-
ferent low-power modes, on-de-
mand gating of clocks and pow-
er domains, etc. The dynamic
technique is to dynamically scale
the CPU work frequency (and
voltage, because CPU requires
higher voltage when it runs at
higher frequency) according to
the performance requirement
of current applications running Figure 2: Architecture Diagram of Vertigo
on the CPU and achieve the goal
of energy saving. In theory, this Voltage Scaling). But the only
technique comes from the for- support of chips is not enough
mulas below: to make the DVFS take effect
and reach the real goal of en-
ergy saving. The comprehen-
sive design of software and
hardware is needed.
A typical DVFS workflow is as
From the equations above, it follows:
can be seen that scaling down Step 1: Monitor the signals
the frequency can only reduce related to the workload, acquire
the power in watt but can’t save the workload data and calculate
the energy in joule consumed the current system workload.
by a task, because for a given This job can be done by either
task, F*t is a constant. To re- software or hardware. Generally,
duce the energy consumption software does this by installing
effectively, the voltage should hooks to the system calls in the
also be scaled down when the kernel, especially the scheduler,
frequency is decreased. and calculating the workload ac-
Currently many chips sup- cording to the frequency these
port the dynamic voltage fre- system calls are called.
quency scaling (DVFS) feature.
For example, Intel supports The implementation based
SpeedStep while ARM sup- on hardware, e.g., Freescale’s
ports IEM (Intelligent Energy i.MX31, gets the workload data
Manager) and AVS (Adaptive by gathering the use info of Equations:

EE Times-India | January 2007 | eetindia.com 


some critical signals such as in-
terrupt line, cache line, memory
bus as well as others.
Step 2: On the basis of cur-
rent workload, predict the per-
formance requirement of sys-
tem in the next time slice. Many
prediction algorithms can be
used here and it is up to the real
application. This prediction can
also be done by either software
or hardware.
Step 3: Translate the predict-
ed performance requirement to
frequency and adjust the CPU
clock setting.
Step 4: Calculate the new
voltage corresponding to the
new frequency, notify the power
source module and ask it to
adjust the voltage to CPU. A Figure 3: i.MX31 DVFS Load Tracking Module Block Diagram
special power management IC is
needed here, such as Freescale’s execution time of a task advantages. E.g., LMS is similar disadvantage: the prediction
MC13783, or ICs from National • fs/read_write.c to adaptive filter and can adjust algorithm can not be selected
Semiconductor that support o hack sys_read() and parameters automatically but it freely. But this inconvenience
the PowerWise feature. They sys_write(), record the faces convergence issue. can be compensated to some
support small step voltage ad- times they are called ARM developed Vertigo to extend by adjusting the predic-
justing and they can complete • kernel/timer.c demonstrate the DVS (Dynamic tion parameters.
this adjusting very quickly (~10 o hack sys_nanosleep() Voltage Scaling) feature. This Freescale’s i.MX31 is a good
microseconds). and msleep(), record the software uses following formu- example for this. It is an appli-
Additionally, the frequency sleep time of a task las to estimate the workload, cation processor targeting at
and voltage should be adjusted • fs/ioctl.c deadline and performance: mobile multimedia market and
in specific order. When the fre- o hack sys_ioctl(), record (see Equations, 8, 9, 10, & 11) it has powerful performance for
quency is adjusted from high the times it is called This algorithm works well for audio and video processing.
to low, it should be scaled • kernel/exit.c those OS tasks whose workload An ARM11 core is integrated
down before the voltage is o hack do_exit(), record changes slowly, e.g., MPEG de- into this chip which inherits
decreased. Contrarily, in the the time when a task coder. DVS technique from ARM and
frequency up case, the volt- exits voluntarily In the architecture of Vertigo, derives DVFS.
age should be increased be- • include/asm_xxx/system. once the predictor finishes per- In this chip, CPU workload
fore the frequency is scaled h,arch/xxx/system.c formance estimation, it submits track and performance predic-
up. Figure 1 illustrates the o hack arch_idle(), cal- the result to a policy manager. It tion is completed by the hard-
simple workflow of DVFS [2]. culate the time that cpu_ is the duty of the policy manager ware automatically. The CPU
idle() thread is scheduled to decide whether to accept the workload track module diagram
DVFS Realization Based on prediction result and adjust the is shown below.
Software When one predicts the sys- performance setting. [Please In Figure 3 above, 16 gen-
In the implementation of DVFS tem workload of next time slice, refer to (3) for the detailed eral purpose load signals are
based on software, hooks are in- the acquired workload data of implementation of Vertigo.] The sampled and weighted. The
stalled to the system calls in the previous several slices can be architecture diagram of Vertigo weighted sum is sent to the load
kernel. They gather the use infor- used. The predicted workload isshown in Figure 2: adder where it is added to the
mation about of system calls and can be gotten from CPU idleness signal data (simply
estimate the system workload. DVFS Realization Based on averaged to reduce the sample
The obvious location where the the formula below: Hardware clock frequency).
hooks are installed is scheduler. As mentioned before, the job of The output of the load ad-
Other locations include read/ CPU load track and performance der is fed to the Exponential
write interfaces, timers, etc. For Equation 3: prediction can also be done by Moving Average block which
instance, in Linux kernel, hooks the hardware. This method not performs EMA (Exponential
are installed to following places: The prediction algorithm var- only improves the reliability of Moving Average) algorithm
ies with different h. Following workload track and calculation, and predicts the performance
• kernel/sched.c are some examples: but also reduces the overhead of requirement. The estimation
o hack __schedule(), insert (see above Equations: 4, 5, 6, 7 ) CPU performing such calculation data from EMA block is com-
code before and after All the algorithms above have and estimation. pared with the predefined
schedule(), record the their own advantages and dis- Of course, it brings another threshold values.

 eetindia.com | January 2007 | EE Times-India


on software and hardware re- perceive the audio or video qual-
spectively. ity deterioration. It will worsen
Intrinsyc ported the IEM the user experience greatly and
software developed by ARM to compromise his confidence
WinCE [5] and measured the in DVFS. The author met such
power consumption of CPU problems when performing the
when IEM was enabled or dis- DVFS test.
abled. The IEM software runs The moving average algo-
on i.MX31 advanced develop- rithm used by IEM only works
ment suite. It can be thought well for simple use cases such
of as an implementation of as only one application running
DVFS based on software since on CPU. And the EMA (expo-
it doesn’t make use of the nential moving average) used
Figure 4: Workflow of i.MX31 DVFS i.MX31 built-in DVFS. by i.MX31built-in DVFS is not
The moving average algo- a panacea either. If the built-in
If the performance predic- interface is made up of two lines rithm is used to estimate the DVFS is enabled, the CPU can’t
tion is greater than the upper and the state of these two lines workload [the h in equation (3) is play some songs of Pink Floyd
threshold, the frequency should means different voltage adjust- always 1/N]. In addition, a GPIO smoothly (after some DVFS pa-
be scaled up. Otherwise, if the ing request: 00-no change, 01- is used to indicate whether the rameters such as lower frequen-
prediction is less than the lower decrease the voltage by one CPU enters IDLE state (cpu_idle() cy threshold are modified, these
threshold, the frequency should step, 10-increase the voltage thread is scheduled). The smaller songs can be played well).
be scaled down. by one step, 11-increase the the IDLE portion is, the higher But the author believes that
The interrupt request of voltage to top. the CPU utilization is. The bench- DVFS will be applied wider and
frequency or performance ad- DPTC in the figure above marking result is presented wider with the evolution of
justing will be sent to the CPU means Dynamic Process and below. prediction algorithm and other
itself or to an external processor. Temperature Control. This tech- In order to verify the real ef- techniques, because it has dem-
The processor will handle the nique can adjust the power fect of DVFS implementation onstrated the great potential in
request in the ISR (interrupt voltage according to the chip based on hardware, the author power reduction. And power re-
service routine) and set the process and current ambient measured the power consump- duction is often the first require-
correct frequency and voltage. temperature and in result, save tion on i.MX31 advanced devel- ment for many portable devices.
In Figure 4, which illustrates the energy effectively. It is also opment suite. The multimedia
the total workflow, CCM means an attractive feature of i.MX31. applications (audio or video References
Clock Control Module, which player) run on Linux. The bench- 1. Freescale, 2/2006, i.MX31
is responsible for adjusting the Real Effect of DVFS marking result is presented in Multimedia Application
CPU frequency. PMIC means To verify the real effect of DVFS, the Table 3. Processor Reference Manual,
Power Management IC, which the applications should be ex- It can be seen clearly from Rev 1
is responsible for supplying the ecuted on the given CPU and the tables above that DVFS 2. Freescale, Boris Bobrov
power needed by CPU. the actual power consumption implementation based on ei- & Michael Priel, 6/2005,
This chip provides two in- should be measured with DVFS ther software or hardware can i.MX31Power Management
terfaces to CPU: normal SPI enabled and disabled. The ac- reduce the power consumption White Paper, Rev 0
(Serial Peripheral Interface) tual result of power consumption effectively. 3. ARM, Krisztian Flautner,
and dedicated DVS interface for measurement is presented here et al, OSDI 2002, Vertigo:
dynamic voltage scaling. This for DVFS implementation based Factors Affecting DVFS Automatic Performance-
Application Setting for Linux
The idea of dynamic voltage and/ 4. ARM, Krisztian Flautner,
or frequency scaling appeared for et al, DesignConn 2003,
long and an open source project A Combined Hardware-
Table 1: Video File Information Used for IEM Testing cpufreq is developing the soft- Software Approach for
ware for it. But this technique is Low-Power SoCs: Applying
not widely applied till now. One Adaptive Voltage Scaling
of the key factors is the reliability and Intelligent Energy
of performance prediction. Management Software
Neither prediction algorithm 5. Intrinsyc, Suji Velupillai
Table 2: Power Consumption with IEM Enabled or Disabled is 100% reliable, nor one works & Ken Tough, 9/2006,
well for all applications. And Intelligent Energy Manager
for those real time applications (IEM) Benchmarking on a
such as audio or video, it is not Freescale’s i.MX31 Multimedia
acceptable if the prediction fails. Processor
If the deadline is missed, e.g., the
audio or video frame misses its
Table 3: Power Consumption with i.MX31built-in DVFS Enabled or Disabled presentation time, the user will Email   Send inquiry

EE Times-India | January 2007 | eetindia.com 

You might also like