Hyper-Threading Technology: Presented by Nagarajender Rao Katoori

Hyper-Threading Technology
Presented By
Nagarajender Rao Katoori

Introduction
To Enhance Performance-
Increase in clock rate

o Involves reducing clock cycle time
o Can increase the performance by increasing number of

instructions finishing per second
o H/w limitations limit this feature
Cache hierarchies
o Having frequently used data on the processor caches
reduces average accesses time
Pipelining
o Implementation Technique whereby multiple instructions

are overlapped in execution
o Limited by the dependencies between instructions
o Effected by stalls and effective CPI is greater than 1
Instruction Level Parallelism

o It refers to techniques to increase the number of
instructions executed in each clock cycle.
o Exists whenever the machine instructions that make up a

program are insensitive to the order in which they are
executed if dependencies does not exist, they may be
executed.
 Thread level parallelism
 Chip Multi Processing

o Two processors, each with full set of execution and
architectural resources, reside on a single die.
 Time Slice Multi Threading
o single processor to execute multiple threads by
switching between them
 Switch on Event Multi Threading
o switch threads on long latency events such as cache
misses
Thread level parallelism (cont..)
Simultaneous Multi Threading

o Multiple threads can execute on a single processor
without switching.
oThe threads execute simultaneously and make much
better use of the resources.
oIt maximizes the performance vs. transistor count and
power consumption.
Hyper-Threading Technology
 Hyper-Threading Technology brings the simultaneous multi-
threading approach to the Intel architecture.
 Hyper-Threading Technology makes a single physical

processor appear as two or more logical processors
 Hyper-Threading Technology first invented by Intel Corp.
 Hyper-Threading Technology provides thread-level-

parallelism (TLP) on each processor resulting in increased
utilization of processor and execution resources.
 Each logical processor maintain one copy of the architecture

state
Hyper-Threading Technology Architecture
Arch State Arch State Arch State
Processor Execution Processor Execution

Resources Resources
Processor with out Hyper- Processor with Hyper-

Threading Technology Threading Technology
Ref: Intel Technology Journal, Volume 06 Issue 01, February 14, 2002
Following resources are duplicated to support Hyper-
Threading Technology
 Register Alias Tables
 Next-Instruction Pointer
 Instruction Streaming Buffers and Trace Cache Fill

Buffers
 Instruction Translation Look-aside Buffer

Figure: Intel Xeon processor pipeline
Sharing of Resources

 Major Sharing Schemes are-
o Partition
o Threshold
o Full Sharing
Partition

 Each logical processor uses half the resources
 Simple and low in complexity
 Ensures fairness and progress
 Good for major pipeline queues
Partitioned Queue Example
• Yellow thread – It is faster thread

• Green thread – It is slower thread
Partitioned Queue Example
• Partitioning resource ensures fairness and

ensures progress for both logical processors.
Threshold
 Puts a threshold on number of resource entries a logical

processor can use.
 Limits maximum resource usage
 For small structures where resource utilization in burst and

time of utilization is short, uniform and predictable

 Eg- Processor Scheduler
Full Sharing
 Most flexible mechanism for resource sharing, do not limit

the maximum uses for resource usage for a logical processor
 Good for large structures in which working set sizes are
variable and there is no fear of starvation
 Eg: All Processor caches are shared

o Some applications benefit from a shared cache
because they share code and data, minimizing
redundant data in the caches
Netburst Microarchitecture’s execution pipeline
SINGLE-TASK AND MULTI-TASK MODES
• Two modes of operations

– single-task (ST)
– multi-task (MT).
• MT-mode- There are two active logical processors and

some of the resources are partitioned.
• There are two flavors of ST-mode: single-task logical

processor 0 (ST0) and single-task logical processor 1
(ST1).
• In ST0- or ST1-mode, only one logical processor is active,
and resources that were partitioned in MT-mode are re-
combined to give the single active logical processor use of
all of the resources
SINGLE-TASK AND MULTI-TASK
MODES
• HALT instruction that stops processor execution.
• On a processor with Hyper-Threading Technology,

executing HALT transition the processor from MT-mode
to ST0- or ST1-mode, depending on which logical
processor executed the HALT.
• In ST0- or ST1-modes, an interrupt sent to the halted

logical processor would cause a transition to MT-mode.
OPERATING SYSTEM
• For best performance, the operating system should
implement two optimizations.
– The first is to use the HALT instruction if one logical

processor is active and the other is not. HALT will
allow the processor to transition MT mode to either the
ST0- or ST1-mode.
– The second optimization is in scheduling software

threads to logical processors. The operating system
should schedule threads to logical processors on
different physical processors before scheduling two
threads to the same physical processor.
Business Benefits of Hyper-Threading
Technology
• Higher transaction rates for e-Businesses
• Improved reaction and response times for end-users and

customers.
• Increased number of users that a server system can support
• Handle increased server workloads
• Compatibility with existing server applications and

operating systems
Performance increases from Web server benchmark
Hyper-Threading Technology on performance
an OLTP workload
Conclusion
•Intel’s Hyper-Threading Technology brings the concept of
simultaneous multi-threading to the Intel Architecture.
•It will become increasingly important going forward as it adds a

new technique for obtaining additional performance for lower
transistor and power costs.
•The goal was to implement the technology at minimum cost

while ensuring forward progress on logical processors, even if
the other is stalled, and to deliver full performance even when
there is only one active logical processor.
References
• “HYPER-THREADING TECHNOLOGY
ARCHITECTURE AND MICROARCHITECTURE” by
Deborah T. Marr, Frank Binns, David L. Hill, Glenn
Hinton,David A. Koufaty, J. Alan Miller, Michael Upton,
intel Technology Journal, Volume 06 Issue 01, Published
February 14, 2002. Pages: 4 –15.
• “:HYPERTHREADING TECHNOLOGY IN THE
NETBURST MICROARCHITECTURE” by David
Koufaty,Deborah T. Marr, IEEE Micro, Vol. 23, Issue 2,
March–April 2003. Pages: 56 – 65.
• http://cache-
www.intel.com/cd/00/00/22/09/220943_220943.pdf
• http://www.cs.washington.edu/research/smt/papers/tlp2ilp.fin
al.pdf
• http://mos.stanford.edu/papers/mj_thesis.pdf
Thank you

Hyper-Threading Technology: Presented by Nagarajender Rao Katoori

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hyper-Threading Technology: Presented by Nagarajender Rao Katoori

Uploaded by

Copyright:

Available Formats

Hyper-Threading Technology

Increase in clock rate

o Can increase the performance by increasing number of

o H/w limitations limit this feature

o Implementation Technique whereby multiple instructions

o Limited by the dependencies between instructions

o Effected by stalls and effective CPI is greater than 1

Instruction Level Parallelism

o Exists whenever the machine instructions that make up a

 Chip Multi Processing

Simultaneous Multi Threading

 Hyper-Threading Technology makes a single physical

 Hyper-Threading Technology first invented by Intel Corp.

 Hyper-Threading Technology provides thread-level-

 Each logical processor maintain one copy of the architecture

Arch State Arch State Arch State

Processor Execution Processor Execution

Processor with out Hyper- Processor with Hyper-

 Register Alias Tables

 Instruction Streaming Buffers and Trace Cache Fill

 Instruction Translation Look-aside Buffer

• Yellow thread – It is faster thread

• Partitioning resource ensures fairness and

 Puts a threshold on number of resource entries a logical

 Limits maximum resource usage

 For small structures where resource utilization in burst and

 Most flexible mechanism for resource sharing, do not limit

 Eg: All Processor caches are shared

• Two modes of operations

• MT-mode- There are two active logical processors and

• There are two flavors of ST-mode: single-task logical

• On a processor with Hyper-Threading Technology,

• In ST0- or ST1-modes, an interrupt sent to the halted

– The first is to use the HALT instruction if one logical

– The second optimization is in scheduling software

• Improved reaction and response times for end-users and

• Increased number of users that a server system can support

• Handle increased server workloads

• Compatibility with existing server applications and

•It will become increasingly important going forward as it adds a

•The goal was to implement the technology at minimum cost

You might also like