Superscaling in Computer Architecture
Superscaling in Computer Architecture
Superscaling in Computer Architecture
6/5/2021
Computer Architecture
Assignment | Date: June 5, 2021
What is Super Scaling in Computer Architecture?
Super scaling is a concept of parallel execution of multiple independent pipeline
instructions at a time. In computer architecture, making a processor by using the
super scaling method is called a superscalar processor, and this architecture is
called superscalar architecture.
History:
Seymour Cray’s CDC 6600 from 1966 is often mentioned as the first superscalar
design. In 1964 the IBM announced System/360 Model 91 as the competitor of
CDC 6600. Then in 1967, it was another superscalar mainframe. The
Motorola MC88100, the Intel i960CA, and the AMD 29000-series 29050
microprocessors were the first commercial single-chip superscalar
microprocessors. The P5 Pentium was the first superscalar x86 processor.
The Nx586, P6 Pentium Pro, and AMD K5 were among the first designs which
decode x86-instructions asynchronously into dynamic microcode-like micro-
op sequences before actual execution on a superscalar computer organization.
The superscalar architecture was designed to improve the performance of the
operations by executing them simultaneously in multiple independent pipelines.
This technology increases the level of complexity in hardware designing.
Superscalar Architecture:
Superscalar processor design generally refers to a set of techniques that enable
the central processing unit (CPU) of a computer to attain a throughput of over
one instruction per cycle while executing a single sequential program. It is an
advanced pipelining technique.
Superscalar Processor:
A superscalar processor means that you dispatch multiple instructions during a
single clock cycle. The superscalar architecture was first implemented in RISC
processors, which use short and simple instructions to perform calculations. This
architecture can also be called “second-generation RISC.”
Because of their superscalar abilities, RISC processors have typically performed
better than CISC processors running at the same megahertz(MHz). CPU
Processing speeds are measured in clock cycles per second (MHz); However,
most CISC-based processors (such as the Intel Pentium) now include some RISC
architecture as well, which enables them to execute instructions in parallel.
Nearly all processors developed after 1998 are superscalar processors.
The processor or compiler in a superscalar architecture determines if an
instruction is dependent on the output of other sequential instructions or
whether it can be executed independently. The data dependency between
instructions is verified dynamically by the CPU hardware at run time. The
scheduling of instructions in a superscalar architecture is done dynamically, at
run time, by the processor. The superscalar architectures have mechanisms for
fetching multiple instructions, determining dependencies between instructions,
and executing instructions in order.
Instruction-level Parallelism:
The superscalar processor implements a form of parallelism called instruction-
level parallelism within a single processor. The degree to which the instruction
of a program can be executed parallels is called instruction-level parallelism.
That is, Instruction-level parallelism (ILP) is a measure of how many of
the instructions in a computer program can be executed simultaneously.
Pipelining in Superscalar Architecture:
• The Super Scalar process consists of multiple independent pipelines.
• Each pipeline consists of various stages so that each one can handle
multiple instructions at a time.
Example:
Limitation of Superscalar:
1. True data dependency:
Example: I1 : ADD r1, r2
I2 : MOV r3, r1
In this case, MOV of I2 instruction depends on the I1 results of r1. that is, the second(I2)
instruction needs data produced by the first(I1) instruction.
2. Procedure dependency:
• Situation 1: Can not execute instructions after a branch in parallel
with instructions before a unit – this holds up MULTIPLE pipelines.
• Situation 2: Variable-length instructions – must partially decode first
instruction for first pipe before second instruction for second pipe
can be fetched.
3. Resource Conflict:
• They occur if two or more instructions compete for the same resource
(register, memory, functional unit) at the same time. They are similar
to structural hazards discussed with pipelines.
• Introducing several parallel pipelined units, superscalar architectures
try to reduce a part of possible resource conflicts.
4. Output dependency:
• Output dependency occurs when two instructions write a result
together.
• If an instruction relies on the intermediate result, output dependency
problems could arise.
5. Antidependency:
• Antidependency is the exact opposite of data dependency.
• Data dependency: instruction-2 depends on data from instruction-1.
Anti-dependency: instruction-1 depends on data that could be
destroyed by instruction-2.
Example:
I1 : R3= R3+ R5
I2 : R4= R3+1
I3 : R3= R5+1
I4 : R7= R3+ R4
Instruction I3 can not complete before I2 starts as Instruction I2 needs a value in
R3, and Instruction I3 changes Instruction R3.