Advanced Computer Architecture 2
Advanced Computer Architecture 2
JIWAJI UNIVERSITY
PRIYANKA PAWAR
3RD YEAR(CSE)
171489954
Ques1: What is Pipeline?
Answer:
Pipelining is the process of accumulating instruction from the
processor through a pipeline. It allows storing and executing
instructions in an orderly process. It is also known as pipeline
processing.
Pipelining is a technique where multiple instructions are overlapped
during execution. Pipeline is divided into stages and these stages are
connected with one another to form a pipe like structure.
Instructions enter from one end and exit from another end.
Pipelining increases the overall instruction throughput.
In pipeline system, each segment consists of an input register
followed by a combinational circuit. The register is used to hold data
and combinational circuit performs operations on it. The output of
combinational circuit is applied to the input register of the next
segment.
Pipeline system is like the modern day assembly line setup in
factories. For example in a car manufacturing industry, huge
assembly lines are setup and at each point, there are robotic arms to
perform a certain task, and then the car moves on ahead to the next
arm.
Types of Pipeline:
It is divided into 2 categories:
1. Arithmetic Pipeline
2. Instruction Pipeline
Arithmetic Pipeline:
Arithmetic pipelines are usually found in most of the computers.
They are used for floating point operations, multiplication of fixed
point numbers etc. For example: The input to the Floating Point
Adder pipeline is:
X = A*2^a
Y = B*2^b
Here A and B are mantissas (significant digit of floating point
numbers), while a and b are exponents.
The floating point addition and subtraction is done in 4 parts:
Registers are used for storing the intermediate results between the
above operations.
Instruction Pipeline:
In this a stream of instructions can be executed by
overlapping fetch, decode and execute phases of an instruction cycle.
This type of technique is used to increase the throughput of the
computer system.
An instruction pipeline reads instruction from the memory while
previous instructions are being executed in other segments of the
pipeline. Thus we can execute multiple instructions simultaneously.
The pipeline will be more efficient if the instruction cycle is divided
into segments of equal duration.
Pipeline Conflicts:
There are some factors that cause the pipeline to deviate its normal
performance. Some of these factors are given below:
1. Timing Variations: All stages cannot take same amount of time.
This problem generally occurs in instruction processing where
different instructions have different operand requirements and thus
different processing time.
2. Data Hazards: When several instructions are in partial execution,
and if they reference same data then the problem arises. We must
ensure that next instruction does not attempt to access data before
the current instruction, because this will lead to incorrect results.
3. Branching: In order to fetch and execute the next instruction, we
must know what that instruction is. If the present instruction is a
conditional branch, and its result will lead us to the next instruction,
then the next instruction may not be known until the current one is
processed.
4. Interrupts: Interrupts set unwanted instruction into the
instruction stream. Interrupts effect the execution of instruction.
5. Data Dependency: It arises when an instruction depends upon
the result of a previous instruction but this result is not yet
available.
Advantages of Pipelining:
Disadvantages of Pipelining:
• A note on terminology:
Non-Linear Pipeline :
Non-Linear pipeline is a pipeline which is made of different
pipelines that are present at different stages. The different pipelines
are connected to perform multiple functions. It also has feedback
and feed-forward connections. It is made such that it performs
various functions at different time intervals. In Non-Linear pipeline
the functions are dynamically assigned.
Linear pipeline are static pipeline Non-Linear pipeline are dynamic pipeline because
because they are used to perform they can be reconfigured to perform variable
fixed functions. functions at different times.
Linear pipeline allows only Non-Linear pipeline allows feed-forward and
streamline connections. feedback connections in addition to the streamline
connection.
It is relatively easy to partition a Function partitioning is relatively difficult because
given function into a sequence of the pipeline stages are interconnected with loops in
linearly ordered sub functions. addition to streamline connections.
The Output of the pipeline is The Output of the pipeline is not necessarily
produced from the last stage. produced from the last stage.
The reservation table is trivial in The reservation table is non-trivial in the sense tha
the sense that data flows in there is no linear streamline for data flows.
linear streamline.
Static pipelining is specified by Dynamic pipelining is specified by more than one
single Reservation table. Reservation table.
All initiations to a static pipeline A dynamic pipeline may allow different initiations t
use the same reservation table. follow a mix of reservation tables.
Coherence mechanisms
he two most common mechanisms of ensuring coherency
are snooping and directory-based, each having their own benefits
and drawbacks. Snooping based protocols tend to be faster, if
enough bandwidth is available, since all transactions are a
request/response seen by all processors. The drawback is that
snooping isn't scalable. Every request must be broadcast to all nodes
in a system, meaning that as the system gets larger, the size of the
(logical or physical) bus and the bandwidth it provides must grow.
Directories, on the other hand, tend to have longer latencies (with a
3 hop request/forward/respond) but use much less bandwidth
since messages are point to point and not broadcast. For this reason,
many of the larger systems (>64 processors) use this type of cache
coherence.
Snooping: First introduced in 1983,[7] snooping is a process
where the individual caches monitor address lines for accesses
to memory locations that they have cached.[4] The write-
invalidate protocols and write-update protocols make use of this
mechanism.
For the snooping mechanism, a snoop filter reduces the snooping
traffic by maintaining a plurality of entries, each representing a
cache line that may be owned by one or more nodes. When
replacement of one of the entries is required, the snoop filter selects
for the replacement the entry representing the cache line or lines
owned by the fewest nodes, as determined from a presence vector in
each of the entries. A temporal or other type of algorithm is used to
refine the selection if more than one cache line is owned by the
fewest nodes.
Directory-based: In a directory-based system, the data
being shared is placed in a common directory that maintains
the coherence between caches. The directory acts as a filter
through which the processor must ask permission to load an
entry from the primary memory to its cache. When an entry is
changed, the directory either updates or invalidates the other
caches with that entry.
Distributed shared memory systems mimic these mechanisms in
an attempt to maintain consistency between blocks of memory in
loosely coupled systems.[9]
Coherence protocols
Write-update:
When a write operation is observed to a location that a cache
has a copy of, the cache controller updates its own copy of the
snooped memory location with the new data.
If the protocol design states that whenever any copy of the
shared data is changed, all the other copies must be "updated"
to reflect the change, then it is a write-update protocol. If the
design states that a write to a cached copy by any processor
requires other processors to discard or invalidate their cached
copies, then it is a write-invalidate protocol.
However, scalability is one shortcoming of broadcast
protocols.
Various models and protocols have been devised for
maintaining coherence, such as MSI, MESI (aka
Illinois), MOSI, MOESI, MERSI, MESIF, write-once, Synapse,
Berkeley, Firefly and Dragon protocol. In 2011, ARM
Ltd proposed the AMBA 4 ACE for handling coherency in SoCs.