Computer Architecture CT-3221: By: Solomon S
Computer Architecture CT-3221: By: Solomon S
Computer Architecture CT-3221: By: Solomon S
CT-3221
By: Solomon S.
1
TEXT BOOK:
Computer Organization and Architecture: Designing for Performance,
6th,7th, 8th , 9th Ed.
2
– William Stallings
CHAPTER 1
INTRODUCTION TO COMPUTER
ARCHTECTURE
3
Computer architecture and computer organization
In describing computers, a distinction is often made between
computer architecture and computer organization.
Although it is difficult to give precise definitions for these
terms.
Computer architecture
It refers to those attributes of a system visible to a
programmer or, put another way, those attributes that
have a direct impact on the logical execution of a
program.
Examples of architectural attributes include the
instruction set, the number of bits used to represent
various data types (e.g., numbers, characters), I/O
mechanisms, and techniques for addressing memory. 4
Cont..
Computer organization
It refers to the operational units and their
interconnections that realize the architectural
specifications.
Examples of Organizational attributes include those
hardware details transparent to the programmer, such
as
control signals;
interfaces between the computer and peripherals;
and the memory technology used.
5
Cont…
6
Brief History of Computers
KEY POINT :- The evolution of computers has been characterized by increasing
processor speed, decreasing component size, increasing memory size, and
increasing I/O capacity and speed.
1. ENIAC
• Electronic Numerical Integrator And Computer
• Programmed manually by switches
• Eckert and Mauchly
• Decimal (not binary)
• 20 accumulators of 10 digits
Drawback of ENIAC:
• 18,000 vacuum tubes
• 15,000 square feet It had to be programmed manually
by setting switches and plugging
• 140 kW power consumption and unplugging cables.
• 5,000 additions per second
7
Cont…
2. Von Neumann/Turing
• Stored Program concept
• Main memory storing programs and data
• ALU operating on binary data
• Control unit interpreting instructions from memory and
executing
• Input and output equipment operated by control unit
• Princeton Institute for Advanced Studies
- IAS
8
Structure of von Neumann machine
It consists of:
A main memory, which
stores both data and
instructions
An arithmetic and logic
unit (ALU) capable of
operating on binary data
A control unit, which
interprets the instructions
in memory and causes
them to be executed
Input and output (I/O)
equipment operated by
Structure of the IAS computer
the control unit
9
IAS - details
• 1000 x 40 bit words
—Binary number
—2 x 20 bit instructions
• Set of registers (storage in CPU)
—Memory Buffer Register
—Memory Address Register
—Instruction Register
—Instruction Buffer Register
—Program Counter
—Accumulator
—Multiplier Quotient
10
Structure of IAS – detail
11
r
• d
12
Cont…
3. Transistors
• Replaced vacuum tubes
• Smaller
• Cheaper
• Less heat dissipation
• Made from Silicon (Sand)
• Second generation machines
13
Cont…
4. Microelectronics
• Literally - “small electronics”
• A computer is made up of gates, memory cells
and interconnections
• These can be manufactured on a
semiconductor
• e.g. silicon wafer
14
Generations of Computer-Summery I
First generation (1946 - 1957)
Vacuum tubes were larger components and resulted in
first generation computers being quite large in size, taking
up a lot of space in a room.
Second generation (1958 - 1964)
Transistors were smaller than vacuum tubes and allowed
computers to be smaller in size, faster in speed, and
cheaper to build.
Third generation (1964 - 1971)
Using IC's in computers helped reduce the size of
computers even more compared to second-generation
computers, as well as make them faster.
15
Cont…
Fourth generation (1972 - 2010)
Microprocessors, along with integrated circuits,
helped make it possible for computers to fit easily
on a desk and for the introduction of the laptop.
Fifth generation (2010 to present)
AI (artificial intelligence), an exciting technology
that has many potential applications around the
world.
16
Generations of Computer-Summery II
Vacuum tube - 1946-1957
Transistor - 1958-1964
Small scale integration - 1965 on
Up to 100 devices on a chip
Medium scale integration - to 1971
100-3,000 devices on a chip
Large scale integration - 1971-1977
3,000 - 100,000 devices on a chip
Very large scale integration - 1978 -1991
100,000 - 100,000,000 devices on a chip
Ultra large scale integration – 1991 -
Over 100,000,000 devices on a chip 17
Moore’s Law
18
Growth in CPU Transistor Count
19
Hardware/Software/Firmware
Hardware
Hardware is Physical. It's "Real," Sometimes Breaks, and
Eventually Wears Out.
Software
Software Is Virtual. It Can Be Copied, Changed, and
Destroyed.
Firmware
Firmware Is Virtual. It's Software Specifically Designed
for a Piece of Hardware.
Firmware is a software program permanently etched into a
hardware device such as a keyboards, hard drive, BIOS, or
video cards.
20
Basics of computer Architecture
The main components in a typical(happening in the usual
way)computer system are the processor, memory,
input/output devices, and the communication channels that
connect them.
Processor
The processor is the workhorse(dependable person who
does a lot of work) of the system; it is the component that
it executes a program by performing arithmetic and
logical operations on data.
It executes a program by processing data.
In a typical system there will be only one processor,
known at the central processing unit, or CPU.
Modern high performance systems, for example vector 21
Cont…
Memory
Memory is a passive component that simply stores
information until it is requested by another part of the
system.
It used simply stores information(data or instruction) until
it is requested by another part of the system.
During normal operations it feeds instructions and data to
the processor, and at other times it is the source or
destination of data transferred by I/O devices.
Information in a memory is accessed by its address.
22
Cont…
Input/output (I/O) device
The term I/O is used to describe any program, operation or
device that transfers data to or from a computer and to
or from a peripheral device
Any program, operation or device that transfers data
b/n a computer and a peripheral device.
Input/output (I/O) devices transfer information without
altering it between the external world and one or more
internal components.
I/O devices can be secondary memories, for example
disks and tapes, or devices used to communicate directly
with users, such as video displays, keyboards, and
mouses. 23
Cont…
Communication channels
The communication channels that tie the system
together.
It tie the system together.
Can either be simple links that connect two devices or
more complex switches that interconnect several
components and allow any two of them to communicate at
a given point in time.
When a switch is configured to allow two devices to
exchange information, all other devices that rely on the
switch are blocked, i.e.
They must wait until the switch can be reconfigured.
24
A stored-program computer
A computer with a von Neumann architecture stores
program and data in the same memory;
A computer with a Harvard architecture has separate
memories for storing program and data.
Stored-program computer is sometimes used as a synonym
for von Neumann architecture.
The von-Neumann architecture
Since you cannot access program memory and data
memory simultaneously, the Von Neumann architecture is
susceptible(easily affected or influenced) to bottlenecks
and system performance is affected.
25
26
Cont…
The Harvard architecture
In this case, there are at least two memory address
spaces to work with, so there is a memory register
for machine instructions and another memory
register for data.
Computers designed with the Harvard architecture
are able to run a program and access data
independently, and therefore simultaneously.
Harvard architecture has a strict separation
between data and code.
27
Cont…
Thus, Harvard architecture is more complicated but separate
pipelines remove the bottleneck that Von Neumann creates.
28
Computer Structures
Structure: The way in which the components are
interrelated.
Accumulator based machines
Stack machine
General register machines
I. Accumulator based machines
An accumulator machine, also called a 1-operand machine,
or a CPU with accumulator-based architecture, is a kind of
CPU where, although it may have several registers.
The CPU mostly stores the results of calculations in one
special register, typically called "the accumulator".
29
Cont…
Have a sharply limited number of data accumulator.(most
of the time one Acc.)
Additional address register with in CPU.
The Acc. serve both as a source of one operand a
destination for arithmetic operation.
30
Cont…
II. Stack machine
The CPU registers are organized as LIFO technique.
Operands are “PUSHED” on the stack from memory and
“POPED” off the stack in reverse order.
LIFO arithmetic operation remove operands from the top
of the stack and the result is placed on the stack replacing
the operands.
When expression evaluation is completed the result is
“POPED” into memory location to complete the process.
i.e. no operand address need to be specified during
arithmetic operation.
Referred as 0-address machine.
31
Cont…
III. General register machine
Have a set of numbered registers with in the CPU.
e.g. A, B, C, D, E, …
Unlike Acc machine & stack machine the register
in general register machine can be used for almost
any purpose.
All modern machines have a set of general purpose
register.
Such register, hold both address and data for
integer register.
Floating point register holds only data. 32
Designing for performance Idea
The speed of your processor (central processing unit or CPU), the quantity and speed of your memory
(random access memory or RAM), and the capacity and performance of your hard disk are all significant.
Other hardware factors play a part in determining the speed of your computer.
33
For designing performance of computer the major elements are:-
Speed of processor (CPU), the quantity and speed of memory(RAM), capacity and
performance hard disk and other hard ware factors.
The Processor Performance Equation
Essentially all computers are constructed using a clock
running at a constant rate.
These discrete time events are called ticks, clock ticks,
clock periods, clocks, cycles, or clock cycles.
Events of discrete time is called ticks, clock periods,
clocks, cycles, clock cycles, clock ticks
Computer designers refer to the time of a clock period by
its duration (e.g., 1 ns) or by its rate (e.g., 1 GHz).
CPU time for a program can then be expressed two ways:
1
CPU time = CPU clock cycles for a program x Clock cycle time 34
Cont…
36
Processor performance can be depend up on the following points.
Instruction count
clock cycle per instruction and
clock cycle time
N.B n% improvement on one of those point are n% improvement of the
performance of the CPU time
37
Cont…
o As this formula demonstrates, processor performance is
dependent upon three characteristics:
clock cycle (or rate),
clock cycles per instruction, and
instruction count.
o Furthermore, CPU time is equally dependent on these
three characteristics:
A 10% improvement in any one of them leads to a 10%
improvement in CPU time.
38
CPU time Example
39
Cont…
Speedup(Over all Speedup) Equation
Speedup tells us how much faster a task will run using the
computer with the enhancement as opposed to the original
computer.
Suppose that we can make an enhancement to a computer
that will improve performance when it is used.
Speedup is the ratio:
SU = Performance for entire task using the enhancement
when possible
Performance for entire task without using the enhancement
Alternatively,
SU = Performance for entire task without using the
enhancement
Performance for entire task using the enhancement 40
when possible
Cont…
Amdahl's
o Aim - Simply to calculate the performance gain that can be
obtained by improving some portion of a computer.
Reflection of Amdahl's Law
Amdahl's Law gives us a quick way to find the speedup
from some enhancement, which depends on two factors:
1) The fraction of the computation time in the original
computer that can be converted to take advantage of the
enhancement.
Example. If 20 seconds of the execution time of a program that takes
60 seconds in total can use an enhancement, the fraction is 20/60. This
value, which is called as Fractionenhanced is always less than or equal
to 1. 41
Cont…
2) The improvement gained by the enhanced
execution mode; that is, how much faster the task
would run if the enhanced mode were used for the
entire program, This value is the time of the original
mode over the time of the enhanced mode.
43
Over all Speedup Example
44
Cont…
Solution
SUo= _____________1_________
(1-FRenh) +_FRenh__
SUenh
45
Cont…
AMAT Equation
AMAT stands for Average Memory Access Time. It refers
to the time necessary to perform a memory access on
average.
The AMAT of a simple system with only a single level of
cache may be calculated as:
Where,
1. Hit time – Time to access data in a cache. one property of cache
is Hit time, the amount of time that it takes to access data in catch
and our hope is that, this is less than the 1 nanosecond. 46
Cont…
2. Miss penalty– is the time to replace the block from memory (that is,
the cost of
a miss)
3. Miss rate- is simply the fraction of cache accesses that result in a miss
Example of AMAT
Question?
Assume that the hit time 1 cycle. The miss rate falls 5% for an 8 KB
data cache & Assume the miss penalty is 20 cycles. So, what is the
average memory access time(AMAT)?
Solution
Average memory access time = Hit time + Miss rate x Miss penalty
47
Pentium and PowerPC Evolution
1. Pentium Evolution
Definition of Pentium. A family of 32 and 64-bit x86-based CPU chips from Intel. The term
may refer to the chip or to a PC that uses it. During their reign, Pentium chips were the most
widely used CPUs in the world for general-purpose computing.
50
Cont…
2. PowerPC Evolution
52