Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit I Overview & Instructions: Cs6303-Computer Architecture

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 16

CS6303-COMPUTER ARCHITECTURE

UNIT I OVERVIEW & INSTRUCTIONS


Eight ideas Components of a computer system Technology Performance Power wall
Uniprocessors to multiprocessors; Instructions operations and operands representing
instructions Logical operations control operations Addressing and addressing modes.
Machine Structures
Casses !" C!#$utin% A$$icati!ns an& Their Characteristics
Pers!na c!#$uters 'PCs(
Personal computers emphasie deli!ery of good performance to single users at low cost and
usually e"ecute third#party soft ware. A computer designed for use $y an indi!idual% usually
incorporating a graphics display% a &ey$oard% and a mouse.
Ser)ers
A computer used for running larger programs for multiple users% oft en simultaneously% and
typically accessed only !ia a networ&. Ser)ers are the modern form of what were once much
larger computers% and are sually accessed only !ia a networ&. 'er!ers are oriented to carrying
large wor&loads% which may consist of either single comple" applications(usually a scientifi c
or engineering application(or handling any small )o$s% such as would occur in $uilding a large
we$ ser!er.
Th ese low#end ser!ers are typically used for fi le storage% small $usiness applications% or simple
we$ ser!ing. At the other e"treme are su$erc!#$uters% which at the present consist of tens of
thousands of processors and many tera*+tes of memory% and cost tens to hundreds of millions
of dollars.
A$$icati!ns !" Su$erc!#$uters,
high#end scientifi c and engineering calculations% such as weather forecasting% oil e"ploration%
protein structure determination% and other large#scale pro$lems.
E#*e&&e& c!#$uters
A computer inside another de!ice used for running one predetermined application or collection
of soft ware.
*************###################################################**************
EI-HT -REAT I.EAS IN COMPUTER ARCHITECTURE
/0 .esi%n "!r M!!re1s 2a3
The numbers of transistors incorporated in a chip will approximately double every 24
months
M!!re1s a3 resute& "r!# a /465 $re&icti!n !" such %r!3th in IC ca$acit+ #a&e *+
-!r&!n M!!re6 c!-"!un&er !" Inte0
70 Use A*stracti!ns t! Si#$i"+ .esi%n
A ma)or producti!ity techni+ue for hardware and software is to use a$stractions to represent the
design at different le!els of representation% lower#le!el details are hidden to offer a simpler
model at higher e!els.
30 Ma8e the c!##!n case "ast
,a&ing the common case fast will tend to enhance performance $etter than optimiing the rare
case. Ironically% the common case is often simpler than the rare case and hence is often easier to
enhance.
90 Per"!r#ance )ia Paraeis#
Computer architects ha!e offered designs that get more performance $y performing operations in
parallel.
Parallel -e+uests Assigned to computer e.g. search ./arcia0
Parallel Threads Assigned to core e.g. loo&up% ads
Parallel Instructions 1 2 instruction 3 one time e.g. 4 pipelined instructions
Parallel 5ata 1 2 data item 3 one time e.g. add of 6 pairs of words
50 Per"!r#ance )ia Pi$einin%
Pipelining is an implementation techni+ue where multiple instructions are o!erlapped in
e"ecution. The computer pipeline is di!ided in stages. Each stage completes a part of an
instruction in parallel. The stages are connected one to the ne"t to form a pipe # instructions enter
at one end% progress through the stages% and e"it at the other end.
60 Per"!r#ance )ia Pre&icti!n
In some cases it can $e faster on a!erage to guess and start wor&ing rather than wait until you
&now for sure% assuming that the mechanism to reco!er from a misprediction is not too e"pensi!e
and your prediction is relati!ely accurate.
:0 Hierarch+ !" #e#!ries
Programmers want memory to $e fast% large% and cheap% as memory speed often shapes
performance% capacity limits the sie of pro$lems that can $e sol!ed% and the cost of memory
today is often the ma)ority of computer cost. Architects ha!e found that they can address these
conflicting demands with a hierarchy of memories% with the fastest% smallest% and most e"pensi!e
memory per $it at the top of the hierarchy and the slowest% largest% and cheapest per $it at the
$ottom. Caches gi!e the programmer the illusion that main memory is nearly as fast as the top of
the hierarchy and nearly as $ig and cheap as the $ottom of the hierarchy. A layered triangle icon
is used to represent the memory hierarchy. The shape indicates speed% cost% and sie7 the closer to
the top% the faster and more e"pensi!e per $it the memory; the wider the $ase of the layer% the
$igger the memory.
;0 .e$en&a*iit+ )ia Re&un&anc+
Computers not only need to $e fast; they need to $e dependa$le. 'ince any physical
de!ice can fail% we ma&e systems dependa$le $y including redundant components that
can ta&e o!er when a failure occurs and to help detect failures.
8 ################################################################# "
C!#$!nents !" a c!#$uter s+ste#
The underlying hardware in any computer performs the same $asic functions7 inputting
data% outputting data% processing data% and storing data.
In$ut &e)ice
A mechanism through which the computer is fed information% such as a &ey$oard.
Out$ut &e)ice
A mechanism that con!eys the result of a computation to a user% such as a display% or to another
computer.
'CPU( Also called processor. The acti!e part of the computer% which contains the datapath and
control and which adds num$ers% tests num$ers% signals I9: de!ices to acti!ate% and so on.
.ata$ath
The component of the processor that performs arithmetic operations
C!ntr!, The component of the processor that commands the datapath% memory% and I9:
de!ices according to the instructions of the program.
Me#!r+
Th e storage area in which programs are &ept when they are running and that contains the data
needed $y the running programs.
Th e #e#!r+ is where the programs are &ept when they are running; it also contains the data
needed $y the running programs. Th e memory is $uilt from 5-A, chips. DRAM stands for
&+na#ic ran&!# access #e#!r+. ,ultiple 5-A,s are used together to contain the
instructions and data of a program. In contrast to se+uential access memories% such as magnetic
tapes% the RAM portion of the term 5-A, means that memory accesses ta&e $asically the same
amount of time no matter what portion of the memory is read.
.+na#ic ran&!# access #e#!r+ '.RAM(
,emory $uilt as an integrated circuit; it pro!ides random access to any location. Access times
are 4; nanoseconds and cost per giga$yte in <;2< was =4 to =2;.
""#####################################################################################################""
Hierarchica a+ers !" har&3are an& s!"t3are
>igure shows that the layers of software are organied primarily in a hierarchical fashion% with
applications $eing the outermost ring and a !ariety of systems software sitting $etween the
hardware and applications software.
System Software7 'oftware that pro!ides ser!ices that are commonly useful% including operating
systems% compilers% loaders% and assem$lers.
There are many types of systems software% $ut two types of systems software are central to e!ery
computer system software today7 an operating system and a compiler.
An :perating system interfaces $etween a user?s program and the hardware and pro!ides a
!ariety of ser!ices and super!isory functions. Among the most important functions are7
@andling $asic input and output operations
Allocating storage and memory
Pro!iding for protected sharing of the computer among multiple applications using it
simultaneously.
E"amples of operating systems in use today are Linu"% i:'% and Aindows
Compiler: A program that translates high#le!el language statements into assem$ly language
statements.
Instruction7 A command that computer hardware understand and o$eys.
Assembler: A program that translates a sym$olic !ersion of instructions into the $inary !ersion
Assembly an!ua!e7 A sym$olic representation of machine instructions.
"achine an!ua!e7 A $inary representation of machine instructions.
*******************##############################******************
Techn!!%ies "!r <ui&in% Pr!cess!rs an& Me#!r+
Processors and memory ha!e impro!ed at an incredi$le rate% $ecause computer designers ha!e
long em$raced the latest in electronic technology to try to win the race to design a $etter
computer.
A transistor is simply an on9off switch controlled $y electricity. The integrated circuitBICC
com$ined doens to hundreds of transistors into a single chip. To descri$e the tremendous
increase in the num$er of transistors from hundreds to million% the ad)ecti!e !ery large scale is
added to the term% creating the a$$re!iation DL'I% for !ery large scale integrated circuit.
The process starts with a silicon crystal ingot which loo&s li&e a giant sausage. An ingot is finely
sliced into wafers no more than ;.2 inches thic&. These wafers then go through a series of
processing steps% during which patterns of chemicals are placed on each wafer% creating the
transistors% conductors and insulators.
"anufacturin! #rocess of $nte!rated Circuits
The patterned wafer is then chopped up% or diced% into these components called dies and more
informally &nown as chi$s.
5icing ena$les you to discard only those dies that were unluc&y enough to contain the fl aws%
rather than the whole wafer. This concept is +uantified $y the +ie& of a process% which is defined
as the percentage of good dies from the total num$er of dies on the wafer.
The cost of an integrated circuit rises +uic&ly as the die sie increases% due $oth to the lower
yield and the smaller num$er of dies that fit on a wafer. To reduce the cost% using the ne"t
generation process shrin&s a large die as it uses smaller sies for $oth transistors and wires. This
impro!es the yield and the die count per wafer.
The cost of an IC can $e e"pressed in three simple e+uations7
**************#####################*****************
Per"!r#ance
Accurately measuring and comparing different computers is critical. Performance can $e
determined $y different ways.
Res$!nse ti#e Also called e=ecuti!n ti#e. Th e total time re+uired for the computer to
complete a tas&% including dis& accesses% memory accesses% I9: acti!ities% operating system
o!erhead% CPU e"ecution time% and so on.
5atacenter managers are oft en interested in increasing thr!u%h$ut or *an&3i&th(the total
amount of wor& done in a gi!en time.
Thr!u%h$ut an& Res$!nse Ti#e
5o the following changes to a computer system increase throughput% decrease response time% or
$othE
2. -eplacing the processor in a computer with a faster !ersion
<. Adding additional processors to a system that uses multiple processors
for separate tas&s(for e"ample% searching the we$
5ecreasing response time almost always impro!es throughput. @ence% in case 2% $oth response
time and throughput are impro!ed. In case <% no one tas& gets wor& done faster% so only
throughput increases.
Performance of computers primarily concerned with response time. To ma"imie performance%
minimie the response time of e"ecution time for some tas&. Thus% performance and e"ecution
time can $e related for a computer 8 as%
Performance of two different computers can $e related +uantitati!ely li&e .8 is n times faster
than F0 or e+ui!alently .8 is n times as fast as F0 to mean
Measurin% Per"!r#ance
Time is the measure of computer performance7 the computer that performs the same
amount of wor& in the least time is the fastest. Program execution time is measured in seconds
per program.
Th e most straightforward defi nition of time is called wall clock time% response time% or elapsed
time. Th ese terms mean the total time to complete a tas&% including dis& accesses% memory
accesses% input/output BI9:C acti!ities% operating system o!erhead(e!erything.
CPU e=ecuti!n ti#e Also called CPU ti#e. Th e actual time the CPU spends computing
for a specific tas&.
User CPU ti#e , The CPU time spent in a program itself.
S+ste# CPU ti#e The CPU time spent in the operating system performing tas&s on $ehalf of the
program.
Almost all computers are constructed using a cloc& that determines when e!ents ta&e place in the
hardware. Th ese discrete time inter!als are called c!c8 c+ces Bor tic8s% c!c8 tic8s% c!c8
$eri!&s% c!c8s% c+cesC. 5esigners refer to the length of a c!c8 $eri!& $oth as the time for a
complete clock cycle Be.g.% <4; picoseconds% or <4; psC and as the clock rate Be.g.% 6 gigahert% or
6 /@C% which is the in!erse of the cloc& period.
CPU Per"!r#ance an& Its >act!rs
Users and designers oft en e"amine performance using different metrics. A simple formula
relates the most $asic metrics Bcloc& cycles and cloc& cycle timeC to CPU time7
This formula ma&es it clear that the hardware designer can impro!e performance $y reducing the
num$er of cloc& cycles re+uired for a program or the length of the cloc& cycle.
Instructi!n Per"!r#ance
The num$er of cloc& cycles re+uired for a program
The term c!c8 c+ces $er instructi!n% which is the a!erage num$er of cloc& cycles each
instruction ta&es to e"ecute% is oft en a$$re!iated as CPI. 'ince different instructions may ta&e
diff erent amounts of time depending on what they do% CPI is an a!erage of all the instructions
e"ecuted in the program. CPI pro!ides one way of comparing two diff erent implementations of
the same instruction set architecture% since the num$er of instructions e"ecuted for a program
will% of course% $e the same.
The performance of a program depends on the algorithm% the language% the compiler% the
architecture and the actual hardware. The following ta$le summaries how these components
affect the factors in the CPU performance e+uation.
A#&ah1s 2a3
Amdahl?s Law states that the performance impro!ement to $e gained from using some faster
mode of e"ecution is limited $y the fraction of the time the faster mode can $e used.
The P!3er 3a
Goth cloc& rate and power increased rapidly and grew together since they are correlated. Gattery
life can trump performance in the personal mo$ile de!ice% and the architects of warehouse scale
computers try to reduce the costs of powering and cooling 2;;%;;; ser!ers as the costs are high
at this scale. Hust as measuring time in seconds is a safer measure of program performance than a
rate li&e ,IP'% the energy metric )oules is a $etter measure than a power rate li&e watts% which is
)ust )oules9second.
The dominant technology for integrated circuits is called C,:' Bcomplementary metal o"ide
semiconductorC. >or C,:'% the primary source of energy consumption is so#called dynamic
energy(that is% energy that is consumed when transistors switch states from ; to 2 and !ice
!ersa. The dynamic energy depends on the capaciti!e loading of each transistor and the !oltage
applied7
>re+uency switched is a function of the cloc& rate. Th e capaciti!e load per transistor is a
function of $oth the num$er of transistors connected to an output Bcalled the fanoutC and the
technology% which determines the capacitance of $oth wires and transistors.
**********************#############################************************
The switch from Uni$r!cess!rs t! Muti$r!cess!rs
-easons for switching from unicore processors to ,ulticore processors7
5ifficult to ma&e single#core cloc& fre+uencies e!en higher
5eeply pipelined circuits7
heat pro$lems
speed of light pro$lems
difficult design and !erification
large design teams necessary
ser!er farms need e"pensi!e air#conditioning
,any new applications are multithreaded
/eneral trend in computer architecture Bshift towards more parallelismC

You might also like