1) The document discusses key concepts in computer architecture including components of a computer system, technologies for building processors and memory, and hierarchical layers of hardware and software.
2) It describes different types of computer applications like personal computers, servers, supercomputers, and embedded computers.
3) Eight great ideas in computer architecture are outlined, focusing on techniques like pipelining, parallelism, and memory hierarchies to improve performance.
Eight ideas Components of a computer system Technology Performance Power wall Uniprocessors to multiprocessors; Instructions operations and operands representing instructions Logical operations control operations Addressing and addressing modes. Machine Structures Casses !" C!#$utin% A$$icati!ns an& Their Characteristics Pers!na c!#$uters 'PCs( Personal computers emphasie deli!ery of good performance to single users at low cost and usually e"ecute third#party soft ware. A computer designed for use $y an indi!idual% usually incorporating a graphics display% a &ey$oard% and a mouse. Ser)ers A computer used for running larger programs for multiple users% oft en simultaneously% and typically accessed only !ia a networ&. Ser)ers are the modern form of what were once much larger computers% and are sually accessed only !ia a networ&. 'er!ers are oriented to carrying large wor&loads% which may consist of either single comple" applications(usually a scientifi c or engineering application(or handling any small )o$s% such as would occur in $uilding a large we$ ser!er. Th ese low#end ser!ers are typically used for fi le storage% small $usiness applications% or simple we$ ser!ing. At the other e"treme are su$erc!#$uters% which at the present consist of tens of thousands of processors and many tera*+tes of memory% and cost tens to hundreds of millions of dollars. A$$icati!ns !" Su$erc!#$uters, high#end scientifi c and engineering calculations% such as weather forecasting% oil e"ploration% protein structure determination% and other large#scale pro$lems. E#*e&&e& c!#$uters A computer inside another de!ice used for running one predetermined application or collection of soft ware. *************###################################################************** EI-HT -REAT I.EAS IN COMPUTER ARCHITECTURE /0 .esi%n "!r M!!re1s 2a3 The numbers of transistors incorporated in a chip will approximately double every 24 months M!!re1s a3 resute& "r!# a /465 $re&icti!n !" such %r!3th in IC ca$acit+ #a&e *+ -!r&!n M!!re6 c!-"!un&er !" Inte0 70 Use A*stracti!ns t! Si#$i"+ .esi%n A ma)or producti!ity techni+ue for hardware and software is to use a$stractions to represent the design at different le!els of representation% lower#le!el details are hidden to offer a simpler model at higher e!els. 30 Ma8e the c!##!n case "ast ,a&ing the common case fast will tend to enhance performance $etter than optimiing the rare case. Ironically% the common case is often simpler than the rare case and hence is often easier to enhance. 90 Per"!r#ance )ia Paraeis# Computer architects ha!e offered designs that get more performance $y performing operations in parallel. Parallel -e+uests Assigned to computer e.g. search ./arcia0 Parallel Threads Assigned to core e.g. loo&up% ads Parallel Instructions 1 2 instruction 3 one time e.g. 4 pipelined instructions Parallel 5ata 1 2 data item 3 one time e.g. add of 6 pairs of words 50 Per"!r#ance )ia Pi$einin% Pipelining is an implementation techni+ue where multiple instructions are o!erlapped in e"ecution. The computer pipeline is di!ided in stages. Each stage completes a part of an instruction in parallel. The stages are connected one to the ne"t to form a pipe # instructions enter at one end% progress through the stages% and e"it at the other end. 60 Per"!r#ance )ia Pre&icti!n In some cases it can $e faster on a!erage to guess and start wor&ing rather than wait until you &now for sure% assuming that the mechanism to reco!er from a misprediction is not too e"pensi!e and your prediction is relati!ely accurate. :0 Hierarch+ !" #e#!ries Programmers want memory to $e fast% large% and cheap% as memory speed often shapes performance% capacity limits the sie of pro$lems that can $e sol!ed% and the cost of memory today is often the ma)ority of computer cost. Architects ha!e found that they can address these conflicting demands with a hierarchy of memories% with the fastest% smallest% and most e"pensi!e memory per $it at the top of the hierarchy and the slowest% largest% and cheapest per $it at the $ottom. Caches gi!e the programmer the illusion that main memory is nearly as fast as the top of the hierarchy and nearly as $ig and cheap as the $ottom of the hierarchy. A layered triangle icon is used to represent the memory hierarchy. The shape indicates speed% cost% and sie7 the closer to the top% the faster and more e"pensi!e per $it the memory; the wider the $ase of the layer% the $igger the memory. ;0 .e$en&a*iit+ )ia Re&un&anc+ Computers not only need to $e fast; they need to $e dependa$le. 'ince any physical de!ice can fail% we ma&e systems dependa$le $y including redundant components that can ta&e o!er when a failure occurs and to help detect failures. 8 ################################################################# " C!#$!nents !" a c!#$uter s+ste# The underlying hardware in any computer performs the same $asic functions7 inputting data% outputting data% processing data% and storing data. In$ut &e)ice A mechanism through which the computer is fed information% such as a &ey$oard. Out$ut &e)ice A mechanism that con!eys the result of a computation to a user% such as a display% or to another computer. 'CPU( Also called processor. The acti!e part of the computer% which contains the datapath and control and which adds num$ers% tests num$ers% signals I9: de!ices to acti!ate% and so on. .ata$ath The component of the processor that performs arithmetic operations C!ntr!, The component of the processor that commands the datapath% memory% and I9: de!ices according to the instructions of the program. Me#!r+ Th e storage area in which programs are &ept when they are running and that contains the data needed $y the running programs. Th e #e#!r+ is where the programs are &ept when they are running; it also contains the data needed $y the running programs. Th e memory is $uilt from 5-A, chips. DRAM stands for &+na#ic ran&!# access #e#!r+. ,ultiple 5-A,s are used together to contain the instructions and data of a program. In contrast to se+uential access memories% such as magnetic tapes% the RAM portion of the term 5-A, means that memory accesses ta&e $asically the same amount of time no matter what portion of the memory is read. .+na#ic ran&!# access #e#!r+ '.RAM( ,emory $uilt as an integrated circuit; it pro!ides random access to any location. Access times are 4; nanoseconds and cost per giga$yte in <;2< was =4 to =2;. ""#####################################################################################################"" Hierarchica a+ers !" har&3are an& s!"t3are >igure shows that the layers of software are organied primarily in a hierarchical fashion% with applications $eing the outermost ring and a !ariety of systems software sitting $etween the hardware and applications software. System Software7 'oftware that pro!ides ser!ices that are commonly useful% including operating systems% compilers% loaders% and assem$lers. There are many types of systems software% $ut two types of systems software are central to e!ery computer system software today7 an operating system and a compiler. An :perating system interfaces $etween a user?s program and the hardware and pro!ides a !ariety of ser!ices and super!isory functions. Among the most important functions are7 @andling $asic input and output operations Allocating storage and memory Pro!iding for protected sharing of the computer among multiple applications using it simultaneously. E"amples of operating systems in use today are Linu"% i:'% and Aindows Compiler: A program that translates high#le!el language statements into assem$ly language statements. Instruction7 A command that computer hardware understand and o$eys. Assembler: A program that translates a sym$olic !ersion of instructions into the $inary !ersion Assembly an!ua!e7 A sym$olic representation of machine instructions. "achine an!ua!e7 A $inary representation of machine instructions. *******************##############################****************** Techn!!%ies "!r <ui&in% Pr!cess!rs an& Me#!r+ Processors and memory ha!e impro!ed at an incredi$le rate% $ecause computer designers ha!e long em$raced the latest in electronic technology to try to win the race to design a $etter computer. A transistor is simply an on9off switch controlled $y electricity. The integrated circuitBICC com$ined doens to hundreds of transistors into a single chip. To descri$e the tremendous increase in the num$er of transistors from hundreds to million% the ad)ecti!e !ery large scale is added to the term% creating the a$$re!iation DL'I% for !ery large scale integrated circuit. The process starts with a silicon crystal ingot which loo&s li&e a giant sausage. An ingot is finely sliced into wafers no more than ;.2 inches thic&. These wafers then go through a series of processing steps% during which patterns of chemicals are placed on each wafer% creating the transistors% conductors and insulators. "anufacturin! #rocess of $nte!rated Circuits The patterned wafer is then chopped up% or diced% into these components called dies and more informally &nown as chi$s. 5icing ena$les you to discard only those dies that were unluc&y enough to contain the fl aws% rather than the whole wafer. This concept is +uantified $y the +ie& of a process% which is defined as the percentage of good dies from the total num$er of dies on the wafer. The cost of an integrated circuit rises +uic&ly as the die sie increases% due $oth to the lower yield and the smaller num$er of dies that fit on a wafer. To reduce the cost% using the ne"t generation process shrin&s a large die as it uses smaller sies for $oth transistors and wires. This impro!es the yield and the die count per wafer. The cost of an IC can $e e"pressed in three simple e+uations7 **************#####################***************** Per"!r#ance Accurately measuring and comparing different computers is critical. Performance can $e determined $y different ways. Res$!nse ti#e Also called e=ecuti!n ti#e. Th e total time re+uired for the computer to complete a tas&% including dis& accesses% memory accesses% I9: acti!ities% operating system o!erhead% CPU e"ecution time% and so on. 5atacenter managers are oft en interested in increasing thr!u%h$ut or *an&3i&th(the total amount of wor& done in a gi!en time. Thr!u%h$ut an& Res$!nse Ti#e 5o the following changes to a computer system increase throughput% decrease response time% or $othE 2. -eplacing the processor in a computer with a faster !ersion <. Adding additional processors to a system that uses multiple processors for separate tas&s(for e"ample% searching the we$ 5ecreasing response time almost always impro!es throughput. @ence% in case 2% $oth response time and throughput are impro!ed. In case <% no one tas& gets wor& done faster% so only throughput increases. Performance of computers primarily concerned with response time. To ma"imie performance% minimie the response time of e"ecution time for some tas&. Thus% performance and e"ecution time can $e related for a computer 8 as% Performance of two different computers can $e related +uantitati!ely li&e .8 is n times faster than F0 or e+ui!alently .8 is n times as fast as F0 to mean Measurin% Per"!r#ance Time is the measure of computer performance7 the computer that performs the same amount of wor& in the least time is the fastest. Program execution time is measured in seconds per program. Th e most straightforward defi nition of time is called wall clock time% response time% or elapsed time. Th ese terms mean the total time to complete a tas&% including dis& accesses% memory accesses% input/output BI9:C acti!ities% operating system o!erhead(e!erything. CPU e=ecuti!n ti#e Also called CPU ti#e. Th e actual time the CPU spends computing for a specific tas&. User CPU ti#e , The CPU time spent in a program itself. S+ste# CPU ti#e The CPU time spent in the operating system performing tas&s on $ehalf of the program. Almost all computers are constructed using a cloc& that determines when e!ents ta&e place in the hardware. Th ese discrete time inter!als are called c!c8 c+ces Bor tic8s% c!c8 tic8s% c!c8 $eri!&s% c!c8s% c+cesC. 5esigners refer to the length of a c!c8 $eri!& $oth as the time for a complete clock cycle Be.g.% <4; picoseconds% or <4; psC and as the clock rate Be.g.% 6 gigahert% or 6 /@C% which is the in!erse of the cloc& period. CPU Per"!r#ance an& Its >act!rs Users and designers oft en e"amine performance using different metrics. A simple formula relates the most $asic metrics Bcloc& cycles and cloc& cycle timeC to CPU time7 This formula ma&es it clear that the hardware designer can impro!e performance $y reducing the num$er of cloc& cycles re+uired for a program or the length of the cloc& cycle. Instructi!n Per"!r#ance The num$er of cloc& cycles re+uired for a program The term c!c8 c+ces $er instructi!n% which is the a!erage num$er of cloc& cycles each instruction ta&es to e"ecute% is oft en a$$re!iated as CPI. 'ince different instructions may ta&e diff erent amounts of time depending on what they do% CPI is an a!erage of all the instructions e"ecuted in the program. CPI pro!ides one way of comparing two diff erent implementations of the same instruction set architecture% since the num$er of instructions e"ecuted for a program will% of course% $e the same. The performance of a program depends on the algorithm% the language% the compiler% the architecture and the actual hardware. The following ta$le summaries how these components affect the factors in the CPU performance e+uation. A#&ah1s 2a3 Amdahl?s Law states that the performance impro!ement to $e gained from using some faster mode of e"ecution is limited $y the fraction of the time the faster mode can $e used. The P!3er 3a Goth cloc& rate and power increased rapidly and grew together since they are correlated. Gattery life can trump performance in the personal mo$ile de!ice% and the architects of warehouse scale computers try to reduce the costs of powering and cooling 2;;%;;; ser!ers as the costs are high at this scale. Hust as measuring time in seconds is a safer measure of program performance than a rate li&e ,IP'% the energy metric )oules is a $etter measure than a power rate li&e watts% which is )ust )oules9second. The dominant technology for integrated circuits is called C,:' Bcomplementary metal o"ide semiconductorC. >or C,:'% the primary source of energy consumption is so#called dynamic energy(that is% energy that is consumed when transistors switch states from ; to 2 and !ice !ersa. The dynamic energy depends on the capaciti!e loading of each transistor and the !oltage applied7 >re+uency switched is a function of the cloc& rate. Th e capaciti!e load per transistor is a function of $oth the num$er of transistors connected to an output Bcalled the fanoutC and the technology% which determines the capacitance of $oth wires and transistors. **********************#############################************************ The switch from Uni$r!cess!rs t! Muti$r!cess!rs -easons for switching from unicore processors to ,ulticore processors7 5ifficult to ma&e single#core cloc& fre+uencies e!en higher 5eeply pipelined circuits7 heat pro$lems speed of light pro$lems difficult design and !erification large design teams necessary ser!er farms need e"pensi!e air#conditioning ,any new applications are multithreaded /eneral trend in computer architecture Bshift towards more parallelismC