Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Introduction To Computer Systems: 1 Bryant and O'Hallaron, Computer Systems: A Programmer's Perspective, Third Edition

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 51

Carnegie Mellon

Introduction to Computer Systems

Lecture 1

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 1


Carnegie Mellon

Everything is bits
 Each bit is 0 or 1
 By encoding/interpreting sets of bits in various ways
 Computers determine what to do (instructions)
 … and represent and manipulate numbers, sets, strings, etc…
 Why bits? Electronic Implementation
 Easy to store with bistable elements
 Reliably transmitted on noisy and inaccurate wires

0 1 0

1.1V
0.9V

0.2V
0.0V

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 2


Carnegie Mellon

For example, can count in binary


 Base 2 Number Representation
 Represent 1521310 as 111011011011012
 Represent 1.2010 as 1.0011001100110011[0011]…2
 Represent 1.5213 X 104 as 1.11011011011012 X 213

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 3


Carnegie Mellon

Encoding Byte Values


al y
 Byte = 8 bits x c im ar
n
 Binary 000000002 to 111111112 He De Bi
0 0 0000
 Decimal: 010 to 25510 1 1 0001
 Hexadecimal 0016 to FF16 2 2 0010
3 3 0011
 Base 16 number representation 4 4 0100
 Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 5 5 0101
6 6 0110
 Write FA1D37B16 in C as 7 7 0111
– 0xFA1D37B 8 8 1000
9 9 1001
– 0xfa1d37b A 10 1010
B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 4


Carnegie Mellon

Example Data Representations

C Data Type Typical 32-bit Typical 64-bit x86-64

char 1 1 1

short 2 2 2

int 4 4 4

long 4 8 8

float 4 4 4

double 8 8 8

long double − − 10/16

pointer 4 8 8

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5


Carnegie Mellon

Understanding Performance
 Algorithm
 Determines number of operations executed
 Programming language, compiler, architecture
 Determine number of machine instructions executed per operation
 Processor and memory system
 Determine how fast instructions are executed
 I/O system (including OS)
 Determines how fast I/O operations are executed

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 6


Carnegie Mellon

Below Your Program


 Application software
 Written in high-level language
Architecture

Logic gates
 System software
Transistors  Compiler: translates HLL code to
machine code
Computer Architecture or
Instruction Set Architecture  Operating System: service code
 Handling input/output
 Managing memory and storage
 Scheduling tasks & sharing resources
 Hardware
 Processor, memory, I/O controllers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 7


Carnegie Mellon

Software

Compiler Assembler

Application software, MIPS compiler output, MIPS binary machine code:


a program in C: assembly language program:
00000000101000010000000000011000
00000000000110000001100000100001
swap (int v[ ], int k) swap; 10001100011000100000000000000000
{int temp; muli $2, $5, 4 10001100111100100000000000000100
10101100111100100000000000000000
temp = v[k]; add $2, $4, $2 10101100011000100000000000000100
v[k] = v[k+1]; lw $15, 0 ($2) 00000011111000000000000000001000
v[k+1] = temp; lw $16, 4 ($2)
} sw $16, 0 ($2)
sw $15, 4 ($2)
Application
software jr $31
Systems software

Hardware

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 8


Carnegie Mellon

Components of a Computer
The BIG Picture  Same components for
all kinds of computer
 Desktop, server,
embedded
 Input/output includes
 User-interface devices
 Display, keyboard, mouse
 Storage devices
 Hard disk, CD/DVD, flash
 Network adapters
 For communicating with other
computers

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 9


Carnegie Mellon

The Hardware of a Computer

Input
Control

Datapath Memory
Central Processing
Unit (CPU)
Application
or “processor” Output
software
Systems software

Hardware

FIVE EASY PIECES

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 10


Carnegie Mellon

Anatomy of a Computer

Output
device

Network
cable

Input Input
device device

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 11


Carnegie Mellon

Inside the Processor (CPU)


 Datapath: performs operations on data
 Control unit
 Cache memory
 Small fast SRAM memory for immediate access to data

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 12


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 13


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 15


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 16


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 17


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 18


Carnegie Mellon

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 19


Carnegie Mellon

Instruction Set Architecture (ISA)


 A set of assembly language instructions (ISA)
provides a link between software and hardware.
 Given an instruction set, software programmers and
hardware engineers work more or less
independently.
 ISA is designed to extract the most performance out
of the available hardware technology.
Application

Instruction
software
Systems software

Software Hardware

set
Hardware

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 20


Carnegie Mellon

The Instruction Set:


a Critical Interface

software

instruction set

hardware

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 21


Carnegie Mellon

ISA

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 22


Carnegie Mellon

What is Computer Architecture?


Easy Answer

Computer Architecture =
Instruction Set Architecture + Machine Organization

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 23


Carnegie Mellon

Must a Programmer Care About Hardware?

• Must know how to reason about program performance


and energy and security

• It makes sure you learn to work within architectural and


Resource constraints

• Memory management: if we understand how/where data


is placed, we can speed up program by
ensuring that relevant data is nearby

• Understand the effect of instruction-level parallelism


and out-of-order instruction scheduling,
and what that means for branching
24
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 24
Carnegie Mellon

Why learn Computer Organization


• Decline of Moore’s law

• Multi core processors

• Emergence of new platfroms

• Also for students planning to do MS….

• Embarrassing if you struggle with basic hardware


knowledge as a CS graduate

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 25


Carnegie Mellon

Course Organization

• 20% midterm, 45% final, 15% assignments,


• 10% quizzes , 10%project

• ~4 quizes
• ~4 assignments –assignments due at the start of class

• Co-operation policy: you may discuss – you may not see


someone else’s written matter when writing your solution

26
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 26
Carnegie Mellon

Microprocessor Performance

Source: H&P Textbook


50% improvement every year!!
What contributes to this improvement?
27
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 27
Carnegie Mellon

Power Consumption Trends

• Dyn power a activity x capacitance x voltage2 x frequency

• Voltage and frequency are somewhat constant now,


while capacitance per transistor is decreasing and number
of transistors (activity) is increasing

• Leakage power is also rising (function of #trans and voltage)

28 Source: H&P Textbook


Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 28
Carnegie Mellon

Summary

• Increasing frequency led to power wall in early 2000s

• Frequency has stagnated since then

• End of voltage (Dennard) scaling in early 2010s

• Has led to dark silicon and dim silicon (occasional turbo)

29
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 29
Carnegie Mellon

Important Trends

• Running out of ideas to improve single thread performance

• Power wall makes it harder to add complex features

• Power wall makes it harder to increase frequency

• Additional performance provided by: more cores, occasional


spikes in frequency, accelerators

30
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 30
Carnegie Mellon

Important Trends

• Historical contributions to performance:


1. Better processes (faster devices) ~20% X
2. Better circuits/pipelines ~15% X
3. Better organization/architecture ~15%

In the future, bullet-2 will help little and bullet-1 will


eventually disappear!

Pentium P-Pro P-II P-III P-4 Itanium Montecito


Year 1993 95 97 99 2000 2002 2005
Transistors 3.1M 5.5M 7.5M 9.5M 42M 300M 1720M
Clock Speed 60M 200M 300M 500M 1500M 800M 1800M

Moore’s Law in action At this point, adding transistors


to a core yields31little benefit
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 31
Carnegie Mellon

What Does This Mean to a Programmer?

• Today, one can expect only a 20% annual improvement;


the improvement is even lower if the program is not
multi-threaded

 A program needs many threads

 The threads need efficient synchronization and


communication

 Data placement in the memory hierarchy is important

 Accelerators should be used when possible

32
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 32
Carnegie Mellon

Challenges for Hardware Designers

• Find efficient ways to

 improve single-thread performance and energy

 improve data sharing

 boost programmer productivity

 manage the memory system

 build accelerators for important kernels

 provide security

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 33


Carnegie Mellon

Wafers and Dies

Source: H&P Textbook

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 34


Carnegie Mellon

Manufacturing Process

• Silicon wafers undergo many processing steps so that


different parts of the wafer behave as insulators,
conductors, and transistors (switches)

• Multiple metal layers on the silicon enable connections


between transistors

• The wafer is chopped into many dies – the size of the die
determines yield and cost

35
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 35
Carnegie Mellon

Processor Technology Trends

• Shrinking of transistor sizes: 250nm (1997) 


130nm (2002)  70nm (2008)  35nm (2014) 
2019, start of transition from 14nm to 10nm

• Transistor density increases by 35% per year and die size


increases by 10-20% per year… functionality improvements!

• Transistor speed improves linearly with size (complex


equation involving voltages, resistances, capacitances)

• Wire delays do not scale down at the same rate as


transistor delays

36
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 36
Carnegie Mellon

Memory and I/O Technology Trends

• DRAM density increases by 40-60% per year, latency has


reduced by 33% in 10 years (the memory wall!), bandwidth
improves twice as fast as latency decreases

• Disk density improves by 100% every year, latency


improvement similar to DRAM

• Networks: primary focus on bandwidth; 10Mb  100Mb


in 10 years; 100Mb  1Gb in 5 years

37
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 37
Carnegie Mellon

Performance Metrics

• Possible measures:
 response time – time elapsed between start and end
of a program
 throughput – amount of work done in a fixed time

• The two measures are usually linked


 A faster processor will improve both
 More processors will likely only improve throughput
 Some policies will improve throughput and worsen
response time

• What influences performance?

38
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 38
Carnegie Mellon

Execution Time

Consider a system X executing a fixed workload W

PerformanceX = 1 / Execution timeX

Execution time = response time = wall clock time


- Note that this includes time to execute the workload
as well as time spent by the operating system
co-ordinating various events

39
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 39
Carnegie Mellon

Speedup and Improvement

• System X executes a program in 10 seconds, system Y


executes the same program in 15 seconds

• System X is 1.5 times faster than system Y

• The speedup of system X over system Y is 1.5 (the ratio)


= perf X / perf Y = exectime Y / exectime X

• The performance improvement of X over Y is


1.5 -1 = 0.5 = 50% = (perf X – perf Y) / perf Y = speedup - 1

• The execution time reduction for system X, compared to


Y is (15-10) / 15 = 33%
The execution time increase for Y, compared40 to X is
(15-10) / 10 = 50%
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 40
Carnegie Mellon

Performance Equation - I

CPU execution time = CPU clock cycles x Clock cycle time


Clock cycle time = 1 / Clock speed

If a processor has a frequency of 3 GHz, the clock ticks


3 billion times in a second – as we’ll soon see, with each
clock tick, one or more/less instructions may complete

If a program runs for 10 seconds on a 3 GHz processor,


how many clock cycles did it run for?

41
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 41
Carnegie Mellon

 If a program runs for 2 billion clock cycles on a 1.5 GHz


processor, what is the execution time in seconds?

Execution time= x = 133s

Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 42


Carnegie Mellon

Performance Equation - II

 CPU clock cycles = number of instrs x avg clock cycles


per instruction (CPI)

Substituting in previous equation,

Execution time = number of instrs x avg CPI x clock cycle time

If a 2 GHz processor graduates an instruction every third cycle,


how many instructions are there in a program that runs for
10 seconds?

10= n x 3 x = 6.67 x10^9

43
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 43
Carnegie Mellon

Factors Influencing Performance

Execution time = clock cycle time x number of instrs x avg CPI

• Clock cycle time: manufacturing process (how fast is each


transistor), how much work gets done in each pipeline stage
(more on this later)

• Number of instrs: the quality of the compiler and the


instruction set architecture

• CPI: the nature of each instruction and the quality of the


architecture implementation

44
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 44
Carnegie Mellon

Example

Execution time = clock cycle time x number of instrs x avg CPI

Which of the following two systems is better?

• A program is converted into 4 billion MIPS instructions by a


compiler ; the MIPS processor is implemented such that
each instruction completes in an average of 1.5 cycles and
the clock speed is 1 GHz

• The same program is converted into 2 billion x86 instructions;


the x86 processor is implemented such that each instruction
completes in an average of 6 cycles and the clock speed is
1.5 GHz
45
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 45
Carnegie Mellon

Benchmark Suites

• Each vendor announces a SPEC rating for their system


 a measure of execution time for a fixed collection of
programs
 is a function of a specific CPU, memory system, IO
system, operating system, compiler
 enables easy comparison of different systems

The key is coming up with a collection of relevant programs

46
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 46
Carnegie Mellon

SPEC CPU

• SPEC: System Performance Evaluation Corporation, an industry


consortium that creates a collection of relevant programs

• The 2006 version includes 12 integer and 17 floating-point applications

• The SPEC rating specifies how much faster a system is, compared to
a baseline machine – a system with SPEC rating 600 is 1.5 times
faster than a system with SPEC rating 400

• Note that this rating incorporates the behavior of all 29 programs – this
may not necessarily predict performance for your favorite program!

• Latest version: SPEC 2017

47
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 47
Carnegie Mellon

Deriving a Single Performance Number

How is the performance of 29 different apps compressed


into a single performance number?

• SPEC uses geometric mean (GM) – the execution time


of each program is multiplied and the Nth root is derived

• Another popular metric is arithmetic mean (AM) – the


average of each program’s execution time

• Weighted arithmetic mean – the execution times of some


programs are weighted to balance priorities

48
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 48
Carnegie Mellon

Amdahl’s Law

• Architecture design is very bottleneck-driven – make the


common case fast, do not waste resources on a component
that has little impact on overall performance/power

• Amdahl’s Law: performance improvements through an


enhancement is limited by the fraction of time the
enhancement comes into play

• Example: a web server spends 40% of time in the CPU


and 60% of time doing I/O – a new processor that is ten
times faster results in a 36% reduction in execution time
(speedup of 1.56) – Amdahl’s Law states that maximum
execution time reduction is 40% (max speedup of 1.66)
49
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 49
Carnegie Mellon

Common Principles

• Amdahl’s Law

• Energy: performance improvements typically also result


in energy improvements – less leakage

• 90-10 rule: 10% of the program accounts for 90% of


execution time

• Principle of locality: the same data/code will be used


again (temporal locality), nearby data/code will be
touched next (spatial locality)

50
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 50
Carnegie Mellon

Power and Energy

• Total power = dynamic power + leakage power

• Dynamic power a activity x capacitance x voltage2 x frequency

• Leakage power a voltage

• Energy = power x time


(joules) (watts) (sec)

51
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 51

You might also like