0% found this document useful (0 votes)

155 views

Assessing and Understanding Performance

This document discusses various methods for measuring and understanding computer performance, including: - Elapsed time, response time, and CPU (execution) time for measuring overall and processor performance. - Metrics like throughput and MIPS (millions of instructions per second). - Factors that affect performance like clock rate, clock cycles per instruction (CPI), and instruction count. - The importance of using real applications and benchmarks like SPEC for evaluating overall system performance rather than individual metrics like MIPS.

Uploaded by

Hoang Anh Nguyen

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

155 views

Assessing and Understanding Performance

Uploaded by

Hoang Anh Nguyen

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 31

Chapter 4 Assessing and Understanding Performance

Bo Cheng

Which One Is Good?

Airplane
Boeing 737-100
Boeing 747 BAC/Sud Concorde Douglas DC-8-50

Passengers
101
470 132 146

Range (mi)
630
4150 4000 8720

Speed (mph)
598
610 1350 544

Depends on measures of performance Cruising speed Longest range Largest capacity

Measuring Performance

Elapsed Time, wall-clock time or response time

Total time to complete a task

Including disk and memory accesses, I/O , etc.

a useful number, but often not good for comparison purposes Doesn't count I/O or time spent running other programs can be broken up into system CPU time, and user CPU time CPU time = user CPU time +system CPU time time spent executing the lines of code that are "in" our program

CPU (execution) time

Our focus: user CPU time

CPU Performance Metrics

Response time: the time between the start and the completion of a task (in time units) Throughput: the total amount of work done in a given time (in number of tasks per unit of time)

Performance

Problem: 1 Performancex Machine A runs a execution _ timex program in 10 sec. Performancex execution _ timey n Machine B runs the Performancey execution _ timex same program in 15 sec. How much faster is A than B ?

15 1.5 10

A is 1.5 times faster than B

Clock Rate Measurement

Name
Clock cycle: The time for one clock period running at a constant rate Clock rate is given in Hz (=1/sec) clock_cycle_time = 1/clock_rate (in sec)

Example

Measurement

Millisecond Microsecond Nanosecond Picosecond Femtosecond

1 msec (ms) 1 usec (us) 1 nsec (ns) 1 psec (ps) 1 fsec (fs)
=> => => =>

1.E-03 1.E-06 1.E-09 1.E-12 1.E-15

10 nsec clock cycle 1 nsec clock cycle 500 psec clock cycle 200 psec clock cycle

100 MHz clock rate 1 GHz clock rate 2 GHz clock rate 5 GHz clock rate

MHz

http://www.webopedia.com/TERM/M/MHz.html

One MHz represents one million cycles per second. The speed of microprocessors, called the clock speed, is measured in megahertz.

For example, a microprocessor that runs at 200 MHz executes 200 million cycles per second.

One GHz represents 1 billion cycles per second.

CPU Time or CPU Execution Time

The actual time the CPU spends computing for a specific task This time accounts for the time CPU is computing the given program, including operating system routines executed on the programs behave, and it does not include the time waiting for I/O and running other programs. Performance of processor/memory = 1 / CPU_time

CPU Execution Time Formula

N E N *T R
E = CPU Execution time for a program N = Number of CPU clock cycles for a program T = clock cycle Time R = clock Rate

Example

Job

N 10 4
10 seconds

Job

6 seconds

1.2 * N 6 R
Computer A 4 GHz

R = 8 GHz

Computer B X GHz

Clock cycles Per Instruction (CPI)

The average number of clock cycles per instruction for a program or program fragment

N = Number of CPU clock cycles for a program I = total Instructions for a program C = CPI

N I *C

The Big Picture

N E N *T R E N *T I *C *T Seconds Instructions Clock _ cycles Seconds Time * * Pr ogram Pr ogram Instructions Clock _ cycle N I *C E R R
Instruction count depends on the architecture, but not on the exact implementation Average CPI depends on design details and on the mix of types of instructions executed in an application

Understanding Program Performance

Instruction Count Algorithm Programming Language Compiler ISA X X X X

CPI Possibly X X X

Clock Rate

Using Performance Equation

Clock Cycle Time
Computer A Computer B 250 ps 500 ps

CPI
2 1.2

Which computer is faster for this program, and by how much?

CPU A I * 2 * 250 500I CPUB I * 1.2 * 500 600I PerformanceA CPUB 600I 1.2 PerformanceB CPU A 500I

Computing CPI
Done by looking at the different types of instructions and using their individual cycle counts n

Clock _ Cycle (CPI i * Ci )

i 1

Ci: The count of the number of instructions of class i executed CPIi: The average number of cycles per instruction for that instruction class l n: is the number of instruction classes

Example
CPI for this instruction class Code Sequence CPI for this instruction class

CPI

A 1

B 2

C 3

1 2

A 2 4

B 1 1

C 2 1

CC1 (2 * 1) (1 * 2) (2 * 3) 10 10 CPI1 2 5

CC2 ( 4 * 1) (1 * 2) (1 * 3) 9 CPI 2 9 1.5 6

Workload

A set of programs used for evaluating a computer or a system Benchmarks: programs specifically chosen to measure performance. SPEC 2000 benchmarks (12 integer, 14 floatingpoint programs). Performance results given by benchmarks may not be correct if the system (or the compiler of the system) is optimized for the benchmarks

Benchmark

Programs specifically chosen to measure performance Best determined by running a real application

use programs typical of expected workload e.g., compilers/editors, scientific applications, graphics...
nice for architects and designers companies have agreed on a set of real program and inputs

Small benchmarks

SPEC (System Performance Evaluation Cooperative)

Simplest Approach

Computer A Program 1 (sec) Program 2 (sec) 1 1000

Computer B 10 100

Total (sec)

1001

110

Performanc eB Execution _ Time A 1001 9.1 Performanc eA Execution _ Time B 110

Evaluating Performance
CPU Performance

Different classes and applications of computer require different types of benchmarks

Desktop

SPEC CPU benchmark to measure CPU performance and response time focusing on a specific task: DVD playback or graphic performance of games depend on the nature of intended application Throughput

Server

requirements on response time to individual events: database query and web page request SPECweb99

Embedded Computing

EEMBC

Reproducibility: list everything another experimenter need to duplicate the results

SPEC CPU2000 Benchmark

SPEC: CINT2000 and CFP2000

Relative Performance in Three Different Modes

Relative Energy Efficiency Comparison

Amdahls Law
Execution Time After Improvement = ( Execution Time Affected/ Amount of Improvement) + Execution Time Unaffected Principle: Make the common case fast

80 ET _ after (100 80) n 80 20 sec 20 n

Example: Suppose a program runs in 100 seconds on a machine, with multiply operation responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 5 times faster?"

MIPS (million instructions per second)

Instruction class
A B

CPI
1 2

Instruction _ Count MIPS Execution _ Time * 10 6

CC1 (5 * 1 1 * 2 1 * 3) * 109 10 * 109 10 * 109 E1 2.5 sec 4 * 109 (5 1 1) * 109 MIPS1 2800 2.5 * 106
CC2 (10 * 1 1 * 2 1 * 3) * 109 15 * 109 15 * 109 E1 3.75 sec 4 * 109 (10 1 1) * 109 MIPS2 3200 3.75 * 106

Code from
Compiler 1 Compiler 2

Instruction counts (in billion) A 5 10 B 1 1 C 1 1

Always trust execution time metric!

http://www.faculty.uaf.edu/ffdr/EE443/Handouts/Set5_Sp05_3pp.pdf

A Complete Example (I)

A Complete Example (II)

A Complete Example (III)

Three problems with using MIPS

MIPS specifies the instruction execution rate but does not take into account the capabilities of the instructions.

We cannot compare computers with different instruction sets using MIPS, since the instruction counts will certainly differ.

MIPS varies between programs on the same computer;

a computer cannot have a single MIPS rating for all programs.

MIPS can vary inversely with performance.

Datasheet
No ratings yet
Datasheet
2 pages
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
80% (5)
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
118 pages
Lecture4 Performance Evaluation 2011
No ratings yet
Lecture4 Performance Evaluation 2011
34 pages
Computer Performance
No ratings yet
Computer Performance
27 pages
Lec 2 Performance
No ratings yet
Lec 2 Performance
28 pages
Module 3.3 - Problems On Performance
No ratings yet
Module 3.3 - Problems On Performance
54 pages
02 Performance
No ratings yet
02 Performance
13 pages
02 Performance
No ratings yet
02 Performance
23 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
52 pages
A Constant Clock Rate:: - Most Computers Run Synchronously Utilizing A CPU Clock Running at
No ratings yet
A Constant Clock Rate:: - Most Computers Run Synchronously Utilizing A CPU Clock Running at
45 pages
Performance
No ratings yet
Performance
35 pages
IT401 Computer Organization and Architecture: Prasun Ghosal
No ratings yet
IT401 Computer Organization and Architecture: Prasun Ghosal
30 pages
CH 02a-Computer Performance
No ratings yet
CH 02a-Computer Performance
22 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
28 pages
Performance Matrices
No ratings yet
Performance Matrices
14 pages
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
No ratings yet
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
23 pages
COMP 303 Computer Architecture
No ratings yet
COMP 303 Computer Architecture
34 pages
Lecture 3
No ratings yet
Lecture 3
19 pages
Lec10 Performance
No ratings yet
Lec10 Performance
22 pages
Lecture 2: Performance/Power, MIPS Instructions
No ratings yet
Lecture 2: Performance/Power, MIPS Instructions
28 pages
COD Ch. 2 The Role of Performance
No ratings yet
COD Ch. 2 The Role of Performance
28 pages
Lecture4 Performance Evaluation
No ratings yet
Lecture4 Performance Evaluation
34 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
17 pages
DA_CI
No ratings yet
DA_CI
13 pages
CSE 332 L4 - 14 Nov 2020
No ratings yet
CSE 332 L4 - 14 Nov 2020
41 pages
SP21 BCS 022 (Class - Assignment 03)
No ratings yet
SP21 BCS 022 (Class - Assignment 03)
5 pages
Lecture 3: Performance/Power, MIPS Instructions
No ratings yet
Lecture 3: Performance/Power, MIPS Instructions
18 pages
Introduction To Computer Organization
No ratings yet
Introduction To Computer Organization
66 pages
Measuring Computer Performance
No ratings yet
Measuring Computer Performance
26 pages
Performance Chap4
No ratings yet
Performance Chap4
20 pages
Aca Unit 1
No ratings yet
Aca Unit 1
34 pages
Chapter 1 Computer Abstractions and Technology
No ratings yet
Chapter 1 Computer Abstractions and Technology
46 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
47 pages
Advance Computer Architecture: Dr. Haroon Mahmood Assistant Professor NUCES Lahore
No ratings yet
Advance Computer Architecture: Dr. Haroon Mahmood Assistant Professor NUCES Lahore
17 pages
AOK Lecture03 PDF
No ratings yet
AOK Lecture03 PDF
28 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
56 pages
Performance
No ratings yet
Performance
51 pages
CMSC 611: Advanced Computer Architecture
No ratings yet
CMSC 611: Advanced Computer Architecture
21 pages
Performance Measures
No ratings yet
Performance Measures
25 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
52 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
29 pages
Measuring Performance: Chris Clack B261 Systems Architecture
No ratings yet
Measuring Performance: Chris Clack B261 Systems Architecture
19 pages
PF PF Performance Performance: What Is Good Performance
No ratings yet
PF PF Performance Performance: What Is Good Performance
7 pages
Designing For Performance - Performance Metrics
No ratings yet
Designing For Performance - Performance Metrics
19 pages
Cse - 321 - 2
No ratings yet
Cse - 321 - 2
37 pages
Inroduction and Performance Analysis
No ratings yet
Inroduction and Performance Analysis
29 pages
Computer Performance Measurement. Amdahl's Law
No ratings yet
Computer Performance Measurement. Amdahl's Law
24 pages
550 12 6 2011 PDF
No ratings yet
550 12 6 2011 PDF
45 pages
Chapter 2-Part 12 1
No ratings yet
Chapter 2-Part 12 1
38 pages
sp22-bct-045(assignment 03)coal
No ratings yet
sp22-bct-045(assignment 03)coal
6 pages
Computer Performance
No ratings yet
Computer Performance
18 pages
Unit I-Basic Structure of Computers-Lecture 5
No ratings yet
Unit I-Basic Structure of Computers-Lecture 5
6 pages
Performance
No ratings yet
Performance
12 pages
4 Perfrmance
No ratings yet
4 Perfrmance
30 pages
Revised Assignment
No ratings yet
Revised Assignment
8 pages
Cpu Performance Metric Problems
No ratings yet
Cpu Performance Metric Problems
5 pages
Homework 1
No ratings yet
Homework 1
11 pages
Cse-Vii-Advanced Computer Architectures (10cs74) - Solution
100% (1)
Cse-Vii-Advanced Computer Architectures (10cs74) - Solution
111 pages
Chapter 1 Computer Abstractions and Technology
No ratings yet
Chapter 1 Computer Abstractions and Technology
46 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Son-CA - Lec1 - 1 - Computer Abstraction and Technology
No ratings yet
Son-CA - Lec1 - 1 - Computer Abstraction and Technology
31 pages
8.10 Timer/Counter Oscillator: 8.12.1 OSCCAL - Oscillator Calibration Register
No ratings yet
8.10 Timer/Counter Oscillator: 8.12.1 OSCCAL - Oscillator Calibration Register
1 page
EE316 Homework Assignment-1
No ratings yet
EE316 Homework Assignment-1
2 pages
05 Wind PDF
100% (1)
05 Wind PDF
44 pages
Introduction To Computer Hardware Part 1 PDF
No ratings yet
Introduction To Computer Hardware Part 1 PDF
31 pages
System Unit Lesson
No ratings yet
System Unit Lesson
10 pages
MIS 6110 Assignment #1 (Spring 2015)
No ratings yet
MIS 6110 Assignment #1 (Spring 2015)
14 pages
CS61C: Machine Structures: Lecture #29 Performance & Parallel Intro
No ratings yet
CS61C: Machine Structures: Lecture #29 Performance & Parallel Intro
47 pages
Barani Institute of Science Sahiwal: Information and Communication Technoligy
No ratings yet
Barani Institute of Science Sahiwal: Information and Communication Technoligy
6 pages
Assignment No 3 CST-203
No ratings yet
Assignment No 3 CST-203
5 pages
CA Assignment 2
50% (2)
CA Assignment 2
2 pages
Microprocessor Components
100% (1)
Microprocessor Components
2 pages
Beckhoff Main Catalog 2018 1 03 Embedded PC
No ratings yet
Beckhoff Main Catalog 2018 1 03 Embedded PC
90 pages
Basics of Computer For EPFO Part 10
No ratings yet
Basics of Computer For EPFO Part 10
113 pages
MPMC Model Exam Question Paper (3)
No ratings yet
MPMC Model Exam Question Paper (3)
5 pages
000-Chapter1 CPU
100% (1)
000-Chapter1 CPU
92 pages
Inside The System Unit
100% (1)
Inside The System Unit
48 pages
Chapter 2 - The System Unit - Processing and Memory
No ratings yet
Chapter 2 - The System Unit - Processing and Memory
11 pages
Microprocessors: Managing and Troubleshooting Pcs
No ratings yet
Microprocessors: Managing and Troubleshooting Pcs
80 pages
The System Unit: Motherboard
No ratings yet
The System Unit: Motherboard
13 pages
Types of Processors and RAM's
No ratings yet
Types of Processors and RAM's
26 pages
International Computing For Lower Secondary Students Book Stage
No ratings yet
International Computing For Lower Secondary Students Book Stage
225 pages
Subject Content Clarification Guide A Level
No ratings yet
Subject Content Clarification Guide A Level
24 pages
How To Unlock Intel CPU.
No ratings yet
How To Unlock Intel CPU.
4 pages
Microprocessor FINAL
No ratings yet
Microprocessor FINAL
8 pages
Dbstar Arm9 Manual
No ratings yet
Dbstar Arm9 Manual
153 pages
q2 Chs 9 Module 1 Week 1
No ratings yet
q2 Chs 9 Module 1 Week 1
6 pages
Smcsxi PDF
No ratings yet
Smcsxi PDF
92 pages
Chapter 1 Part 2: Computer Abstractions and Technology
No ratings yet
Chapter 1 Part 2: Computer Abstractions and Technology
27 pages