Lecture Wk4 Ch3

KIE3005
Week 5: Chapter 3 of textbook

• Recognize the distinction between truncation and round-off errors.
• Understand the concepts of significant figures, accuracy, and precision
• Recognize the difference between true relative error, approximate relative
error, and acceptable error, and understand how approximate relative
error, and acceptable error are used to terminate an iterative computation.
• Understand how numbers are represented in digital computers and how
this representation induces round-off error.
• Recognize how computer arithmetic can introduce and amplify round-off
errors in calculation.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Chapter 3 1
Recall the falling parachutist example
 We determine the velocity of the falling
parachutist analytically & numerically
 Discrepancy between the two method
 Analytical:
exact
 Numerical: approximation
 Error = |numerical sol. – analytical sol.|

 But no analytical solution available for
many engineering problems  estimates
of errors
Chapter 3 2
Approximations and Round-Off Errors
 Numerical methods yield approximate results, results that are

close to the exact analytical solution. We cannot exactly
compute the errors associated with numerical methods if the
analytical solution is not available
 Only rarely given data are exact, since they originate from
measurements. Therefore there is probably error in the input
information.
 Algorithm itself usually introduces errors as well, e.g., unavoidable
round-offs, etc.
 The output information will then contain error from both of these
sources.
 How confident we are in our approximate result?
 The question is “how much error is present in our
calculation and is it tolerable?”
Chapter 3 3
3.1 Significant Figures
 Approx. representation of numbers
 Number of significant figures indicates precision. Significant
digits of a number are those that can be used with
confidence, e.g., the number of certain digits plus one
estimated digit.
Confident between 48 to 49 km/h
Approximate 48.8 to 48.9 km/h
Speedometer (3 significant
figures)
Odometer (7 significant
figures)
Chapter 3 4
Significant Figure Example: ZEROS
45,300 How many significant figures? (trailing zeros)
4.53 x 104 3
4.530 x 104 4 Scientific notation used to avoid confusion
4.5300 x 104 5
Zeros are sometimes used to locate the decimal point not significant figures
0.00001753 4
0.0001753 4 Zeros here are not significant figures
0.001753 4
Implications of significant figures for our study of numerical methods :

(1)Criteria to specify how confident in our approximation result
(eg. Only acceptable if it is correct to 4 significant figures)
(1)Mathematical quantities cannot be expressed exactly by a limited numbers or
digits such as pi = 3.141592653…ad infinitum. The omission of the remaining
significant figures called round-off error
5
Source: http://en.wikipedia.org/wiki/Significant_figures
The rules for identifying significant digits when writing or interpreting numbers are as follows:
• All non-zero digits are considered significant. For example, 91 has two significant digits (9 and 1), while 123.45 has
five significant digits (1, 2, 3, 4 and 5).
• Zeros appearing anywhere between two non-zero digits are significant. Example: 101.12 has five significant digits:
1, 0, 1, 1 and 2.
• Leading zeros are not significant. For example, 0.00052 has two significant digits: 5 and 2.
• Trailing zeros in a number decimal point are significant. For example, 12.2300 has six significant digits: 1, 2, 2, 3, 0
and 0. The number 0.000122300 still has only six significant digits (the zeros before the 1 are not significant). In
addition, 120.00 has five significant digits. This convention clarifies the precision of such numbers; for example, if a
result accurate to four decimal places (0.0001) is given as 12.23 then it might be understood that only two decimal
places of accuracy are available. Stating the result as 12.2300 makes clear that it is accurate to four decimal places.
• The significance of trailing zeros in a number not containing a decimal point can be ambiguous. For example, it may
not always be clear if a number like 1300 is accurate to the nearest unit (and just happens coincidentally to be an
exact multiple of a hundred) or if it is only shown to the nearest hundred due to rounding or uncertainty. Various
conventions exist to address this issue:
• A bar may be placed over the last significant digit; any trailing zeros following this are insignificant. For
example,1300 has three significant digits (and hence indicates that the number is accurate to the nearest ten).
• The last significant digit of a number may be underlined; for example, 2000 has two significant digits.
• A decimal point may be placed after the number; for example "100." indicates specifically that three significant
digits are meant.
• However, these conventions are not universally used, and it is often necessary to determine from context whether
such trailing zeros are intended to be significant. If all else fails, the level of rounding can be specified explicitly.
The abbreviation s.f. is sometimes used, for example "20 000 to 2 s.f." or "20 000 (2 sf)". Alternatively, the
uncertainty can be stated separately and explicitly, as in 20 000 ± 1%, so that significant-figures rules do not
apply.
Chapter 3 6
3.2 Accuracy and Precision
 The errors associated with calculations and
measurements characterized by their
accuracy and precision
 Accuracy. How close is a computed or measured
value to the true value
 Precision (or reproducibility). How close is a
computed or measured value to previously
computed or measured values.
 Inaccuracy (or bias). A systematic deviation from
the actual value.
 Imprecision (or uncertainty). Magnitude of scatter.
Chapter 3 7
Bulls eye
represents
the truth
Bullet holes
represents
numerical
result
Fig. 3.2 An example from marksmanship illustrating the concepts of accuracy

and precision. (a) Inaccurate and imprecise; (b) accurate and imprecise(c)
inaccurate and precise; (d) accurate and precise
Chapter 3 8
3.3 Error Definitions
• Numerical error arise from the use of approximations to

represent exact mathematical operations and quantities.
• Truncation errors – caused when approximations are

used to represent exact mathematical procedures.
• Round-off error – caused when numbers with limited

significant figures are used to represent exact number.
Caused mainly in computer.
9
Chapter 3
The relationship between exact (true) and approximation
can be formulated as:
True Value = Approximation + Error (3.1)
Et = True value – Approximation (+/-) (3.2)
True error
true error
True fractional relative error 
true value
true error
True percent relative error,  t  100% (3.3)
true value
Chapter 3 10
Example 3.1:
Problem Statement. Suppose that you have the task of measuring the lengths of a bridge
and a rivet and come up with 9999 cm and 9 cm, respectively. If the true values are
10,000 cm and 10 cm, respectively, compute (a) the error and (b) the true percent relative
error for each case.
Solution.
(a) The error for measuring the bridge is [Eq. (3.2)]
𝐸𝑡 = 10,000 − 9999 = 1 cm
and for the rivet it is
𝐸𝑡 = 10 − 9 = 1 cm
(b) The percent relative error for measuring the bridge is [Eq. (3.3)]
1
𝜀𝑡 = 100% = 0.01%
10.000
And for the rivet is
1
𝜀𝑡 = 100% = 10%
10
Thus, although both measurements have an error of 1 cm, the relative error for the
rivet is much greater. We would conclude that have done an adequate job of
measuring the bridge. Whereas our estimate for the rivet leaves something to be
desired.
Chapter 3 11

a  Approximat e error 100% (3.4)

Approximat ion
Current approximat ion - Previous approximat ion

a  100% (3.5)
Current approximat ion
Eq. (3.2) to (3.5) may be either positive or negative (+ / -)

Chapter 3 12
 We don’t concern on the sign of the error. Hence, usually
absolute value is used for Eq. (3.5).
 For the numerical method, computations are repeated until
stopping criterion ( ) is satisfied.
 a  s (3.6) Pre-specified % tolerance based

on the knowledge of your
solution
 We can be sure that the result is correct to at least n significant

figures if the following criterion is met:
 s  (0.5  10(2 - n) )% (3 . 7 )
Chapter 3
13
Example 3.2 Error estimations for iterative methods
Chapter 3 14
Solution. First Eq. (3.7) can be employed to determine the error
criterion that ensures a results is correct to least three significant figures:
𝜀𝑠 = 0.5 𝑥 102−3 % = 0.05%

Thus we will add terms to the series until 𝜀𝑎 falls below that level. The
first estimate is simply equal to Eq. (3.2.1) with a single term. Thus the
first estimate is then generated by adding the second term as in
𝑒𝑥 = 1 + 𝑥
Or for 𝑥 = 0.5,
𝑒 0.5 = 1+0.5 = 1.5

This represents a true persent relative error of [Eq.(3.3)]
1.648721 −1.5
𝜀𝑡 = 100% = 9.02%
1.648721
Chapter 3 15
Equation (3.5) can be used to determine an approximate estimate of the
error, as in True value of e0.5=1.648721…
1.5 − 1
𝜀𝑎 = 100% = 33.3%
1.5
Because 𝜀𝑎 is not less than the required value of 𝜀𝑠 , we could continue

the computation by adding another term, 𝑋 2 /2!, and repeating error
calculation. The process is continued until 𝜀𝑎 < 𝜀𝑠 . The entire
computation can be summarized as
Terms Result 𝜺𝒕 (%) 𝜺𝒂 (%)
1 1 39.3
2 1.5 9.02 33.3
3 1.625 1.44 7.69
4 1.645833333 0.175 1.27
5 1.648437500 0.0172 0.158
6 1.648697947 0.00142 0.0158
Thus after six terms are included, the approximate error falls below 𝜀𝑠 = 0.05%
and the computation is terminated. However, notice that, rather than three
significant figures, the result is accurate to five! This is because, for this case, both
Eqs (3.5) and (3.7) are conservative. That is they ensure that the result is at least as
good as they specify. Although as discussed in Chap. 6, this is not always the case
for Eq. (3.5) it is true most of the time. Chapter 3 16
3.4 Round-off Errors
 Numbers such as p, e, or 7 cannot be expressed by a fixed number of
significant figures.
 Computers use a base-2 representation, hence they cannot precisely
represent certain exact base-10 numbers.
3.4.1 Computer Representation of Numbers
 Numerical round off errors directly to the manner in which numbers are
stored in a computer.
 The unit to represent an information is referred as word.
 Number systems: base-10, base-8, base-2.
Chapter 3 17
Figure 3.3 (a) Example Decimal number (b) Example of Binary number
Chapter 3 18
Integer representation in computer.
1 – represents negative
0 – represent positive
Figure 3.4 Representation of the decimal integer -173 on a 16-bits computer using
signed magnitude method.
Chapter 3 19
Example 3.3
Range of Integers.
Problem Statements. Determine the range of integers in base-10 that can
be represented on a 16-bit computer.
Solution. Of the 16 bits, the first bit holds the sign. The remaining 15
bits can hold binary numbers from 0 to 111111111111111. The upper
limit can be converted to a decimal integer as in
𝟏 𝐱 𝟐𝟏𝟒 + 𝟏 𝐱 𝟐𝟏𝟑 + ⋯ + 𝟏𝐱 𝟐𝟏 + (𝟏𝐱𝟐𝟎 )
Which equals 32,767 (note that is expression can be simply evaluated as

215-1). Thus, a 16-bit computer word can store decimal integers ranging
from -32,767 to 32,767. In addition, because zero is already defined as
000000000000000, it is redundant to use the number 100000000000000
to define a “minus zero” Therefore, it is usually employed to represent
on additional negative number: -32,768 and the range is from -32,768 to
32.767.
Note *- Signed magnitude method in the above example is not used to represent
integers on conventional computers. A preferred approach is 2’s complement
technique.
Chapter 3 20
Floating-point representation
Fractional quantities are typically represented in computer using

“floating point” form. Number is expressed as fractional called
mantissa or significand
Integer part
exponent
m. be
mantissa Base of the number system
used
Example
156.78  0.15678x103 in a floating
point base-10 system
Chapter 3
21
Figure 3.5 The manner in which a floating-point number is stored
in a word
1 – represents negative
First bit – sign 0 – represent positive
Next series bits – signed exponent
Last bits - mantissa
Chapter 3 22
•Mantissa is normalized if it has leading zero digits.
•Example: 1/34=0.029411765…, if we want to store in a floating point base
-10 system that allowed only four decimal point.
1
 0.029411765
34
0.0294100
 Normalized to remove the leading zeroes. Multiply the mantissa by

10 and lower the exponent by 1, thus:
0.2941 x 10-1
Additional significant figure
is retained
Chapter 3 23
Normalization cause the absolute value of m is limited.
1
 m 1
b
Therefore
for a base-10 system 0.1 ≤m<1
for a base-2 system 0.5 ≤m<1
 Floating point representation allows both fractions and very large

numbers to be expressed on the computer. However,
 Floating point numbers take up more room.
 Take longer to process than integer numbers.
 Round-off errors are introduced because mantissa holds only a
finite number of significant figures.
Chapter 3 24
EXAMPLE 3.4
Hypothetical Set of Floating-Point Numbers
Problem Statement. Create a hypothetical floating-point number set for a
machine that stores information using 7-bit words. Employ the first bit
for the sign of the number, the next three for the sign and the magnitude
of the exponent, and the last three magnitude of the mantissa (Fig.3.6).
21 20 2-1 2-2 2-3

0 1 1 1 1 0 0
Magnitude
Sign of Sign of
of mantissa
number exponent
Magnitude
of exponent
Figure 3.6
The smallest possible positive floating-point number from Example 3.4
Chapter 3 25
Solution. The smallest possible positive number is depicted in Fig. 3.6.
The initial 0 indicates that quantity is positive. The 1 in the second place
designates that the exponent has a negative sign. The 1’s in the third and
fourth places give a maximum value to the exponent of
1 x 21 + 1 x 20 = 3
Therefore, the exponent will be -3. Finally, the mantissa is specified by 21 20 2-1 2-2 2-3
0 1 1 1 1 0 0
the 100 in the last three places. Which conforms to
1 x 2−1 + 0 x 2 −2 + 0 x 2−3 = 0.5 Magnitude of
Sign of Sign of
mantissa
number exponent
Although a smaller mantissa is possible (e.g., 000,001, 011), the value of Magnitude of
exponent
100 is used because of the limit imposed by normalization [Eq. (3.8)].
Thus, the smallest possible positive number for this system is +0.5 x 2 -3,
which is equal to 0.0625 in the base-10 system.
The next highest numbers are developed by increasing the mantissa, as
in
0111101 = (1 x 2-1 + 0 x 2-2 + 1 x 2-3) x 2-3 = (0.078125)10 0.015625
0111110 = (1 x 2-1 + 1 x 2-2 + 0 x 2-3) x 2-3 = (0.093750)10
0111111 = (1 x 2-1 + 1 x 2-2 + 1 x 2-3) x 2-3 = (0.109375)10 0.015625
Notice that the base-10 equivalents are spaced evenly with an interval of
0.015625.
At this point, to continue increasing, we must decrease the
exponent to 10, which gives a value of
1 x 2 1 + 0 x 20 = 2
Chapter 3 26
The mantissa is decreased back to its smallest value of 100. Therefore,
the next number is
0110100 = (1 x 2-1 + 0 x 2-2 + 0 x 2-3) x 2-2 = (0.125000)10

This still represents a gap of 0.125000 - 0.109375 = 0.015625. However,
now when higher numbers are generated by increasing the mantissa, the
gap is lengthened to 0.03125.
0110101 = (1 x 2-1 + 0 x 2-2 + 1 x 2-3) x 2-2 = (0.156250)10
0110110 = (1 x 2-1 + 1 x 2-2 + 0 x 2-3) x 2-2 = (0.187500)10
0110111 = (1 x 2-1 + 1 x 2-2 + 1 x 2-3) x 2-2 = (0.218750)10
This pattern is repeated as each larger quantity is formulated until a
maximum number is reached.
0011111 = (1 x 2-1 + 1 + 2-2 + 1 x 2-3) x 23 = (7)10
The final number set is depicted graphically in Fig.3.7
Chapter 3 27
Figure 3.7 The hypothetical number system developed in Example 3.4.
Each value is indicated by a tick mark. Only the positive numbers are
shown. An identical set would also extend in the negative direction.
Chapter 3 28
 From Figure 3.7, it can be seen that:
1. There is a limited range of quantities that may be represented – attempt to
employ number outside the acceptable range results in overflow error and
underflow error
2. There are only a finite number of quantities that can be represented
within the range. The degree of precision is limited
Example:
p=3.14159265358 to be stored on a base-10 system carrying 7
significant digits. We need to approximate it.
1st method: chopping - chop off 8th and higher term

p=3.141592 chopping error t=0.00000065
2nd method: rounded

p=3.141593 rounding error t=0.00000035
The error introduced by these approximation is called quantizing error
Chapter 3 29

Chapter 3 30
3. The interval between Numbers,  x , increases as the Number Grows in
Magnitude. This allow floating-point representation to preserve
significant digits.
 Quantization error proportional to the magnitude of the number.
 For normalized floating-point numbers with chopping employed, the
proportionality is: x (3.9)

x
 For normalized floating-point numbers with rounding employed, the
proportionality is: x  (3.10)

x 2
  1t
is machine epsilon, and  b (3.11)
 b is the number based and t is the number of significant digit in the
mantissa.
Chapter 3 31
EXAMPLE 3.5
Machine Epsilon
Problem Statement. Determine the machine epsilon and verify its
effectiveness in characterizing the errors of the number system from
Example 3.4. Assume that chopping is used.
Solution. The hypothetical floating-point system from Example 3.4.
amployed values of the base b= 2, and the number of mantissa bits t = 3.
Therefore, machine epsilon would be [Eq.(3.11)]
𝜀 = 21−3 = 0.25
Consequently, the relative quantizing error should be bounded by 0.25
for chopping. The largest relative should occur for those quantities that
fall just below the upper bound of the first interval between successive
equispaced numbers (Fig. 3.8). Those numbers falling in the succeeding
higher intervals would have the same value of Δx but a greater value of x
and, hence, would have a lower relative error. An example of a
maximum error would be a value falling just below the upper bound of
the interval between (0.1250000)10 and (0.156250)10. For this case, the
error would be less than
0.03125
= 0.25
0.125000
Thus, the error is a predicted by Eq. (3.9).
Figure 3.8 The largest quantizing error will occur for those values falling just below the
upper bound of the first of a series of equispaced intervals.
Chapter 3 32
Example 3.6: Machine Epsilon

Chapter 3 33

Chapter 3 34
 Practical application of magnitude dependence of quantizing errors in
numerical methods:
1. Convergence quantities
2. Stopping mechanism for iterative process.
- To test whether the two quantities are equal, advisable to test whether
their difference is less than an acceptable tolerance
- Normalized should be employed mainly for numbers with large
magnitude.
- machine epsilon can be used in formulating stopping or convergence
criteria. This will ensure the program not dependent on the computer on
which they are implemented.
Figure 3.9 Pseudocode to

determine machine epsilon for a
binary computer
Chapter 3 35
3.4.2 Arithmetic Manipulations of Computer Number
 Arithmetic manipulation cause round-off error.
 Example;
A computer use 4 digit mantissa and a 1-digit exponent.
Chopping is used.
Adding: 0.1557.101+ 0.4381.10-1
0.4381.10-1=0.004381.101 (mantissa with the smallest
exponent is modified to match the exponents)
0.1557 .101
0.004381.101
0.160081.101 the result is chopped to 0.1600.101
The last two digits of the second number that were shifted to
the right have been lost from the computation.
Chapter 3 36
 Subtraction (the process identical, but the
sign of subtrahend is reversed):
Example 1:
0.3641 .102
- 0.2686 .102
0.0955 .102 normalized 0.9550 .101

Example 2:
0.7642 .103
- 0.7641 .103
0.0001 .103 normalized 0.1000 .100

Chapter 3 37
 Multiplication is performed by multiply the mantissa and adding the
exponent.
 Division is performed by divided the mantissa and subtracting the
exponents.
 Large computation can cause a individual round-off error being
accumulated and lead to large round-off error.
 Adding a large and small number -cause round-off error. Usually occurs in
the computation of infinite series.
Example: 0.4000 ∙ 104
0.0000001 ∙ 104
0.4000001 ∙ 104
which is chopped to 0.4000x104, thus might as well not performed the addition
 Subtractive cancellation – round-off induced when subtracting two nearly

floating-points numbers. Double precision can be used to address this
problem
Chapter 3 38
Week 3 Exercise 1
The exponential function of ex can be computed using Maclaurin series :
Estimate the value of e-5 that make absolute value of approximate error (ea)
falls below a pre-specified error criterion es confirming to 5 significant figures.
Calculate true error and estimated error for each considered term.
True value of e-5=6.737947 x 10-3
Chapter 3 39
Week 3 Exercise 2
(a) Evaluate the polynomial y = x3 − 7x2 + 8x − 0.35at x = 1.37.

Use 3-digit arithmetic with chopping. Evaluate the percent
relative error.
(b) Repeat (a) but express y as y = ((x − 7)x + 8)x − 0.35 . Evaluate
the error and compare with (a).
Chapter 3 40

Lecture Wk4 Ch3

Uploaded by

Copyright:

Available Formats

Lecture Wk4 Ch3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture Wk4 Ch3

Uploaded by

Copyright:

Available Formats

KIE3005

Week 5: Chapter 3 of textbook

 Error = |numerical sol. – analytical sol.|

 Numerical methods yield approximate results, results that are

45,300 How many significant figures? (trailing zeros)

Implications of significant figures for our study of numerical methods :

Fig. 3.2 An example from marksmanship illustrating the concepts of accuracy

• Numerical error arise from the use of approximations to

• Truncation errors – caused when approximations are

• Round-off error – caused when numbers with limited

True Value = Approximation + Error (3.1)

Et = True value – Approximation (+/-) (3.2)

and for the rivet it is

a  Approximat e error 100% (3.4)

Current approximat ion - Previous approximat ion

Eq. (3.2) to (3.5) may be either positive or negative (+ / -)

 a  s (3.6) Pre-specified % tolerance based

 We can be sure that the result is correct to at least n significant

𝜀𝑠 = 0.5 𝑥 102−3 % = 0.05%

𝑒 0.5 = 1+0.5 = 1.5

Because 𝜀𝑎 is not less than the required value of 𝜀𝑠 , we could continue

3.4.1 Computer Representation of Numbers

𝟏 𝐱 𝟐𝟏𝟒 + 𝟏 𝐱 𝟐𝟏𝟑 + ⋯ + 𝟏𝐱 𝟐𝟏 + (𝟏𝐱𝟐𝟎 )

Which equals 32,767 (note that is expression can be simply evaluated as

Fractional quantities are typically represented in computer using

 Normalized to remove the leading zeroes. Multiply the mantissa by

 Floating point representation allows both fractions and very large

21 20 2-1 2-2 2-3

0110100 = (1 x 2-1 + 0 x 2-2 + 0 x 2-3) x 2-2 = (0.125000)10

1st method: chopping - chop off 8th and higher term

2nd method: rounded

Thus, the error is a predicted by Eq. (3.9).

Figure 3.9 Pseudocode to

0.0955 .102 normalized 0.9550 .101

0.0001 .103 normalized 0.1000 .100

 Subtractive cancellation – round-off induced when subtracting two nearly

The exponential function of ex can be computed using Maclaurin series :

True value of e-5=6.737947 x 10-3

(a) Evaluate the polynomial y = x3 − 7x2 + 8x − 0.35at x = 1.37.

You might also like