Problems Chapter 17 Parallel Processsing: 17.14 An Application Program Is Executed On A Nine-Computer Cluster. A

Nama : Rafi Ahmad Fadhlan
NIM : 205150207111061
1. Problems Chapter 17 Parallel Processsing

17.14 An application program is executed on a nine-computer cluster. A
benchmark program took time T on this cluster. Further, it was found that 25% of
T was time in which the application was running simultaneously on all nine
computers. The remaining time, the application had to run on a single computer.
a. Calculate the effective speedup under the aforementioned condition as
compared to executing the program on a single computer. Also calculate α,
the percentage of code that has been parallelized (programmed or compiled
so as to use the cluster mode) in the preceding program.
b. Suppose that we are able to effectively use 17 computers rather than 9
computers on the parallelized portion of the code. Calculate the effective
speedup that is achieved.
Answer :
a. Given values,
n =9
α = 25%
= 0,25
Effective speedup can be calculated using the formula [nα – α + 1]
Substitute the above values in Equation (3),
[nα – α + 1] = [9 x 0,25 – 0.25 + 1]
= [2 + 1]
` =3
Therefore, the effective speedup value of 3.
The amount of code that has been parallelized is “α = 0.25” with 9 processors. So,
totally it is “2.25”.
So, the percentage is calculated as follows:
Percentage of code has been parallelized
Therefore, the percentage of code that has been parallelized is 75%.
b. Keeping the number of processors as 16 instead of 9, calculate the effective speedup

n = 17
α = 25%
= 0.25
Effective speedup can be calculated using the formula [nα – α + 1]
Substitute these values in Equation
[nα – α + 1] = [17 x 0,25 – 0.25 + 1]
= [4 + 1]
` =5
Therefore, new effective speedup value of 5.
17.7 An earlier version of the IBM mainframe, the S/390 G4, used three levels of
cache. As with the z990, only the first level was on the processor chip [called the
processor unit (PU)]. The L2 cache was also similar to the z990. An L3 cache was
on a separate chip that acted as a memory controller, and was interposed
between the L2 caches and the memory cards. Table 17.4 shows the
performance of a three-level cache arrangement for the IBM S/390. The purpose
of this problem is to determine whether the inclusion of the third level of cache
seems worthwhile. Determine the access penalty (average number of PU cycles)
for a system with only an L1 cache, and normalize that value to 1.0. Then
determine the normalized access penalty when both an L1 and L2 cache are
used, and the access penalty when all three caches are used. Note the amount of
improvement in each case and state your opinion on the value of the L3 cache.
Answer :
If only the L1 cache is used, then 89% of the accesses are to L1 and the remaining
11% of the accesses are to main memory. Therefore, the average penalty is
(1 × 0.89) + (32 × 0.11) = 4.41.
If both L1 and L2 are present, the average penalty is
(1 × 0.89) + (5 × 0.05) + (32 × 0.06) = 3.06. This normalizes to 3.06/4.41 = 0.69.
Thus, with the addition of the L2 cache, the average penalty is reduced to 69% of
that with only one cache.
If all three caches are present, the average penalty is
(1 × 0.89) + (5 × 0.05) + (14 × 0.03) + (32 × 0.03) = 2.52, and normalized average
penalty is 2.52/4.41 = 0.57.
The reduction of the average penalty from 0.69 to 0.57 would seem to justify the
inclusion of the L3 cache.
2. Review Questions Chapter 18 Multicore Computers
18.2 Give several reasons for the choice by designers to move to a multicore
organization rather than increase parallelism within a single processor.
Answer :
Reasons to move to multicore organization:
Reasons to move to multicore organisation rather than increase parallelism within single
processor are
1. Consider pipeline concept,
 In pipeline concept we use N-number of stages ranging from 3-staged pipeline to

5-stage and to dozen of stages.
 Implementing many pipeline stages practically is bit difficult because it needs

more logics, more interconnections, and more control signals etc...
2. With superscalar concept,
 Performance can be improved by having parallel pipeline.
 Again increase in number of parallel pipelines, more logics with interconnections

are required.
3. With simultaneous multithreading concept,
 Managing multiple threads with pipeline limits threads and pipelines used
effectively.
4. To avoid things with respect to the above points we move to multicore organisation
 Which Improves system efficiency and performance of an application to run

multiple applications.
 Improved performance for compute intensive applications
 Simplified infrastructure etc..

18.4 List some examples of applications that benefit directly from the ability to
scale throughput with the number of cores.
Answer :
 Multi-threaded native applications

Multi-threaded applications are characterized by having a small number of highly
threaded processes. Examples of threaded applications include Lotus Domino or
Siebel CRM (Customer
Relationship Manager).
 Multi-process applications
Multi-process applications are characterized by the presence of many single-
threaded processes. Examples of multi-process applications include the Oracle
database, SAP, and PeopleSoft.
 Java applications
Java applications embrace threading in a fundamental way. Not only does the
Java language greatly facilitate multithreaded applications, but the Java Virtual
Machine is a multi-threaded process that provides scheduling and memory
management for Java applications. Java applications that can benefit directly
from multicore resources include application servers such as Sun’s Java
Application Server, BEA’s Weblogic, IBM’s Websphere, and the open-source
Tomcat application server. All applications that use a Java 2 Platform, Enterprise
Edition (J2EE platform) application server can immediately benefit from
multicore technology.
 Multi-instance applications
Even if an individual application does not scale to take advantage of a large
number of threads, it is still possible to gain from multicore architecture by
running multiple instances of the application in parallel. If multiple application
instances require some degree of isolation, virtualization technology (for the
hardware of the operating system) can be used to provide each of them with its
own separate and secure environment.

Problems Chapter 17 Parallel Processsing: 17.14 An Application Program Is Executed On A Nine-Computer Cluster. A

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Problems Chapter 17 Parallel Processsing: 17.14 An Application Program Is Executed On A Nine-Computer Cluster. A

Uploaded by

Copyright:

Available Formats

Nama : Rafi Ahmad Fadhlan

1. Problems Chapter 17 Parallel Processsing

b. Keeping the number of processors as 16 instead of 9, calculate the effective speedup

1. Consider pipeline concept,

 In pipeline concept we use N-number of stages ranging from 3-staged pipeline to

 Implementing many pipeline stages practically is bit difficult because it needs

2. With superscalar concept,

 Performance can be improved by having parallel pipeline.

 Again increase in number of parallel pipelines, more logics with interconnections

3. With simultaneous multithreading concept,

 Which Improves system efficiency and performance of an application to run

 Improved performance for compute intensive applications

 Simplified infrastructure etc..

 Multi-threaded native applications

You might also like