COS 464 Concurrent Progr by DR Mrs Asogwa
COS 464 Concurrent Progr by DR Mrs Asogwa
COS 464 Concurrent Progr by DR Mrs Asogwa
TABLE OF CONTENTS
1. Introduction
2. Basic Concepts
4. Parallel Architecture
5. Threads Analysis
7. References
2
1. Introduction
This is an introductory lecture note on Concurrent Programming. The note will lead the
student to the understanding of the foundational concepts of current programming. It
contains the introduction, Basic Concepts related to concurrent programming,
Introduction to Concurrency and Parallel Computing, The Need for Parallel Computers,
Parallel Architecture, Threads Analysis, Basic Software and Operating System.
The reader is advised to get a firm grip of the discussion or stumble on without it. To
make an ‘A’ in the course, the student is advised to attend classes, study the notes after
each day’s lecture, ask the lecturer questions if there are issues and obtain answers; do
associated assignments and take the examination confidently without depending on
‘microchip’.
2. Basic Concepts
At this level, it is assumed that you are already familiar with concepts associated with
programming. The concepts below are described just to refresh your memory.
Program: A program is a set of ordered instructions given to a computer to follow in
order to execute a specified task.
Programming: Programming is the act of writing computer programs.
Sequential programming: Act of writing computer instructions (code) that are executed
one after another.
Structured Programming
This is the act of using structured programming constructs (such as IF, While, For, etc) to
write computer program
In this course, you will advance to writing programming that are not limited to being run
one after another but at the same time. Writing programs that run in this manner is
referred to as Concurrent programming.
Concurrent: This means more than one thing happening at the same time such as
running a segment of code, printing and formatting, or browsing and downloading etc
happening, or done at the same time.
Transistor: This is a semi conductor which is a gate or switch that controls the voltage of
an electrical equipment (computer) which controls the flow of current and also amplifies
electric signals.
3
Si: silicon (Si), a nonmetallic chemical element in the carbon family. The name silicon
derives from the Latin silex or silicis, meaning “flint” or “hard stone.” Amorphous
elemental silicon was first isolated and described as an element in 1824 by Jöns Jacob
Berzelius, a Swedish chemist.
Pipelining: a means of running a sequence of instructions at the same time that another
instruction is still running or has not finished running. In other words, programs overlap
in time. Formally, it is a form of computer organization in which successive steps of an
instruction sequence are executed in turn by a sequence of modules able to operate
concurrently, so that another instruction can be begun before the previous one is finished
Parallel programming: Parallel computing refers to the process of breaking down larger
problems into smaller, independent, often similar parts that can be executed
simultaneously by multiple processors communicating via shared memory, then the
results are combined upon completion as part of an overall output.
It covers how to use more than one processor or computer to complete a task. Parallel
programming can also be done by dividing the problem or the data among different
processors, and allowing them to exchange information. There are different tools and
methods for parallel programming, such as MPI, Pthreads, and OpenMP. Parallel
programming is useful for solving complex or large-scale problems that require high
performance or efficiency.
In the past, many efforts have been made towards increasing the speed of computer
systems, like multiprogramming, pipelining and so on. these efforts have yielded good
results, however, the most effective way of increasing the speed of computer systems is
to use a multi-processor computer system or parallel computers, and allow the various
processors to execute a part of a program that solves a specific problem concurrently or
in parallel. Therefore, increasing the speed of computer system indefinitely aims at
developing high performance computers.
Furthermore, Moore’s law can be used to explain the rationale for true concurrent
programming. In 1965, Gordon Moore predicted that the numbers of transistors on
computers will double every eighteen months. Though a lot of computer scientists
considered this law/prediction as economic law/prediction. However, the effects of
Moores law/prediction are as follows:
The speeds of computers will double every eighteen months. The reason for this is
that increasing the number of transistors on computers will increase the speed of
computers.
The speed of computers will double as the number of transistors doubles every
eighteen months.
The size of transistors will continue to shrink (reduce in size), half of its original
size every eighteen months.
Though Moore’s law and its effects have been fulfilled for some decades, however,
recent discoveries show that there is hardware limitation with respect to continued
reduction in the size of transistors, in an attempt to increase number of transistors on
computer, thereby increasing its speed. This limit or threshold point has been reached; as
a result, the size of transistors cannot be reduced any more. The reason for this limit is
that all semiconductor devices, like transistors are Si based It can be assumed that a
circuit element will take at least a single Si atom, and the covalent bonding in Si has a
bond length approximately 20nm (nanometers), hence we will reach the limit of
miniaturization very soon. At the moment that limit has been reached. As a result of this
limitation, it implies that speed of computers can no longer be realized by increasing the
number of transistors. Computer scientists, at the moment are looking into other avenues
of increasing the speed of computers. Two of such possible ways are Parallel Computing
and Nanocomputing. Parallel Computing is promising, as it has recorded tremendous
success, while Nanocomputing remains a technology for the future, two decades from
date.
5
The diagrams below illustrate Moore’s Law and the various interpretations of Moore’s
Law.
No. of Transistors
Threshold point
Year
At the Threshold point, any further increase in the number of transistors will reduce the
speed of the computer. This justifies the need to introduce more than one processor on the
computer system.
Disadvantages of Concurrency
Though concurrency has many benefits and aforementioned, it also has some
disadvantages such as
i.It has problems like deadlocks
ii. It can result in resource starvation.
Because each of the processors executes concurrently or at the same time, with the aim of
solving a specific problem, they must cooperate by communicating with each other. Data
must be exchanged and communicated among the various processors. It means that the
6
The concept of division of labour in Economics requires that task be split into various
parts, and allocated to various workers. The various workers will perform some of the
tasks at the same time, while others will wait until some have been completed. In the
course of executing the work, the workers cooperate and communicate with each other.
The main advantage of this division of labour is increase in the level of production or
increase in the amount of work. The same concept is used in parallel computing,
computing task is split among the various processors, and they are allocated to the
various processors of the parallel computers. Some of the computing tasks are executed at
the same time by the parallel processors, while some will wait until others have
completed. The processors communicate and cooperate with each other by exchanging
data and synchronizing each other’s activities. The result is increase in throughput.
However, low productivity can be as a result of poor communication facilities that the
workers can use to communicate with each other. The same thing applies to parallel
computing. Throughput can be low as a result of inefficient means of communication
between the various processors; therefore, the most efficient means of communication
between the processors is desired. One way of realizing an efficient means of
communication between the processors is to devise the most efficient algorithm that
can be used to route the message from one processor to the other.
Parallel computing, which requires the use of parallel computer to solve problem at
reasonable time has defined new research area that have added a parallel dimension to
computing. Whatever we can do with the traditional computing, using a uni-processor
computer has a parallel dimension, using a parallel computer system. As a result, the
following research areas are under parallel computing:
Parallel Architecture, Parallel Algorithm, Parallel/True Concurrent Programming,
Parallel Programming Languages, Parallel Computer System Performance Evaluation,
etc. Each of these broad research areas will be discussed briefly.
4. PARALLEL ARCHITECTURE
4.0 Introduction
Writing a true concurrent program or a parallel program requires that we understand the
architecture of the computer system that can be used to write the program. Parallel
architecture refers to the various ways of organizing or arranging the various components
of a parallel computer with the aim of enhancing the performance of the system. These
components include the following: memory, parallel processors, peripheral devices,
interconnecting network, connecting the various processors etc. This chapter uses Fynn’s
Taxonomy to examine the various architectures of computer, including parallel computer.
Shared memory
Interconnection network
MIMD Processors
Interconnection network
Processor
Memory
7 1
6
1 1
2
1
5 3
1
4
1
0 1 2 3 4 5 6 7
8-processor chain
In a Chain/Bus configuration, the processors are arranged in a linear form. Each
processor is connected to two other processors except the processors at the front or back
of the chain. The maximum number of connections in such configuration in a n-processor
chain is 2*(n-1).
Two-dimensional Mesh
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20-processor mesh
12
This configuration is known as array. It consists of processors that are arranged in a grid.
As an array configuration, a two dimensional mesh of n processors can be decomposed
into I rows and j column processors. Each row of the array configuration can be regarded
as a horizontal chain, while each column can be regarded as a vertical chain. A two
dimensional mesh of I rows and j column of processors will have a maximum of 2*(i-1)*j
+ 2*(j-1)*i connections, and a total of I*j processors.
Ring
0
11 1
10 2
9 3
8 4
7 5
6
12-processor ring
Ring is one of the most common configurations. Like a chain/bus, the processors are
arranged in a linear form, and each processor is connected to two other processors for a
bi-directional rings where messages travel in both direction of the ring. Unlike a
chain/bus, the processors at front and back ends of the chain are connected to each other
in both directions. Therefore, for a bi-directional, n-processor ring, the maximum number
of connections is 2*n.
13
Torus
0 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31 32 33 34
35 36 37 38 39 40 41
42-processor torus
A torus configuration can be regarded as a two dimensional mesh, but each row and each
column of the array processors is a ring. Therefore the torus configuration can be
regarded as a ring of rings of processors. A torus configuration with i rows and j columns
of processors will have a total of i*j processors and a total of 2*i*j + 2*j*i = 4*I*j
connections.
4. THREADS ANALYSIS
To understand it better, assume that there is a Newspaper presented for users to read lying
on top of a table. That Newspaper before anybody reads it is like a program that has not
started running. Then you come in, saw the Newspaper and picked it to read. Your
starting to read a section of the Newspaper is like starting to run a part of a program. Note
that you don’t read all the news in the Newspaper at once; you may just pick the Sports
Section on African Cup of Nations (AFCON) and start reading. The entire Newspaper is
likened to a program and the Sports Section is likened to a Thread which is a unit of
execution cutting through the code and the data structures of the program (that is, a thread
of control that is running in the program or a portion of the program that the CPU is
allocating attention to at a particular time). Then your friend comes in and sees a section
on Politics in the same Newspaper and starts reading. Now you and your friend are
reading different sections of the same Newspaper and this is likened to running multiple
threads of the same program. What do you think will happen if two of you want to read
the same section at once? There will be conflict, right? In the same way, if two or more
threads wants to update (read/write, etc) the same data structure at the same time, there
will be conflict such as deadlock; but operating system uses a scheduler to allocate
resources appropriately to ensure that there is no conflict.
Components of a thread
A thread comprises
- A thread ID
- A program counter
- A set of registers and,
- A stack
A thread can share program code, data and files with other threads within a program but
must be individually assigned a stack and a set of registers.
Advantages of Threads
Process
A process can also be defined as a combination of program and all the states of threads
executing in the program. The main difference between a process and a thread is that a
thread cannot run on its own, but runs as within program while a process is a program in
execution. In the context of a browser program for example, one thread may be
downloading a page while another is allowing the browsing a page, another fetching a
page from remote server, etc.
- Threads are dependent on one another since they share some resources hence
there may be conflict, processes are independent.
- Memory space are not protected in threads since threads share memory but
processes run independently
References
[1] Introduction to Concurrent Programming by Nir Piterman, Chalmers University
of Technology, University of Gothenburg, SP1 2021/2022
[1] Peterson, J. L, Silberschatz, A. (1985). Operating System Concepts, Addison-
Wesley Publishing Company.
[2] Chien-Min WANG and Sheng.De WANG, Structured Partitioning of Concurrent
Program for execution on multiprocessors, J. Parallel Computing.
[3] I.N. Herstein; Topics in Algebra; 1976; John Wiley & sons Inc.
16