Concurrency in Python Tutorial
Concurrency in Python Tutorial
i
Concurrency in Python
Audience
This tutorial will be useful for graduates, postgraduates, and research students who either
have an interest in this subject or have this subject as a part of their curriculum. The
reader can be a beginner or an advanced learner.
Prerequisites
The reader must have basic knowledge about concepts such as Concurrency,
Multiprocessing, Threads, and Process etc. of Operating System. He/she should also be
aware about basic terminologies used in OS along with Python programming concepts.
All the content and graphics published in this e-book are the property of Tutorials Point (I)
Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish
any contents or a part of contents of this e-book in any manner without written consent
of the publisher.
We strive to update the contents of our website and tutorials as timely and as precisely as
possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt.
Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our
website or its contents including this tutorial. If you discover any errors on our website or
in this tutorial, please notify us at contact@tutorialspoint.com
i
Concurrency in Python
Table of Contents
About the Tutorial ............................................................................................................................................ i
Audience ........................................................................................................................................................... i
Prerequisites ..................................................................................................................................................... i
What is Concurrency?...................................................................................................................................... 1
ii
Concurrency in Python
Why to Test?.................................................................................................................................................. 51
v
Concurrency in Python
1. Concurrency in Python – Introduction
In this chapter, we will understand the concept of concurrency in Python and learn about
the different threads and processes.
What is Concurrency?
In simple words, concurrency is the occurrence of two or more events at the same time.
Concurrency is a natural phenomenon because many events occur simultaneously at any
given time.
1
Concurrency in Python
independent of one other. Each thread shares code section, data section, etc. with other
threads. They are also known as lightweight processes.
Program counter which consist of the address of the next executable instruction
Stack
Set of registers
A unique id
Multithreading, on the other hand, is the ability of a CPU to manage the use of operating
system by executing multiple threads concurrently. The main idea of multithreading is to
achieve parallelism by dividing a process into multiple threads. The concept of
multithreading can be understood with the help of the following example.
Example
Suppose we are running a particular process wherein we open MS Word to type content
into it. One thread will be assigned to open MS Word and another thread will be required
to type content in it. And now, if we want to edit the existing then another thread will be
required to do the editing task and so on.
2
Concurrency in Python
A process can have only one thread, called primary thread, or multiple threads having
their own set of registers, program counter and stack. Following diagram will show us the
difference:
Multiprocessing, on the other hand, is the use of two or more CPUs units within a single
computer system. Our primary goal is to get the full potential from our hardware. To
achieve this, we need to utilize full number of CPU cores available in our computer system.
Multiprocessing is the best approach to do so.
Memory
Python is one of the most popular programming languages. Followings are some reasons
that make it suitable for concurrent applications:
Syntactic sugar
Syntactic sugar is syntax within a programming language that is designed to make things
easier to read or to express. It makes the language “sweeter” for human use: things can
be expressed more clearly, more concisely, or in an alternative style based on preference.
Python comes with Magic methods, which can be defined to act on objects. These Magic
methods are used as syntactic sugar and bound to more easy-to-understand keywords.
Large Community
3
Concurrency in Python
Python language has witnessed a massive adoption rate amongst data scientists and
mathematicians, working in the field of AI, machine learning, deep learning and
quantitative analysis.
However, there are some libraries and implementations in Python such as Numpy,
Jpython and IronPython. These libraries work without any interaction with GIL.
4
Concurrency in Python
2. Concurrency in Python – Concurrency vs
Parallelism
Both concurrency and parallelism are used in relation to multithreaded programs but there
is a lot of confusion about the similarity and difference between them. The big question in
this regard: is concurrency parallelism or not? Although both the terms appear quite
similar but the answer to the above question is NO, concurrency and parallelism are not
same. Now, if they are not same then what is the basic difference between them?
In simple terms, concurrency deals with managing the access to shared state from
different threads and on the other side, parallelism deals with utilizing multiple CPUs or its
cores to improve the performance of hardware.
Concurrency in Detail
Concurrency is when two tasks overlap in execution. It could be a situation where an
application is progressing on more than one task at the same time. We can understand it
diagrammatically; multiple tasks are making progress at the same time, as follows:
Levels of Concurrency
In this section, we will discuss the three important levels of concurrency in terms of
programming:
Low-Level Concurrency
In this level of concurrency, there is explicit use of atomic operations. We cannot use such
kind of concurrency for application building, as it is very error-prone and difficult to debug.
Even Python does not support such kind of concurrency.
Mid-Level Concurrency
5
Concurrency in Python
In this concurrency, there is no use of explicit atomic operations. It uses the explicit locks.
Python and other programming languages support such kind of concurrency. Mostly
application programmers use this concurrency.
High-Level Concurrency
In this concurrency, neither explicit atomic operations nor explicit locks are used. Python
has concurrent.futures module to support such kind of concurrency.
Correctness property
The correctness property means that the program or the system must provide the desired
correct answer. To keep it simple, we can say that the system must map the starting
program state to final state correctly.
Safety property
The safety property means that the program or the system must remain in a “good” or
“safe” state and never does anything “bad”.
Liveness property
This property means that a program or system must “make progress” and it would reach
at some desirable state.
6
Concurrency in Python
Sharing of data
An important issue while implementing the concurrent systems is the sharing of data
among multiple threads or processes. Actually, the programmer must ensure that locks
protect the shared data so that all the accesses to it are serialized and only one thread or
process can access the shared data at a time. In case, when multiple threads or processes
are all trying to access the same shared data then not all but at least one of them would
be blocked and would remain idle. In other words, we can say that we would be able to
use only one process or thread at a time when lock is in force. There can be some simple
solutions to remove the above-mentioned barriers:
The following Python script is for requesting a web page and getting the time our network
took to get the requested page:
import urllib.request
import time
7
Concurrency in Python
ts = time.time()
req = urllib.request.urlopen('http://www.tutorialspoint.com')
pageHtml = req.read()
te = time.time()
After executing the above script, we can get the page fetching time as shown below.
Output
Page Fetching Time: 1.0991398811340332 Seconds
We can see that the time to fetch the page is more than one second. Now what if we want
to fetch thousands of different web pages, you can understand how much time our network
would take.
What is Parallelism?
Parallelism may be defined as the art of splitting the tasks into subtasks that can be
processed simultaneously. It is opposite to the concurrency, as discussed above, in which
two or more events are happening at the same time. We can understand it
diagrammatically; a task is broken into a number of subtasks that can be processed in
parallel, as follows:
To get more idea about the distinction between concurrency and parallelism, consider the
following points:
8
Concurrency in Python
An application can be concurrent but not parallel means that it processes more than one
task at the same time but the tasks are not broken down into subtasks.
Necessity of Parallelism
We can achieve parallelism by distributing the subtasks among different cores of single
CPU or among multiple computers connected within a network.
If we talk about real life example of parallelism, the graphics card of our computer is the
example that highlights the true power of parallel processing because it has hundreds of
individual processing cores that work independently and can do the execution at the same
time. Due to this reason, we are able to run high-end applications and games as well.
9
Concurrency in Python
benefit to take informed decision while designing the software. We have the following two
kinds of processors:
Single-core processors
Single-core processors are capable of executing one thread at any given time. These
processors use context switching to store all the necessary information for a thread at
a specific time and then restoring the information later. The context switching mechanism
helps us make progress on a number of threads within a given second and it looks as if
the system is working on multiple things.
Single-core processors come with many advantages. These processors require less power
and there is no complex communication protocol between multiple cores. On the other
hand, the speed of single-core processors is limited and it is not suitable for larger
applications.
Multi-core processors
Multi-core processors have multiple independent processing units also called cores.
Such processors do not need context switching mechanism as each core contains
everything it needs to execute a sequence of stored instructions.
Fetch-Decode-Execute Cycle
The cores of multi-core processors follow a cycle for executing. This cycle is called the
Fetch-Decode-Execute cycle. It involves the following steps:
Fetch
This is the first step of cycle, which involves the fetching of instructions from the program
memory.
Decode
Recently fetched instructions would be converted to a series of signals that will trigger
other parts of the CPU.
Execute
It is the final step in which the fetched and the decoded instructions would be executed.
The result of execution will be stored in a CPU register.
One advantage over here is that the execution in multi-core processors are faster than
that of single-core processors. It is suitable for larger applications. On the other hand,
complex communication protocol between multiple cores is an issue. Multiple cores require
more power than single-core processors.
10
Concurrency in Python
3. Concurrency in Python – System & Memory
Architecture
There are different system and memory architecture styles that need to be considered
while designing the program or concurrent system. It is very necessary because one
system & memory style may be suitable for one task but may be error prone to other task.
Advantages of SISD
The advantages of SISD architecture are as follows:
Disadvantages of SISD
The disadvantages of SISD architecture are as follows:
11
Concurrency in Python
The best example for SIMD is the graphics cards. These cards have hundreds of individual
processing units. If we talk about computational difference between SISD and SIMD then
for the adding arrays [5, 15, 20] and [15, 25, 10], SISD architecture would have to
perform three different add operations. On the other hand, with the SIMD architecture,
we can add then in a single add operation.
Advantages of SIMD
The advantages of SIMD architecture are as follows:
Same operation on multiple elements can be performed using one instruction only.
Disadvantages of SIMD
The disadvantages of SIMD architecture are as follows:
12
Concurrency in Python
A normal multiprocessor uses the MIMD architecture. These architectures are basically
used in a number of application areas such as computer-aided design/computer-aided
manufacturing, simulation, modeling, communication switches, etc.
13
Concurrency in Python
When all the processors have equal access to all the peripheral devices, the system is
called a symmetric multiprocessor. When only one or a few processors can access the
peripheral devices, the system is called an asymmetric multiprocessor.
14
Concurrency in Python
15
Concurrency in Python
4. Concurrency in Python – Threads
In general, as we know that thread is a very thin twisted string usually of the cotton or
silk fabric and used for sewing clothes and such. The same term thread is also used in the
world of computer programming. Now, how do we relate the thread used for sewing clothes
and the thread used for computer programming? The roles performed by the two threads
is similar here. In clothes, thread hold the cloth together and on the other side, in computer
programming, thread hold the computer program and allow the program to execute
sequential actions or many actions at once.
Thread is the smallest unit of execution in an operating system. It is not in itself a program
but runs within a program. In other words, threads are not independent of one other and
share code section, data section, etc. with other threads. These threads are also known
as lightweight processes.
States of Thread
To understand the functionality of threads in depth, we need to learn about the lifecycle
of the threads or the different thread states. Typically, a thread can exist in five distinct
states. The different states are shown below:
New Thread
A new thread begins its life cycle in the new state. However, at this stage, it has not yet
started and it has not been allocated any resources. We can say that it is just an instance
of an object.
Runnable
As the newly born thread is started, the thread becomes runnable i.e. waiting to run. In
this state, it has all the resources but still task scheduler have not scheduled it to run.
Running
In this state, the thread makes progress and executes the task, which has been chosen
by task scheduler to run. Now, the thread can go to either the dead state or the non-
runnable/ waiting state.
Non-running/waiting
In this state, the thread is paused because it is either waiting for the response of some
I/O request or waiting for the completion of the execution of other thread.
Dead
A runnable thread enters the terminated state when it completes its task or otherwise
terminates.
16
Concurrency in Python
Types of Thread
In this section, we will see the different types of thread. The types are described below:
In this case, the thread management kernel is not aware of the existence of threads. The
thread library contains code for creating and destroying threads, for passing message and
data between threads, for scheduling thread execution and for saving and restoring thread
contexts. The application starts with a single thread.
Java threads
POSIX threads
17
Concurrency in Python
In this case, the Kernel does thread management. There is no thread management code
in the application area. Kernel threads are supported directly by the operating system.
Any application can be programmed to be multithreaded. All of the threads within an
application are supported within a single process.
The Kernel maintains context information for the process as a whole and for individual
threads within the process. Scheduling by the Kernel is done on a thread basis. The Kernel
performs thread creation, scheduling and management in Kernel space. Kernel threads
are generally slower to create and manage than the user threads. The examples of kernel
level threads are Windows, Solaris.
18
Concurrency in Python
● Kernel can simultaneously schedule multiple threads from the same process on
multiple processes.
● If one thread in a process is blocked, the Kernel can schedule another thread of the
same process.
● Transfer of control from one thread to another within the same process requires a
mode switch to the Kernel.
Thread state: It contains the information related to the state (Running, Runnable,
Non-Running, Dead) of the thread.
Program Counter (PC): It points to the current program instruction of the thread.
Register set: It contains the thread’s register values assigned to them for
computations.
Stack Pointer: It points to the thread’s stack in the process. It contains the local
variables under thread’s scope.
Pointer to PCB: It contains the pointer to the process that created that thread.
Thread identification
Thread state
Register set
19
Concurrency in Python
Stack Pointer
Pointer to PCB
The following table shows the comparison between process and thread:
20
Concurrency in Python
6 In multiple processes, each process One thread can read, write or change
operates independently of the others. another thread's data.
7 If there would be any change in the If there would be any change in the
parent process then it does not affect main thread then it may affect the
the child processes. behavior of other threads of that
process.
21
Concurrency in Python
22