0% found this document useful (0 votes)

278 views

1 Python Multithreading and Multiprocessing Tutorial

The document discusses Python multithreading and multiprocessing. It explains that due to the Global Interpreter Lock (GIL) in CPython, true parallelism cannot be achieved with multithreading. However, processes allow executing code in parallel on different CPUs. Threads allow concurrent execution where independent tasks are scheduled cooperatively. The document provides an example demonstrating serial, threaded, and processed execution of tasks. Processes provide the best performance through parallel execution, while threads provide concurrency through cooperative scheduling.

Uploaded by

Subash Basnyat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

278 views

1 Python Multithreading and Multiprocessing Tutorial

Uploaded by

Subash Basnyat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Python Multithreading and Multiprocessing

December 19, 2018

1 Python Multithreading and Multiprocessing Tutorial

• WHY IS PARALLELISM TRICKY IN PYTHON? (Hint: its because of GIL - Global Interpreter
Lock)
• Threads vs Processes: Different ways of achieving parallelism. When to use one over the
other?
• Parallel vs Concurrent: Why in some cases we can settle for concurrency rather than paral-
lelism
• Building a simple but practical example using the various techniques discussed.

1.0.1 Global Interpreter Lock

The Global Interpreter Lock (GIL) is one of the most controversial subjects in the Python world.
In CPython, the most popular implementation of Python, the GIL is a mutex that makes things
thread-safe.
Thread safe: Implementation is guaranteed to be free of race conditions when accessed by
multiple threads simultaneously.
A race condition occurs when two or more threads can access shared data and they try to
change it at the same time. Because the thread scheduling algorithm can swap between threads at
any time, you don’t know the order in which the threads will attempt to access the shared data.
Therefore, the result of the change in data is dependent on the thread scheduling algorithm, i.e.
both threads are "racing" to access/change the data.
Problems often occur when one thread does a "check-then-act" (e.g. "check" if the value is X,
then "act" to do something that depends on the value being X) and another thread does something
to the value in between the "check" and the "act".
In order to prevent race conditions from occurring, you would typically put a lock around the
shared data to ensure only one thread can access the data at a time.
The GIL makes it easy to integrate with external libraries that are not thread-safe, and it makes
non-parallel code faster.
Due to the GIL, we can’t achieve true parallelism via multithreading.
But stuff that happens outside the GIL realm is free to be parallel.
In this category fall long-running tasks like I/O and, fortunately, libraries like numpy.

1.0.2 Threads vs Processes

Process is a program that is in execution - in other words, code that is running. Multiple processes
are always running in a computer, and they are executing in parallel.

1
Process can have multiple threads. They execute the same code belonging to the parent pro-
cess. Ideally, they run in parallel, but not necessarily.
A process is an executing instance of an application. What does that mean? Well, for example,
when you double-click the Microsoft Word icon, you start a process that runs Word. A thread is
a path of execution within a process. Also, a process can contain multiple threads. When you
start Word, the operating system creates a process and begins executing the primary thread of that
process.
It’s important to note that a thread can do anything a process can do. But since a process can
consist of multiple threads, a thread could be considered a ‘lightweight’ process.
Threads within the same process share the same address space, whereas different processes do
not. This allows threads to read from and write to the same data structures and variables, and also
facilitates communication between threads.
Threads, of course, allow for multi-threading. A common example of the advantage of mul-
tithreading is the fact that you can have a word processor that prints a document using a back-
ground thread, but at the same time another thread is running that accepts user input, so that you
can type up a new document.
If we were dealing with an application that uses only one thread, then the application would
only be able to do one thing at a time – so printing and responding to user input at the same time
would not be possible in a single threaded application.
Sections of code that modify data structures shared by multiple threads are called critical sec-
tions. When a critical section is running in one thread it’s extremely important that no other thread
be allowed into that critical section.

1.0.3 Parallel vs Concurrent

Concurrency implies scheduling independent code to be executed in a cooperative manner. Tak-
ing advantage of the fact that piece of code is waiting on I/O operations, and during that time run
a different but independent part of the code.
Processes A & B
Concurrent:

A --- -- ---

B --- -- ---

Parallel:

A ---------------

B ---------------

In [2]: import os
import time
import threading
import multiprocessing

NUM_WORKERS = 4

def only_sleep():

2
""" Do nothing, wait for a timer to expire """
print("PID: {}, Process Name: {}, Thread Name: {}".
format(os.getpid(),
multiprocessing.current_process().name,
threading.current_thread().name))
time.sleep(1)

def crunch_numbers():
""" Do some computations """
print("PID: {}, Process Name: {}, Thread Name: {}".
format(os.getpid(),
multiprocessing.current_process().name,
threading.current_thread().name))
x = 0
while x< 10000000:
x +=1

We have created two tasks. Both of them long running, but only crunch_numbers actively
performs computations.
Let’s run only_sleep - Serially - Multithreadedd - Using Multiple processes
and compare the results

In [3]: # Run tasks serially

start_time = time.time()
for _ in range(NUM_WORKERS):
only_sleep()
end_time = time.time()

print("Serial time= {} \n".format(end_time-start_time))

#Run tasks using threads

start_time = time.time()
threads = [threading.Thread(target=only_sleep) for _ in range(NUM_WORKERS)]
print([thread for thread in threads])
[thread.start() for thread in threads]

[thread.join() for thread in threads]

end_time = time.time()

print("Threads time= {} \n".format(end_time - start_time))

# Run tasks using processes

start_time = time.time()
processes = [multiprocessing.Process(target=only_sleep) for _ in range(NUM_WORKERS)]
print([process for process in processes])
[process.start() for process in processes]
[process.join() for process in processes]
end_time = time.time()

3
print("Parallel time= {}".format(end_time - start_time))

PID: 17451, Process Name: MainProcess, Thread Name: MainThread

PID: 17451, Process Name: MainProcess, Thread Name: MainThread
PID: 17451, Process Name: MainProcess, Thread Name: MainThread
PID: 17451, Process Name: MainProcess, Thread Name: MainThread
Serial time= 4.005737543106079

[<Thread(Thread-4, initial)>, <Thread(Thread-5, initial)>, <Thread(Thread-6, initial)>, <Thread(

PID: 17451, Process Name: MainProcess, Thread Name: Thread-5
PID: 17451, Process Name: MainProcess, Thread Name: Thread-6PID: 17451, Process Name: MainProces

PID: 17451, Process Name: MainProcess, Thread Name: Thread-7

Threads time= 1.0192456245422363

[<Process(Process-1, initial)>, <Process(Process-2, initial)>, <Process(Process-3, initial)>, <P

PID: 18100, Process Name: Process-1, Thread Name: MainThread
PID: 18102, Process Name: Process-2, Thread Name: MainThread
PID: 18106, Process Name: Process-3, Thread Name: MainThread
PID: 18109, Process Name: Process-4, Thread Name: MainThread
Parallel time= 1.048060655593872

In the case of the serial approach, things are pretty obvious. We’re running the tasks one after
the other. All four runs are executed by the same thread of the same process.
Using processes we cut the execution time down to a quarter of the original time, simply
because the tasks are executed in parallel. Notice how each task is performed in a different process
and on the MainThread of that process.
Using threads we take advantage of the fact that the tasks can be executed concurrently. The
execution time is also cut down to a quarter, even though nothing is running in parallel. Here’s
how that goes: we spawn the first thread and it starts waiting for the timer to expire. We pause its
execution, letting it wait for the timer to expire, and in this time we spawn the second thread. We
repeat this for all the threads. At one moment the timer of the first thread expires so we switch
execution to it and we terminate it. The algorithm is repeated for the second and for all the other
threads. At the end, the result is as if things were run in parallel. You’ll also notice that the four
different threads branch out from and live inside the same process: MainProcess.

In [5]: start_time = time.time()

for _ in range(NUM_WORKERS):
crunch_numbers()
end_time = time.time()

print("Serial time=", end_time - start_time)

start_time = time.time()
threads = [threading.Thread(target=crunch_numbers) for _ in range(NUM_WORKERS)]
[thread.start() for thread in threads]

4
[thread.join() for thread in threads]
end_time = time.time()

print("Threads time=", end_time - start_time)

start_time = time.time()
processes = [multiprocessing.Process(target=crunch_numbers) for _ in range(NUM_WORKERS)]
[process.start() for process in processes]
[process.join() for process in processes]
end_time = time.time()

print("Parallel time=", end_time - start_time)

PID: 17451, Process Name: MainProcess, Thread Name: MainThread

PID: 17451, Process Name: MainProcess, Thread Name: Thread-13

PID: 17451, Process Name: MainProcess, Thread Name: Thread-14
PID: 17451, Process Name: MainProcess, Thread Name: Thread-15
Threads time= 2.9878089427948
PID: 18132, Process Name: Process-9, Thread Name: MainThread
PID: 18133, Process Name: Process-10, Thread Name: MainThread
PID: 18141, Process Name: Process-12, Thread Name: MainThread
PID: 18138, Process Name: Process-11, Thread Name: MainThread
Parallel time= 1.2481975555419922

The main difference here is in the result of the multithreaded approach. This time it performs
very similarly to the serial approach, and here’s why: since it performs computations and Python
doesn’t perform real parallelism, the threads are basically running one after the other, yielding
execution to one another until they all finish.

1.1 Building Practical Application

Build an application that checks the uptime of websites.

• Application goes frequently over a list of websites URLs and checks if those websites are up
• Every websites should be checked every 5-10 mnutes so that the downtime is not significant
• Instead of performing a classic HTTP GET request, it performs a HEAD request so that it
doesnot affect your traffic significantly
• If the HTTP status is in the danger ranges (400+, 500+), the owner is notified.
• The owner is notified by email, text-message, or push notification

Why is it essential to take parallel/concurrent approach to the problem?

5
As the list of websites grow, going through the list serially won’t guarantee us that every
website is checked every five minutes or so. The website could be down for hours, and the owner
won’t be notified.
In [6]: import time
import logging
import requests

class WebsiteDownException(Exception):
pass

def ping_website(address, timeout=20):

'''
Check if the website is down. If status_code >= 400
or if timeout expires.
Throw a WebsiteDownException if any of the website
down conditions are met.
'''
try:
response = requests.head(address, timeout=timeout)
if response.status_code >= 400:
logging.warning("Website {} returned status_code={}".format(
address, response.status_code))
raise WebsiteDownException()
except requests.exceptions.RequestException:
logging.warning("Timeout expired for website {}".format(
address))
raise WebsiteDownException()

def notify_owner(address):
'''
Send the owner of the address a notification.
For now, we're going to sleep for 0.5 seconds.
'''
logging.info("Notifying the owner of {} website".format(
address))
time.sleep(0.5)

def check_website(address):
'''
Utility function: check if website is down
'''
try:
ping_website(address)
except WebsiteDownException:
notify_owner(address)
In [7]: WEBSITE_LIST = [
'http://envato.com',

6
'http://amazon.co.uk',
'http://amazon.com',
'http://facebook.com',
'http://google.com',
'http://google.fr',
'http://google.es',
'http://google.co.uk',
'http://internet.org',
'http://gmail.com',
'http://stackoverflow.com',
'http://github.com',
'http://heroku.com',
'http://really-cool-available-domain.com',
'http://djangoproject.com',
'http://rubyonrails.org',
'http://basecamp.com',
'http://trello.com',
'http://yiiframework.com',
'http://shopify.com',
'http://another-really-interesting-domain.co',
'http://airbnb.com',
'http://instagram.com',
'http://snapchat.com',
'http://youtube.com',
'http://baidu.com',
'http://yahoo.com',
'http://live.com',
'http://linkedin.com',
'http://yandex.ru',
'http://netflix.com',
'http://wordpress.com',
'http://bing.com',
]
In [8]: import time
start_time = time.time()

for address in WEBSITE_LIST:

check_website(address)

end_time = time.time()
print("Time for Serial: {} secs".format(end_time-start_time))
WARNING:root:Timeout expired for website http://really-cool-available-domain.com
WARNING:root:Timeout expired for website http://another-really-interesting-domain.co
WARNING:root:Website http://live.com returned status_code=405
WARNING:root:Website http://netflix.com returned status_code=405
WARNING:root:Website http://bing.com returned status_code=405

7
Time for Serial: 27.111411809921265 secs

1.1.1 1. Threading Approach

Use a queue to put the addresses in and create worker threads to get them out of the queue and
process them. We are going to wait for the queue to be empty.

In [9]: import time

from queue import Queue
from threading import Thread

NUM_WORKERS = 4
task_queue = Queue()

def worker():
#Constantly check the queue for addresses
while True:
address = task_queue.get()
check_website(address)

#Mark the processed task as done

task_queue.task_done()

start_time = time.time()

#Create the worker threads

threads = [Thread(target=worker) for _ in range(NUM_WORKERS)]

#Add the websites to the task queue

[task_queue.put(item) for item in WEBSITE_LIST]

#Start all the workers

[thread.start() for thread in threads]

#Wait for all the tasks in the queue to be processed

task_queue.join()

end_time = time.time()

print("Time fo Thread: {} secs".format(end_time - start_time))

WARNING:root:Timeout expired for website http://another-really-interesting-domain.co

WARNING:root:Timeout expired for website http://really-cool-available-domain.com
WARNING:root:Website http://live.com returned status_code=405
WARNING:root:Website http://bing.com returned status_code=405
WARNING:root:Website http://netflix.com returned status_code=405

• join() in Threading
For example, when the join() is invoked from a main thread, the main thread waits till the
child thread on which join is invoked exits. The significance of join() method is, if join() is not
invoked, the main thread may exit before the child thread, which will result undetermined
behaviour of programs and affect program invariants and integrity of the data on which the
program operates.

1.1.2 2. concurrent.futures
concurrent.futures is a high-level API for using threads. We will use a ThreadPoolExecutor. We’re
going to submit tasks to the pool and get back the futures, which are results that will be available
to us in the future. Of course, we can wait for all futures to become actual results.

In [11]: import time

import concurrent.futures

NUM_WORKERS = 4
start_time = time.time()

with concurrent.futures.ThreadPoolExecutor(
max_workers=NUM_WORKERS) as executor:
futures = {executor.submit(check_website, address)
for address in WEBSITE_LIST}
concurrent.futures.wait(futures)
end_time = time.time()
print("Time for Future: {}".format(end_time-start_time))

WARNING:root:Timeout expired for website http://really-cool-available-domain.com

WARNING:root:Timeout expired for website http://another-really-interesting-domain.co
WARNING:root:Website http://live.com returned status_code=405
WARNING:root:Website http://bing.com returned status_code=405
WARNING:root:Website http://netflix.com returned status_code=405

Time for Future: 14.673298120498657

1.1.3 3. The Multiprocessing Approach

The multiprocessing library provides an almost drop-in replacement API for the threading li-
brary. In this case, we’re going to take an approach similar to the concurrent.futures and sub-
mitting tasks to it by mapping a function to the list of addresses (think of classic Python map
function).

In [12]: import time

import socket

9
import multiprocessing

NUM_WORKERS = 4
start_time = time.time()

with multiprocessing.Pool(processes=NUM_WORKERS) as pool:

results = pool.map_async(check_website, WEBSITE_LIST)
results.wait()
end_time = time.time()
print("Time for MultiProcessing: {}".format(end_time-start_time))

WARNING:root:Timeout expired for website http://really-cool-available-domain.com

WARNING:root:Timeout expired for website http://another-really-interesting-domain.co
WARNING:root:Website http://live.com returned status_code=405
WARNING:root:Website http://netflix.com returned status_code=405
WARNING:root:Website http://bing.com returned status_code=405

Time for MultiProcessing: 12.136277437210083

1.1.4 Gevent
Gevent is a popular alternative for achieving massive concurrency. Few things to know:-

• Code performed concurrently by greenlets is deterministic. As opposed to two other pre-

sented alternatives, this paradigm guarantees that for any two identical run, you’ll get the
same results in the same order.

• You need to monkey patch standart functions so that they cooperate with gevent. What
it means is that normally a socket operation is blocking. We’re waiting for the operation
to finish. If we were in a multithreaded environment, the scheduler would simply switch
to another thread while other one is waiting for I/O. Since we are not in multithreaded
environment, gevent patches the standard functions so that they become non-blocking and
return control to the gevent scheduler.

In [14]: import time

from gevent.pool import Pool
from gevent import monkey

NUM_WORKERS = 4

#Monkey-Patch socket module for HTTP requests

monkey.patch_socket()

start_time = time.time()

pool = Pool(NUM_WORKERS)
for address in WEBSITE_LIST:

10
pool.spawn(check_website,address)

#Wait for stuff to finish

pool.join()

end_time = time.time()

print("Time for Monkey {}".format(end_time-start_time))

WARNING:root:Timeout expired for website http://really-cool-available-domain.com

WARNING:root:Timeout expired for website http://another-really-interesting-domain.co
WARNING:root:Website http://live.com returned status_code=405
WARNING:root:Website http://netflix.com returned status_code=405
WARNING:root:Website http://bing.com returned status_code=405

Time for Monkey 15.155159950256348

1.1.5 Celery
Celery is an approach that mostly differs from what we’ve seen so far. It is battle tested in the
context of very complex and high-performance environments. Setting up Celery will require bit
more tinkering than all the above solutions.
First, we’ll need to install Celery:

pip install celery

Tasks are the central concepts within the Celery project. Everything that you’ll want to run
inside Celery needs to be a task.
Celery offers great flexibility for running tasks:
you can run them synchronously or asynchronously, real-time or scheduled, on the same machine or on
multiple machines, and using threads, processes, Eventlet, or gevent.
Celery uses other services for sending and receiving messages. These messages are usually
tasks or results from tasks. We’re going to use Redis in this tutorial for this purpose.
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and
message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries,
bitmaps, hyperloglogs, geospatial indexes with radius queries and streams.
Install Redis by Redis Quickstart
Also to install the redis Python libray,

pip install redis

And the bundle necessary for using Redis and Celery:

pip install celery[redis]

Start the Redis server by:

redis-server

11
To get started building stuff with Celery, we’ll first need to create a Celery application. After
that, Celery needs to know what kind of tasks it might execute. To achieve that, we need to register
tasks to the Celery application. We’ll do this using the app.task decorator

In [19]: #Make File selery.py

import time
import logging
import requests
from celery import Celery
from celery.result import ResultSet

class WebsiteDownException(Exception):
pass

def ping_website(address, timeout=20):

def check_website(address):
'''
Utility function: check if website is down
'''
try:
ping_website(address)

12
except WebsiteDownException:
notify_owner(address)

WEBSITE_LIST = [
'http://envato.com',
'http://amazon.co.uk',
'http://amazon.com',
'http://facebook.com',
'http://google.com',
'http://google.fr',
'http://google.es',
'http://google.co.uk',
'http://internet.org',
'http://gmail.com',
'http://stackoverflow.com',
'http://github.com',
'http://heroku.com',
'http://really-cool-available-domain.com',
'http://djangoproject.com',
'http://rubyonrails.org',
'http://basecamp.com',
'http://trello.com',
'http://yiiframework.com',
'http://shopify.com',
'http://another-really-interesting-domain.co',
'http://airbnb.com',
'http://instagram.com',
'http://snapchat.com',
'http://youtube.com',
'http://baidu.com',
'http://yahoo.com',
'http://live.com',
'http://linkedin.com',
'http://yandex.ru',
'http://netflix.com',
'http://wordpress.com',
'http://bing.com',
]

app = Celery('selery',
broker='redis://localhost:6379/0',
backend='redis://localhost:6379/0')

@app.task

13
def check_website_task(address):
return check_website(address)

if __name__ == "__main__":
start_time = time.time()

# Using `delay` runs the task async

rs = ResultSet([check_website_task.delay(address) for address in WEBSITE_LIST])

# Wait for the tasks to finish

rs.get()

end_time = time.time()

print("Celery:", end_time - start_time)

in the same folder where our python file resides:

>> celery worker -A selery --loglevel=INFO --concurrency=4

Then,

>> python selery.py

Celery: 4.989539623260498

One thing to pay attention to: notice how we passed the Redis address to our Redis application
twice. The broker parameter specifies where the tasks are passed to Celery, and backend is where
Celery puts the results so that we can use them in our app. If we don’t specify a result backend,
there’s no way for us to know when the task was processed and what the result was.

In [ ]:

Build 10 Flutter 3.0 Apps in 100 Days A Step by Step Guide To Build Apps and Master Flutter (Sanjib Sinha) (Z-Library)
No ratings yet
Build 10 Flutter 3.0 Apps in 100 Days A Step by Step Guide To Build Apps and Master Flutter (Sanjib Sinha) (Z-Library)
804 pages
Introduction To Compiler Design - Solutions
0% (1)
Introduction To Compiler Design - Solutions
23 pages
Vulnerability Management Training
No ratings yet
Vulnerability Management Training
10 pages
Data Structure and Algorithm Design Assignment
100% (1)
Data Structure and Algorithm Design Assignment
7 pages
Multithreading and Multiprocessing
No ratings yet
Multithreading and Multiprocessing
3 pages
Multithreading in Python
No ratings yet
Multithreading in Python
10 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Python Multithreading
100% (1)
Python Multithreading
6 pages
Flask Restplus
No ratings yet
Flask Restplus
86 pages
The Next 700 Programming Languages
100% (1)
The Next 700 Programming Languages
10 pages
Top 100 Python Interview Questions & Answers For 2021 - Edureka
No ratings yet
Top 100 Python Interview Questions & Answers For 2021 - Edureka
24 pages
Chapter 1: Building Abstractions With Functions
100% (1)
Chapter 1: Building Abstractions With Functions
164 pages
Writing Idiomatic Python 3 PDF
100% (3)
Writing Idiomatic Python 3 PDF
66 pages
CS1010S-Lec-03 Recursion, Iteration
No ratings yet
CS1010S-Lec-03 Recursion, Iteration
70 pages
Files - Python Questions and Answers - Sanfoundry PDF
No ratings yet
Files - Python Questions and Answers - Sanfoundry PDF
15 pages
Going Go Programming
No ratings yet
Going Go Programming
324 pages
Python Idioms
100% (1)
Python Idioms
72 pages
Tutorials Point, Simply Easy Learning: Java Tutorial
No ratings yet
Tutorials Point, Simply Easy Learning: Java Tutorial
17 pages
Asymptotic Time Complexity
No ratings yet
Asymptotic Time Complexity
2 pages
Project and Process Metrices
No ratings yet
Project and Process Metrices
5 pages
E
100% (1)
E
327 pages
Network Programming
No ratings yet
Network Programming
1,011 pages
OpenWrt SDK
No ratings yet
OpenWrt SDK
10 pages
Parallela Cluster by Michael Johan Kruger
No ratings yet
Parallela Cluster by Michael Johan Kruger
56 pages
Scheduling Basics
No ratings yet
Scheduling Basics
78 pages
Cassandra
No ratings yet
Cassandra
31 pages
Python Programming
No ratings yet
Python Programming
10 pages
3 Control Statements
91% (11)
3 Control Statements
34 pages
Oop
No ratings yet
Oop
468 pages
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet
Python Notes
No ratings yet
Python Notes
67 pages
Power Point Presentation On Topic: Python: Submitted By: Himani Kathal
No ratings yet
Power Point Presentation On Topic: Python: Submitted By: Himani Kathal
12 pages
Fundamentals of Algorithm
No ratings yet
Fundamentals of Algorithm
14 pages
C Programming Interview Questions
No ratings yet
C Programming Interview Questions
21 pages
Python Practical Questions For HND Pearson
No ratings yet
Python Practical Questions For HND Pearson
7 pages
GCC Profile Guided Optimization
No ratings yet
GCC Profile Guided Optimization
47 pages
Practical Numerical Computing Using Python (Mahendra Verma) (Z-Library)
No ratings yet
Practical Numerical Computing Using Python (Mahendra Verma) (Z-Library)
759 pages
Zeromq: Martin Sústrik
100% (1)
Zeromq: Martin Sústrik
14 pages
Operating System Tutorial
100% (1)
Operating System Tutorial
72 pages
Beginners Python Cheat Sheet PCC BW
No ratings yet
Beginners Python Cheat Sheet PCC BW
2 pages
A Guide To Mastering html5 css3 and Javascript
No ratings yet
A Guide To Mastering html5 css3 and Javascript
68 pages
Python Class Notes
No ratings yet
Python Class Notes
18 pages
Data Types Operators
No ratings yet
Data Types Operators
133 pages
Specman Cheat Book
0% (1)
Specman Cheat Book
15 pages
Sorting and Searching Algorithms
No ratings yet
Sorting and Searching Algorithms
49 pages
Programming With Python For Student
No ratings yet
Programming With Python For Student
69 pages
Plete Python Manual 4th HQ PDF-Edition 2019
No ratings yet
Plete Python Manual 4th HQ PDF-Edition 2019
163 pages
Python Interview Questions and Answers
No ratings yet
Python Interview Questions and Answers
3 pages
Puzzles As Programmer Interview Question
No ratings yet
Puzzles As Programmer Interview Question
32 pages
You Aren't A One Man Army:: Introducing 0MQ
No ratings yet
You Aren't A One Man Army:: Introducing 0MQ
26 pages
Intro To Automata Theory
No ratings yet
Intro To Automata Theory
23 pages
Synchronization Algorithms and Concurrent Programming
No ratings yet
Synchronization Algorithms and Concurrent Programming
74 pages
Constants, Variables, & Data Types: Prepared By:vipul Vekariya
No ratings yet
Constants, Variables, & Data Types: Prepared By:vipul Vekariya
46 pages
Snake Game
No ratings yet
Snake Game
65 pages
NW.js Essentials
From Everand
NW.js Essentials
Alessandro Benoit
No ratings yet
Racket Programming the Fun Way: From Strings to Turing Machines
From Everand
Racket Programming the Fun Way: From Strings to Turing Machines
James. W. Stelly
No ratings yet
Django 1.0 Template Development
From Everand
Django 1.0 Template Development
Scott Newman
No ratings yet
Building Websites with VB.NET and DotNetNuke 4
From Everand
Building Websites with VB.NET and DotNetNuke 4
Daniel N. Egan
1/5 (1)
C Programming: Core Concepts and Techniques
From Everand
C Programming: Core Concepts and Techniques
William Smith
No ratings yet
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
Professional Python
From Everand
Professional Python
Luke Sneeringer
No ratings yet
Mastering the Art of Node.js Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Node.js Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Comprehensive Python Cheatsheet
No ratings yet
Comprehensive Python Cheatsheet
56 pages
Python Basics With Numpy v3
No ratings yet
Python Basics With Numpy v3
17 pages
Python Basics With Numpy v3
No ratings yet
Python Basics With Numpy v3
17 pages
NanoDegree AI Syllabus
No ratings yet
NanoDegree AI Syllabus
7 pages
Model of Entrepreneurial Success: A Review and Research Agenda
No ratings yet
Model of Entrepreneurial Success: A Review and Research Agenda
19 pages
Carrom Rules
No ratings yet
Carrom Rules
2 pages
Jim Morrison
No ratings yet
Jim Morrison
7 pages
Minimum and Maximum Modes
No ratings yet
Minimum and Maximum Modes
21 pages
Your Task: Prepare A Simple Report About The Data Types and Other Literals That Is Supported by The Mysql and Submit It
No ratings yet
Your Task: Prepare A Simple Report About The Data Types and Other Literals That Is Supported by The Mysql and Submit It
1 page
99 157055 B Sailor 6120 Ssa System Type Approval Certificate DNV
No ratings yet
99 157055 B Sailor 6120 Ssa System Type Approval Certificate DNV
3 pages
BambooInvoice - A Simple, Opensource Online Invoicing Software - Unixmen
No ratings yet
BambooInvoice - A Simple, Opensource Online Invoicing Software - Unixmen
9 pages
Chapter 6
No ratings yet
Chapter 6
43 pages
Ey-Tprm-Covid-19 Third Party Resilience Response-Pov
No ratings yet
Ey-Tprm-Covid-19 Third Party Resilience Response-Pov
21 pages
Document Management in M Files
No ratings yet
Document Management in M Files
2 pages
Grundig LCD TV 32 Vle 7131 BF Users Manual 430044
No ratings yet
Grundig LCD TV 32 Vle 7131 BF Users Manual 430044
59 pages
Test Bank For Hands On Ethical Hacking and Network Defense 1st Edition Simpson Backman Corley 1133935613 9781133935612
100% (11)
Test Bank For Hands On Ethical Hacking and Network Defense 1st Edition Simpson Backman Corley 1133935613 9781133935612
35 pages
Weaver Crop Combination Method
No ratings yet
Weaver Crop Combination Method
6 pages
Đáp Án Đề Thi Thử Số 17
No ratings yet
Đáp Án Đề Thi Thử Số 17
6 pages
Binary Tree Problems Must For Interviews and Competitive Coding
No ratings yet
Binary Tree Problems Must For Interviews and Competitive Coding
386 pages
Java Lab Manual
No ratings yet
Java Lab Manual
38 pages
probability-diagrams---venn-and-tree-diagrams-y8BmZhB2tGpqD8fF
No ratings yet
probability-diagrams---venn-and-tree-diagrams-y8BmZhB2tGpqD8fF
84 pages
Year Skill Checklist - Algebra
No ratings yet
Year Skill Checklist - Algebra
1 page
Ks2 Mathematics 2005 Test A
No ratings yet
Ks2 Mathematics 2005 Test A
24 pages
Leica Infinity BRO
No ratings yet
Leica Infinity BRO
4 pages
Device Type Brand Model Version FCCID Availability Supported Current Rel Unsupported Functions Bootloader
No ratings yet
Device Type Brand Model Version FCCID Availability Supported Current Rel Unsupported Functions Bootloader
56 pages
More Challenging Problems On Numerical Analysis
No ratings yet
More Challenging Problems On Numerical Analysis
14 pages
PowerMax Power Up Power Down
No ratings yet
PowerMax Power Up Power Down
4 pages
Barangay Management System Final
No ratings yet
Barangay Management System Final
124 pages
Btech and Mtech All Semesters Supplementary Examinations Timetable
No ratings yet
Btech and Mtech All Semesters Supplementary Examinations Timetable
17 pages
Muhammad Adnan Ashraf: Phone
No ratings yet
Muhammad Adnan Ashraf: Phone
1 page
Group 1 Summary Chap 8
No ratings yet
Group 1 Summary Chap 8
9 pages
Web Based Enrollment and Billing System
100% (1)
Web Based Enrollment and Billing System
13 pages
BPMN Fundamentals: Romi Satria Wahono
No ratings yet
BPMN Fundamentals: Romi Satria Wahono
49 pages
QUIZ Operations On Integers
No ratings yet
QUIZ Operations On Integers
5 pages
Discrete Mathematics Assignment-01
No ratings yet
Discrete Mathematics Assignment-01
10 pages
Pulsimsuite2 Leaflet
No ratings yet
Pulsimsuite2 Leaflet
2 pages
PA-30 Paging & Music Amplifier
No ratings yet
PA-30 Paging & Music Amplifier
8 pages

1 Python Multithreading and Multiprocessing Tutorial

Uploaded by

1 Python Multithreading and Multiprocessing Tutorial

Uploaded by

Python Multithreading and Multiprocessing

December 19, 2018

1 Python Multithreading and Multiprocessing Tutorial

1.0.1 Global Interpreter Lock

1.0.2 Threads vs Processes

1.0.3 Parallel vs Concurrent

In [3]: # Run tasks serially

print("Serial time= {} \n".format(end_time-start_time))

#Run tasks using threads

[thread.join() for thread in threads]

print("Threads time= {} \n".format(end_time - start_time))

# Run tasks using processes

PID: 17451, Process Name: MainProcess, Thread Name: MainThread

[<Thread(Thread-4, initial)>, <Thread(Thread-5, initial)>, <Thread(Thread-6, initial)>, <Thread(

PID: 17451, Process Name: MainProcess, Thread Name: Thread-7

[<Process(Process-1, initial)>, <Process(Process-2, initial)>, <Process(Process-3, initial)>, <P

In [5]: start_time = time.time()

print("Serial time=", end_time - start_time)

print("Threads time=", end_time - start_time)

print("Parallel time=", end_time - start_time)

PID: 17451, Process Name: MainProcess, Thread Name: MainThread

PID: 17451, Process Name: MainProcess, Thread Name: Thread-13

1.1 Building Practical Application

Why is it essential to take parallel/concurrent approach to the problem?

def ping_website(address, timeout=20):

for address in WEBSITE_LIST:

1.1.1 1. Threading Approach

In [9]: import time

#Mark the processed task as done

#Create the worker threads

#Add the websites to the task queue

#Start all the workers

#Wait for all the tasks in the queue to be processed

print("Time fo Thread: {} secs".format(end_time - start_time))

WARNING:root:Timeout expired for website http://another-really-interesting-domain.co

In [11]: import time

WARNING:root:Timeout expired for website http://really-cool-available-domain.com

Time for Future: 14.673298120498657

1.1.3 3. The Multiprocessing Approach

In [12]: import time

with multiprocessing.Pool(processes=NUM_WORKERS) as pool:

WARNING:root:Timeout expired for website http://really-cool-available-domain.com

Time for MultiProcessing: 12.136277437210083

• Code performed concurrently by greenlets is deterministic. As opposed to two other pre-

In [14]: import time

#Monkey-Patch socket module for HTTP requests

#Wait for stuff to finish

print("Time for Monkey {}".format(end_time-start_time))

WARNING:root:Timeout expired for website http://really-cool-available-domain.com

Time for Monkey 15.155159950256348

pip install celery

pip install redis

And the bundle necessary for using Redis and Celery:

pip install celery[redis]

Start the Redis server by:

In [19]: #Make File selery.py

def ping_website(address, timeout=20):

# Using `delay` runs the task async

# Wait for the tasks to finish

print("Celery:", end_time - start_time)

in the same folder where our python file resides:

>> celery worker -A selery --loglevel=INFO --concurrency=4

>> python selery.py

You might also like