Python Language 581 860
Python Language 581 860
Unlike other languages, Python doesn't have a do-until or a do-while construct (this will allow code
to be executed once before the condition is tested). However, you can combine a while True with a
break to achieve the same purpose.
a = 10
while True:
a = a-1
print(a)
if a<7:
break
print('Done.')
9
8
7
6
Done.
collection = [('a', 'b', 'c'), ('x', 'y', 'z'), ('1', '2', '3')]
This will also work for most types of iterables, not just tuples.
https://riptutorial.com/ 517
Chapter 98: Manipulating XML
Remarks
Not all elements of the XML input will end up as elements of the parsed tree. Currently, this
module skips over any XML comments, processing instructions, and document type declarations
in the input. Nevertheless, trees built using this module’s API rather than parsing from XML text
can have comments and processing instructions in them; they will be included when generating
XML output.
Examples
Opening and reading using an ElementTree
Import the ElementTree object, open the relevant .xml file and get the root tag:
import xml.etree.ElementTree as ET
tree = ET.parse("yourXMLfile.xml")
root = tree.getroot()
There are a few ways to search through the tree. First is by iteration:
print(root[0][1].text)
print(root.findall("myTag"))
print(root[0].find("myOtherTag"))
Import Element Tree module and open xml file, get an xml element
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root=tree.getroot()
element = root[0] #get first child of root element
Element object can be manipulated by changing its fields, adding and modifying attributes, adding
and removing children
https://riptutorial.com/ 518
element.set('attribute_name', 'attribute_value') #set the attribute to xml element
element.text="string_text"
root.remove(element)
tree.write('output.xml')
import xml.etree.ElementTree as ET
p=ET.Element('parent')
c = ET.SubElement(p, 'child1')
ET.dump(p)
# Output will be like this
#<parent><child1 /></parent>
If you want to save to a file create a xml tree with ElementTree() function and to save to a file use
write() method
tree = ET.ElementTree(p)
tree.write("output.xml")
Opening and reading large XML files using iterparse (incremental parsing)
Sometimes we don't want to load the entire XML file in order to get the information we need. In
these instances, being able to incrementally load the relevant sections and then delete them when
we are finished is useful. With the iterparse function you can edit the element tree that is stored
https://riptutorial.com/ 519
while parsing the XML.
import xml.etree.ElementTree as ET
Open the .xml file and iterate over all the elements:
Alternatively, we can only look for specific events, such as start/end tags or namespaces. If this
option is omitted (as above), only "end" events are returned:
Here is the complete example showing how to clear elements from the in-memory tree when we
are finished with them:
Starting with version 2.7 ElementTree has a better support for XPath queries. XPath is a syntax to
enable you to navigate through an xml like SQL is used to search through a database. Both find
and findall functions support XPath. The xml below will be used for this example
<Catalog>
<Books>
<Book id="1" price="7.95">
<Title>Do Androids Dream of Electric Sheep?</Title>
<Author>Philip K. Dick</Author>
</Book>
<Book id="5" price="5.95">
<Title>The Colour of Magic</Title>
<Author>Terry Pratchett</Author>
</Book>
<Book id="7" price="6.95">
<Title>The Eye of The World</Title>
<Author>Robert Jordan</Author>
</Book>
</Books>
</Catalog>
https://riptutorial.com/ 520
import xml.etree.cElementTree as ET
tree = ET.parse('sample.xml')
tree.findall('Books/Book')
tree.find("Books/Book[@id='5']")
# searches with xml attributes must have '@' before the name
tree.find("Books/Book[2]")
# indexes starts at 1, not 0
tree.find("Books/Book[last()]")
# 'last' is the only xpath function allowed in ElementTree
tree.findall(".//Author")
#searches with // must use a relative path
https://riptutorial.com/ 521
Chapter 99: Map Function
Syntax
• map(function, iterable[, *additional_iterables])
• future_builtins.map(function, iterable[, *additional_iterables])
• itertools.imap(function, iterable[, *additional_iterables])
Parameters
Parameter Details
Remarks
Everything that can be done with map can also be done with comprehensions:
import operator
alist = [1,2,3]
list(map(operator.add, alist, alist)) # [2, 4, 6]
[i + j for i, j in zip(alist, alist)] # [2, 4, 6]
List comprehensions are efficient and can be faster than map in many cases, so test the times of
both approaches if speed is important for you.
Examples
Basic use of map, itertools.imap and future_builtins.map
The map function is the simplest one among Python built-ins used for functional programming.
map() applies a specified function to each element in an iterable:
https://riptutorial.com/ 522
Python 3.x3.0
map(len, names) # map in Python 3.x is a class; its instances are iterable
# Out: <map object at 0x00000198B32E2CF8>
Python 2.x2.6
Alternatively, in Python 2 one can use imap from itertools to get a generator
Python 2.x2.3
The result can be explicitly converted to a list to remove the differences between Python 2 and 3:
list(map(len, names))
# Out: [4, 5, 6]
For example, you can take the absolute value of each element:
list(map(abs, (1, -1, 2, -2, 3, -3))) # the call to `list` is unnecessary in 2.x
# Out: [1, 1, 2, 2, 3, 3]
def to_percent(num):
https://riptutorial.com/ 523
return num * 100
functools.partial is a convenient way to fix parameters of functions so that they can be used with
map instead of using lambda or creating customized functions.
For example calculating the average of each i-th element of multiple iterables:
def average(*args):
return float(sum(args)) / len(args) # cast to float - only mandatory for python 2.x
There are different requirements if more than one iterable is passed to map depending on the
version of python:
Python 2.x2.0.1
https://riptutorial.com/ 524
• map:The mapping iterates as long as one iterable is still not fully consumed but assumes None
from the fully consumed iterables:
import operator
• itertools.imap and future_builtins.map: The mapping stops as soon as one iterable stops:
import operator
from itertools import imap
Python 3.x3.0.0
import operator
Transposing with Map: Using "None" as function argument (python 2.x only)
list(map(None, *image))
# Out: [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
list(fmap(None, *image))
# Out: [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
https://riptutorial.com/ 525
list(imap(None, *image))
# Out: [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
Python 3.x3.0.0
list(map(None, *image))
def conv_to_list(*args):
return list(args)
list(map(conv_to_list, *image))
# Out: [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
map() is a built-in function, which means that it is available everywhere without the need to use an
'import' statement. It is available everywhere just like print() If you look at Example 5 you will see
that I had to use an import statement before I could use pretty print (import pprint). Thus pprint is
not a built-in function
Series mapping
In this case each argument of the iterable is supplied as argument to the mapping function in
ascending order. This arises when we have just one iterable to map and the mapping function
requires a single argument.
Example 1
results in
Example 2
https://riptutorial.com/ 526
print(list(map(len, insects))) # the len function is executed each item in the insect list
results in
[3, 3, 6, 10]
Parallel mapping
In this case each argument of the mapping function is pulled from across all iterables (one from
each iterable) in parallel. Thus the number of iterables supplied must match the number of
arguments required by the function.
Example 3
results in
Example 4
results in
TypeError: animals() missing 3 required positional arguments: 'x', 'y', and 'z'
Example 5
# here map supplies w, x, y, z with one value from across the list
import pprint
pprint.pprint(list(map(animals, insects, carnivores, herbivores, omnivores)))
results in
https://riptutorial.com/ 527
['Fly, lion, african buffalo, and chicken ARE ALL ANIMALS',
'Ant, tiger, moose, and dove ARE ALL ANIMALS',
'Beetle, leopard, okapi, and mouse ARE ALL ANIMALS',
'Cankerworm, arctic fox, parakeet, and pig ARE ALL ANIMALS']
https://riptutorial.com/ 528
Chapter 100: Math Module
Examples
Rounding: round, floor, ceil, trunc
In addition to the built-in round function, the math module provides the floor, ceil, and trunc
functions.
x = 1.55
y = -1.55
# the second argument gives how many decimal places to round to (defaults to 0)
round(x, 1) # 1.6
round(y, 1) # -1.6
Python 2.x2.7
round(1.3) # 1.0
round(0.5) # 1.0
round(1.5) # 2.0
Python 3.x3.0
floor, ceil,and trunc always return an Integral value, while round returns an Integral value if
called with one argument.
https://riptutorial.com/ 529
round(1.3) # 1
round(1.33, 1) # 1.3
breaks ties towards the nearest even number. This corrects the bias towards larger numbers
round
when performing a large number of calculations.
round(0.5) # 0
round(1.5) # 2
Warning!
As with any floating-point representation, some fractions cannot be represented exactly. This can
lead to some unexpected rounding behavior.
Warning about the floor, trunc, and integer division of negative numbers
Python (and C++ and Java) round away from zero for negative numbers. Consider:
>>> math.floor(-1.7)
-2.0
>>> -5 // 2
-3
Logarithms
math.log(math.e) # 1.0
math.log(1) # 0.0
math.log(100) # 4.605170185988092
math.log can lose precision with numbers close to 1, due to the limitations of floating-point
numbers. In order to accurately calculate logs close to 1, use math.log1p, which evaluates the
natural logarithm of 1 plus the argument:
math.log10(10) # 1.0
Python 2.x2.3.0
When used with two arguments, math.log(x, base) gives the logarithm of x in the given base (i.e.
log(x) / log(base).
https://riptutorial.com/ 530
math.log(100, 10) # 2.0
math.log(27, 3) # 3.0
math.log(1, 10) # 0.0
Copying signs
In Python 2.6 and higher, math.copysign(x, y) returns x with the sign of y. The returned value is
always a float.
Python 2.x2.6
math.copysign(-2, 3) # 2.0
math.copysign(3, -3) # -3.0
math.copysign(4, 14.2) # 4.0
math.copysign(1, -0.0) # -1.0, on a platform which supports signed zero
Trigonometry
All math functions expect radians so you need to convert degrees to radians:
All results of the inverse trigonometic functions return the result in radians, so you may need to
convert it back to degrees:
math.asin(1)
# Out: 1.5707963267948966 # "= pi / 2"
math.asin(1) / math.pi
# Out: 0.5
https://riptutorial.com/ 531
math.cos(math.pi / 2)
# Out: 6.123233995736766e-17
# Almost zero but not exactly because "pi" is a float with limited precision!
math.acos(1)
# Out: 0.0
Python 3.x3.5
math.atan(math.inf)
# Out: 1.5707963267948966 # This is just "pi / 2"
math.atan(float('inf'))
# Out: 1.5707963267948966 # This is just "pi / 2"
Apart from the math.atan there is also a two-argument math.atan2 function, which computes the
correct quadrant and avoids pitfalls of division by zero:
Constants
https://riptutorial.com/ 532
>>> from math import pi, e
>>> pi
3.141592653589793
>>> e
2.718281828459045
>>>
Python 3.5 and higher have constants for infinity and NaN ("not a number"). The older syntax of
passing a string to float() still works.
Python 3.x3.5
math.inf == float('inf')
# Out: True
-math.inf == float('-inf')
# Out: True
Imaginary Numbers
Imaginary numbers in Python are represented by a "j" or "J" trailing the target number.
In all versions of Python, we can represent infinity and NaN ("not a number") as follows:
In Python 3.5 and higher, we can also use the defined constants math.inf and math.nan:
Python 3.x3.5
pos_inf = math.inf
neg_inf = -math.inf
not_a_num = math.nan
We can test for either positive or negative infinity with the isinf method:
https://riptutorial.com/ 533
math.isinf(pos_inf)
# Out: True
math.isinf(neg_inf)
# Out: True
We can test specifically for positive infinity or for negative infinity by direct comparison:
neg_inf == pos_inf
# Out: False
Python 3.x3.2
math.isfinite(pos_inf)
# Out: False
math.isfinite(0.0)
# Out: True
import sys
sys.float_info.max
# Out: 1.7976931348623157e+308 (this is system-dependent)
But if an arithmetic expression produces a value larger than the maximum that can be represented
as a float, it will become infinity:
However division by zero does not give a result of infinity (or negative infinity where appropriate),
rather it raises a ZeroDivisionError exception.
try:
x = 1.0 / 0.0
https://riptutorial.com/ 534
print(x)
except ZeroDivisionError:
print("Division by zero")
0.0 * pos_inf
# Out: nan
0.0 * neg_inf
# Out: nan
pos_inf / pos_inf
# Out: nan
NaN is never equal to anything, not even itself. We can test for it is with the isnan method:
not_a_num == not_a_num
# Out: False
math.isnan(not_a_num)
Out: True
NaN always compares as "not equal", but never less than or greater than:
Arithmetic operations on NaN always give NaN. This includes multiplication by -1: there is no
"negative NaN".
5.0 * not_a_num
# Out: nan
float('-nan')
# Out: nan
Python 3.x3.5
-math.nan
# Out: nan
https://riptutorial.com/ 535
There is one subtle difference between the old float versions of NaN and infinity and the Python
3.5+ math library constants:
Python 3.x3.5
The built-in ** operator often comes in handy, but if performance is of the essence, use math.pow.
Be sure to note, however, that pow returns floats, even if the arguments are integers:
The cmath module is similar to the math module, but defines functions appropriately for the complex
plane.
First of all, complex numbers are a numeric type that is part of the Python language itself rather
than being provided by a library class. Thus we don't need to import cmath for ordinary arithmetic
expressions.
z = 1 + 3j
We must use 1j since j would be the name of a variable rather than a numeric literal.
1j * 1j
Out: (-1+0j)
1j ** 1j
# Out: (0.20787957635076193+0j) # "i to the i" == math.e ** -(math.pi/2)
We have the real part and the imag (imaginary) part, as well as the complex conjugate:
https://riptutorial.com/ 536
# real part and imaginary part are both float type
z.real, z.imag
# Out: (1.0, 3.0)
z.conjugate()
# Out: (1-3j) # z.conjugate() == z.real - z.imag * 1j
The built-in functions abs and complex are also part of the language itself and don't require any
import:
abs(1 + 1j)
# Out: 1.4142135623730951 # square root of 2
complex(1)
# Out: (1+0j)
complex(imag=1)
# Out: (1j)
complex(1, 1)
# Out: (1+1j)
The complex function can take a string, but it can't have spaces:
complex('1+1j')
# Out: (1+1j)
complex('1 + 1j')
# Exception: ValueError: complex() arg is a malformed string
But for most functions we do need the module, for instance sqrt:
import cmath
cmath.sqrt(-1)
# Out: 1j
Naturally the behavior of sqrt is different for complex numbers and real numbers. In non-complex
math the square root of a negative number raises an exception:
import math
math.sqrt(-1)
# Exception: ValueError: math domain error
cmath.polar(1 + 1j)
# Out: (1.4142135623730951, 0.7853981633974483) # == (sqrt(1 + 1), atan2(1, 1))
cmath.rect(math.sqrt(2), math.atan(1))
https://riptutorial.com/ 537
# Out: (1.0000000000000002+1.0000000000000002j)
The mathematical field of complex analysis is beyond the scope of this example, but many
functions in the complex plane have a "branch cut", usually along the real axis or the imaginary
axis. Most modern platforms support "signed zero" as specified in IEEE 754, which provides
continuity of those functions on both sides of the branch cut. The following example is from the
Python documentation:
cmath.phase(complex(-1.0, 0.0))
# Out: 3.141592653589793
cmath.phase(complex(-1.0, -0.0))
# Out: -3.141592653589793
The cmath module also provides many functions with direct counterparts from the math module.
In addition to sqrt, there are complex versions of exp, log, log10, the trigonometric functions and
their inverses (sin, cos, tan, asin, acos, atan), and the hyperbolic functions and their inverses (sinh,
cosh, tanh, asinh, acosh, atanh). Note however there is no complex counterpart of math.atan2, the
two-argument form of arctangent.
cmath.log(1+1j)
# Out: (0.34657359027997264+0.7853981633974483j)
cmath.exp(1j * cmath.pi)
# Out: (-1+1.2246467991473532e-16j) # e to the i pi == -1, within rounding error
The constants pi and e are provided. Note these are float and not complex.
type(cmath.pi)
# Out: <class 'float'>
The cmath module also provides complex versions of isinf, and (for Python 3.2+) isfinite. See "
Infinity and NaN". A complex number is considered infinite if either its real part or its imaginary part
is infinite.
cmath.isinf(complex(float('inf'), 0.0))
# Out: True
Likewise, the cmath module provides a complex version of isnan. See "Infinity and NaN". A complex
number is considered "not a number" if either its real part or its imaginary part is "not a number".
cmath.isnan(0.0, float('nan'))
# Out: True
Note there is no cmath counterpart of the math.inf and math.nan constants (from Python 3.5 and
higher)
Python 3.x3.5
https://riptutorial.com/ 538
cmath.isinf(complex(0.0, math.inf))
# Out: True
cmath.isnan(complex(math.nan, 0.0))
# Out: True
cmath.inf
# Exception: AttributeError: module 'cmath' has no attribute 'inf'
In Python 3.5 and higher, there is an isclose method in both cmath and math modules.
Python 3.x3.5
z = cmath.rect(*cmath.polar(1+1j))
z
# Out: (1.0000000000000002+1.0000000000000002j)
cmath.isclose(z, 1+1j)
# True
https://riptutorial.com/ 539
Chapter 101: Metaclasses
Introduction
Metaclasses allow you to deeply modify the behaviour of Python classes (in terms of how they're
defined, instantiated, accessed, and more) by replacing the type metaclass that new classes use
by default.
Remarks
When designing your architecture, consider that many things which can be accomplished with
metaclasses can also be accomplished using more simple semantics:
Examples
Basic Metaclasses
When type is called with three arguments it behaves as the (meta)class it is, and creates a new
instance, ie. it produces a new class/type.
class mytype(type):
def __init__(cls, name, bases, dict):
# call the base initializer
type.__init__(cls, name, bases, dict)
Now, we have a new custom mytype metaclass which can be used to create classes in the same
manner as type.
https://riptutorial.com/ 540
When we create a new class using the class keyword the metaclass is by default chosen based on
upon the baseclasses.
>>> type(Foo)
type
In the above example the only baseclass is object so our metaclass will be the type of object,
which is type. It is possible override the default, however it depends on whether we use Python 2
or Python 3:
Python 2.x2.7
class MyDummy(object):
__metaclass__ = mytype
type(MyDummy) # <class '__main__.mytype'>
Python 3.x3.0
class MyDummy(metaclass=mytype):
pass
type(MyDummy) # <class '__main__.mytype'>
Any keyword arguments (except metaclass) in the class declaration will be passed to the
metaclass. Thus class MyDummy(metaclass=mytype, x=2) will pass x=2 as a keyword argument to the
mytype constructor.
A singleton is a pattern that restricts the instantiation of a class to one instance/object. For more
info on python singleton design patterns, see here.
class SingletonType(type):
def __call__(cls, *args, **kwargs):
try:
return cls.__instance
except AttributeError:
cls.__instance = super(SingletonType, cls).__call__(*args, **kwargs)
return cls.__instance
Python 2.x2.7
class MySingleton(object):
__metaclass__ = SingletonType
https://riptutorial.com/ 541
Python 3.x3.0
class MySingleton(metaclass=SingletonType):
pass
Using a metaclass
Metaclass syntax
Python 2.x2.7
class MyClass(object):
__metaclass__ = SomeMetaclass
Python 3.x3.0
class MyClass(metaclass=SomeMetaclass):
pass
import six
class MyClass(six.with_metaclass(SomeMetaclass)):
pass
Functionality in metaclasses can be changed so that whenever a class is built, a string is printed
to standard output, or an exception is thrown. This metaclass will print the name of the class being
built.
class VerboseMetaclass(type):
class Spam(metaclass=VerboseMetaclass):
def eggs(self):
print("[insert example string here]")
s = Spam()
s.eggs()
https://riptutorial.com/ 542
The standard output will be:
Introduction to Metaclasses
What is a metaclass?
In Python, everything is an object: integers, strings, lists, even functions and classes themselves
are objects. And every object is an instance of a class.
>>> type(5)
<type 'int'>
>>> type(str)
<type 'type'>
>>> type([1, 2, 3])
<type 'list'>
Most classes in python are instances of type. type itself is also a class. Such classes whose
instances are also classes are called metaclasses.
class SimplestMetaclass(type):
pass
class MyClass(object):
__metaclass__ = SimplestMetaclass
That does not add any functionality, but it is a new metaclass, see that MyClass is now an
instance of SimplestMetaclass:
>>> type(MyClass)
<class '__main__.SimplestMetaclass'>
https://riptutorial.com/ 543
the class to be created, before calling the original __new__ which creates the class:
class AnotherMetaclass(type):
def __new__(cls, name, parents, dct):
# cls is this class
# name is the name of the class to be created
# parents is the list of the class's parent classes
# dct is the list of class's attributes (methods, static variables)
# here all of the attributes can be modified before creating the class, e.g.
# return value is the new class. super will take care of that
return super(AnotherMetaclass, cls).__new__(cls, name, parents, dct)
You may have heard that everything in Python is an object. It is true, and all objects have a class:
>>> type(1)
int
>>> type(bar)
Foo
Nice, bar is an instance of Foo. But what is the class of Foo itself?
>>> type(Foo)
type
>>> type(type)
type
So what is a metaclass? For now lets pretend it is just a fancy name for the class of a class.
Takeaways:
https://riptutorial.com/ 544
• Everything is an object in Python, so everything has a class
• The class of a class is called a metaclass
• The default metaclass is type, and by far it is the most common metaclass
But why should you know about metaclasses? Well, Python itself is quite "hackable", and the
concept of metaclass is important if you are doing advanced stuff like meta-programming or if you
want to control how your classes are initialized.
https://riptutorial.com/ 545
Chapter 102: Method Overriding
Examples
Basic method overriding
Here is an example of basic overriding in Python (for the sake of clarity and compatibility with both
Python 2 and 3, using new style class and print with ()):
class Parent(object):
def introduce(self):
print("Hello!")
def print_name(self):
print("Parent")
class Child(Parent):
def print_name(self):
print("Child")
p = Parent()
c = Child()
p.introduce()
p.print_name()
c.introduce()
c.print_name()
$ python basic_override.py
Hello!
Parent
Hello!
Child
When the Child class is created, it inherits the methods of the Parent class. This means that any
methods that the parent class has, the child class will also have. In the example, the introduce is
defined for the Child class because it is defined for Parent, despite not being defined explicitly in
the class definition of Child.
In this example, the overriding occurs when Child defines its own print_name method. If this
method was not declared, then c.print_name() would have printed "Parent". However, Child has
overriden the Parent's definition of print_name, and so now upon calling c.print_name(), the word
"Child" is printed.
https://riptutorial.com/ 546
Chapter 103: Mixins
Syntax
• class ClassName(MainClass, Mixin1, Mixin2, ...): # Used to declare a class with the name
ClassName, main (first) class MainClass, and mixins Mixin1, Mixin2, etc.
• class ClassName(Mixin1, MainClass, Mixin2, ...): # The 'main' class doesn't have to be the
first class; there's really no difference between it and the mixin
Remarks
Adding a mixin to a class looks a lot like adding a superclass, because it pretty much is just that.
An object of a class with the mixin Foo will also be an instance of Foo, and isinstance(instance,
Foo) will return true
Examples
Mixin
A Mixin is a set of properties and methods that can be used in different classes, which don't come
from a base class. In Object Oriented Programming languages, you typically use inheritance to
give objects of different classes the same functionality; if a set of objects have some ability, you
put that ability in a base class that both objects inherit from.
For instance, say you have the classes Car, Boat, and Plane. Objects from all of these
classes have the ability to travel, so they get the function travel. In this scenario, they
all travel the same basic way, too; by getting a route, and moving along it. To
implement this function, you could derive all of the classes from Vehicle, and put the
function in that shared class:
class Vehicle(object):
"""A generic vehicle class."""
class Car(Vehicle):
...
class Boat(Vehicle):
...
class Plane(Vehicle):
...
https://riptutorial.com/ 547
With this code, you can call travel on a car (car.travel("Montana")), boat (
boat.travel("Hawaii")), and plane (plane.travel("France"))
However, what if you have functionality that's not available to a base class? Say, for instance, you
want to give Car a radio and the ability to use it to play a song on a radio station, with
play_song_on_station, but you also have a Clock that can use a radio too. Car and Clock could share
a base class (Machine). However, not all machines can play songs; Boat and Plane can't (at least in
this example). So how do you accomplish without duplicating code? You can use a mixin. In
Python, giving a class a mixin is as simple as adding it to the list of subclasses, like this
Foo will inherit all of the properties and methods of main_super, but also those of mixin as well.
So, to give the classes Car and clock the ability to use a radio, you could override Car
from the last example and write this:
class RadioUserMixin(object):
def __init__(self):
self.radio = Radio()
The important thing with mixins is that they allow you to add functionality to much different objects,
that don't share a "main" subclass with this functionality but still share the code for it nonetheless.
Without mixins, doing something like the above example would be much harder, and/or might
require some repetition.
Mixins are a sort of class that is used to "mix in" extra properties and methods into a class. This is
usually fine because many times the mixin classes don't override each other's, or the base class'
methods. But if you do override methods or properties in your mixins this can lead to unexpected
results because in Python the class hierarchy is defined right to left.
class Mixin1(object):
def test(self):
print "Mixin1"
https://riptutorial.com/ 548
class Mixin2(object):
def test(self):
print "Mixin2"
class BaseClass(object):
def test(self):
print "Base"
In this case the Mixin2 class is the base class, extended by Mixin1 and finally by BaseClass. Thus,
if we execute the following code snippet:
>>> x = MyClass()
>>> x.test()
Base
We see the result returned is from the Base class. This can lead to unexpected errors in the logic
of your code and needs to be accounted for and kept in mind
https://riptutorial.com/ 549
Chapter 104: Multidimensional arrays
Examples
Lists in lists
lst=[[1,2,3],[4,5,6],[7,8,9]]
here the outer list lst has three things in it. each of those things is another list: The first one is:
[1,2,3], the second one is: [4,5,6] and the third one is: [7,8,9]. You can access these lists the
same way you would access another other element of a list, like this:
print (lst[0])
#output: [1, 2, 3]
print (lst[1])
#output: [4, 5, 6]
print (lst[2])
#output: [7, 8, 9]
You can then access the different elements in each of those lists the same way:
print (lst[0][0])
#output: 1
print (lst[0][1])
#output: 2
Here the first number inside the [] brackets means get the list in that position. In the above
example we used the number 0 to mean get the list in the 0th position which is [1,2,3]. The
second set of [] brackets means get the item in that position from the inner list. In this case we
used both 0 and 1 the 0th position in the list we got is the number 1 and in the 1st position it is 2
You can also set values inside these lists the same way:
lst[0]=[10,11,12]
Now the list is [[10,11,12],[4,5,6],[7,8,9]]. In this example we changed the whole first list to be a
completely new list.
lst[1][2]=15
Now the list is [[10,11,12],[4,5,15],[7,8,9]]. In this example we changed a single element inside
of one of the inner lists. First we went into the list at position 1 and changed the element within it at
https://riptutorial.com/ 550
position 2, which was 6 now it's 15.
[[[111,112,113],[121,122,123],[131,132,133]],[[211,212,213],[221,222,223],[231,232,233]],[[311,312,313]
As is probably obvious, this gets a bit hard to read. Use backslashes to break up the different
dimensions:
[[[111,112,113],[121,122,123],[131,132,133]],\
[[211,212,213],[221,222,223],[231,232,233]],\
[[311,312,313],[321,322,323],[331,332,333]]]
By nesting the lists like this, you can extend to arbitrarily high dimensions.
print(myarray)
print(myarray[1])
print(myarray[2][1])
print(myarray[1][0][2])
etc.
myarray[1]=new_n-1_d_list
myarray[2][1]=new_n-2_d_list
myarray[1][0][2]=new_n-3_d_list #or a single number if you're dealing with 3D arrays
etc.
https://riptutorial.com/ 551
Chapter 105: Multiprocessing
Examples
Running Two Simple Processes
A simple example of using multiple processes would be two processes (workers) that are
executed separately. In the following example, two processes are started:
import multiprocessing
import time
from random import randint
def countUp():
i = 0
while i <= 3:
print('Up:\t{}'.format(i))
time.sleep(randint(1, 3)) # sleep 1, 2 or 3 seconds
i += 1
def countDown():
i = 3
while i >= 0:
print('Down:\t{}'.format(i))
time.sleep(randint(1, 3)) # sleep 1, 2 or 3 seconds
i -= 1
if __name__ == '__main__':
# Initiate the workers.
workerUp = multiprocessing.Process(target=countUp)
workerDown = multiprocessing.Process(target=countDown)
# Join the workers. This will block in the main (parent) process
# until the workers are complete.
workerUp.join()
workerDown.join()
Up: 0
Down: 3
Up: 1
Up: 2
Down: 2
Up: 3
Down: 1
Down: 0
https://riptutorial.com/ 552
Using Pool and Map
def cube(x):
return x ** 3
if __name__ == "__main__":
pool = Pool(5)
result = pool.map(cube, [0, 1, 2, 3])
Poolis a class which manages multiple Workers (processes) behind the scenes and lets you, the
programmer, use.
Pool(5)creates a new Pool with 5 processes, and pool.map works just like map but it uses multiple
processes (the amount defined when creating the pool).
Similar results can be achieved using map_async, apply and apply_async which can be found in the
documentation.
https://riptutorial.com/ 553
Chapter 106: Multithreading
Introduction
Threads allow Python programs to handle multiple functions at once as opposed to running a
sequence of commands individually. This topic explains the principles behind threading and
demonstrates its usage.
Examples
Basics of multithreading
Using the threading module, a new thread of execution may be started by creating a new
threading.Thread and assigning it a function to execute:
import threading
def foo():
print "Hello threading!"
my_thread = threading.Thread(target=foo)
The target parameter references the function (or callable object) to be run. The thread will not
begin execution until start is called on the Thread object.
Starting a Thread
Now that my_thread has run and terminated, calling start again will produce a RuntimeError. If you'd
like to run your thread as a daemon, passing the daemon=True kwarg, or setting my_thread.daemon to
True before calling start(), causes your Thread to run silently in the background as a daemon.
Joining a Thread
In cases where you split up one big job into several small ones and want to run them concurrently,
but need to wait for all of them to finish before continuing, Thread.join() is the method you're
looking for.
For example, let's say you want to download several pages of a website and compile them into a
single page. You'd do this:
import requests
from threading import Thread
from queue import Queue
q = Queue(maxsize=20)
https://riptutorial.com/ 554
def put_page_to_q(page_num):
q.put(requests.get('http://some-website.com/page_%s.html' % page_num)
def compile(q):
# magic function that needs all pages before being able to be executed
if not q.full():
raise ValueError
else:
print("Done compiling!")
threads = []
for page_num in range(20):
t = Thread(target=requests.get, args=(page_num,))
t.start()
threads.append(t)
# Next, join all threads to make sure all threads are done running before
# we continue. join() is a blocking call (unless specified otherwise using
# the kwarg blocking=False when calling join)
for t in threads:
t.join()
Using threading.Thread class we can subclass new custom Thread class. we must override run
method in a subclass.
class Sleepy(Thread):
def run(self):
time.sleep(5)
print("Hello form Thread")
if __name__ == "__main__":
t = Sleepy()
t.start() # start method automatic call Thread class run method.
# print 'The main program continues to run in foreground.'
t.join()
print("The main program continues to run in the foreground.")
There are multiple threads in your code and you need to safely communicate between them.
https://riptutorial.com/ 555
# create a data producer
def producer(output_queue):
while True:
data = data_computation()
output_queue.put(data)
# create a consumer
def consumer(input_queue):
while True:
# retrieve data (blocking)
data = input_queue.get()
q = Queue()
t1 = Thread(target=consumer, args=(q,))
t2 = Thread(target=producer, args=(q,))
t1.start()
t2.start()
echo_server(('',15000), 128)
Using concurrent.futures.Threadpoolexecutor:
https://riptutorial.com/ 556
from socket import AF_INET, SOCK_STREAM, socket
from concurrent.futures import ThreadPoolExecutor
def echo_server(addr):
print('Echo server running at', addr)
pool = ThreadPoolExecutor(128)
sock = socket(AF_INET, SOCK_STREAM)
sock.bind(addr)
sock.listen(5)
while True:
client_sock, client_addr = sock.accept()
pool.submit(echo_client, client_sock, client_addr)
echo_server(('',15000))
Python Cookbook, 3rd edition, by David Beazley and Brian K. Jones (O’Reilly). Copyright 2013
David Beazley and Brian Jones, 978-1-449-34037-7.
This section will contain some of the most advanced examples realized using Multithreading.
#!/usr/bin/env python2
import threading
import Queue
import time
import sys
import subprocess
from backports.shutil_get_terminal_size import get_terminal_size
printq = Queue.Queue()
interrupt = False
lines = []
def main():
https://riptutorial.com/ 557
ww = line.split()
i = 0
while len(new_line) <= (cols - len(ww[i]) - 1):
new_line += ww[i] + ' '
i += 1
print len(new_line)
if new_line == '':
return (line, '')
def printer():
while True:
cols, rows = get_terminal_size() # Get the terminal dimensions
msg = '#' + '-' * (cols - 2) + '#\n' # Create the
try:
new_line = str(printq.get_nowait())
if new_line != '!@#EXIT#@!': # A nice way to turn the printer
# thread out gracefully
lines.append(new_line)
printq.task_done()
else:
printq.task_done()
sys.exit()
except Queue.Empty:
pass
# Build the new message to show and split too long lines
for line in lines:
res = line # The following is to split lines which are
# longer than cols.
while len(res) !=0:
toprint, res = split_line(res, cols)
msg += '\n' + toprint
import threading
import time
class StoppableThread(threading.Thread):
"""Thread class with a stop() method. The thread itself has to check
regularly for the stopped() condition."""
def __init__(self):
super(StoppableThread, self).__init__()
self._stop_event = threading.Event()
def stop(self):
https://riptutorial.com/ 558
self._stop_event.set()
def run()
while not self._stop_event.is_set():
print("Still running!")
time.sleep(2)
print("stopped!"
https://riptutorial.com/ 559
Chapter 107: Mutable vs Immutable (and
Hashable) in Python
Examples
Mutable vs Immutable
There are two kind of types in Python. Immutable types and mutable types.
Immutables
An object of an immutable type cannot be changed. Any attempt to modify the object will result in
a copy being created.
This category includes: integers, floats, complex, strings, bytes, tuples, ranges and frozensets.
To highlight this property, let's play with the id builtin. This function returns the unique identifier of
the object passed as parameter. If the id is the same, this is the same object. If it changes, then
this is another object. (Some say that this is actually the memory address of the object, but beware
of them, they are from the dark side of the force...)
>>> a = 1
>>> id(a)
140128142243264
>>> a += 2
>>> a
3
>>> id(a)
140128142243328
Okay, 1 is not 3... Breaking news... Maybe not. However, this behaviour is often forgotten when it
comes to more complex types, especially strings.
>>> id(stack)
140128123911472
https://riptutorial.com/ 560
No. While it seems we can change the string named by the variable stack, what we actually do, is
creating a new object to contain the result of the concatenation. We are fooled because in the
process, the old object goes nowhere, so it is destroyed. In another situation, that would have
been more obvious:
In this case it is clear that if we want to retain the first string, we need a copy. But is that so
obvious for other types?
Exercise
Now, knowing how a immutable types work, what would you say with the below piece of code? Is
it wise?
s = ""
for i in range(1, 1000):
s += str(i)
s += ","
Mutables
An object of a mutable type can be changed, and it is changed in-situ. No implicit copies are done.
>>> b = bytearray(b'Stack')
>>> b
bytearray(b'Stack')
>>> b = bytearray(b'Stack')
>>> id(b)
140128030688288
>>> b += b'Overflow'
>>> b
bytearray(b'StackOverflow')
>>> id(b)
140128030688288
(As a side note, I use bytes containing ascii data to make my point clear, but remember that bytes
are not designed to hold textual data. May the force pardon me.)
What do we have? We create a bytearray, modify it and using the id, we can ensure that this is
the same object, modified. Not a copy of it.
https://riptutorial.com/ 561
Of course, if an object is going to be modified often, a mutable type does a much better job than
an immutable type. Unfortunately, the reality of this property is often forgotten when it hurts the
most.
>>> c = b
>>> c += b' rocks!'
>>> c
bytearray(b'StackOverflow rocks!')
Okay...
>>> b
bytearray(b'StackOverflow rocks!')
Waiiit a second...
Exercise
Now you better understand what side effect is implied by a mutable type, can you explain what is
going wrong in this example?
One of the major use case when a developer needs to take mutability into account is when
passing arguments to a function. This is very important, because this will determine the ability for
the function to modify objects that doesn't belong to its scope, or in other words if the function has
side effects. This is also important to understand where the result of a function has to be made
available.
>>> a = [1, 2, 3]
>>> b = list_add3(a)
>>> b
[1, 2, 3, 3]
https://riptutorial.com/ 562
>>> a
[1, 2, 3, 3]
Here, the mistake is to think that lin, as a parameter to the function, can be modified locally.
Instead, lin and a reference the same object. As this object is mutable, the modification is done in-
place, which means that the object referenced by both lin and a is modified. lin doesn't really
need to be returned, because we already have a reference to this object in the form of a. a and b
end referencing the same object.
>>> a = (1, 2, 3)
>>> b = tuple_add3(a)
>>> b
(1, 2, 3, 3)
>>> a
(1, 2, 3)
At the beginning of the function, tin and a reference the same object. But this is an immutable
object. So when the function tries to modify it, tin receive a new object with the modification, while
a keeps a reference to the original object. In this case, returning tin is mandatory, or the new
object would be lost.
Exercise
What do you think of this function? Does it have side effects? Is the return necessary? After the
call, what is the value of saying? Of focused? What happens if the function is called again with the
same parameters?
https://riptutorial.com/ 563
Chapter 108: Neo4j and Cypher using Py2Neo
Examples
Importing and Authenticating
You have to make sure your Neo4j Database exists at localhost:7474 with the appropriate
credentials.
the graph object is your interface to the neo4j instance in the rest of your python code. Rather
thank making this a global variable, you should keep it in a class's __init__ method.
results = News.objects.todays_news()
for r in results:
article = graph.merge_one("NewsArticle", "news_id", r)
article.properties["title"] = results[r]['news_title']
article.properties["timestamp"] = results[r]['news_timestamp']
article.push()
[...]
timestampshould be an integer and not a date string as neo4j doesnt really have a date datatype.
This causes sorting issues when you store date as '05-06-1989'
article.push() is an the call that actually commits the operation into neo4j. Dont forget this step.
results = News.objects.todays_news()
for r in results:
article = graph.merge_one("NewsArticle", "news_id", r)
if 'LOCATION' in results[r].keys():
for loc in results[r]['LOCATION']:
loc = graph.merge_one("Location", "name", loc)
try:
rel = graph.create_unique(Relationship(article, "about_place", loc))
except Exception, e:
print e
create_unique is important for avoiding duplicates. But otherwise its a pretty straightforward
https://riptutorial.com/ 564
operation. The relationship name is also important as you would use it in advanced cases.
def get_autocomplete(text):
query = """
start n = node(*) where n.name =~ '(?i)%s.*' return n.name,labels(n) limit 10;
"""
query = query % (text)
obj = []
for res in graph.cypher.execute(query):
# print res[0],res[1]
obj.append({'name':res[0],'entity_type':res[1]})
return res
This is a sample cypher query to get all nodes with the property name that starts with the argument
text.
def search_news_by_entity(location,timestamp):
query = """
MATCH (n)-[]->(l)
where l.name='%s' and n.timestamp='%s'
RETURN n.news_id limit 10
"""
news_ids = []
for res in graph.cypher.execute(query):
news_ids.append(str(res[0]))
return news_ids
You can use this query to find all news articles (n) connected to a location (l) by a relationship.
MATCH (n)-[]->(l)
where l.name='Donald Trump'
RETURN n.date,count(*) order by n.date
Search for other People / Locations connected to the same news articles as Trump with at least 5
total relationship nodes.
MATCH (n:NewsArticle)-[]->(l)
where l.name='Donald Trump'
MATCH (n:NewsArticle)-[]->(m)
with m,count(n) as num where num>5
return labels(m)[0],(m.name), num order by num desc limit 10
https://riptutorial.com/ 565
Read Neo4j and Cypher using Py2Neo online: https://riptutorial.com/python/topic/5841/neo4j-and-
cypher-using-py2neo
https://riptutorial.com/ 566
Chapter 109: Non-official Python
implementations
Examples
IronPython
Open-source implementation for .NET and Mono written in C#, licensed under Apache License
2.0. It relies on DLR (Dynamic Language Runtime). It supports only version 2.7, version 3 is
currently being developped.
Hello World
print "Hello World!"
import clr
from System import Console
Console.WriteLine("Hello World!")
External links
• Official website
• GitHub repository
Jython
Open-source implementation for JVM written in Java, licensed under Python Software Foundation
License. It supports only version 2.7, version 3 is currently being developped.
https://riptutorial.com/ 567
• Strings are Unicode.
• Does not support extensions for CPython written in C.
• Does not suffer from Global Interpreter Lock.
• Performance is usually lower, though it depends on tests.
Hello World
print "Hello World!"
External links
• Official website
• Mercurial repository
Transcrypt
Transcrypt is a tool to precompile a fairly extensive subset of Python into compact, readable
Javascript. It has the following characteristics:
• Allows for classical OO programming with multiple inheritance using pure Python syntax,
parsed by CPython’s native parser
• Seamless integration with the universe of high-quality web-oriented JavaScript libraries,
rather than the desktop-oriented Python ones
• Hierarchical URL based module system allowing module distribution via PyPi
• Simple relation between Python source and generated JavaScript code for easy debugging
• Multi-level sourcemaps and optional annotation of target code with source references
• Compact downloads, kB’s rather than MB’s
• Optimized JavaScript code, using memoization (call caching) to optionally bypass the
prototype lookup chain
• Operator overloading can be switched on and off locally to facilitate readable numerical math
https://riptutorial.com/ 568
Integration with HTML
<script src="__javascript__/hello.js"></script>
<h2>Hello demo</h2>
<p>
<div id = "greet">...</div>
<button onclick="hello.solarSystem.greet ()">Click me repeatedly!</button>
<p>
<div id = "explain">...</div>
<button onclick="hello.solarSystem.explain ()">And click me repeatedly too!</button>
class SolarSystem:
planets = [list (chain (planet, (index + 1,))) for index, planet in enumerate ((
('Mercury', 'hot', 2240),
('Venus', 'sulphurous', 6052),
('Earth', 'fertile', 6378),
('Mars', 'reddish', 3397),
('Jupiter', 'stormy', 71492),
('Saturn', 'ringed', 60268),
('Uranus', 'cold', 25559),
('Neptune', 'very cold', 24766)
))]
lines = (
'{} is a {} planet',
'The radius of {} is {} km',
'{} is planet nr. {} counting from the sun'
)
https://riptutorial.com/ 569
Transcrypt can be used in combination with any JavaScript library without special measures or
syntax. In the documentation examples are given for a.o. react.js, riot.js, fabric.js and node.js.
class A:
def __init__ (self, x):
self.x = x
class B:
def __init__ (self, y):
alert ('In B constructor')
self.y = y
a = A (1001)
a.show ('america')
b = B (2002)
b.show ('russia')
c = C (3003, 4004)
c.show ('netherlands')
show2 = c.show
show2 ('copy')
JavaScript
https://riptutorial.com/ 570
});
var B = __class__ ('B', [object], {
get __init__ () {return __get__ (this, function (self, y) {
alert ('In B constructor');
self.y = y;
});},
get show () {return __get__ (this, function (self, label) {
print ('B.show', label, self.y);
});}
});
var C = __class__ ('C', [A, B], {
get __init__ () {return __get__ (this, function (self, x, y) {
alert ('In C constructor');
A.__init__ (self, x);
B.__init__ (self, y);
self.show ('constructor');
});},
get show () {return __get__ (this, function (self, label) {
B.show (self, label);
print ('C.show', label, self.x, self.y);
});}
});
var a = A (1001);
a.show ('america');
var b = B (2002);
b.show ('russia');
var c = C (3003, 4004);
c.show ('netherlands');
var show2 = c.show;
show2 ('copy');
External links
• Official website: http://www.transcrypt.org/
• Repository: https://github.com/JdeH/Transcrypt
https://riptutorial.com/ 571
Chapter 110: Operator module
Examples
Operators as alternative to an infix operator
For every infix operator, e.g. + there is a operator-function (operator.add for +):
1 + 1
# Output: 2
from operator import add
add(1, 1)
# Output: 2
even though the main documentation states that for the arithmetic operators only numerical input
is allowed it is possible:
See also: mapping from operation to operator function in the official Python documentation.
Methodcaller
Itemgetter
https://riptutorial.com/ 572
# Output: {1: {'a': 1, 'c': 1}, 5: {'b': 5}}
Or sorting a list of tuples by the second element first the first element as secondary:
https://riptutorial.com/ 573
Chapter 111: Operator Precedence
Introduction
Python operators have a set order of precedence, which determines what operators are
evaluated first in a potentially ambiguous expression. For instance, in the expression 3 * 2 + 7, first
3 is multiplied by 2, and then the result is added to 7, yielding 13. The expression is not evaluated
the other way around, because * has a higher precedence than +.
Below is a list of operators by precedence, and a brief description of what they (usually) do.
Remarks
From the Python documentation:
The following table summarizes the operator precedences in Python, from lowest
precedence (least binding) to highest precedence (most binding). Operators in the
same box have the same precedence. Unless the syntax is explicitly given, operators
are binary. Operators in the same box group left to right (except for comparisons,
including tests, which all have the same precedence and chain from left to right and
exponentiation, which groups from right to left).
Operator Description
or Boolean OR
| Bitwise OR
^ Bitwise XOR
https://riptutorial.com/ 574
Operator Description
** Exponentiation [9]
Examples
Simple Operator Precedence Examples in python.
Python follows PEMDAS rule. PEMDAS stands for Parentheses, Exponents, Multiplication and
Division, and Addition and Subtraction.
Example:
>>> a, b, c, d = 2, 3, 5, 7
>>> a ** (b + c) # parentheses
256
>>> a * b ** c # exponent: same as `a * (b ** c)`
7776
>>> a + b * c / d # multiplication / division: same as `a + (b * c / d)`
4.142857142857142
https://riptutorial.com/ 575
Chapter 112: Optical Character Recognition
Introduction
Optical Character Recognition is converting images of text into actual text. In these examples find
ways of using OCR in python.
Examples
PyTesseract
try:
import Image
except ImportError:
from PIL import Image
import pytesseract
#Basic OCR
print(pytesseract.image_to_string(Image.open('test.png')))
#In French
print(pytesseract.image_to_string(Image.open('test-european.jpg'), lang='fra’))
PyOCR
To initialize:
import pyocr
import pyocr.builders
tools = pyocr.get_available_tools()
# The tools are returned in the recommended order of usage
tool = tools[0]
langs = tool.get_available_languages()
lang = langs[0]
# Note that languages are NOT sorted in any way. Please refer
https://riptutorial.com/ 576
# to the system locale settings for the default language
# to use.
txt = tool.image_to_string(
Image.open('test.png'),
lang=lang,
builder=pyocr.builders.TextBuilder()
)
# txt is a Python string
word_boxes = tool.image_to_string(
Image.open('test.png'),
lang="eng",
builder=pyocr.builders.WordBoxBuilder()
)
# list of box objects. For each box object:
# box.content is the word in the box
# box.position is its position on the page (in pixels)
#
# Beware that some OCR tools (Tesseract for instance)
# may return empty boxes
line_and_word_boxes = tool.image_to_string(
Image.open('test.png'), lang="fra",
builder=pyocr.builders.LineBoxBuilder()
)
# list of line objects. For each line object:
# line.word_boxes is a list of word boxes (the individual words in the line)
# line.content is the whole text of the line
# line.position is the position of the whole line on the page (in pixels)
#
# Beware that some OCR tools (Tesseract for instance)
# may return empty boxes
https://riptutorial.com/ 577
Chapter 113: os.path
Introduction
This module implements some useful functions on pathnames. The path parameters can be
passed as either strings, or bytes. Applications are encouraged to represent file names as
(Unicode) character strings.
Syntax
• os.path.join(a, *p)
• os.path.basename(p)
• os.path.dirname(p)
• os.path.split(p)
• os.path.splitext(p)
Examples
Join Paths
To join two or more path components together, firstly import os module of python and then use
following:
import os
os.path.join('a', 'b', 'c')
The advantage of using os.path is that it allows code to remain compatible over all operating
systems, as this uses the separator appropriate for the platform it's running on.
In an Unix OS:
Use os.path.abspath:
>>> os.getcwd()
'/Users/csaftoiu/tmp'
https://riptutorial.com/ 578
>>> os.path.abspath('foo')
'/Users/csaftoiu/tmp/foo'
>>> os.path.abspath('../foo')
'/Users/csaftoiu/foo'
>>> os.path.abspath('/foo')
'/foo'
os.path.abspath(os.path.join(PATH_TO_GET_THE_PARENT, os.pardir))
path = '/home/john/temp'
os.path.exists(path)
#this returns false if path doesn't exist or if the path is a broken symbolic link
check if the given path is a directory, file, symbolic link, mount point etc.
dirname = '/home/john/python'
os.path.isdir(dirname)
https://riptutorial.com/ 579
os.path.islink(symlink)
mount_path = '/home'
os.path.ismount(mount_path)
https://riptutorial.com/ 580
Chapter 114: Overloading
Examples
Magic/Dunder Methods
Magic (also called dunder as an abbreviation for double-underscore) methods in Python serve a
similar purpose to operator overloading in other languages. They allow a class to define its
behavior when it is used as an operand in unary or binary operator expressions. They also serve
as implementations called by some built-in functions.
import math
class Vector(object):
# instantiation
def __init__(self, x, y):
self.x = x
self.y = y
# addition (v + u)
def __add__(self, other):
return Vector(self.x + other.x, self.y + other.y)
# subtraction (v - u)
def __sub__(self, other):
return self + (-other)
# equality (v == u)
def __eq__(self, other):
return self.x == other.x and self.y == other.y
# abs(v)
def __abs__(self):
return math.hypot(self.x, self.y)
# str(v)
def __str__(self):
return '<{0.x}, {0.y}>'.format(self)
# repr(v)
def __repr__(self):
return 'Vector({0.x}, {0.y})'.format(self)
Now it is possible to naturally use instances of the Vector class in various expressions.
v = Vector(1, 4)
u = Vector(2, 0)
https://riptutorial.com/ 581
u + v # Vector(3, 4)
print(u + v) # "<3, 4>" (implicit string conversion)
u - v # Vector(1, -4)
u == v # False
u + v == v + u # True
abs(u + v) # 5.0
It is possible to emulate container types, which support accessing values by key or index.
Consider this naive implementation of a sparse list, which stores only its non-zero elements to
conserve memory.
class sparselist(object):
def __init__(self, size):
self.size = size
self.data = {}
# l[index]
def __getitem__(self, index):
if index < 0:
index += self.size
if index >= self.size:
raise IndexError(index)
try:
return self.data[index]
except KeyError:
return 0.0
# l[index] = value
def __setitem__(self, index, value):
self.data[index] = value
# del l[index]
def __delitem__(self, index):
if index in self.data:
del self.data[index]
# value in l
def __contains__(self, value):
return value == 0.0 or value in self.data.values()
# len(l)
def __len__(self):
return self.size
https://riptutorial.com/ 582
l[12345] = 10
10 in l # True
l[12345] # 10
for v in l:
pass # 0, 0, 0, ... 10, 0, 0 ... 0
Callable types
class adder(object):
def __init__(self, first):
self.first = first
# a(...)
def __call__(self, second):
return self.first + second
add2 = adder(2)
add2(1) # 3
add2(2) # 4
If your class doesn't implement a specific overloaded operator for the argument types provided, it
should return NotImplemented (note that this is a special constant, not the same as
NotImplementedError). This will allow Python to fall back to trying other methods to make the
operation work:
When NotImplemented is returned, the interpreter will then try the reflected operation on
the other type, or some other fallback, depending on the operator. If all attempted
operations return NotImplemented, the interpreter will raise an appropriate exception.
class NotAddable(object):
class Addable(NotAddable):
__radd__ = __add__
As this is the reflected method we have to implement __add__ and __radd__ to get the expected
behaviour in all cases; fortunately, as they are both doing the same thing in this simple example,
https://riptutorial.com/ 583
we can take a shortcut.
In use:
>>> x = NotAddable(1)
>>> y = Addable(2)
>>> x + x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'NotAddable' and 'NotAddable'
>>> y + y
<so.Addable object at 0x1095974d0>
>>> z = x + y
>>> z
<so.Addable object at 0x109597510>
>>> z.value
3
Operator overloading
Below are the operators that can be overloaded in classes, along with the method definitions that
are required, and an example of the operator in use within an expression.
N.B. The use of other as a variable name is not mandatory, but is considered the norm.
https://riptutorial.com/ 584
Operator Method Expression
The optional parameter modulo for __pow__ is only used by the pow built-in function.
Each of the methods corresponding to a binary operator has a corresponding "right" method which
start with __r, for example __radd__:
class A:
def __init__(self, a):
self.a = a
def __add__(self, other):
return self.a + other
def __radd__(self, other):
print("radd")
return other + self.a
A(1) + 2 # Out: 3
2 + A(1) # prints radd. Out: 3
class B:
def __init__(self, b):
self.b = b
def __iadd__(self, other):
self.b += other
https://riptutorial.com/ 585
print("iadd")
return self
b = B(2)
b.b # Out: 2
b += 1 # prints iadd
b.b # Out: 3
Since there's nothing special about these methods, many other parts of the language, parts of the
standard library, and even third-party modules add magic methods on their own, like methods to
cast an object to a type or checking properties of the object. For example, the builtin str() function
calls the object's __str__ method, if it exists. Some of these uses are listed below.
There are also the special methods __enter__ and __exit__ for context managers, and many more.
https://riptutorial.com/ 586
Chapter 115: Pandas Transform: Preform
operations on groups and concatenate the
results
Examples
Simple transform
orders_df = pd.DataFrame()
orders_df['customer_id'] = [1,1,1,1,1,2,2,3,3,3,3,3]
orders_df['order_id'] = [1,1,1,2,2,3,3,4,5,6,6,6]
orders_df['item'] = ['apples', 'chocolate', 'chocolate', 'coffee', 'coffee', 'apples',
'bananas', 'coffee', 'milkshake', 'chocolate', 'strawberry',
'strawberry']
.
.
# And now, we can tranform each group using the logic defined above
orders_df['number_of_orders_per_cient'] = ( # Put the results into a new column
https://riptutorial.com/ 587
that is called 'number_of_orders_per_cient'
orders_df # Take the original dataframe
.groupby(['customer_id'])['order_id'] # Create a seperate group for each
customer_id & select the order_id
.transform(count_number_of_orders)) # Apply the function to each group
seperatly
# Let's try to see if the items were ordered more than once in each orders
https://riptutorial.com/ 588
each group separately
Read Pandas Transform: Preform operations on groups and concatenate the results online:
https://riptutorial.com/python/topic/10947/pandas-transform--preform-operations-on-groups-and-
concatenate-the-results
https://riptutorial.com/ 589
Chapter 116: Parallel computation
Remarks
Due to the GIL (Global interpreter lock) only one instance of the python interpreter executes in a
single process. So in general, using multi-threading only improves IO bound computations, not
CPU-bound ones. The multiprocessing module is recommended if you wish to parallelise CPU-
bound tasks.
GIL applies to CPython, the most popular implementation of Python, as well as PyPy. Other
implementations such as Jython and IronPython have no GIL.
Examples
Using the multiprocessing module to parallelise tasks
import multiprocessing
def fib(n):
"""computing the Fibonacci in an inefficient way
was chosen to slow down the CPU."""
if n <= 2:
return 1
else:
return fib(n-1)+fib(n-2)
p = multiprocessing.Pool()
print(p.map(fib,[38,37,36,35,34,33]))
As the execution of each call to fib happens in parallel, the time of execution of the full example is
1.8× faster than if done in a sequential way on a dual processor.
Python 2.2+
child.py
import time
def main():
print "starting work"
time.sleep(1)
print "work work work work work"
time.sleep(1)
print "done working"
if __name__ == '__main__':
main()
https://riptutorial.com/ 590
parent.py
import os
def main():
for i in range(5):
os.system("python child.py &")
if __name__ == '__main__':
main()
This is useful for parallel, independent HTTP request/response tasks or Database select/inserts.
Command line arguments can be given to the child.py script as well. Synchronization between
scripts can be achieved by all scripts regularly checking a separate server (like a Redis instance).
The idea here is to move the computationally intensive jobs to C (using special macros),
independent of Python, and have the C code release the GIL while it's working.
#include "Python.h"
...
PyObject *pyfunc(PyObject *self, PyObject *args) {
...
Py_BEGIN_ALLOW_THREADS
// Threaded C code
...
Py_END_ALLOW_THREADS
...
}
PyPar is a library that uses the message passing interface (MPI) to provide parallelism in Python.
A simple example in PyPar (as seen at https://github.com/daleroberts/pypar) looks like this:
import pypar as pp
ncpus = pp.size()
rank = pp.rank()
node = pp.get_processor_name()
if rank == 0:
msh = 'P0'
pp.send(msg, destination=1)
msg = pp.receive(source=rank-1)
print 'Processor 0 received message "%s" from rank %d' % (msg, rank-1)
else:
source = rank-1
destination = (rank+1) % ncpus
msg = pp.receive(source)
msg = msg + 'P' + str(rank)
pypar.send(msg, destination)
https://riptutorial.com/ 591
pp.finalize()
https://riptutorial.com/ 592
Chapter 117: Parsing Command Line
arguments
Introduction
Most command line tools rely on arguments passed to the program upon its execution. Instead of
prompting for input, these programs expect data or specific flags (which become booleans) to be
set. This allows both the user and other programs to run the Python file passing it data as it starts.
This section explains and demonstrates the implementation and usage of command line
arguments in Python.
Examples
Hello world in argparse
The following program says hello to the user. It takes one positional argument, the name of the
user, and can also be told the greeting.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('name',
help='name of user'
)
parser.add_argument('-g', '--greeting',
default='Hello',
help='optional alternate greeting'
)
args = parser.parse_args()
print("{greeting}, {name}!".format(
greeting=args.greeting,
name=args.name)
)
positional arguments:
name name of user
optional arguments:
-h, --help show this help message and exit
-g GREETING, --greeting GREETING
optional alternate greeting
https://riptutorial.com/ 593
Hello, world!
$ python hello.py John -g Howdy
Howdy, John!
docopt turns command-line argument parsing on its head. Instead of parsing the arguments, you
just write the usage string for your program, and docopt parses the usage string and uses it to
extract the command line arguments.
"""
Usage:
script_name.py [-a] [-b] <path>
Options:
-a Print all the things.
-b Get more bees into the path.
"""
from docopt import docopt
if __name__ == "__main__":
args = docopt(__doc__)
import pprint; pprint.pprint(args)
Sample runs:
$ python script_name.py
Usage:
script_name.py [-a] [-b] <path>
$ python script_name.py something
{'-a': False,
'-b': False,
'<path>': 'something'}
$ python script_name.py something -a
{'-a': True,
'-b': False,
'<path>': 'something'}
$ python script_name.py -b something -a
{'-a': True,
'-b': True,
'<path>': 'something'}
If you want two or more arguments to be mutually exclusive. You can use the function
argparse.ArgumentParser.add_mutually_exclusive_group(). In the example below, either foo or bar
can exist but not both at the same time.
import argparse
parser = argparse.ArgumentParser()
https://riptutorial.com/ 594
group = parser.add_mutually_exclusive_group()
group.add_argument("-f", "--foo")
group.add_argument("-b", "--bar")
args = parser.parse_args()
print "foo = ", args.foo
print "bar = ", args.bar
If you try to run the script specifying both --foo and --bar arguments, the script will complain with
the below message.
Whenever a Python script is invoked from the command line, the user may supply additional
command line arguments which will be passed on to the script. These arguments will be
available to the programmer from the system variable sys.argv ("argv" is a traditional name used in
most programming languages, and it means "argument vector").
By convention, the first element in the sys.argv list is the name of the Python script itself, while the
rest of the elements are the tokens passed by the user when invoking the script.
# cli.py
import sys
print(sys.argv)
$ python cli.py
=> ['cli.py']
Here's another example of how to use argv. We first strip off the initial element of sys.argv
because it contains the script's name. Then we combine the rest of the arguments into a single
sentence, and finally print that sentence prepending the name of the currently logged-in user (so
that it emulates a chat program).
import getpass
import sys
words = sys.argv[1:]
sentence = " ".join(words)
print("[%s] %s" % (getpass.getuser(), sentence))
The algorithm commonly used when "manually" parsing a number of non-positional arguments is
to iterate over the sys.argv list. One way is to go over the list and pop each element of it:
https://riptutorial.com/ 595
arg = argv.pop()
# stop iterating when there's no more args to pop()
while len(argv) > 0:
if arg in ('-f', '--foo'):
print('seen foo!')
elif arg in ('-b', '--bar'):
print('seen bar!')
elif arg in ('-a', '--with-arg'):
arg = arg.pop()
print('seen value: {}'.format(arg))
# get the next value
arg = argv.pop()
You can create parser error messages according to your script needs. This is through the
argparse.ArgumentParser.error function. The below example shows the script printing a usage and
an error message to stderr when --foo is given but not --bar.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--foo")
parser.add_argument("-b", "--bar")
args = parser.parse_args()
if args.foo and args.bar is None:
parser.error("--foo requires --bar. You did not specify bar.")
Assuming your script name is sample.py, and we run: python sample.py --foo ds_in_fridge
When you create an argparse ArgumentParser() and run your program with '-h' you get an
automated usage message explaining what arguments you can run your software with. By default,
positional arguments and conditional arguments are separated into two categories, for example,
here is a small script (example.py) and the output when you run python example.py -h.
import argparse
https://riptutorial.com/ 596
args = parser.parse_args()
Simple example
positional arguments:
name Who to greet
optional arguments:
-h, --help show this help message and exit
--bar_this BAR_THIS
--bar_that BAR_THAT
--foo_this FOO_THIS
--foo_that FOO_THAT
There are some situations where you want to separate your arguments into further conceptual
sections to assist your user. For example, you may wish to have all the input options in one group,
and all the output formating options in another. The above example can be adjusted to separate
the --foo_* args from the --bar_* args like so.
import argparse
Simple example
positional arguments:
name Who to greet
optional arguments:
-h, --help show this help message and exit
Foo options:
--bar_this BAR_THIS
--bar_that BAR_THAT
Bar options:
--foo_this FOO_THIS
https://riptutorial.com/ 597
--foo_that FOO_THAT
As with docopt, with [docopt_dispatch] you craft your --help in the __doc__ variable of your entry-
point module. There, you call dispatch with the doc string as argument, so it can run the parser
over it.
That being done, instead of handling manually the arguments (which usually ends up in a high
cyclomatic if/else structure), you leave it to dispatch giving only how you want to handle the set of
arguments.
This is what the dispatch.on decorator is for: you give it the argument or sequence of arguments
that should trigger the function, and that function will be executed with the matching values as
parameters.
"""
from docopt_dispatch import dispatch
@dispatch.on('--development')
def development(host, port, **kwargs):
print('in *development* mode')
@dispatch.on('--production')
def development(host, port, **kwargs):
print('in *production* mode')
@dispatch.on('items', 'add')
def items_add(item, **kwargs):
print('adding item...')
@dispatch.on('items', 'delete')
def items_delete(item, **kwargs):
print('deleting item...')
if __name__ == '__main__':
dispatch(__doc__)
https://riptutorial.com/ 598
Chapter 118: Partial functions
Introduction
As you probably know if you came from OOP school, specializing an abstract class and use it is a
practice you should keep in mind when writing your code.
What if you could define an abstract function and specialize it in order to create different versions
of it? Thinks it as a sort of function Inheritance where you bind specific params to make them
reliable for a specific scenario.
Syntax
• partial(function, **params_you_want_fix)
Parameters
Param details
y the exponent
Remarks
As stated in Python doc the functools.partial:
Return a new partial object which when called will behave like func called with the
positional arguments args and keyword arguments keywords. If more arguments are
supplied to the call, they are appended to args. If additional keyword arguments are
supplied, they extend and override keywords.
Examples
Raise the power
https://riptutorial.com/ 599
def raise_power(x, y):
return x**y
Let's suppose y can be one of [3,4,5] and let's say you don't want offer end user the possibility to
use such function since it is very computationally intensive. In fact you would check if provided y
assumes a valid value and rewrite your function as:
Messy? Let's use the abstract form and specialize it to all three cases: let's implement them
partially.
What happens here? We fixed the y params and we defined three different functions.
No need to use the abstract function defined above (you could make it private) but you could use
partial applied functions to deal with raising a number to a fixed value.
https://riptutorial.com/ 600
Chapter 119: Performance optimization
Remarks
When attempting to improve the performance of a Python script, first and foremost you should be
able to find the bottleneck of your script and note that no optimization can compensate for a poor
choice in data structures or a flaw in your algorithm design. Identifying performance bottlenecks
can be done by profiling your script. Secondly do not try to optimize too early in your coding
process at the expense of readability/design/quality. Donald Knuth made the following statement
on optimization:
“We should forget about small efficiencies, say about 97% of the time: premature
optimization is the root of all evil. Yet we should not pass up our opportunities in that
critical 3%.”
Examples
Code profiling
First and foremost you should be able to find the bottleneck of your script and note that no
optimization can compensate for a poor choice in data structure or a flaw in your algorithm design.
Secondly do not try to optimize too early in your coding process at the expense of
readability/design/quality. Donald Knuth made the following statement on optimization:
"We should forget about small efficiencies, say about 97% of the time: premature
optimization is the root of all evil. Yet we should not pass up our opportunities in that
critical 3%"
To profile your code you have several tools: cProfile (or the slower profile) from the standard
library, line_profiler and timeit. Each of them serve a different purpose.
cProfileis a determistic profiler: function call, function return, and exception events are monitored,
and precise timings are made for the intervals between these events (up to 0.001s). The library
documentation ([https://docs.python.org/2/library/profile.html][1]) provides us with a simple use
case
import cProfile
def f(x):
return "42!"
cProfile.run('f(12)')
https://riptutorial.com/ 601
# ... do something ...
# ... long ...
pr.disable()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print s.getvalue()
This will create outputs looking like the table below, where you can quickly see where your
program spends most of its time and identify the functions to optimize.
$ kernprof -l script_to_profile.py
kernprof will create an instance of LineProfiler and insert it into the __builtins__ namespace with
the name profile. It has been written to be used as a decorator, so in your script, you decorate the
functions you want to profile with @profile.
@profile
def slow_function(a, b, c):
...
The default behavior of kernprof is to put the results into a binary file script_to_profile.py.lprof .
You can tell kernprof to immediately view the formatted results at the terminal with the [-v/--view]
option. Otherwise, you can view the results later like so:
Finally timeit provides a simple way to test one liners or small expression both from the command
line and the python shell. This module will answer question such as, is it faster to do a list
comprehension or use the built-in list() when transforming a set into a list. Look for the setup
keyword or -s option to add setup code.
https://riptutorial.com/ 602
from a terminal
https://riptutorial.com/ 603
Chapter 120: Pickle data serialisation
Syntax
• pickle.dump(object,file,protocol) #To serialize an object
Parameters
Parameter Details
protocol The protocol used for pickling the object (optional parameter)
Remarks
Pickleable types
The following objects are picklable.
Examples
Using Pickle to serialize and deserialize an object
The pickle module implements an algorithm for turning an arbitrary Python object into a series of
bytes. This process is also called serializing the object. The byte stream representing the object
can then be transmitted or stored, and later reconstructed to create a new object with the same
characteristics.
For the simplest code, we use the dump() and load() functions.
https://riptutorial.com/ 605
deserialized_data = pickle.loads(serialized_data)
# deserialized_data == data
Some data cannot be pickled. Other data should not be pickled for other reasons.
What will be pickled can be defined in __getstate__ method. This method must return something
that is picklable.
On the oposite side is __setstate__: it will receive what __getstate__ created and has to initialize
the object.
class A(object):
def __init__(self, important_data):
self.important_data = important_data
def __getstate__(self):
return [self.important_data] # only this is needed
The implementation here pikles a list with one value: [self.important_data]. That was just an
example, __getstate__ could have returned anything that is picklable, as long as __setstate__
knows how to do the oppoisite. A good alternative is a dictionary of all values: {'important_data':
self.important_data}.
Constructor is not called! Note that in the previous example instance a2 was created in
pickle.loads
https://riptutorial.com/ 606
without ever calling A.__init__, so A.__setstate__ had to initialize everything that __init__ would
have initialized if it were called.
https://riptutorial.com/ 607
Chapter 121: Pillow
Examples
Read Image File
im = Image.open("Image.bmp")
https://riptutorial.com/ 608
Chapter 122: pip: PyPI Package Manager
Introduction
pip is the most widely-used package manager for the Python Package Index, installed by default
with recent versions of Python.
Syntax
• pip <command> [options] where <command> is one of:
○install
○ Install packages
○uninstall
○ Uninstall packages
○freeze
○ Output installed packages in requirements format
○list
○ List installed packages
○show
○ Show information about installed packages
○search
○ Search PyPI for packages
○wheel
○ Build wheels from your requirements
○zip
○ Zip individual packages (deprecated)
○unzip
○ Unzip individual packages (deprecated)
○bundle
○ Create pybundles (deprecated)
○help
○ Show help for commands
Remarks
Sometimes, pip will perfom a manual compilation of native code. On Linux python will
automatically choose an available C compiler on your system. Refer to the table below for the
required Visual Studio/Visual C++ version on Windows (newer versions will not work.).
https://riptutorial.com/ 609
Python Version Visual Studio Version Visual C++ Version
Source: wiki.python.org
Examples
Install Packages
If commands shows permission denied error on Linux/Unix then use sudo with the commands
Each line of the requirements file indicates something to be installed, and like arguments to pip
install, Details on the format of the files are here: Requirements File Format.
After install the package you can check it using freeze command:
$ pip freeze
Uninstall Packages
To uninstall a package:
https://riptutorial.com/ 610
$ pip list
# example output
docutils (0.9.1)
Jinja2 (2.6)
Pygments (1.5)
Sphinx (1.1.2)
Upgrade Packages
Running
will upgrade package SomePackage and all its dependencies. Also, pip automatically removes older
version of the package before upgrade.
on Unix or
on Windows machines.
doesn't current contain a flag to allow a user to update all outdated packages in one shot.
pip
However, this can be accomplished by piping commands together in a Linux environment:
pip list --outdated --local | grep -v '^\-e' | cut -d = -f 1 | xargs -n1 pip install -U
This command takes all packages in the local virtualenv and checks if they are outdated. From
that list, it gets the package name and then pipes that to a pip install -U command. At the end of
this process, all local packages should be updated.
doesn't current contain a flag to allow a user to update all outdated packages in one shot.
pip
However, this can be accomplished by piping commands together in a Windows environment:
https://riptutorial.com/ 611
for /F "delims= " %i in ('pip list --outdated --local') do pip install -U %i
This command takes all packages in the local virtualenv and checks if they are outdated. From
that list, it gets the package name and then pipes that to a pip install -U command. At the end of
this process, all local packages should be updated.
This will save a list of all packages and their version installed on the system to a file named
requirements.txt in the current folder.
The --local parameter will only output a list of packages and versions that are installed locally to a
virtualenv. Global packages will not be listed.
If you have both Python 3 and Python 2 installed, you can specify which version of Python you
would like pip to use. This is useful when packages only support Python 2 or 3 or when you wish
to test with both.
or:
You can also invoke installation of a package to a specific python installation with:
https://riptutorial.com/ 612
On OS-X/Linux/Unix platforms it is important to be aware of the distinction between the system
version of python, (which upgrading make render your system inoperable), and the user version(s)
of python. You may, depending on which you are trying to upgrade, need to prefix these
commands with sudo and input a password.
Likewise on Windows some python installations, especially those that are a part of another
package, can end up installed in system directories - those you will have to upgrade from a
command window running in Admin mode - if you find that it looks like you need to do this it is a
very good idea to check which python installation you are trying to upgrade with a command such
as python -c"import sys;print(sys.path);" or py -3.5 -c"import sys;print(sys.path);" you can also
check which pip you are trying to run with pip --version
On Windows, if you have both python 2 and python 3 installed, and on your path and your python
3 is greater than 3.4 then you will probably also have the python launcher py on your system path.
You can then do tricks like:
If you are running & maintaining multiple versions of python I would strongly recommend reading
up about the python virtualenv or venv virtual enviroments which allow you to isolate both the
version of python and which packages are present.
Many, pure python, packages are not yet available on the Python Package Index as wheels but
still install fine. However, some packages on Windows give the dreaded vcvarsall.bat not found
error.
The problem is that the package that you are trying to install contains a C or C++ extension and is
not currently available as a pre-built wheel from the python package index, pypi, and on windows
you do not have the tool chain needed to build such items.
The simplest answer is to go to Christoph Gohlke's excellent site and locate the appropriate
version of the libraries that you need. By appropriate in the package name a -cpNN- has to match
your version of python, i.e. if you are using windows 32 bit python even on win64 the name must
include -win32- and if using the 64 bit python it must include -win_amd64- and then the python
version must match, i.e. for Python 34 the filename must include -cp34-, etc. this is basically the
magic that pip does for you on the pypi site.
Alternatively, you need to get the appropriate windows development kit for the version of python
that you are using, the headers for any library that the package you are trying to build interfaces
to, possibly the python headers for the version of python, etc.
Python 2.7 used Visual Studio 2008, Python 3.3 and 3.4 used Visual Studio 2010, and Python
https://riptutorial.com/ 613
3.5+ uses Visual Studio 2015.
• Install “Visual C++ Compiler Package for Python 2.7”, which is available from Microsoft’s
website or
• Install “Windows SDK for Windows 7 and .NET Framework 4” (v7.1), which is available from
Microsoft’s website or
• Install Visual Studio 2015 Community Edition, (or any later version, when these are
released), ensuring you select the options to install C & C++ support no longer the
default -I am told that this can take up to 8 hours to download and install so make sure that
those options are set on the first try.
Then you may need to locate the header files, at the matching revision for any libraries that your
desired package links to and download those to an appropriate locations.
Finally you can let pip do your build - of course if the package has dependencies that you don't
yet have you may also need to find the header files for them as well.
Alternatives: It is also worth looking out, both on pypi or Christop's site, for any slightly earlier
version of the package that you are looking for that is either pure python or pre-built for your
platform and python version and possibly using those, if found, until your package does become
available. Likewise if you are using the very latest version of python you may find that it takes the
package maintainers a little time to catch up so for projects that really need a specific package
you may have to use a slightly older python for the moment. You can also check the packages
source site to see if there is a forked version that is available pre-built or as pure python and
searching for alternative packages that provide the functionality that you require but are available -
one example that springs to mind is the Pillow, actively maintained, drop in replacement for PIL
currently not updated in 6 years and not available for python 3.
Afterword, I would encourage anybody who is having this problem to go to the bug tracker for the
package and add to, or raise if there isn't one already, a ticket politely requesting that the
package maintainers provide a wheel on pypi for your specific combination of platform and python,
if this is done then normally things will get better with time, some package maintainers don't
realise that they have missed a given combination that people may be using.
https://riptutorial.com/ 614
since such code is in flux it is very unlikely to have wheels built for it, so any impure packages will
require the presence of the build tools, and they may be broken at any time so the user is
strongly encouraged to only install such packages in a virtual environment.
1. Download compressed snapshot, most online version control systems have the option to
download a compressed snapshot of the code. This can be downloaded manually and then
installed with pip install path/to/downloaded/file note that for most compression formats pip
will handle unpacking to a cache area, etc.
2. Let pip handle the download & install for you with: pip install URL/of/package/repository -
you may also need to use the --trusted-host, --client-cert and/or --proxy flags for this to
work correctly, especially in a corporate environment. e.g:
https://riptutorial.com/ 615
Using cached pytz-2017.2-py2.py3-none-any.whl
Collecting sqlalchemy>=0.9 (from sphinxcontrib-websupport->Sphinx==1.7.dev20170506)
Downloading SQLAlchemy-1.1.9.tar.gz (5.2MB)
100% |################################| 5.2MB 220kB/s
Collecting whoosh>=2.0 (from sphinxcontrib-websupport->Sphinx==1.7.dev20170506)
Downloading Whoosh-2.7.4-py2.py3-none-any.whl (468kB)
100% |################################| 471kB 1.1MB/s
Installing collected packages: six, MarkupSafe, Jinja2, Pygments, docutils,
snowballstemmer, pytz, babel, alabaster, imagesize, requests, typing, sqlalchemy, whoosh,
sphinxcontrib-websupport, colorama, Sphinx
Running setup.py install for MarkupSafe ... done
Running setup.py install for typing ... done
Running setup.py install for sqlalchemy ... done
Running setup.py install for Sphinx ... done
Successfully installed Jinja2-2.9.6 MarkupSafe-1.0 Pygments-2.2.0 Sphinx-1.7.dev20170506
alabaster-0.7.10 babel-2.4.0 colorama-0.3.9 docutils-0.13.1 imagesize-0.7.1 pytz-2017.2
requests-2.13.0 six-1.10.0 snowballstemmer-1.2.1 sphinxcontrib-websupport-1.0.0 sqlalchemy-
1.1.9 typing-3.6.1 whoosh-2.7.4
3. Clone the repository using git, mercurial or other acceptable tool, preferably a DVCS tool,
and use pip install path/to/cloned/repo - this will both process any requires.text file and
perform the build and setup steps, you can manually change directory to your cloned
repository and run pip install -r requires.txt and then python setup.py install to get the
same effect. The big advantages of this approach is that while the initial clone operation may
take longer than the snapshot download you can update to the latest with, in the case of git:
git pull origin master and if the current version contains errors you can use pip uninstall
package-name then use git checkout commands to move back through the repository history
to earlier version(s) and re-try.
https://riptutorial.com/ 616
Chapter 123: Plotting with Matplotlib
Introduction
Matplotlib (https://matplotlib.org/) is a library for 2D plotting based on NumPy. Here are some basic
examples. More examples can be found in the official documentation (
https://matplotlib.org/2.0.2/gallery.html and https://matplotlib.org/2.0.2/examples/index.html) as
well as in http://www.riptutorial.com/topic/881
Examples
A Simple Plot in Matplotlib
This example illustrates how to create a simple sine curve using Matplotlib
import numpy as np
import matplotlib.pyplot as plt
plt.plot(x, y)
plt.show()
https://riptutorial.com/ 617
Adding more features to a simple plot : axis labels, title, axis ticks, grid, and
legend
In this example, we take a sine curve plot and add more features to it; namely the title, axis labels,
title, axis ticks, grid and legend.
import numpy as np
import matplotlib.pyplot as plt
https://riptutorial.com/ 618
plt.show()
In this example, a sine curve and a cosine curve are plotted in the same figure by superimposing
the plots on top of each other.
import numpy as np
import matplotlib.pyplot as plt
https://riptutorial.com/ 619
plt.title("Plot of some trigonometric functions")
plt.xticks(xnumbers)
plt.yticks(ynumbers)
plt.legend(['sine', 'cosine'])
plt.grid()
plt.axis([0, 6.5, -1.1, 1.1]) # [xstart, xend, ystart, yend]
plt.show()
Making multiple Plots in the same figure using plot superimposition with
separate plot commands
Similar to the previous example, here, a sine and a cosine curve are plotted on the same figure
using separate plot commands. This is more Pythonic and can be used to get separate handles for
each plot.
import numpy as np
import matplotlib.pyplot as plt
https://riptutorial.com/ 620
# values for making ticks in x and y axis
xnumbers = np.linspace(0, 7, 15)
ynumbers = np.linspace(-1, 1, 11)
In this example, we will plot a sine curve and a hyperbolic sine curve in the same plot with a
common x-axis having different y-axis. This is accomplished by the use of twinx() command.
https://riptutorial.com/ 621
# Note:
# Grid for second curve unsuccessful : let me know if you find it! :(
import numpy as np
import matplotlib.pyplot as plt
# plot the curves on axes 1, and 2, and get the curve handles
curve1, = ax1.plot(x, y, label="sin", color='r')
curve2, = ax2.plot(x, z, label="sinh", color='b')
https://riptutorial.com/ 622
Plots with common Y-axis and different X-axis using twiny()
In this example, a plot with curves having common y-axis but different x-axis is demonstrated
using twiny() method. Also, some additional features such as the title, legend, labels, grids, axis
ticks and colours are added to the plot.
import numpy as np
import matplotlib.pyplot as plt
https://riptutorial.com/ 623
# Duplicate the axes with a different x axis
# and the same y axis
ax2 = ax1.twiny() # ax2 and ax1 will have common y axis and different x axis
# plot the curves on axes 1, and 2, and get the axes handles
curve1, = ax1.plot(x1, y, label="sin", color='r')
curve2, = ax2.plot(x2, y, label="sinh", color='b')
# set x ticks
ax1.set_xticks(xnumbers1)
ax2.set_xticks(xnumbers2)
# set y ticks
ax1.set_yticks(ynumbers)
# ax2.set_yticks(ynumbers) # also works
https://riptutorial.com/ 624
Read Plotting with Matplotlib online: https://riptutorial.com/python/topic/10264/plotting-with-
matplotlib
https://riptutorial.com/ 625
Chapter 124: Plugin and Extension Classes
Examples
Mixins
In Object oriented programming language, a mixin is a class that contains methods for use by
other classes without having to be the parent class of those other classes. How those other
classes gain access to the mixin's methods depends on the language.
It provides a mechanism for multiple inheritance by allowing multiple classes to use the common
functionality, but without the complex semantics of multiple inheritance. Mixins are useful when a
programmer wants to share functionality between different classes. Instead of repeating the same
code over and over again, the common functionality can simply be grouped into a mixin and then
inherited into each class that requires it.
When we use more than one mixins, Order of mixins are important. here is a simple example:
class Mixin1(object):
def test(self):
print "Mixin1"
class Mixin2(object):
def test(self):
print "Mixin2"
Result must be Mixin1 because Order is left to right. This could be show unexpected results when
super classes add with it. So reverse order is more good just like this:
https://riptutorial.com/ 626
Python 3.x3.0
class Base(object):
def test(self):
print("Base.")
class PluginA(object):
def test(self):
super().test()
print("Plugin A.")
class PluginB(object):
def test(self):
super().test()
print("Plugin B.")
PluginSystemA().test()
# Base.
# Plugin A.
PluginSystemB().test()
# Base.
# Plugin B.
In Python 3.6, PEP 487 added the __init_subclass__ special method, which simplifies and extends
class customization without using metaclasses. Consequently, this feature allows for creating
simple plugins. Here we demonstrate this feature by modifying a prior example:
Python 3.x3.6
class Base:
plugins = []
def test(self):
print("Base.")
class PluginA(Base):
def test(self):
super().test()
print("Plugin A.")
class PluginB(Base):
def test(self):
super().test()
https://riptutorial.com/ 627
print("Plugin B.")
Results:
PluginA().test()
# Base.
# Plugin A.
PluginB().test()
# Base.
# Plugin B.
Base.plugins
# [__main__.PluginA, __main__.PluginB]
https://riptutorial.com/ 628
Chapter 125: Polymorphism
Examples
Basic Polymorphism
Polymorphism is the ability to perform an action on an object regardless of its type. This is
generally implemented by creating a base class and having two or more subclasses that all
implement methods with the same signature. Any other function or method that manipulates these
objects can call the same methods regardless of which type of object it is operating on, without
needing to do a type check first. In object-oriented terminology when class X extend class Y , then
Y is called super class or base class and X is called subclass or derived class.
class Shape:
"""
This is a parent class that is intended to be inherited by other classes
"""
def calculate_area(self):
"""
This method is intended to be overridden in subclasses.
If a subclass doesn't implement it but it is called, NotImplemented will be raised.
"""
raise NotImplemented
class Square(Shape):
"""
This is a subclass of the Shape class, and represents a square
"""
side_length = 2 # in this example, the sides are 2 units long
def calculate_area(self):
"""
This method overrides Shape.calculate_area(). When an object of type
Square has its calculate_area() method called, this is the method that
will be called, rather than the parent class' version.
class Triangle(Shape):
"""
This is also a subclass of the Shape class, and it represents a triangle
"""
base_length = 4
height = 3
def calculate_area(self):
"""
This method also overrides Shape.calculate_area() and performs the area
calculation for a triangle, returning the result.
"""
https://riptutorial.com/ 629
return 0.5 * self.base_length * self.height
def get_area(input_obj):
"""
This function accepts an input object, and will call that object's
calculate_area() method. Note that the object type is not specified. It
could be a Square, Triangle, or Shape object.
"""
print(input_obj.calculate_area())
# Now pass each object, one at a time, to the get_area() function and see the
# result.
get_area(shape_obj)
get_area(square_obj)
get_area(triangle_obj)
None
4
6.0
class Square:
side_length = 2
def calculate_square_area(self):
return self.side_length ** 2
class Triangle:
base_length = 4
height = 3
def calculate_triangle_area(self):
return (0.5 * self.base_length) * self.height
def get_area(input_obj):
# Notice the type checks that are now necessary here. These type checks
# could get very complicated for a more complex example, resulting in
# duplicate and difficult to maintain code.
if type(input_obj).__name__ == "Square":
https://riptutorial.com/ 630
area = input_obj.calculate_square_area()
print(area)
# Now pass each object, one at a time, to the get_area() function and see the
# result.
get_area(square_obj)
get_area(triangle_obj)
4
6.0
Important Note
Note that the classes used in the counter example are "new style" classes and implicitly inherit
from the object class if Python 3 is being used. Polymorphism will work in both Python 2.x and 3.x,
but the polymorphism counterexample code will raise an exception if run in a Python 2.x
interpreter because type(input_obj).name will return "instance" instead of the class name if they
do not explicitly inherit from object, resulting in area never being assigned to.
Duck Typing
Polymorphism without inheritance in the form of duck typing as available in Python due to its
dynamic typing system. This means that as long as the classes contain the same methods the
Python interpreter does not distinguish between them, as the only checking of the calls occurs at
run-time.
class Duck:
def quack(self):
print("Quaaaaaack!")
def feathers(self):
print("The duck has white and gray feathers.")
class Person:
def quack(self):
print("The person imitates a duck.")
def feathers(self):
print("The person takes a feather from the ground and shows it.")
def name(self):
print("John Smith")
def in_the_forest(obj):
obj.quack()
obj.feathers()
donald = Duck()
john = Person()
https://riptutorial.com/ 631
in_the_forest(donald)
in_the_forest(john)
Quaaaaaack!
The duck has white and gray feathers.
The person imitates a duck.
The person takes a feather from the ground and shows it.
https://riptutorial.com/ 632
Chapter 126: PostgreSQL
Examples
Getting Started
PostgreSQL is an actively developed and mature open source database. Using the psycopg2
module, we can execute queries on the database.
Basic usage
Lets assume we have a table my_table in the database my_database defined as follows.
id first_name last_name
1 John Doe
We can use the psycopg2 module to run queries on the database in the following fashion.
import psycopg2
# Create a cursor
cur = con.cursor()
https://riptutorial.com/ 633
Read PostgreSQL online: https://riptutorial.com/python/topic/3374/postgresql
https://riptutorial.com/ 634
Chapter 127: Processes and Threads
Introduction
Most programs are executed line by line, only running a single process at a time. Threads allow
multiple processes to flow independent of each other. Threading with multiple processors permits
programs to run multiple processes simultaneously. This topic documents the implementation and
usage of threads in Python.
Examples
Global Interpreter Lock
Python multithreading performance can often suffer due to the Global Interpreter Lock. In short,
even though you can have multiple threads in a Python program, only one bytecode instruction
can execute in parallel at any one time, regardless of the number of CPUs.
As such, multithreading in cases where operations are blocked by external events - like network
access - can be quite effective:
import threading
import time
def process():
time.sleep(2)
start = time.time()
process()
print("One run took %.2fs" % (time.time() - start))
start = time.time()
threads = [threading.Thread(target=process) for _ in range(4)]
for t in threads:
t.start()
for t in threads:
t.join()
print("Four runs took %.2fs" % (time.time() - start))
Note that even though each process took 2 seconds to execute, the four processes together were
able to effectively run in parallel, taking 2 seconds total.
However, multithreading in cases where intensive computations are being done in Python code -
such as a lot of computation - does not result in much improvement, and can even be slower than
running in parallel:
https://riptutorial.com/ 635
import threading
import time
def somefunc(i):
return i * i
def process():
for j in range(100):
result = 0
for i in range(100000):
result = otherfunc(result, somefunc(i))
start = time.time()
process()
print("One run took %.2fs" % (time.time() - start))
start = time.time()
threads = [threading.Thread(target=process) for _ in range(4)]
for t in threads:
t.start()
for t in threads:
t.join()
print("Four runs took %.2fs" % (time.time() - start))
In the latter case, multiprocessing can be effective as multiple processes can, of course, execute
multiple instructions simultaneously:
import multiprocessing
import time
def somefunc(i):
return i * i
def process():
for j in range(100):
result = 0
for i in range(100000):
result = otherfunc(result, somefunc(i))
start = time.time()
process()
print("One run took %.2fs" % (time.time() - start))
start = time.time()
processes = [multiprocessing.Process(target=process) for _ in range(4)]
https://riptutorial.com/ 636
for p in processes:
p.start()
for p in processes:
p.join()
print("Four runs took %.2fs" % (time.time() - start))
import threading
import os
def process():
print("Pid is %s, thread id is %s" % (os.getpid(), threading.current_thread().name))
import multiprocessing
import os
def process():
print("Pid is %s" % (os.getpid(),))
As all threads are running in the same process, all threads have access to the same data.
https://riptutorial.com/ 637
However, concurrent access to shared data should be protected with a lock to avoid
synchronization issues.
import threading
obj = {}
obj_lock = threading.Lock()
Code running in different processes do not, by default, share the same data. However, the
multiprocessing module contains primitives to help share values across multiple processes.
import multiprocessing
plain_num = 0
shared_num = multiprocessing.Value('d', 0)
lock = multiprocessing.Lock()
def increment():
global plain_num
with lock:
# ordinary variable modifications are not visible across processes
plain_num += 1
# multiprocessing.Value modifications are
shared_num.value += 1
https://riptutorial.com/ 638
print("plain_num is %d, shared_num is %d" % (plain_num, shared_num.value))
https://riptutorial.com/ 639
Chapter 128: Profiling
Examples
%%timeit and %timeit in IPython
timeit() function
https://riptutorial.com/ 640
line_profiler in command line
The source code with @profile directive before the function we want to profile:
import requests
@profile
def slow_func():
s = requests.session()
html=s.get("https://en.wikipedia.org/").text
sum([pow(ord(x),3.1) for x in list(html)])
for i in range(50):
slow_func()
Page request is almost always slower than any calculation based on the information on the page.
Python includes a profiler called cProfile. This is generally preferred over using timeit.
It breaks down your entire script and for each method in your script it tells you:
https://riptutorial.com/ 641
$ python -m cProfile main.py
To sort the returned list of profiled methods by the time taken in the method:
https://riptutorial.com/ 642
Chapter 129: Property Objects
Remarks
Note: In Python 2, make sure that your class inherits from object (making it a new-style class) in
order for all features of properties to be available.
Examples
Using the @property decorator
The @property decorator can be used to define methods in a class which act like attributes. One
example where this can be useful is when exposing information which may require an initial
(expensive) lookup and simple retrieval thereafter.
class Foo(object):
def __init__(self):
self.__bar = None
@property
def bar(self):
if self.__bar is None:
self.__bar = some_expensive_lookup_operation()
return self.__bar
Then
If you want to use @property to implement custom behavior for setting and getting, use this pattern:
class Cash(object):
def __init__(self, value):
self.value = value
@property
def formatted(self):
return '${:.2f}'.format(self.value)
@formatted.setter
def formatted(self, new):
self.value = float(new[1:])
https://riptutorial.com/ 643
To use this:
When you inherit from a class with a property, you can provide a new implementation for one or
more of the property getter, setter or deleter functions, by referencing the property object on the
parent class:
class BaseClass(object):
@property
def foo(self):
return some_calculated_value()
@foo.setter
def foo(self, value):
do_something_with_value(value)
class DerivedClass(BaseClass):
@BaseClass.foo.setter
def foo(self, value):
do_something_different_with_value(value)
You can also add a setter or deleter where there was not one on the base class before.
While using decorator syntax (with the @) is convenient, it also a bit concealing. You can use
properties directly, without decorators. The following Python 3.x example shows this:
class A:
p = 1234
def getX (self):
return self._x
https://riptutorial.com/ 644
def getY2 (self):
return self._y
A.q = 5678
class B:
def getZ (self):
return self.z_
class C:
def __init__ (self):
self.offset = 1234
a1 = A ()
a2 = A ()
a1.y2 = 1000
a2.y2 = 2000
a1.x = 5
a1.y = 6
a2.x = 7
a2.y = 8
a1.t = 77
a1.u = 88
https://riptutorial.com/ 645
print (a2.x, a2.y, a2.y2)
print (a1.p, a2.p, a1.q, a2.q)
b = B ()
c = C ()
b.z = 100100
c.z = 200200
c.w = 300300
c.w = 400400
c.z = 500500
b.z = 600600
https://riptutorial.com/ 646
Chapter 130: py.test
Examples
Setting up py.test
py.test is one of several third party testing libraries that are available for Python. It can be installed
using pip with
# projectroot/module/code.py
def add(a, b):
return a + b
# projectroot/tests/test_code.py
from module import code
def test_add():
assert code.add(1, 2) == 3
tests/test_code.py .
https://riptutorial.com/ 647
================================================ 1 passed in 0.01 seconds
================================================
Failing Tests
# projectroot/tests/test_code.py
from module import code
def test_add__failing():
assert code.add(10, 11) == 33
Results:
$ py.test
================================================== test session starts
===================================================
platform darwin -- Python 2.7.10, pytest-2.9.2, py-1.4.31, pluggy-0.3.1
rootdir: /projectroot, inifile:
collected 1 items
tests/test_code.py F
======================================================== FAILURES
========================================================
___________________________________________________ test_add__failing
____________________________________________________
def test_add__failing():
> assert code.add(10, 11) == 33
E assert 21 == 33
E + where 21 = <function add at 0x105d4d6e0>(10, 11)
E + where <function add at 0x105d4d6e0> = code.add
tests/test_code.py:5: AssertionError
================================================ 1 failed in 0.01 seconds
================================================
More complicated tests sometimes need to have things set up before you run the code you want
to test. It is possible to do this in the test function itself, but then you end up with large test
functions doing so much that it is difficult to tell where the setup stops and the test begins. You can
also get a lot of duplicate setup code between your various test functions.
# projectroot/module/stuff.py
class Stuff(object):
def prep(self):
self.foo = 1
self.bar = 2
https://riptutorial.com/ 648
Our test file:
# projectroot/tests/test_stuff.py
import pytest
from module import stuff
def test_foo_updates():
my_stuff = stuff.Stuff()
my_stuff.prep()
assert 1 == my_stuff.foo
my_stuff.foo = 30000
assert my_stuff.foo == 30000
def test_bar_updates():
my_stuff = stuff.Stuff()
my_stuff.prep()
assert 2 == my_stuff.bar
my_stuff.bar = 42
assert 42 == my_stuff.bar
These are pretty simple examples, but if our Stuff object needed a lot more setup, it would get
unwieldy. We see that there is some duplicated code between our test cases, so let's refactor that
into a separate function first.
# projectroot/tests/test_stuff.py
import pytest
from module import stuff
def get_prepped_stuff():
my_stuff = stuff.Stuff()
my_stuff.prep()
return my_stuff
def test_foo_updates():
my_stuff = get_prepped_stuff()
assert 1 == my_stuff.foo
my_stuff.foo = 30000
assert my_stuff.foo == 30000
def test_bar_updates():
my_stuff = get_prepped_stuff()
assert 2 == my_stuff.bar
my_stuff.bar = 42
assert 42 == my_stuff.bar
This looks better but we still have the my_stuff = get_prepped_stuff() call cluttering up our test
functions.
https://riptutorial.com/ 649
more than we're leveraging here, but we'll take it one step at a time.
First we change get_prepped_stuff to a fixture called prepped_stuff. You want to name your fixtures
with nouns rather than verbs because of how the fixtures will end up being used in the test
functions themselves later. The @pytest.fixture indicates that this specific function should be
handled as a fixture rather than a regular function.
@pytest.fixture
def prepped_stuff():
my_stuff = stuff.Stuff()
my_stuff.prep()
return my_stuff
Now we should update the test functions so that they use the fixture. This is done by adding a
parameter to their definition that exactly matches the fixture name. When py.test executes, it will
run the fixture before running the test, then pass the return value of the fixture into the test function
through that parameter. (Note that fixtures don't need to return a value; they can do other setup
things instead, like calling an external resource, arranging things on the filesystem, putting values
in a database, whatever the tests need for setup)
def test_foo_updates(prepped_stuff):
my_stuff = prepped_stuff
assert 1 == my_stuff.foo
my_stuff.foo = 30000
assert my_stuff.foo == 30000
def test_bar_updates(prepped_stuff):
my_stuff = prepped_stuff
assert 2 == my_stuff.bar
my_stuff.bar = 42
assert 42 == my_stuff.bar
Now you can see why we named it with a noun. but the my_stuff = prepped_stuff line is pretty
much useless, so let's just use prepped_stuff directly instead.
def test_foo_updates(prepped_stuff):
assert 1 == prepped_stuff.foo
prepped_stuff.foo = 30000
assert prepped_stuff.foo == 30000
def test_bar_updates(prepped_stuff):
assert 2 == prepped_stuff.bar
prepped_stuff.bar = 42
assert 42 == prepped_stuff.bar
Now we're using fixtures! We can go further by changing the scope of the fixture (so it only runs
once per test module or test suite execution session instead of once per test function), building
fixtures that use other fixtures, parametrizing the fixture (so that the fixture and all tests using that
fixture are run multiple times, once for each parameter given to the fixture), fixtures that read
values from the module that calls them... as mentioned earlier, fixtures have a lot more power and
https://riptutorial.com/ 650
flexibility than a normal setup function.
# projectroot/module/stuff.py
class Stuff(object):
def prep(self):
self.foo = 1
self.bar = 2
def finish(self):
self.foo = 0
self.bar = 0
We could add some code to call the clean up at the bottom of every test function, but fixtures
provide a better way to do this. If you add a function to the fixture and register it as a finalizer, the
code in the finalizer function will get called after the test using the fixture is done. If the scope of
the fixture is larger than a single function (like module or session), the finalizer will be executed
after all the tests in scope are completed, so after the module is done running or at the end of the
entire test running session.
@pytest.fixture
def prepped_stuff(request): # we need to pass in the request to use finalizers
my_stuff = stuff.Stuff()
my_stuff.prep()
def fin(): # finalizer function
# do all the cleanup here
my_stuff.finish()
request.addfinalizer(fin) # register fin() as a finalizer
# you can do more setup here if you really want to
return my_stuff
Using the finalizer function inside a function can be a bit hard to understand at first glance,
especially when you have more complicated fixtures. You can instead use a yield fixture to do the
same thing with a more human readable execution flow. The only real difference is that instead of
using return we use a yield at the part of the fixture where the setup is done and control should go
to a test function, then add all the cleanup code after the yield. We also decorate it as a
yield_fixture so that py.test knows how to handle it.
@pytest.yield_fixture
def prepped_stuff(): # it doesn't need request now!
# do setup
my_stuff = stuff.Stuff()
my_stuff.prep()
# setup is done, pass control to the test functions
yield my_stuff
# do cleanup
my_stuff.finish()
https://riptutorial.com/ 651
For more information, see the official py.test fixture documentation and the official yield fixture
documentation
https://riptutorial.com/ 652
Chapter 131: pyaudio
Introduction
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With
PyAudio, you can easily use Python to play and record audio on a variety of platforms. PyAudio is
inspired by:
Remarks
Note: stream_callback is called in a separate thread (from the main thread). Exceptions that occur
in the stream_callback will:
1.print a traceback on standard error to aid debugging,
2.queue the exception to be thrown (at some point) in the main thread, and
3.return paAbort to PortAudio to stop the stream.
Note: Do not call Stream.read() or Stream.write() if using non-blocking operation.
See: PortAudio’s callback signature for additional details :
http://portaudio.com/docs/v19-
doxydocs/portaudio_8h.html#a8a60fb2a5ec9cbade3f54a9c978e2710
Examples
Callback Mode Audio I/O
import pyaudio
import wave
import time
import sys
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
https://riptutorial.com/ 653
# open stream using callback (3)
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True,
stream_callback=callback)
In callback mode, PyAudio will call a specified callback function (2) whenever it needs new audio
data (to play) and/or when there is new (recorded) audio data available. Note that PyAudio calls
the callback function in a separate thread. The function has the following signature
callback(<input_data>, <frame_count>, <time_info>, <status_flag>) and must return a tuple
containing frame_count frames of audio data and a flag signifying whether there are more frames to
play/record.
Start processing the audio stream using pyaudio.Stream.start_stream() (4), which will call the
callback function repeatedly until that function returns pyaudio.paComplete.
To keep the stream active, the main thread must not terminate, e.g., by sleeping (5).
import pyaudio
import wave
import sys
CHUNK = 1024
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: %s filename.wav" % sys.argv[0])
sys.exit(-1)
wf = wave.open(sys.argv[1], 'rb')
https://riptutorial.com/ 654
rate=wf.getframerate(),
output=True)
# read data
data = wf.readframes(CHUNK)
To use PyAudio, first instantiate PyAudio using pyaudio.PyAudio() (1), which sets up the
portaudio system.
To record or play audio, open a stream on the desired device with the desired audio parameters
using pyaudio.PyAudio.open() (2). This sets up a pyaudio.Stream to play or record audio.
Play audio by writing audio data to the stream using pyaudio.Stream.write(), or read audio data
from the stream using pyaudio.Stream.read(). (3)
https://riptutorial.com/ 655
Chapter 132: pyautogui module
Introduction
pyautogui is a module used to control mouse and keyboard. This module is basically used to
automate mouse click and keyboard press tasks. For the mouse, the coordinates of the screen
(0,0) start from the top-left corner. If you are out of control, then quickly move the mouse cursor to
top-left, it will take the control of mouse and keyboard from the Python and give it back to you.
Examples
Mouse Functions
Keyboard Functions
These are some of useful keyboard functions to automate the key pressing.
typewrite('') #this will type the string on the screen where current window has focused.
typewrite(['a','b','left','left','X','Y'])
pyautogui.KEYBOARD_KEYS #get the list of all the keyboard_keys.
pyautogui.hotkey('ctrl','o') #for the combination of keys to enter.
These function will help you to take the screenshot and also match the image with the part of the
screen.
https://riptutorial.com/ 656
Chapter 133: pygame
Introduction
Pygame is the go-to library for making multimedia applications, especially games, in Python. The
official website is http://www.pygame.org/.
Syntax
• pygame.mixer.init(frequency=22050, size=-16, channels=2, buffer=4096)
• pygame.mixer.pre_init(frequency, size, channels, buffer)
• pygame.mixer.quit()
• pygame.mixer.get_init()
• pygame.mixer.stop()
• pygame.mixer.pause()
• pygame.mixer.unpause()
• pygame.mixer.fadeout(time)
• pygame.mixer.set_num_channels(count)
• pygame.mixer.get_num_channels()
• pygame.mixer.set_reserved(count)
• pygame.mixer.find_channel(force)
• pygame.mixer.get_busy()
Parameters
Parameter Details
Examples
Installing pygame
With pip:
With conda:
https://riptutorial.com/ 657
conda install -c tlatorre pygame=1.9.2
You can find the suitable installers fro windows and other operating systems.
The pygame.mixer module helps control the music used in pygame programs. As of now, there are 15
different functions for the mixer module.
Initializing
Similar to how you have to initialize pygame with pygame.init(), you must initialize pygame.mixer as
well.
By using the first option, we initialize the module using the default values. You can though,
override these default options. By using the second option, we can initialize the module using the
values we manually put in ourselves. Standard values:
To check whether we have initialized it or not, we can use pygame.mixer.get_init(), which returns
True if it is and False if it is not. To quit/undo the initializing, simply use pygame.mixer.quit(). If you
want to continue playing sounds with the module, you might have to reinitialize the module.
Possible Actions
As your sound is playing, you can pause it tempoparily with pygame.mixer.pause(). To resume
playing your sounds, simply use pygame.mixer.unpause(). You can also fadeout the end of the
sound by using pygame.mixer.fadeout(). It takes an argument, which is the number of milliseconds
it takes to finish fading out the music.
Channels
You can play as many songs as needed as long there are enough open channels to support them.
By default, there are 8 channels. To change the number of channels there are, use
pygame.mixer.set_num_channels(). The argument is a non-negative integer. If the number of
channels are decreased, any sounds playing on the removed channels will immediately stop.
To find how many channels are currently being used, call pygame.mixer.get_channels(count). The
output is the number of channels that are not currently open. You can also reserve channels for
https://riptutorial.com/ 658
sounds that must be played by using pygame.mixer.set_reserved(count). The argument is also a
non-negative integer. Any sounds playing on the newly reserved channels will not be stopped.
You can also find out which channel isn't being used by using pygame.mixer.find_channel(force). Its
argument is a bool: either True or False. If there are no channels that are idle and force is False, it
will return None. If force is true, it will return the channel that has been playing for the longest time.
https://riptutorial.com/ 659
Chapter 134: Pyglet
Introduction
Pyglet is a Python module used for visuals and sound. It has no dependencies on other modules.
See [pyglet.org][1] for the official information. [1]: http://pyglet.org
Examples
Hello World in Pyglet
import pyglet
window = pyglet.window.Window()
label = pyglet.text.Label('Hello, world',
font_name='Times New Roman',
font_size=36,
x=window.width//2, y=window.height//2,
anchor_x='center', anchor_y='center')
@window.event
def on_draw():
window.clear()
label.draw()
pyglet.app.run()
Installation of Pyglet
Python 2:
Python 3:
sound = pyglet.media.load(sound.wav)
sound.play()
import pyglet
from pyglet.gl import *
win = pyglet.window.Window()
https://riptutorial.com/ 660
@win.event()
def on_draw():
#OpenGL goes here. Use OpenGL as normal.
pyglet.app.run()
import pyglet
from pyglet.gl import *
win = pyglet.window.Window()
glClear(GL_COLOR_BUFFER_BIT)
@win.event
def on_draw():
glBegin(GL_POINTS)
glVertex2f(x, y) #x is desired distance from left side of window, y is desired distance
from bottom of window
#make as many vertexes as you want
glEnd
https://riptutorial.com/ 661
Chapter 135: PyInstaller - Distributing Python
Code
Syntax
• pyinstaller [options] script [script ...] | specfile
Remarks
PyInstaller is a module used to bundle python apps in a single package along with all the
dependencies. The user can then run the package app without a python interpreter or any
modules. It correctly bundles many major packages like numpy, Django, OpenCv and others.
Examples
Installation and Setup
Installation in Windows
For Windows, pywin32 or pypiwin32 is a prerequisite. The latter is installed automatically when
pyinstaller is installed using pip.
Installation in Mac OS X
PyInstaller works with the default Python 2.7 provided with current Mac OS X. If later versions of
Python are to be used or if any major packages such as PyQT, Numpy, Matplotlib and the like are
to be used, it is recommended to install them using either MacPorts or Homebrew.
https://riptutorial.com/ 662
Expand the archive and find the setup.py script. Execute python setup.py install with administrator
privilege to install or upgrade PyInstaller.
Using Pyinstaller
In the simplest use-case, just navigate to the directory your file is in, and type:
pyinstaller myfile.py
Options
There are several options that can be used with pyinstaller. A full list of the options can be found
here.
When PyInstaller is used without any options to bundle myscript.py , the default output is a single
folder (named myscript) containing an executable named myscript (myscript.exe in windows) along
with all the necessary dependencies.
The app can be distributed by compressing the folder into a zip file.
One Folder mode can be explictly set using the option -D or --onedir
pyinstaller myscript.py -D
Advantages:
One of the major advantages of bundling to a single folder is that it is easier to debug problems. If
any modules fail to import, it can be verified by inspecting the folder.
Another advantage is felt during updates. If there are a few changes in the code but the
dependencies used are exactly the same, distributors can just ship the executable file (which is
typically smaller than the entire folder).
https://riptutorial.com/ 663
Disadvantages
The only disadvantage of this method is that the users have to search for the executable among a
large number of files.
Also users can delete/modify other files which might lead to the app not being able to work
correctly.
The options to generate a single file are -F or --onefile. This bundles the program into a single
myscript.exe file.
Single file executable are slower than the one-folder bundle. They are also harder to debug.
https://riptutorial.com/ 664
Chapter 136: Python and Excel
Examples
Put list data into a Excel's file.
dt = datetime.now()
list_values = [["01/01/2016", "05:00:00", 3], \
["01/02/2016", "06:00:00", 4], \
["01/03/2016", "07:00:00", 5], \
["01/04/2016", "08:00:00", 6], \
["01/05/2016", "09:00:00", 7]]
OpenPyXL
https://riptutorial.com/ 665
load_workbook() contains the parameter read_only, setting this to True will load the workbook as
read_only, this is helpful when reading larger xlsx files:
Once you have loaded the workbook into memory, you can access the individual sheets using
workbook.sheets
first_sheet = workbook.worksheets[0]
If you want to specify the name of an available sheets, you can use workbook.get_sheet_names().
Finally, the rows of the sheet can be accessed using sheet.rows. To iterate over the rows in a
sheet, use:
Since each row in rows is a list of Cells, use Cell.value to get the contents of the Cell.
Several tab properties may be changed through openpyxl, for example the tabColor:
ws.sheet_properties.tabColor = 'FFC0CB'
wb.save('filename.xlsx')
import xlsxwriter
# sample data
chart_data = [
{'name': 'Lorem', 'value': 23},
{'name': 'Ipsum', 'value': 48},
{'name': 'Dolor', 'value': 15},
{'name': 'Sit', 'value': 8},
{'name': 'Amet', 'value': 32}
]
https://riptutorial.com/ 666
# excel file path
xls_file = 'chart.xlsx'
# the workbook
workbook = xlsxwriter.Workbook(xls_file)
row_ = 0
col_ = 0
# write headers
worksheet.write(row_, col_, 'NAME')
col_ += 1
worksheet.write(row_, col_, 'VALUE')
row_ += 1
workbook.close()
Result:
https://riptutorial.com/ 667
Read the excel data using xlrd module
Python xlrd library is to extract data from Microsoft Excel (tm) spreadsheet files.
Installation:-
https://pypi.python.org/pypi/xlrd
https://riptutorial.com/ 668
Reading an excel sheet:- Import xlrd module and open excel file using open_workbook() method.
import xlrd
book=xlrd.open_workbook('sample.xlsx')
print book.nsheets
print book.sheet_names()
sheet=book.sheet_by_index(1)
num_rows=sheet.nrows
num_col=sheet.ncols
sheets = book.sheet_names()
cur_sheet = book.sheet_by_name(sheets[0])
import xlsxwriter
https://riptutorial.com/ 669
# set column B to 20 and include the percent format we created earlier
worksheet.set_column('B:B', 20, percent_format)
workbook.close()
https://riptutorial.com/ 670
Chapter 137: Python Anti-Patterns
Examples
Overzealous except clause
Exceptions are powerful, but a single overzealous except clause can take it all away in a single
line.
try:
res = get_result()
res = res[0]
log('got result: %r' % res)
except:
if not res:
res = ''
print('got exception')
1. The except with no exception type (line 5) will catch even healthy exceptions, including
KeyboardInterrupt. That will prevent the program from exiting in some cases.
2. The except block does not reraise the error, meaning that we won't be able to tell if the
exception came from within get_result or because res was an empty list.
3. Worst of all, if we were worried about result being empty, we've caused something much
worse. If get_result fails, res will stay completely unset, and the reference to res in the
except block, will raise NameError, completely masking the original error.
Always think about the type of exception you're trying to handle. Give the exceptions page a read
and get a feel for what basic exceptions exist.
import traceback
try:
res = get_result()
except Exception:
log_exception(traceback.format_exc())
raise
try:
res = res[0]
except IndexError:
res = ''
We catch more specific exceptions, reraising where necessary. A few more lines, but infinitely
more correct.
https://riptutorial.com/ 671
Looking before you leap with processor-intensive function
A program can easily waste time by calling a processor-intensive function multiple times.
For example, take a function which looks like this: it returns an integer if the input value can
produce one, else None:
x = 5
if intensive_f(x) is not None:
print(intensive_f(x) / 2)
else:
print(x, "could not be processed")
print(x)
Whilst this will work, it has the problem of calling intensive_f, which doubles the length of time for
the code to run. A better solution would be to get the return value of the function beforehand.
x = 5
result = intensive_f(x)
if result is not None:
print(result / 2)
else:
print(x, "could not be processed")
However, a clearer and possibly more pythonic way is to use exceptions, for example:
x = 5
try:
print(intensive_f(x) / 2)
except TypeError: # The exception raised if None + 1 is attempted
print(x, "could not be processed")
Here no temporary variable is needed. It may often be preferable to use a assert statement, and to
catch the AssertionError instead.
Dictionary keys
A common example of where this may be found is accessing dictionary keys. For example
compare:
bird_speeds = get_very_long_dictionary()
https://riptutorial.com/ 672
if "european swallow" in bird_speeds:
speed = bird_speeds["european swallow"]
else:
speed = input("What is the air-speed velocity of an unladen swallow?")
print(speed)
with:
bird_speeds = get_very_long_dictionary()
try:
speed = bird_speeds["european swallow"]
except KeyError:
speed = input("What is the air-speed velocity of an unladen swallow?")
print(speed)
The first example has to look through the dictionary twice, and as this is a long dictionary, it may
take a long time to do so each time. The second only requires one search through the dictionary,
and thus saves a lot of processor time.
An alternative to this is to use dict.get(key, default), however many circumstances may require
more complex operations to be done in the case that the key is not present
https://riptutorial.com/ 673
Chapter 138: Python concurrency
Remarks
The Python developers made sure that the API between threading and multiprocessing is similar
so that switching between the two variants is easier for programmers.
Examples
The threading module
t1 = threading.Thread(target=countdown,args=(10,))
t1.start()
t2 = threading.Thread(target=countdown,args=(20,))
t2.start()
In certain implementations of Python such as CPython, true parallelism is not achieved using
threads because of using what is known as the GIL, or Global Interpreter Lock.
def countdown(count):
while count > 0:
print("Count value", count)
count -= 1
return
if __name__ == "__main__":
p1 = multiprocessing.Process(target=countdown, args=(10,))
p1.start()
p2 = multiprocessing.Process(target=countdown, args=(20,))
p2.start()
p1.join()
https://riptutorial.com/ 674
p2.join()
Here, each function is executed in a new process. Since a new instance of Python VM is running
the code, there is no GIL and you get parallelism running on multiple cores.
The Process.start method launches this new process and run the function passed in the target
argument with the arguments args. The Process.join method waits for the end of the execution of
processes p1 and p2.
The new processes are launched differently depending on the version of python and the plateform
on which the code is running e.g.:
After a fork in a multithreaded program, the child can safely call only async-signal-safe
functions until such time as it calls execve.
(see)
Using fork, a new process will be launched with the exact same state for all the current mutex but
only the MainThread will be launched. This is unsafe as it could lead to race conditions e.g.:
• If you use a Lock in MainThread and pass it to an other thread which is suppose to lock it at
some point. If the fork occures simultaneously, the new process will start with a locked lock
which will never be released as the second thread does not exist in this new process.
Actually, this kind of behavior should not occured in pure python as multiprocessing handles it
properly but if you are interacting with other library, this kind of behavior can occures, leading to
crash of your system (for instance with numpy/accelerated on macOS).
Because data is sensitive when dealt with between two threads (think concurrent read and
concurrent write can conflict with one another, causing race conditions), a set of unique objects
were made in order to facilitate the passing of data back and forth between threads. Any truly
atomic operation can be used between threads, but it is always safe to stick with Queue.
import multiprocessing
import queue
my_Queue=multiprocessing.Queue()
#Creates a queue with an undefined maximum size
https://riptutorial.com/ 675
#this can be dangerous as the queue becomes increasingly large
#it will take a long time to copy data to/from each read/write thread
Most people will suggest that when using queue, to always place the queue data in a try: except:
block instead of using empty. However, for applications where it does not matter if you skip a scan
cycle (data can be placed in the queue while it is flipping states from queue.Empty==True to
queue.Empty==False) it is usually better to place read and write access in what I call an Iftry block,
because an 'if' statement is technically more performant than catching the exception.
import multiprocessing
import queue
'''Import necessary Python standard libraries, multiprocessing for classes and queue for the
queue exceptions it provides'''
def Queue_Iftry_Get(get_queue, default=None, use_default=False, func=None, use_func=False):
'''This global method for the Iftry block is provided for it's reuse and
standard functionality, the if also saves on performance as opposed to catching
the exception, which is expencive.
It also allows the user to specify a function for the outgoing data to use,
and a default value to return if the function cannot return the value from the queue'''
if get_queue.empty():
if use_default:
return default
else:
try:
value = get_queue.get_nowait()
except queue.Empty:
if use_default:
return default
else:
if use_func:
return func(value)
else:
return value
def Queue_Iftry_Put(put_queue, value):
'''This global method for the Iftry block is provided because of its reuse
and
standard functionality, the If also saves on performance as opposed to catching
the exception, which is expensive.
Return True if placing value in the queue was successful. Otherwise, false'''
if put_queue.full():
return False
else:
try:
put_queue.put_nowait(value)
except queue.Full:
return False
else:
return True
https://riptutorial.com/ 676
Chapter 139: Python Data Types
Introduction
Data types are nothing but variable you used to reserve some space in memory. Python variables
do not need an explicit declaration to reserve memory space. The declaration happens
automatically when you assign a value to a variable.
Examples
Numbers data type
Numbers have four types in Python. Int, float, complex, and long.
String are identified as a contiguous set of characters represented in the quotation marks. Python
allows for either pairs of single or double quotes. Strings are immutable sequence data type, i.e
each time one makes any changes to a string, completely new string object is created.
A list contains items separated by commas and enclosed within square brackets [].lists are almost
similar to arrays in C. One difference is that all the items belonging to a list can be of different data
type.
list = [123,'abcd',10.2,'d'] #can be a array of any data type or single data type.
list1 = ['hello','world']
print(list) #will ouput whole list. [123,'abcd',10.2,'d']
print(list[0:2]) #will output first two element of list. [123,'abcd']
print(list1 * 2) #will gave list1 two times. ['hello','world','hello','world']
print(list + list1) #will gave concatenation of both the lists.
[123,'abcd',10.2,'d','hello','world']
Lists are enclosed in brackets [ ] and their elements and size can be changed, while tuples are
https://riptutorial.com/ 677
enclosed in parentheses ( ) and cannot be updated. Tuples are immutable.
tuple = (123,'hello')
tuple1 = ('world')
print(tuple) #will output whole tuple. (123,'hello')
print(tuple[0]) #will output first value. (123)
print(tuple + tuple1) #will output (123,'hello','world')
tuple[1]='update' #this will give you error.
Dictionary consists of key-value pairs.It is enclosed by curly braces {} and values can be assigned
and accessed using square brackets[].
dic={'name':'red','age':10}
print(dic) #will output all the key-value pairs. {'name':'red','age':10}
print(dic['name']) #will output only value with 'name' key. 'red'
print(dic.values()) #will output list of values in dic. ['red',10]
print(dic.keys()) #will output list of keys. ['name','age']
Sets are unordered collections of unique objects, there are two types of set :
1. Sets - They are mutable and new elements can be added once sets are defined
2. Frozen Sets - They are immutable and new elements cannot added after its defined.
b = frozenset('asdfagsa')
print(b)
> frozenset({'f', 'g', 'd', 'a', 's'})
cities = frozenset(["Frankfurt", "Basel","Freiburg"])
print(cities)
> frozenset({'Frankfurt', 'Basel', 'Freiburg'})
https://riptutorial.com/ 678
Chapter 140: Python HTTP Server
Examples
Running a simple HTTP server
Python 2.x2.3
Python 3.x3.0
Running this command serves the files of the current directory at port 9000.
If no argument is provided as port number then server will run on default port 8000.
The -m flag will search sys.path for the corresponding .py file to run as a module.
If you want to only serve on localhost you'll need to write a custom Python program such as:
import sys
import BaseHTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler
HandlerClass = SimpleHTTPRequestHandler
ServerClass = BaseHTTPServer.HTTPServer
Protocol = "HTTP/1.0"
if sys.argv[1:]:
port = int(sys.argv[1])
else:
port = 8000
server_address = ('127.0.0.1', port)
HandlerClass.protocol_version = Protocol
httpd = ServerClass(server_address, HandlerClass)
sa = httpd.socket.getsockname()
print "Serving HTTP on", sa[0], "port", sa[1], "..."
httpd.serve_forever()
Serving files
https://riptutorial.com/ 679
You can setup a web server to serve these files as follows:
Python 2.x2.3
import SimpleHTTPServer
import SocketServer
PORT = 8000
handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("localhost", PORT), handler)
print "Serving files at port {}".format(PORT)
httpd.serve_forever()
Python 3.x3.0
import http.server
import socketserver
PORT = 8000
handler = http.server.SimpleHTTPRequestHandler
httpd = socketserver.TCPServer(("", PORT), handler)
print("serving at port", PORT)
httpd.serve_forever()
The SocketServer module provides the classes and functionalities to setup a network server.
SocketServer's TCPServerclass sets up a server using the TCP protocol. The constructor accepts a
tuple representing the address of the server (i.e. the IP address and port) and the class that
handles the server requests.
The SimpleHTTPRequestHandler class of the SimpleHTTPServer module allows the files at the current
directory to be served.
Python 2.x2.3
Python 3.x3.0
https://riptutorial.com/ 680
The '-m' flag will search 'sys.path' for the corresponding '.py' file to run as a module.
Firstly, Python invokes the SimpleHTTPServer module with 9000 as an argument. Now observing the
SimpleHTTPServer code,
if __name__ == '__main__':
test()
The test function is invoked following request handlers and ServerClass. Now
BaseHTTPServer.test is invoked
This runs an HTTP server on port 8000 (or the first command line
argument).
"""
if sys.argv[1:]:
port = int(sys.argv[1])
else:
port = 8000
server_address = ('', port)
HandlerClass.protocol_version = protocol
httpd = ServerClass(server_address, HandlerClass)
https://riptutorial.com/ 681
sa = httpd.socket.getsockname()
print "Serving HTTP on", sa[0], "port", sa[1], "..."
httpd.serve_forever()
Hence here the port number, which the user passed as argument is parsed and is bound to the
host address. Further basic steps of socket programming with given port and protocol is carried
out. Finally socket server is initiated.
+------------+
| BaseServer |
+------------+
|
v
+-----------+ +------------------+
| TCPServer |------->| UnixStreamServer |
+-----------+ +------------------+
|
v
+-----------+ +--------------------+
| UDPServer |------->| UnixDatagramServer |
+-----------+ +--------------------+
def do_GET(self):
self._set_headers()
self.wfile.write("received get request")
def do_POST(self):
'''Reads post request body'''
self._set_headers()
content_len = int(self.headers.getheader('content-length', 0))
post_body = self.rfile.read(content_len)
self.wfile.write("received post request:<br>{}".format(post_body))
def do_PUT(self):
self.do_POST()
host = ''
port = 80
HTTPServer((host, port), HandleRequests).serve_forever()
https://riptutorial.com/ 682
Example output using curl:
$ curl http://localhost/
received get request%
https://riptutorial.com/ 683
Chapter 141: Python Lex-Yacc
Introduction
PLY is a pure-Python implementation of the popular compiler construction tools lex and yacc.
Remarks
Additional links:
1. Official docs
2. Github
Examples
Getting Started with PLY
To install PLY on your machine for python2/3, follow the steps outlined below:
If you completed all the above, you should now be able to use the PLY module. You can test it out
by opening a python interpreter and typing import ply.lex.
Note: Do not use pip to install PLY, it will install a broken distribution on your machine.
Let's demonstrate the power of PLY with a simple example: this program will take an arithmetic
expression as a string input, and attempt to solve it.
tokens = (
'PLUS',
'MINUS',
'TIMES',
'DIV',
'LPAREN',
'RPAREN',
'NUMBER',
)
https://riptutorial.com/ 684
t_ignore = ' \t'
t_PLUS = r'\+'
t_MINUS = r'-'
t_TIMES = r'\*'
t_DIV = r'/'
t_LPAREN = r'\('
t_RPAREN = r'\)'
def t_NUMBER( t ) :
r'[0-9]+'
t.value = int( t.value )
return t
def t_newline( t ):
r'\n+'
t.lexer.lineno += len( t.value )
def t_error( t ):
print("Invalid Token:",t.value[0])
t.lexer.skip( 1 )
lexer = lex.lex()
precedence = (
( 'left', 'PLUS', 'MINUS' ),
( 'left', 'TIMES', 'DIV' ),
( 'nonassoc', 'UMINUS' )
)
def p_add( p ) :
'expr : expr PLUS expr'
p[0] = p[1] + p[3]
def p_sub( p ) :
'expr : expr MINUS expr'
p[0] = p[1] - p[3]
def p_expr2uminus( p ) :
'expr : MINUS expr %prec UMINUS'
p[0] = - p[2]
def p_mult_div( p ) :
'''expr : expr TIMES expr
| expr DIV expr'''
if p[2] == '*' :
p[0] = p[1] * p[3]
else :
if p[3] == 0 :
print("Can't divide by 0")
raise ZeroDivisionError('integer division by 0')
p[0] = p[1] / p[3]
def p_expr2NUM( p ) :
'expr : NUMBER'
p[0] = p[1]
def p_parens( p ) :
'expr : LPAREN expr RPAREN'
https://riptutorial.com/ 685
p[0] = p[2]
def p_error( p ):
print("Syntax error in input!")
parser = yacc.yacc()
Output:
-8
There are two steps that the code from example 1 carried out: one was tokenizing the input, which
means it looked for symbols that constitute the arithmetic expression, and the second step was
parsing, which involves analysing the extracted tokens and evaluating the result.
This section provides a simple example of how to tokenize user input, and then breaks it down line
by line.
https://riptutorial.com/ 686
r'\n+'
t.lexer.lineno += len(t.value)
# Tokenize
while True:
tok = lexer.token()
if not tok:
break # No more input
print(tok)
Save this file as calclex.py. We'll be using this when building our Yacc parser.
Breakdown
1. Import the module using import ply.lex
2. All lexers must provide a list called tokens that defines all of the possible token names that
can be produced by the lexer. This list is always required.
tokens = [
'NUMBER',
'PLUS',
'MINUS',
'TIMES',
'DIVIDE',
'LPAREN',
'RPAREN',
]
tokens could also be a tuple of strings (rather than a string), where each string denotes a token as
before.
3. The regex rule for each string may be defined either as a string or as a function. In either
case, the variable name should be prefixed by t_ to denote it is a rule for matching tokens.
• For simple tokens, the regular expression can be specified as strings: t_PLUS = r'\+'
https://riptutorial.com/ 687
def t_NUMBER(t):
r'\d+'
t.value = int(t.value)
return t
Note, the rule is specified as a doc string within the function. The function accepts one
argument which is an instance of LexToken, performs some action and then returns back
the argument.
If you want to use an external string as the regex rule for the function instead of
specifying a doc string, consider the following example:
• An instance of LexToken object (let's call this object t) has the following attributes:
1. t.type which is the token type (as a string) (eg: 'NUMBER', 'PLUS', etc). By default,
t.type is set to the name following the t_ prefix.
2. t.value which is the lexeme (the actual text matched)
3. t.lineno which is the current line number (this is not automatically updated, as
the lexer knows nothing of line numbers). Update lineno using a function called
t_newline.
def t_newline(t):
r'\n+'
t.lexer.lineno += len(t.value)
4. t.lexpos which is the position of the token relative to the beginning of the input
text.
• If nothing is returned from a regex rule function, the token is discarded. If you want to
discard a token, you can alternatively add t_ignore_ prefix to a regex rule variable
instead of defining a function for the same rule.
def t_COMMENT(t):
r'\#.*'
pass
# No return value. Token discarded
t_ignore_COMMENT = r'\#.*'
This is of course invalid if you're carrying out some action when you see a comment. In which case, use
a function to define the regex rule.
If you haven't defined a token for some characters but still want to ignore it, use
t_ignore = "<characters to ignore>"
https://riptutorial.com/ 688
(these prefixes are necessary):
t_ignore_COMMENT = r'\#.*'
t_ignore = ' \t' # ignores spaces and tabs
• When building the master regex, lex will add the regexes specified in the file as follows:
1. Tokens defined by functions are added in the same order as they appear in the
file.
2. Tokens defined by strings are added in decreasing order of the string length of
the string defining the regex for that token.
If you are matching == and = in the same file, take advantage of these rules.
• Literals are tokens that are returned as they are. Both t.type and t.value will be set to
the character itself. Define a list of literals as such:
or,
literals = "+-*/"
It is possible to write token functions that perform additional actions when literals are
matched. However, you'll need to set the token type appropriately. For example:
def t_lbrace(t):
r'\{'
t.type = '{' # Set token type to the expected literal (ABSOLUTE MUST if this
is a literal)
return t
4. Final preparations:
You can also put everything inside a class and call use instance of the class to define the
lexer. Eg:
https://riptutorial.com/ 689
import ply.lex as lex
class MyLexer(object):
... # everything relating to token rules and error handling comes here as
usual
m = MyLexer()
m.build() # Build the lexer
m.test("3 + 4") #
To get the tokens, use lexer.token() which returns tokens matched. You can iterate over
lexer in a loop as in:
for i in lexer:
print(i)
This section explains how the tokenized input from Part 1 is processed - it is done using Context
Free Grammars (CFGs). The grammar must be specified, and the tokens are processed according
to the grammar. Under the hood, the parser uses an LALR parser.
# Yacc example
def p_expression_plus(p):
'expression : expression PLUS term'
p[0] = p[1] + p[3]
def p_expression_minus(p):
'expression : expression MINUS term'
p[0] = p[1] - p[3]
def p_expression_term(p):
'expression : term'
p[0] = p[1]
def p_term_times(p):
'term : term TIMES factor'
p[0] = p[1] * p[3]
https://riptutorial.com/ 690
def p_term_div(p):
'term : term DIVIDE factor'
p[0] = p[1] / p[3]
def p_term_factor(p):
'term : factor'
p[0] = p[1]
def p_factor_num(p):
'factor : NUMBER'
p[0] = p[1]
def p_factor_expr(p):
'factor : LPAREN expression RPAREN'
p[0] = p[2]
while True:
try:
s = raw_input('calc > ')
except EOFError:
break
if not s: continue
result = parser.parse(s)
print(result)
Breakdown
• Each grammar rule is defined by a function where the docstring to that function contains the
appropriate context-free grammar specification. The statements that make up the function
body implement the semantic actions of the rule. Each function accepts a single argument p
that is a sequence containing the values of each grammar symbol in the corresponding rule.
The values of p[i] are mapped to grammar symbols as shown here:
def p_expression_plus(p):
'expression : expression PLUS term'
# ^ ^ ^ ^
# p[0] p[1] p[2] p[3]
• For tokens, the "value" of the corresponding p[i] is the same as the p.value attribute
assigned in the lexer module. So, PLUS will have the value +.
• For non-terminals, the value is determined by whatever is placed in p[0]. If nothing is placed,
the value is None. Also, p[-1] is not the same as p[3], since p is not a simple list (p[-1] can
https://riptutorial.com/ 691
specify embedded actions (not discussed here)).
Note that the function can have any name, as long as it is preceeded by p_.
• The p_error(p) rule is defined to catch syntax errors (same as yyerror in yacc/bison).
• Multiple grammar rules can be combined into a single function, which is a good idea if
productions have a similar structure.
def p_binary_operators(p):
'''expression : expression PLUS term
| expression MINUS term
term : term TIMES factor
| term DIVIDE factor'''
if p[2] == '+':
p[0] = p[1] + p[3]
elif p[2] == '-':
p[0] = p[1] - p[3]
elif p[2] == '*':
p[0] = p[1] * p[3]
elif p[2] == '/':
p[0] = p[1] / p[3]
def p_binary_operators(p):
'''expression : expression '+' term
| expression '-' term
term : term '*' factor
| term '/' factor'''
if p[2] == '+':
p[0] = p[1] + p[3]
elif p[2] == '-':
p[0] = p[1] - p[3]
elif p[2] == '*':
p[0] = p[1] * p[3]
elif p[2] == '/':
p[0] = p[1] / p[3]
• To explicitly set the start symbol, use start = 'foo', where foo is some non-terminal.
• Setting precedence and associativity can be done using the precedence variable.
precedence = (
('nonassoc', 'LESSTHAN', 'GREATERTHAN'), # Nonassociative operators
('left', 'PLUS', 'MINUS'),
('left', 'TIMES', 'DIVIDE'),
('right', 'UMINUS'), # Unary minus operator
)
Tokens are ordered from lowest to highest precedence. nonassoc means that those tokens do
https://riptutorial.com/ 692
not associate. This means that something like a < b < c is illegal whereas a < b is still legal.
• is a debugging file that is created when the yacc program is executed for the first
parser.out
time. Whenever a shift/reduce conflict occurs, the parser always shifts.
https://riptutorial.com/ 693
Chapter 142: Python Networking
Remarks
(Very) basic Python client socket example
Examples
The simplest Python socket client-server example
Server side:
import socket
while True:
connection, address = serversocket.accept()
buf = connection.recv(64)
if len(buf) > 0:
print(buf)
break
Client Side:
import socket
First run the SocketServer.py, and make sure the server is ready to listen/receive sth Then the
client send info to the server; After the server received sth, it terminates
To share files or to host simple websites(http and javascript) in your local network, you can use
Python's builtin SimpleHTTPServer module. Python should be in your Path variable. Go to the
folder where your files are and type:
For python 2:
For python 3:
https://riptutorial.com/ 694
$ python3 -m http.server <portnumber>
If port number is not given 8000 is the default port. So the output will be:
You can access to your files through any device connected to the local network by typing
http://hostipaddress:8000/.
You can create a TCP server using the socketserver library. Here's a simple echo server.
Server side
class EchoHandler(BaseRequestHandler):
def handle(self):
print('connection from:', self.client_address)
while True:
msg = self.request.recv(8192)
if not msg:
break
self.request.send(msg)
if __name__ == '__main__':
server = TCPServer(('', 5000), EchoHandler)
server.serve_forever()
Client side
socketserver makes it relatively easy to create simple TCP servers. However, you should be aware
that, by default, the servers are single threaded and can only serve one client at a time. If you
want to handle multiple clients, either instantiate a ThreadingTCPServer instead.
https://riptutorial.com/ 695
Creating a UDP Server
import time
from socketserver import BaseRequestHandler, UDPServer
class CtimeHandler(BaseRequestHandler):
def handle(self):
print('connection from: ', self.client_address)
# Get message and client socket
msg, sock = self.request
resp = time.ctime()
sock.sendto(resp.encode('ascii'), self.client_address)
if __name__ == '__main__':
server = UDPServer(('', 5000), CtimeHandler)
server.serve_forever()
Testing:
https://riptutorial.com/ 696
Read Python Networking online: https://riptutorial.com/python/topic/1309/python-networking
https://riptutorial.com/ 697
Chapter 143: Python Persistence
Syntax
• pickle.dump(obj, file, protocol=None, *, fix_imports=True)
Parameters
Parameter Details
an integer, tells the pickler to use the given protocol,0-ASCII, 1- old binary
protocol
format
The file argument must have a write() method wb for dump method and for
file
loading read() method rb
Examples
Python Persistence
Objects like numbers, lists, dictionaries,nested structures and class instance objects live in your
computer’s memory and are lost as soon as the script ends.
pickled representation of an object is always a bytes object in all cases so one must open files in
wb to store data and rb to load data from pickle.
data={'a':'some_value',
'b':[9,4,7],
'c':['some_str','another_str','spam','ham'],
'd':{'key':'nested_dictionary'},
}
Store data
import pickle
file=open('filename','wb') #file object in binary write mode
pickle.dump(data,file) #dump the data in the file object
file.close() #close the file to write into the file
https://riptutorial.com/ 698
Load data
import pickle
file=open('filename','rb') #file object in binary read mode
data=pickle.load(file) #load the data back
file.close()
>>>data
{'b': [9, 4, 7], 'a': 'some_value', 'd': {'key': 'nested_dictionary'},
'c': ['some_str', 'another_str', 'spam', 'ham']}
import pickle
def save(filename,object):
file=open(filename,'wb')
pickle.dump(object,file)
file.close()
def load(filename):
file=open(filename,'rb')
object=pickle.load(file)
file.close()
return object
>>>list_object=[1,1,2,3,5,8,'a','e','i','o','u']
>>>save(list_file,list_object)
>>>new_list=load(list_file)
>>>new_list
[1, 1, 2, 3, 5, 8, 'a', 'e', 'i', 'o', 'u'
https://riptutorial.com/ 699
Chapter 144: Python Requests Post
Introduction
Documentation for the Python Requests module in the context of the HTTP POST method and its
corresponding Requests function
Examples
Simple Post
Will perform a simple HTTP POST operation. Posted data can be inmost formats, however key
value pairs are most prevalent.
Headers
print(foo.headers)
An example response:
headers = {'Cache-Control':'max-age=0',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/54.0.2840.99 Safari/537.36',
'Content-Type':'application/x-www-form-urlencoded',
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Referer':'https://www.groupon.com/signup',
'Accept-Encoding':'gzip, deflate, br',
'Accept-Language':'es-ES,es;q=0.8'
}
Encoding
https://riptutorial.com/ 700
print(foo.encoding)
'utf-8'
foo.encoding = 'ISO-8859-1'
SSL Verification
Redirection
Any redirection will be followed (e.g. http to https) this can also be changed:
If the post operation has been redirected, this value can be accessed:
print(foo.url)
print(foo.history)
To pass form encoded data with the post operation, data must be structured as dictionary and
supplied as the data parameter.
If the data does not want to be form encoded, simply pass a string, or integer to the data
parameter.
Supply the dictionary to the json parameter for Requests to format the data automatically:
https://riptutorial.com/ 701
File Upload
With the Requests module,its is only necessary to provide a file handle as opposed to the contents
retrieved with .read():
Strings can also be sent as a file, as long they are supplied as the files parameter.
Multiple Files
Multiple files can be supplied in much the same way as one file:
multiple_files = [
('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))]
Responses
Returned Data
Raw Responses
In the instances where you need to access the underlying urllib3 response.HTTPResponse object,
this can be done by the following:
https://riptutorial.com/ 702
foo = post('http://httpbin.org/post', data={'data' : 'value'})
res = foo.raw
print(res.read())
Authentication
HTTP Digest Authentication is done in a very similar way, Requests provides a different object for
this:
Custom Authentication
In some cases the built in authentication mechanisms may not be enough, imagine this example:
A server is configured to accept authentication if the sender has the correct user-agent string, a
certain header value and supplies the correct credentials through HTTP Basic Authentication. To
achieve this a custom authentication class should be prepared, subclassing AuthBase, which is
the base for Requests authentication implementations:
class CustomAuth(AuthBase):
https://riptutorial.com/ 703
self.username = username
self.password = password
return r
Proxies
HTTP/S Proxies
proxies = {
'http': 'http://192.168.0.128:3128',
'https': 'http://192.168.0.127:1080',
}
SOCKS Proxies
The use of socks proxies requires 3rd party dependencies requests[socks], once installed socks
proxies are used in a very similar way to HTTPBasicAuth:
proxies = {
'http': 'socks5://user:pass@host:port',
'https': 'socks5://user:pass@host:port'
}
https://riptutorial.com/ 704
Chapter 145: Python Serial Communication
(pyserial)
Syntax
• ser.read(size=1)
• ser.readline()
• ser.write()
Parameters
parameter details
baudrate type: int default: 9600 standard values: 50, 75, 110, 134, 150, 200,
baudrate
300, 600, 1200, 1800, 2400, 4800, 9600, 19200, 38400, 57600, 115200
Remarks
For more details check out pyserial documentation
Examples
Initialize serial device
import serial
#Serial takes these two parameters: serial device and baudrate
ser = serial.Serial('/dev/ttyUSB0', 9600)
import serial
#Serial takes two parameters: serial device and baudrate
ser = serial.Serial('/dev/ttyUSB0', 9600)
data = ser.read()
https://riptutorial.com/ 705
to read given number of bytes from the serial device
data = ser.read(size=5)
data = ser.readline()
to read the data from serial device while something is being written over it.
#for python2.7
data = ser.read(ser.inWaiting())
#for python3
ser.read(ser.inWaiting)
python -m serial.tools.list_ports
at a command prompt or
https://riptutorial.com/ 706
Chapter 146: Python Server Sent Events
Introduction
Server Sent Events (SSE) is a unidirectional connection between a server and a client (usually a
web browser) that allows the server to "push" information to the client. It is much like websockets
and long polling. The main difference between SSE and websockets is that SSE is unidirectional,
only the server can send info to the client, where as with websockets, both can send info to
eachother. SSE is typically considered to be much simpler to use/implement than websockets.
Examples
Flask SSE
@route("/stream")
def stream():
def event_stream():
while True:
if message_to_send:
yield "data:
{}\n\n".format(message_to_send)"
Asyncio SSE
import asyncio
import sse
class Handler(sse.Handler):
@asyncio.coroutine
def handle_request(self):
yield from asyncio.sleep(2)
self.send('foo')
yield from asyncio.sleep(2)
self.send('bar', event='wakeup')
https://riptutorial.com/ 707
Chapter 147: Python speed of program
Examples
Notation
Basic Idea
The notation used when describing the speed of your Python program is called Big-O notation.
Let's say you have a function:
This is a simple function to check if an item is in a list. To describe the complexity of this function,
you will say O(n). This means "Order of n" as the O function is known as the Order function.
O(k) - generally k is the value of the parameter or the number of elements in the parameter
List operations
Append : O(1)
Copy : O(n)
Insert : O(n)
Iteration : O(n)
Extend : O(k)
https://riptutorial.com/ 708
Sort : O(n log n)
Multiply : O(nk)
x in s : O(n)
Deque operations
class Deque:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
def removeFront(self):
return self.items.pop()
def removeRear(self):
return self.items.pop(0)
def size(self):
return len(self.items)
Append : O(1)
Appendleft : O(1)
Copy : O(n)
Extend : O(k)
Extendleft : O(k)
Pop : O(1)
Popleft : O(1)
Remove : O(n)
Rotate : O(k)
https://riptutorial.com/ 709
Set operations
x in s : O(1)
Difference s - t : O(len(s))
s.symetric_difference_update(t) : O(len(t))
Algorithmic Notations...
There are certain principles that apply to optimization in any computer language, and Python is no
exception. Don't optimize as you go: Write your program without regard to possible
optimizations, concentrating instead on making sure that the code is clean, correct, and
understandable. If it's too big or too slow when you've finished, then you can consider optimizing it.
Remember the 80/20 rule: In many fields you can get 80% of the result with 20% of the effort
(also called the 90/10 rule - it depends on who you talk to). Whenever you're about to optimize
code, use profiling to find out where that 80% of execution time is going, so you know where to
concentrate your effort.
Always run "before" and "after" benchmarks: How else will you know that your optimizations
actually made a difference? If your optimized code turns out to be only slightly faster or smaller
than the original version, undo your changes and go back to the original, clear code.
Use the right algorithms and data structures: Don't use an O(n2) bubble sort algorithm to sort a
thousand elements when there's an O(n log n) quicksort available. Similarly, don't store a
thousand items in an array that requires an O(n) search when you could use an O(log n) binary
tree, or an O(1) Python hash table.
The following 3 asymptotic notations are mostly used to represent time complexity of algorithms.
1. Θ Notation: The theta notation bounds a functions from above and below, so it defines
exact asymptotic behavior. A simple way to get Theta notation of an expression is to drop
low order terms and ignore leading constants. For example, consider the following
expression. 3n3 + 6n2 + 6000 = Θ(n3) Dropping lower order terms is always fine because
https://riptutorial.com/ 710
there will always be a n0 after which Θ(n3) has higher values than Θn2) irrespective of the
constants involved. For a given function g(n), we denote Θ(g(n)) is following set of functions.
Θ(g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 <= c1g(n) <= f(n) <=
c2g(n) for all n >= n0} The above definition means, if f(n) is theta of g(n), then the value f(n)
is always between c1g(n) and c2g(n) for large values of n (n >= n0). The definition of theta
also requires that f(n) must be non-negative for values of n greater than n0.
2. Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a
function only from above. For example, consider the case of Insertion Sort. It takes linear
time in best case and quadratic time in worst case. We can safely say that the time
complexity of Insertion sort is O(n^2). Note that O(n^2) also covers linear time. If we use Θ
notation to represent time complexity of Insertion sort, we have to use two statements for
best and worst cases:
The Big O notation is useful when we only have upper bound on time complexity of an algorithm.
Many times we easily find an upper bound by simply looking at the algorithm. O(g(n)) = { f(n):
there exist positive constants c and n0 such that 0 <= f(n) <= cg(n) for all n >= n0}
https://riptutorial.com/ 711
Chapter 148: Python Virtual Environment -
virtualenv
Introduction
A Virtual Environment ("virtualenv") is a tool to create isolated Python environments. It keeps the
dependencies required by different projects in separate places, by creating virtual Python env for
them. It solves the “project A depends on version 2.xxx but, project B needs 2.xxx” dilemma, and
keeps your global site-packages directory clean and manageable.
"virtualenv" creates a folder which contains all the necessary libs and bins to use the packages
that a Python project would need.
Examples
Installation
OR
Usage
$ cd test_proj
$ virtualenv test_proj
$ source test_project/bin/activate
$ deactivate
https://riptutorial.com/ 712
Install a package in your Virtualenv
If you look at the bin directory in your virtualenv, you’ll see easy_install which has been modified to
put eggs and packages in the virtualenv’s site-packages directory. To install an app in your virtual
environment:
$ source test_project/bin/activate
$ pip install flask
At this time, you don't have to use sudo since the files will all be installed in the local virtualenv
site-packages directory. This was created as your own user account.
cdvirtualenv : Navigate into the directory of the currently activated virtual environment, so you
can browse its site-packages, for example.
https://riptutorial.com/ 713
Chapter 149: Queue Module
Introduction
The Queue module implements multi-producer, multi-consumer queues. It is especially useful in
threaded programming when information must be exchanged safely between multiple threads.
There are three types of queues provides by queue module,Which are as following : 1. Queue 2.
LifoQueue 3. PriorityQueue Exception which could be come: 1. Full (queue overflow) 2. Empty
(queue underflow)
Examples
Simple example
question_queue = Queue()
for x in range(1,10):
temp_dict = ('key', x)
question_queue.put(temp_dict)
while(not question_queue.empty()):
item = question_queue.get()
print(str(item))
Output:
('key', 1)
('key', 2)
('key', 3)
('key', 4)
('key', 5)
('key', 6)
('key', 7)
('key', 8)
('key', 9)
https://riptutorial.com/ 714
Chapter 150: Raise Custom Errors /
Exceptions
Introduction
Python has many built-in exceptions which force your program to output an error when something
in it goes wrong.
However, sometimes you may need to create custom exceptions that serve your purpose.
In Python, users can define such exceptions by creating a new class. This exception class has to
be derived, either directly or indirectly, from Exception class. Most of the built-in exceptions are
also derived from this class.
Examples
Custom Exception
Here, we have created a user-defined exception called CustomError which is derived from the
Exception class. This new exception can be raised, like other exceptions, using the raise
statement with an optional error message.
class CustomError(Exception):
pass
x = 1
if x == 1:
raise CustomError('This is custom error')
Output:
class CustomError(Exception):
pass
try:
raise CustomError('Can you catch me ?')
except CustomError as e:
https://riptutorial.com/ 715
print ('Catched CustomError :{}'.format(e))
except Exception as e:
print ('Generic exception: {}'.format(e))
Output:
https://riptutorial.com/ 716
Chapter 151: Random module
Syntax
• random.seed(a=None, version=2) (version is only avaiable for python 3.x)
• random.getstate()
• random.setstate(state)
• random.randint(a, b)
• random.randrange(stop)
• random.randrange(start, stop, step=1)
• random.choice(seq)
• random.shuffle(x, random=random.random)
• random.sample(population, k)
Examples
Random and sequences: shuffle, choice and sample
import random
shuffle()
You can use random.shuffle() to mix up/randomize the items in a mutable and indexable
sequence. For example a list:
print(laughs)
# Out: ["He", "Hi", "Ho"] # Output may vary!
choice()
Takes a random element from an arbitary sequence:
print(random.choice(laughs))
# Out: He # Output may vary!
sample()
https://riptutorial.com/ 717
Like choice it takes random elements from an arbitary sequence but you can specify how many:
# |--sequence--|--number--|
print(random.sample( laughs , 1 )) # Take one element
# Out: ['Ho'] # Output may vary!
Creating random integers and floats: randint, randrange, random, and uniform
import random
randint()
Returns a random integer between x and y (inclusive):
random.randint(x, y)
random.randint(1, 8) # Out: 8
randrange()
random.randrange has the same syntax as range and unlike random.randint, the last value is not
inclusive:
https://riptutorial.com/ 718
random
Returns a random floating point number between 0 and 1:
uniform
Returns a random floating point number between x and y (inclusive):
https://riptutorial.com/ 719
random.seed(5) # Create a fixed state
print(random.randrange(0, 10)) # Get a random integer between 0 and 9
# Out: 9
print(random.randrange(0, 10))
# Out: 4
Resetting the seed will create the same "random" sequence again:
Since the seed is fixed these results are always 9 and 4. If having specific numbers is not required
only that the values will be the same one can also just use getstate and setstate to recover to a
previous state:
random.seed(None)
random.seed()
By default the Python random module use the Mersenne Twister PRNG to generate random
numbers, which, although suitable in domains like simulations, fails to meet security requirements
in more demanding environments.
In order to create a cryptographically secure pseudorandom number, one can use SystemRandom
which, by using os.urandom, is able to act as a Cryptographically secure pseudorandom number
generator, CPRNG.
The easiest way to use it simply involves initializing the SystemRandom class. The methods provided
are similar to the ones exported by the random module.
https://riptutorial.com/ 720
from random import SystemRandom
secure_rand_gen = SystemRandom()
In order to create a random sequence of 10 ints in range [0, 20], one can simply call randrange():
print(secure_rand_gen.randint(0, 20))
# 5
and, accordingly for all other methods. The interface is exactly the same, the only change is the
underlying number generator.
You can also use os.urandom directly to obtain cryptographically secure random bytes.
In order to create a random user password we can use the symbols provided in the string module.
Specifically punctuation for punctuation symbols, ascii_letters for letters and digits for digits:
After this, we can use random.SystemRandom to generate a password. For a 10 length password:
secure_random = random.SystemRandom()
password = "".join(secure_random.choice(symbols) for i in range(10))
print(password) # '^@g;J?]M6e'
Note that other routines made immediately available by the random module — such as
random.choice, random.randint, etc. — are unsuitable for cryptographic purposes.
Behind the curtains, these routines use the Mersenne Twister PRNG, which does not satisfy the
requirements of a CSPRNG. Thus, in particular, you should not use any of them to generate
passwords you plan to use. Always use an instance of SystemRandom as shown above.
Python 3.x3.6
Starting from Python 3.6, the secrets module is available, which exposes cryptographically safe
functionality.
https://riptutorial.com/ 721
Quoting the official documentation, to generate "a ten-character alphanumeric password with at
least one lowercase character, at least one uppercase character, and at least three digits," you
could:
import string
alphabet = string.ascii_letters + string.digits
while True:
password = ''.join(choice(alphabet) for i in range(10))
if (any(c.islower() for c in password)
and any(c.isupper() for c in password)
and sum(c.isdigit() for c in password) >= 3):
break
import random
probability = 0.3
https://riptutorial.com/ 722
Chapter 152: Reading and Writing CSV
Examples
Writing a TSV file
Python
import csv
Output file
$ cat /tmp/output.tsv
name field
Dijkstra Computer Science
Shelah Math
Aumann Economic Sciences
Using pandas
import pandas as pd
df = pd.read_csv("data.csv")
d = df.to_dict()
https://riptutorial.com/ 723
Chapter 153: Recursion
Remarks
Recursion needs a stop condition stopCondition in order to exit the recursion.
The original variable must be passed on to the recursive function so it becomes stored.
Examples
Sum of numbers from 1 to n
If I wanted to find out the sum of numbers from 1 to n where n is a natural number, I can do 1 + 2 +
3 + 4 + ... + (several hours later) + n. Alternatively, I could write a for loop:
n = 0
for i in range (1, n+1):
n += i
def recursion(n):
if n == 1:
return 1
return n + recursion(n - 1)
Recursion has advantages over the above two methods. Recursion takes less time than writing
out 1 + 2 + 3 for a sum from 1 to 3. For recursion(4), recursion can be used to work backwards:
Whereas the for loop is working strictly forwards: ( 1 -> 1 + 2 -> 1 + 2 + 3 -> 1 + 2 + 3 + 4 -> 10 ).
Sometimes the recursive solution is simpler than the iterative solution. This is evident when
implementing a reversal of a linked list.
Recursion occurs when a function call causes that same function to be called again before the
original function call terminates. For example, consider the well-known mathematical expression x!
(i.e. the factorial operation). The factorial operation is defined for all nonnegative integers as
follows:
https://riptutorial.com/ 724
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
Recursion functions can be difficult to grasp sometimes, so let's walk through this step-by-step.
Consider the expression factorial(3). This and all function calls create a new environment. An
environment is basically just a table that maps identifiers (e.g. n, factorial, print, etc.) to their
corresponding values. At any point in time, you can access the current environment using locals()
. In the first function call, the only local variable that gets defined is n = 3. Therefore, printing
locals() would show {'n': 3}. Since n == 3, the return value becomes n * factorial(n - 1).
At this next step is where things might get a little confusing. Looking at our new expression, we
already know what n is. However, we don't yet know what factorial(n - 1) is. First, n - 1
evaluates to 2. Then, 2 is passed to factorial as the value for n. Since this is a new function call, a
second environment is created to store this new n. Let A be the first environment and B be the
second environment. A still exists and equals {'n': 3}, however, B (which equals {'n': 2}) is the
current environment. Looking at the function body, the return value is, again, n * factorial(n - 1).
Without evaluating this expression, let's substitute it into the original return expression. By doing
this, we're mentally discarding B, so remember to substitute n accordingly (i.e. references to B's n
are replaced with n - 1 which uses A's n). Now, the original return expression becomes n * ((n -
1) * factorial((n - 1) - 1)). Take a second to ensure that you understand why this is so.
Now, let's evaluate the factorial((n - 1) - 1)) portion of that. Since A's n == 3, we're passing 1
into factorial. Therefore, we are creating a new environment C which equals {'n': 1}. Again, the
return value is n * factorial(n - 1). So let's replace factorial((n - 1) - 1)) of the “original” return
expression similarly to how we adjusted the original return expression earlier. The “original”
expression is now n * ((n - 1) * ((n - 2) * factorial((n - 2) - 1))).
Almost done. Now, we need to evaluate factorial((n - 2) - 1). This time, we're passing in 0.
Therefore, this evaluates to 1. Now, let's perform our last substitution. The “original” return
expression is now n * ((n - 1) * ((n - 2) * 1)). Recalling that the original return expression is
evaluated under A, the expression becomes 3 * ((3 - 1) * ((3 - 2) * 1)). This, of course,
evaluates to 6. To confirm that this is the correct answer, recall that 3! == 3 * 2 * 1 == 6. Before
reading any further, be sure that you fully understand the concept of environments and how they
apply to recursion.
The statement if n == 0: return 1 is called a base case. This is because, it exhibits no recursion.
A base case is absolutely required. Without one, you'll run into infinite recursion. With that said, as
long as you have at least one base case, you can have as many cases as you want. For example,
we could have equivalently written factorial as follows:
def factorial(n):
if n == 0:
return 1
elif n == 1:
return 1
else:
return n * factorial(n - 1)
https://riptutorial.com/ 725
You may also have multiple recursion cases, but we won't get into that since it's relatively
uncommon and is often difficult to mentally process.
You can also have “parallel” recursive function calls. For example, consider the Fibonacci
sequence which is defined as follows:
def fib(n):
if n == 0 or n == 1:
return n
else:
return fib(n - 2) + fib(n - 1)
I won't walk through this function as thoroughly as I did with factorial(3), but the final return value
of fib(5) is equivalent to the following (syntactically invalid) expression:
(
fib((n - 2) - 2)
+
(
fib(((n - 2) - 1) - 2)
+
fib(((n - 2) - 1) - 1)
)
)
+
(
(
fib(((n - 1) - 2) - 2)
+
fib(((n - 1) - 2) - 1)
)
+
(
fib(((n - 1) - 1) - 2)
+
(
fib((((n - 1) - 1) - 1) - 2)
+
fib((((n - 1) - 1) - 1) - 1)
)
)
)
• A tail call is simply a recursive function call which is the last operation to be performed
before returning a value. To be clear, return foo(n - 1) is a tail call, but return foo(n - 1) +
https://riptutorial.com/ 726
1is not (since the addition is the last operation).
• Tail call optimization (TCO) is a way to automatically reduce recursion in recursive
functions.
• Tail call elimination (TCE) is the reduction of a tail call to an expression that can be
evaluated without recursion. TCE is a type of TCO.
• The interpreter can minimize the amount of memory occupied by environments. Since no
computer has unlimited memory, excessive recursive function calls would lead to a stack
overflow.
• The interpreter can reduce the number of stack frame switches.
Python has no form of TCO implemented for a number of a reasons. Therefore, other techniques
are required to skirt this limitation. The method of choice depends on the use case. With some
intuition, the definitions of factorial and fib can relatively easily be converted to iterative code as
follows:
def factorial(n):
product = 1
while n > 1:
product *= n
n -= 1
return product
def fib(n):
a, b = 0, 1
while n > 0:
a, b = b, a + b
n -= 1
return a
This is usually the most efficient way to manually eliminate recursion, but it can become rather
difficult for more complex functions.
Another useful tool is Python's lru_cache decorator which can be used to reduce the number of
redundant calculations.
You now have an idea as to how to avoid recursion in Python, but when should you use
recursion? The answer is “not often”. All recursive functions can be implemented iteratively. It's
simply a matter of figuring out how to do so. However, there are rare cases in which recursion is
okay. Recursion is common in Python when the expected inputs wouldn't cause a significant
number of a recursive function calls.
If recursion is a topic that interests you, I implore you to study functional languages such as
Scheme or Haskell. In such languages, recursion is much more useful.
Please note that the above example for the Fibonacci sequence, although good at showing how to
apply the definition in python and later use of the lru cache, has an inefficient running time since it
makes 2 recursive calls for each non base case. The number of calls to the function grows
exponentially to n.
https://riptutorial.com/ 727
Rather non-intuitively a more efficient implementation would use linear recursion:
def fib(n):
if n <= 1:
return (n,0)
else:
(a, b) = fib(n - 1)
return (a + b, a)
But that one has the issue of returning a pair of numbers. This emphasizes that some functions
really do not gain much from recursion.
root
- A
- AA
- AB
- B
- BA
- BB
- BBA
Now, if we wish to list all the names of the elements, we could do this with a simple for-loop. We
assume there is a function get_name() to return a string of the name of a node, a function
get_children() to return a list of all the sub-nodes of a given node in the tree, and a function
get_root() to get the root node.
root = get_root(tree)
for node in get_children(root):
print(get_name(node))
for child in get_children(node):
print(get_name(child))
for grand_child in get_children(child):
print(get_name(grand_child))
# prints: A, AA, AB, B, BA, BB, BBA
This works well and fast, but what if the sub-nodes, got sub-nodes of its own? And those sub-
nodes might have more sub-nodes... What if you don't know beforehand how many there will be?
A method to solve this is the use of recursion.
def list_tree_names(node):
for child in get_children(node):
print(get_name(child))
list_tree_names(node=child)
list_tree_names(node=get_root(tree))
# prints: A, AA, AB, B, BA, BB, BBA
Perhaps you wish to not print, but return a flat list of all node names. This can be done by passing
a rolling list as a parameter.
https://riptutorial.com/ 728
def list_tree_names(node, lst=[]):
for child in get_children(node):
lst.append(get_name(child))
list_tree_names(node=child, lst=lst)
return lst
list_tree_names(node=get_root(tree))
# returns ['A', 'AA', 'AB', 'B', 'BA', 'BB', 'BBA']
There is a limit to the depth of possible recursion, which depends on the Python implementation.
When the limit is reached, a RuntimeError exception is raised:
def cursing(depth):
try:
cursing(depth + 1) # actually, re-cursing
except RuntimeError as RE:
print('I recursed {} times!'.format(depth))
cursing(0)
# Out: I recursed 1083 times!
sys.setrecursionlimit(limit)
You can check what the current parameters of the limit are by running:
sys.getrecursionlimit()
Running the same method above with our new limit we get
sys.setrecursionlimit(2000)
cursing(0)
# Out: I recursed 1997 times!
From Python 3.5, the exception is a RecursionError, which is derived from RuntimeError.
When the only thing returned from a function is a recursive call, it is refered to as tail recursion.
def countdown(n):
if n == 0:
https://riptutorial.com/ 729
print "Blastoff!"
else:
print n
countdown(n-1)
Any computation that can be made using iteration can also be made using recursion. Here is a
version of find_max written using tail recursion:
Tail recursion is considered a bad practice in Python, since the Python compiler does not handle
optimization for tail recursive calls. The recursive solution in cases like this use more system
resources than the equivalent iterative solution.
By default Python's recursion stack cannot exceed 1000 frames. This can be changed by setting
the sys.setrecursionlimit(15000) which is faster however, this method consumes more memory.
Instead, we can also solve the Tail Recursion problem using stack introspection.
#!/usr/bin/env python2.4
# This program shows off a python decorator which implements tail call optimization. It
# does this by throwing an exception if it is it's own grandparent, and catching such
# exceptions to recall the stack.
import sys
class TailRecurseException:
def __init__(self, args, kwargs):
self.args = args
self.kwargs = kwargs
def tail_call_optimized(g):
"""
This function decorates a function with tail call
optimization. It does this by throwing an exception
if it is it's own grandparent, and catching such
exceptions to fake the tail call optimization.
https://riptutorial.com/ 730
return g(*args, **kwargs)
except TailRecurseException, e:
args = e.args
kwargs = e.kwargs
func.__doc__ = g.__doc__
return func
To optimize the recursive functions, we can use the @tail_call_optimized decorator to call our
function. Here's a few of the common recursion examples using the decorator described above:
Factorial Example:
@tail_call_optimized
def factorial(n, acc=1):
"calculate a factorial"
if n == 0:
return acc
return factorial(n-1, n*acc)
print factorial(10000)
# prints a big, big number,
# but doesn't hit the recursion limit.
Fibonacci Example:
@tail_call_optimized
def fib(i, current = 0, next = 1):
if i == 0:
return current
else:
return fib(i - 1, next, current + next)
print fib(10000)
# also prints a big number,
# but doesn't hit the recursion limit.
https://riptutorial.com/ 731
Chapter 154: Reduce
Syntax
• reduce(function, iterable[, initializer])
Parameters
Parameter Details
function that is used for reducing the iterable (must take two arguments). (
function
positional-only)
Remarks
reduce might be not always the most efficient function. For some types there are equivalent
functions or methods:
• sum() for the sum of a sequence containing addable elements (not strings):
sum([1,2,3]) # = 6
Examples
Overview
# No import needed
# No import required...
from functools import reduce # ... but it can be loaded from the functools module
https://riptutorial.com/ 732
from functools import reduce # mandatory
asequence = [1, 2, 3]
In this example, we defined our own add function. However, Python comes with a standard
equivalent function in the operator module:
import operator
reduce(operator.add, asequence)
# Out: 6
Using reduce
asequence = [1, 2, 3]
Given an initializer the function is started by applying it to the initializer and the first iterable
element:
Without initializer parameter the reduce starts by applying the function to the first two list
elements:
https://riptutorial.com/ 733
# Out: 1 * 2 = 2
# 2 * 3 = 6
print(cumprod)
# Out: 6
Cumulative product
import operator
reduce(operator.mul, [10, 5, -3])
# Out: -150
reducewill not terminate the iteration before the iterable has been completly iterated over so it can
be used to create a non short-circuit any() or all() function:
import operator
# non short-circuit "all"
reduce(operator.and_, [False, True, True, True]) # = False
https://riptutorial.com/ 734
Chapter 155: Regular Expressions (Regex)
Introduction
Python makes regular expressions available through the re module.
Regular expressions are combinations of characters that are interpreted as rules for matching
substrings. For instance, the expression 'amount\D+\d+' will match any string composed by the
word amount plus an integral number, separated by one or more non-digits, such as:amount=100,
amount is 3, amount is equal to: 33, etc.
Syntax
• Direct Regular Expressions
• re.match(pattern, string, flag=0) # Out: match pattern at the beginning of string or None
• re.finditer(pattern, string, flag=0) # Out: same as re.findall, but returns iterator object
• re.sub(pattern, replacement, string, flag=0) # Out: string with replacement (string or function)
in place of pattern
Examples
Matching the beginning of a string
The first argument of re.match() is the regular expression, the second is the string to match:
import re
pattern = r"123"
https://riptutorial.com/ 735
string = "123zzb"
re.match(pattern, string)
# Out: <_sre.SRE_Match object; span=(0, 3), match='123'>
match.group()
# Out: '123'
You may notice that the pattern variable is a string prefixed with r, which indicates that the string is
a raw string literal.
A raw string literal has a slightly different syntax than a string literal, namely a backslash \ in a raw
string literal means "just a backslash" and there's no need for doubling up backlashes to escape
"escape sequences" such as newlines (\n), tabs (\t), backspaces (\), form-feeds (\r), and so on.
In normal string literals, each backslash must be doubled up to avoid being taken as the start of an
escape sequence.
Hence, r"\n" is a string of 2 characters: \ and n. Regex patterns also use backslashes, e.g. \d
refers to any digit character. We can avoid having to double escape our strings ("\\d") by using
raw strings (r"\d").
For instance:
string = "\\t123zzb" # here the backslash is escaped, so there's no tab, just '\' and 't'
pattern = "\\t123" # this will match \t (escaping the backslash) followed by 123
re.match(pattern, string).group() # no match
re.match(pattern, "\t123zzb").group() # matches '\t123'
pattern = r"\\t123"
re.match(pattern, string).group() # matches '\\t123'
Matching is done from the start of the string only. If you want to match anywhere use re.search
instead:
match is None
# Out: True
match.group()
# Out: '123'
Searching
https://riptutorial.com/ 736
match.group(1)
# Out: 'your base'
Searching is done anywhere in the string unlike re.match. You can also use re.findall.
You can also search at the beginning of the string (use ^),
Grouping
Grouping is done with parentheses. Calling group() returns a string formed of the matching
parenthesized subgroups.
If there is a single argument, the result is a single string; if there are multiple
arguments, the result is a tuple with one item per argument.
https://riptutorial.com/ 737
Calling groups() on the other hand, returns a list of tuples containing the subgroups.
Named groups
match = re.search(r'My name is (?P<name>[A-Za-z ]+)', 'My name is John Smith')
match.group('name')
# Out: 'John Smith'
match.group(1)
# Out: 'John Smith'
Non-capturing groups
Using (?:) creates a group, but the group isn't captured. This means you can use it as a group,
but it won't pollute your "group space".
re.match(r'(\d+)(\+(\d+))?', '11+22').groups()
# Out: ('11', '+22', '22')
re.match(r'(\d+)(?:\+(\d+))?', '11+22').groups()
# Out: ('11', '22')
This example matches 11+22 or 11, but not 11+. This is since the + sign and the second term are
grouped. On the other hand, the + sign isn't captured.
https://riptutorial.com/ 738
Special characters (like the character class brackets [ and ] below) are not matched literally:
re.escape('a[b]c')
# Out: 'a\\[b\\]c'
match = re.search(re.escape('a[b]c'), 'a[b]c')
match.group()
# Out: 'a[b]c'
The re.escape() function escapes all special characters, so it is useful if you are composing a
regular expression based on user input:
Replacing
Replacing strings
re.sub(r"t[0-9][0-9]", "foo", "my name t13 is t44 what t99 ever t44")
# Out: 'my name foo is foo what foo ever foo'
However, if you make a group ID like '10', this doesn't work: \10 is read as 'ID number 1 followed
by 0'. So you have to be more specific and use the \g<i> notation:
https://riptutorial.com/ 739
re.sub(r"t([0-9])([0-9])", r"t\g<2>\g<1>", "t13 t19 t81 t25")
# Out: 't31 t91 t18 t52'
Note that the r before "[0-9]{2,3}" tells python to interpret the string as-is; as a "raw" string.
You could also use re.finditer() which works in the same way as re.findall() but returns an
iterator with SRE_Match objects instead of a list of strings:
Precompiled patterns
import re
precompiled_pattern = re.compile(r"(\d+)")
matches = precompiled_pattern.search("The answer is 41!")
matches.group(1)
# Out: 41
Compiling a pattern allows it to be reused later on in a program. However, note that Python
caches recently-used expressions (docs, SO answer), so "programs that use only a few regular
expressions at a time needn’t worry about compiling regular expressions".
https://riptutorial.com/ 740
import re
precompiled_pattern = re.compile(r"(.*\d+)")
matches = precompiled_pattern.match("The answer is 41!")
print(matches.group(1))
# Out: The answer is 41
If you want to check that a string contains only a certain set of characters, in this case a-z, A-Z and
0-9, you can do so like this,
import re
def is_allowed(string):
characherRegex = re.compile(r'[^a-zA-Z0-9.]')
string = characherRegex.search(string)
return not bool(string)
print (is_allowed("abyzABYZ0099"))
# Out: 'True'
print (is_allowed("#*@#$%^"))
# Out: 'False'
You can also adapt the expression line from [^a-zA-Z0-9.] to [^a-z0-9.], to disallow uppercase
letters for example.
You can also use regular expressions to split a string. For example,
import re
data = re.split(r'\s+', 'James 94 Samantha 417 Scarlett 74')
print( data )
# Output: ['James', '94', 'Samantha', '417', 'Scarlett', '74']
Flags
For some special cases we need to change the behavior of the Regular Expression, this is done
using flags. Flags can be set in two ways, through the flags keyword or directly in the expression.
Flags keyword
https://riptutorial.com/ 741
Below an example for re.search but it works for most functions in the re module.
m = re.search("b", "ABC")
m is None
# Out: True
Common Flags
re.MULTILINE, re.M Makes ^ match the begin of a line and $ the end of a line
For the complete list of all available flags check the docs
Inline flags
From the docs:
(?iLmsux) (One or more letters from the set 'i', 'L', 'm', 's', 'u', 'x'.)
The group matches the empty string; the letters set the corresponding flags: re.I
(ignore case), re.L (locale dependent), re.M (multi-line), re.S (dot matches all), re.U
(Unicode dependent), and re.X (verbose), for the entire regular expression. This is
useful if you wish to include the flags as part of the regular expression, instead of
passing a flag argument to the re.compile() function.
Note that the (?x) flag changes how the expression is parsed. It should be used first in
the expression string, or after one or more whitespace characters. If there are non-
whitespace characters before the flag, the results are undefined.
You can use re.finditer to iterate over all matches in a string. This gives you (in comparison to
https://riptutorial.com/ 742
re.findall extra information, such as information about the match location in the string (indexes):
import re
text = 'You can try to find an ant in this string'
pattern = 'an?\w' # find 'an' either with or without a following word character
# Print match
print('Match "{}" found at: [{},{}]'.format(sGroup, sStart,sEnd))
Result:
Often you want to match an expression only in specific places (leaving them untouched in others,
that is). Consider the following sentence:
Here the "apple" occurs twice which can be solved with so called backtracking control verbs which
are supported by the newer regex module. The idea is:
import regex as re
string = "An apple a day keeps the doctor away (I eat an apple everyday)."
rx = re.compile(r'''
\([^()]*\) (*SKIP)(*FAIL) # match anything in parentheses and "throw it away"
| # or
apple # match an apple
''', re.VERBOSE)
apples = rx.findall(string)
print(apples)
# only one
This matches "apple" only when it can be found outside of the parentheses.
https://riptutorial.com/ 743
• While looking from left to right, the regex engine consumes everything to the left, the
(*SKIP) acts as an "always-true-assertion". Afterwards, it correctly fails on (*FAIL) and
backtracks.
• Now it gets to the point of (*SKIP) from right to left (aka while backtracking) where it is
forbidden to go any further to the left. Instead, the engine is told to throw away anything to
the left and jump to the point where the (*SKIP) was invoked.
https://riptutorial.com/ 744
Chapter 156: Searching
Remarks
All searching algorithms on iterables containing n elements have O(n) complexity. Only specialized
algorithms like bisect.bisect_left() can be faster with O(log(n)) complexity.
Examples
Getting the index for strings: str.index(), str.rindex() and str.find(), str.rfind()
String also have an index method but also more advanced options and the additional str.find. For
both of these there is a complementary reversed method.
astring.find('o') # 4
astring.rfind('o') # 20
The difference between index/rindex and find/rfind is what happens if the substring is not found in
the string:
astring.index('o', 5) # 6
astring.index('o', 6) # 6 - start is inclusive
astring.index('o', 5, 7) # 6
astring.index('o', 5, 6) # - end is not inclusive
astring.rindex('o', 20) # 20
astring.rindex('o', 19) # 20 - still from left to right
astring.rindex('o', 4, 7) # 6
All built-in collections in Python implement a way to check element membership using in.
List
https://riptutorial.com/ 745
alist = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
5 in alist # True
10 in alist # False
Tuple
String
Set
Dict
dict is a bit special: the normal in only checks the keys. If you want to search in values you need
to specify it. The same if you want to search for key-value pairs.
list and tuple have an index-method to get the position of the element:
alist.index(15)
https://riptutorial.com/ 746
atuple = (10, 16, 26, 5, 2, 19, 105, 26)
atuple.index(26) # 2
atuple[2] # 26
atuple[7] # 26 - is also 26!
dicthave no builtin method for searching a value or key because dictionaries are unordered. You
can create a function that gets the key (or keys) for a specified value:
The first two functions will return a list of all keys that have the specified value:
getOneKeyForValue(adict, 10) # 'c' - depending on the circumstances this could also be 'a'
getOneKeyForValue(adict, 20) # 'b'
getOneKeyForValue(adict, 25)
StopIteration
https://riptutorial.com/ 747
import bisect
alist = [i for i in range(1, 100000, 3)] # Sorted list from 1 to 100000 with step 3
index_sorted(alist, 97285) # 32428
index_sorted(alist, 4) # 1
index_sorted(alist, 97286)
ValueError
For very large sorted sequences the speed gain can be quite high. In case for the first search
approximatly 500 times as fast:
While it's a bit slower if the element is one of the very first:
%timeit index_sorted(alist, 4)
# 100000 loops, best of 3: 2.98 µs per loop
%timeit alist.index(4)
# 1000000 loops, best of 3: 580 ns per loop
Searching in nested sequences like a list of tuple requires an approach like searching the keys
for values in dict but needs customized functions.
The index of the outermost sequence if the value was found in the sequence:
https://riptutorial.com/ 748
outer_inner_index(alist_of_tuples, 'a') # (1, 2)
alist_of_tuples[1][2] # 'a'
outer_inner_index(alist_of_tuples, 7) # (2, 0)
alist_of_tuples[2][0] # 7
In general (not always) using next and a generator expression with conditions to find the first
occurrence of the searched value is the most efficient approach.
To allow the use of in for custom classes the class must either provide the magic method
__contains__ or, failing that, an __iter__-method.
class ListList:
def __init__(self, value):
self.value = value
# Create a set of all values for fast access
self.setofvalues = set(item for sublist in self.value for item in sublist)
def __iter__(self):
print('Using __iter__.')
# A generator over all sublist elements
return (item for sublist in self.value for item in sublist)
# Even without the set you could use the iter method for the contains-check:
# return any(item == value for item in iter(self))
a = ListList([[1,1,1],[0,1,1],[1,5,1]])
10 in a # False
# Prints: Using __contains__.
5 in a # True
# Prints: Using __contains__.
del ListList.__contains__
5 in a # True
# Prints: Using __iter__.
Note: The looping in (as in for i in a) will always use __iter__ even if the class implements a
__contains__ method.
https://riptutorial.com/ 749
Chapter 157: Secure Shell Connection in
Python
Parameters
Parameter Usage
hostname This parameter tells the host to which the connection needs to be established
Examples
ssh connection
https://riptutorial.com/ 750
Chapter 158: Security and Cryptography
Introduction
Python, being one of the most popular languages in computer and network security, has great
potential in security and cryptography. This topic deals with the cryptographic features and
implementations in Python from its uses in computer and network security to hashing and
encryption/decryption algorithms.
Syntax
• hashlib.new(name)
• hashlib.pbkdf2_hmac(name, password, salt, rounds, dklen=None)
Remarks
Many of the methods in hashlib will require you to pass values interpretable as buffers of bytes,
rather than strings. This is the case for hashlib.new().update() as well as hashlib.pbkdf2_hmac. If
you have a string, you can convert it to a byte buffer by prepending the character b to the start of
the string:
"This is a string"
b"This is a buffer of bytes"
Examples
Calculating a Message Digest
The hashlib module allows creating message digest generators via the new method. These
generators will turn an arbitrary string into a fixed-length digest:
import hashlib
h = hashlib.new('sha256')
h.update(b'Nobody expects the Spanish Inquisition.')
h.digest()
# ==>
b'.\xdf\xda\xdaVR[\x12\x90\xff\x16\xfb\x17D\xcf\xb4\x82\xdd)\x14\xff\xbc\xb6Iy\x0c\x0eX\x9eF-
='
Note that you can call update an arbitrary number of times before calling digest which is useful to
hash a large file chunk by chunk. You can also get the digest in hexadecimal format by using
hexdigest:
h.hexdigest()
https://riptutorial.com/ 751
# ==> '2edfdada56525b1290ff16fb1744cfb482dd2914ffbcb649790c0e589e462d3d'
hashlib.new requires the name of an algorithm when you call it to produce a generator. To find out
what algorithms are available in the current Python interpreter, use hashlib.algorithms_available:
import hashlib
hashlib.algorithms_available
# ==> {'sha256', 'DSA-SHA', 'SHA512', 'SHA224', 'dsaWithSHA', 'SHA', 'RIPEMD160', 'ecdsa-with-
SHA1', 'sha1', 'SHA384', 'md5', 'SHA1', 'MD5', 'MD4', 'SHA256', 'sha384', 'md4', 'ripemd160',
'sha224', 'sha512', 'DSA', 'dsaEncryption', 'sha', 'whirlpool'}
The returned list will vary according to platform and interpreter; make sure you check your
algorithm is available.
There are also some algorithms that are guaranteed to be available on all platforms and
interpreters, which are available using hashlib.algorithms_guaranteed:
hashlib.algorithms_guaranteed
# ==> {'sha256', 'sha384', 'sha1', 'sha224', 'md5', 'sha512'}
The PBKDF2 algorithm exposed by hashlib module can be used to perform secure password
hashing. While this algorithm cannot prevent brute-force attacks in order to recover the original
password from the stored hash, it makes such attacks very expensive.
import hashlib
import os
salt = os.urandom(16)
hash = hashlib.pbkdf2_hmac('sha256', b'password', salt, 100000)
PBKDF2 can work with any digest algorithm, the above example uses SHA256 which is usually
recommended. The random salt should be stored along with the hashed password, you will need it
again in order to compare an entered password to the stored hash. It is essential that each
password is hashed with a different salt. As to the number of rounds, it is recommended to set it
as high as possible for your application.
If you want the result in hexadecimal, you can use the binascii module:
import binascii
hexhash = binascii.hexlify(hash)
Note: While PBKDF2 isn't bad, bcrypt and especially scrypt are considered stronger against brute-
force attacks. Neither is part of the Python standard library at the moment.
File Hashing
https://riptutorial.com/ 752
A hash is a function that converts a variable length sequence of bytes to a fixed length sequence.
Hashing files can be advantageous for many reasons. Hashes can be used to check if two files
are identical or verify that the contents of a file haven't been corrupted or changed.
import hashlib
hasher = hashlib.new('sha256')
with open('myfile', 'r') as f:
contents = f.read()
hasher.update(contents)
print hasher.hexdigest()
import hashlib
SIZE = 65536
hasher = hashlib.new('sha256')
with open('myfile', 'r') as f:
buffer = f.read(SIZE)
while len(buffer) > 0:
hasher.update(buffer)
buffer = f.read(SIZE)
print(hasher.hexdigest())
Python's built-in crypto functionality is currently limited to hashing. Encryption requires a third-party
module like pycrypto. For example, it provides the AES algorithm which is considered state of the
art for symmetric encryption. The following code will encrypt a given message using a passphrase:
import hashlib
import math
import os
The AES algorithm takes three parameters: encryption key, initialization vector (IV) and the actual
https://riptutorial.com/ 753
message to be encrypted. If you have a randomly generated AES key then you can use that one
directly and merely generate a random initialization vector. A passphrase doesn't have the right
size however, nor would it be recommendable to use it directly given that it isn't truly random and
thus has comparably little entropy. Instead, we use the built-in implementation of the PBKDF2
algorithm to generate a 128 bit initialization vector and 256 bit encryption key from the password.
Note the random salt which is important to have a different initialization vector and key for each
message encrypted. This ensures in particular that two equal messages won't result in identical
encrypted text, but it also prevents attackers from reusing work spent guessing one passphrase on
messages encrypted with another passphrase. This salt has to be stored along with the encrypted
message in order to derive the same initialization vector and key for decrypting.
salt = encrypted[0:SALT_SIZE]
derived = hashlib.pbkdf2_hmac('sha256', password, salt, 100000,
dklen=IV_SIZE + KEY_SIZE)
iv = derived[0:IV_SIZE]
key = derived[IV_SIZE:]
cleartext = AES.new(key, AES.MODE_CFB, iv).decrypt(encrypted[SALT_SIZE:])
RSA can be used to create a message signature. A valid signature can only be generated with
access to the private RSA key, validating on the other hand is possible with merely the
corresponding public key. So as long as the other side knows your public key they can verify the
message to be signed by you and unchanged - an approach used for email for example. Currently,
a third-party module like pycrypto is required for this functionality.
import errno
try:
with open('privkey.pem', 'r') as f:
key = RSA.importKey(f.read())
except IOError as e:
if e.errno != errno.ENOENT:
raise
# No private key, generate a new one. This can take a few seconds.
key = RSA.generate(4096)
with open('privkey.pem', 'wb') as f:
f.write(key.exportKey('PEM'))
with open('pubkey.pem', 'wb') as f:
f.write(key.publickey().exportKey('PEM'))
hasher = SHA256.new(message)
signer = PKCS1_v1_5.new(key)
signature = signer.sign(hasher)
https://riptutorial.com/ 754
Verifying the signature works similarly but uses the public key rather than the private key:
Note: The above examples use PKCS#1 v1.5 signing algorithm which is very common. pycrypto
also implements the newer PKCS#1 PSS algorithm, replacing PKCS1_v1_5 by PKCS1_PSS in the
examples should work if you want to use that one. Currently there seems to be little reason to use
it however.
Asymmetric encryption has the advantage that a message can be encrypted without exchanging a
secret key with the recipient of the message. The sender merely needs to know the recipients
public key, this allows encrypting the message in such a way that only the designated recipient
(who has the corresponding private key) can decrypt it. Currently, a third-party module like
pycrypto is required for this functionality.
The recipient can decrypt the message then if they have the right private key:
Note: The above examples use PKCS#1 OAEP encryption scheme. pycrypto also implements
PKCS#1 v1.5 encryption scheme, this one is not recommended for new protocols however due to
known caveats.
https://riptutorial.com/ 755
Chapter 159: Set
Syntax
• empty_set = set() # initialize an empty set
• literal_set = {'foo', 'bar', 'baz'} # construct a set with 3 strings inside it
• set_from_list = set(['foo', 'bar', 'baz']) # call the set function for a new set
• set_from_iter = set(x for x in range(30)) # use arbitrary iterables to create a set
• set_from_iter = {x for x in [random.randint(0,10) for i in range(10)]} # alternative notation
Remarks
Sets are unordered and have very fast lookup time (amortized O(1) if you want to get technical). It
is great to use when you have a collection of things, the order doesn't matter, and you'll be looking
up items by name a lot. If it makes more sense to look up items by an index number, consider
using a list instead. If order matters, consider a list as well.
Sets are mutable and thus cannot be hashed, so you cannot use them as dictionary keys or put
them in other sets, or anywhere else that requires hashable types. In such cases, you can use an
immutable frozenset.
The elements of a set must be hashable. This means that they have a correct __hash__ method,
that is consistent with __eq__. In general, mutable types such as list or set are not hashable and
cannot be put in a set. If you encounter this problem, consider using dict and immutable keys.
Examples
Get the unique elements of a list
Let's say you've got a list of restaurants -- maybe you read it from a file. You care about the unique
restaurants in the list. The best way to get the unique elements from a list is to turn it into a set:
Note that the set is not in the same order as the original list; that is because sets are unordered,
just like dicts.
This can easily be transformed back into a List with Python's built in list function, giving another
list that is the same list as the original but without duplicates:
list(unique_restaurants)
# ['Chicken Chicken', "McDonald's", 'Burger King']
https://riptutorial.com/ 756
It's also common to see this as one line:
Now any operations that could be performed on the original list can be done again.
Operations on sets
# Intersection
{1, 2, 3, 4, 5}.intersection({3, 4, 5, 6}) # {3, 4, 5}
{1, 2, 3, 4, 5} & {3, 4, 5, 6} # {3, 4, 5}
# Union
{1, 2, 3, 4, 5}.union({3, 4, 5, 6}) # {1, 2, 3, 4, 5, 6}
{1, 2, 3, 4, 5} | {3, 4, 5, 6} # {1, 2, 3, 4, 5, 6}
# Difference
{1, 2, 3, 4}.difference({2, 3, 5}) # {1, 4}
{1, 2, 3, 4} - {2, 3, 5} # {1, 4}
# Superset check
{1, 2}.issuperset({1, 2, 3}) # False
{1, 2} >= {1, 2, 3} # False
# Subset check
{1, 2}.issubset({1, 2, 3}) # True
{1, 2} <= {1, 2, 3} # True
# Disjoint check
{1, 2}.isdisjoint({3, 4}) # True
{1, 2}.isdisjoint({1, 4}) # False
# Existence check
2 in {1,2,3} # True
4 in {1,2,3} # False
4 not in {1,2,3} # True
s.discard(3) # s == {1,2,4}
s.discard(5) # s == {1,2,4}
s.remove(2) # s == {1,4}
s.remove(2) # KeyError!
https://riptutorial.com/ 757
Set operations return new sets, but have the corresponding in-place versions:
union s |= t update
difference s -= t difference_update
symmetric_difference s ^= t symmetric_difference_update
For example:
s = {1, 2}
s.update({3, 4}) # s == {1, 2, 3, 4}
Sets are unordered collections of distinct elements. But sometimes we want to work with
unordered collections of elements that are not necessarily distinct and keep track of the elements'
multiplicities.
By saving the strings 'a', 'b', 'b', 'c' into a set data structure we've lost the information on the
fact that 'b' occurs twice. Of course saving the elements to a list would retain this information
but a list data structure introduces an extra unneeded ordering that will slow down our
computations.
For implementing multisets Python provides the Counter class from the collections module
(starting from version 2.7):
Python 2.x2.7
https://riptutorial.com/ 758
Counter is a dictionary where where elements are stored as dictionary keys and their counts are
stored as dictionary values. And as all dictionaries, it is an unordered collection.
>>> a = {1, 2, 2, 3, 4}
>>> b = {3, 3, 4, 4, 5}
NOTE: {1} creates a set of one element, but {} creates an empty dict. The correct way
to create an empty set is set().
Intersection
a.intersection(b) returns a new set with elements present in both a and b
>>> a.intersection(b)
{3, 4}
Union
a.union(b) returns a new set with elements present in either a and b
>>> a.union(b)
{1, 2, 3, 4, 5}
Difference
a.difference(b) returns a new set with elements present in a but not in b
>>> a.difference(b)
{1, 2}
>>> b.difference(a)
{5}
Symmetric Difference
a.symmetric_difference(b) returns a new set with elements present in either a or b but not in both
>>> a.symmetric_difference(b)
{1, 2, 5}
>>> b.symmetric_difference(a)
https://riptutorial.com/ 759
{1, 2, 5}
>>> c = {1, 2}
>>> c.issubset(a)
True
>>> a.issuperset(c)
True
Method Operator
a.intersection(b) a & b
a.union(b) a|b
a.difference(b) a - b
a.symmetric_difference(b) a ^ b
a.issubset(b) a <= b
a.issuperset(b) a >= b
Disjoint sets
Sets a and d are disjoint if no element in a is also in d and vice versa.
>>> d = {5, 6}
>>> a.isdisjoint(b) # {2, 3, 4} are in both sets
False
>>> a.isdisjoint(d)
True
https://riptutorial.com/ 760
Testing membership
The builtin in keyword searches for occurances
>>> 1 in a
True
>>> 6 in a
False
Length
The builtin len() function returns the number of elements in the set
>>> len(a)
4
>>> len(b)
3
Set of Sets
{{1,2}, {3,4}}
leads to:
https://riptutorial.com/ 761
Chapter 160: setup.py
Parameters
Parameter Usage
List of Python packages (that is, directories containing modules) to include. This
packages can be specified manually, but a call to setuptools.find_packages() is typically
used instead.
py_modules List of top-level Python modules (that is, single .py files) to include.
Remarks
For further information on python packaging see:
Introduction
Examples
Purpose of setup.py
The setup script is the centre of all activity in building, distributing, and installing modules using the
Distutils. It's purpose is the correct installation of the software.
If all you want to do is distribute a module called foo, contained in a file foo.py, then your setup
script can be as simple as this:
setup(name='foo',
version='1.0',
py_modules=['foo'],
)
To create a source distribution for this module, you would create a setup script, setup.py,
containing the above code, and run this command from a terminal:
https://riptutorial.com/ 762
sdist will create an archive file (e.g., tarball on Unix, ZIP file on Windows) containing your setup
script setup.py, and your module foo.py. The archive file will be named foo-1.0.tar.gz (or .zip), and
will unpack into a directory foo-1.0.
If an end-user wishes to install your foo module, all she has to do is download foo-1.0.tar.gz (or
.zip), unpack it, and—from the foo-1.0 directory—run
Command line scripts inside python packages are common. You can organise your package in
such a way that when a user installs the package, the script will be available on their path.
If you had the greetings package which had the command line script hello_world.py.
greetings/
greetings/
__init__.py
hello_world.py
python greetings/greetings/hello_world.py
hello_world.py
You can achieve this by adding scripts to your setup() in setup.py like this:
When you install the greetings package now, hello_world.py will be added to your path.
entry_points={'console_scripts': ['greetings=greetings.hello_world:main']}
greetings
https://riptutorial.com/ 763
setuptools_scmis an officially-blessed package that can use Git or Mercurial metadata to determine
the version number of your package, and find Python packages and package data to include in it.
setup(
setup_requires=['setuptools_scm'],
use_scm_version=True,
packages=find_packages(),
include_package_data=True,
)
This example uses both features; to only use SCM metadata for the version, replace the call to
find_packages() with your manual package list, or to only use the package finder, remove
use_scm_version=True.
But there is even more options, like installing the package and have the possibility to change the
code and test it without having to re-install it. This is done using:
If you want to perform specific actions like compiling a Sphinx documentation or building fortran
code, you can create your own option like this:
cmdclasses = dict()
class BuildSphinx(Command):
def initialize_options(self):
pass
def finalize_options(self):
pass
def run(self):
import sphinx
sphinx.build_main(['setup.py', '-b', 'html', './doc', './doc/_build/html'])
sphinx.build_main(['setup.py', '-b', 'man', './doc', './doc/_build/man'])
cmdclasses['build_sphinx'] = BuildSphinx
setup(
...
https://riptutorial.com/ 764
cmdclass=cmdclasses,
)
initialize_options and finalize_options will be executed before and after the run function as their
names suggests it.
https://riptutorial.com/ 765
Chapter 161: shelve
Introduction
Shelve is a python module used to store objects in a file. The shelve module implements
persistent storage for arbitrary Python objects which can be pickled, using a dictionary-like API.
The shelve module can be used as a simple persistent storage option for Python objects when a
relational database is overkill. The shelf is accessed by keys, just as with a dictionary. The values
are pickled and written to a database created and managed by anydbm.
Remarks
Note: Do not rely on the shelf being closed automatically; always call close() explicitly when you
don’t need it any more, or use shelve.open() as a context manager:
Warning:
Because the shelve module is backed by pickle, it is insecure to load a shelf from an untrusted
source. Like with pickle, loading a shelf can execute arbitrary code.
Restrictions
1. The choice of which database package will be used (such as dbm.ndbm or dbm.gnu) depends
on which interface is available. Therefore it is not safe to open the database directly using dbm.
The database is also (unfortunately) subject to the limitations of dbm, if it is used — this means
that (the pickled representation of) the objects stored in the database should be fairly small, and in
rare cases key collisions may cause the database to refuse updates.
2.The shelve module does not support concurrent read/write access to shelved objects. (Multiple
simultaneous read accesses are safe.) When a program has a shelf open for writing, no other
program should have it open for reading or writing. Unix file locking can be used to solve this, but
this differs across Unix versions and requires knowledge about the database implementation used.
Examples
Sample code for shelve
To shelve an object, first import the module and then assign the object value as follows:
https://riptutorial.com/ 766
import shelve
database = shelve.open(filename.suffix)
object = Object()
database['key'] = object
import shelve
d.close() # close it
The simplest way to use shelve is via the DbfilenameShelf class. It uses anydbm to store the
data. You can use the class directly, or simply call shelve.open():
import shelve
s = shelve.open('test_shelf.db')
try:
s['key1'] = { 'int': 10, 'float':9.5, 'string':'Sample data' }
finally:
s.close()
To access the data again, open the shelf and use it like a dictionary:
import shelve
s = shelve.open('test_shelf.db')
https://riptutorial.com/ 767
try:
existing = s['key1']
finally:
s.close()
print existing
$ python shelve_create.py
$ python shelve_existing.py
The dbm module does not support multiple applications writing to the same database at the same
time. If you know your client will not be modifying the shelf, you can tell shelve to open the
database read-only.
import shelve
s = shelve.open('test_shelf.db', flag='r')
try:
existing = s['key1']
finally:
s.close()
print existing
If your program tries to modify the database while it is opened read-only, an access error
exception is generated. The exception type depends on the database module selected by anydbm
when the database was created.
Write-back
Shelves do not track modifications to volatile objects, by default. That means if you change the
contents of an item stored in the shelf, you must update the shelf explicitly by storing the item
again.
import shelve
s = shelve.open('test_shelf.db')
try:
print s['key1']
s['key1']['new_value'] = 'this was not here before'
finally:
s.close()
s = shelve.open('test_shelf.db', writeback=True)
try:
print s['key1']
finally:
s.close()
https://riptutorial.com/ 768
In this example, the dictionary at ‘key1’ is not stored again, so when the shelf is re-opened, the
changes have not been preserved.
$ python shelve_create.py
$ python shelve_withoutwriteback.py
To automatically catch changes to volatile objects stored in the shelf, open the shelf with writeback
enabled. The writeback flag causes the shelf to remember all of the objects retrieved from the
database using an in-memory cache. Each cache object is also written back to the database when
the shelf is closed.
import shelve
s = shelve.open('test_shelf.db', writeback=True)
try:
print s['key1']
s['key1']['new_value'] = 'this was not here before'
print s['key1']
finally:
s.close()
s = shelve.open('test_shelf.db', writeback=True)
try:
print s['key1']
finally:
s.close()
Although it reduces the chance of programmer error, and can make object persistence more
transparent, using writeback mode may not be desirable in every situation. The cache consumes
extra memory while the shelf is open, and pausing to write every cached object back to the
database when it is closed can take extra time. Since there is no way to tell if the cached objects
have been modified, they are all written back. If your application reads data more than it writes,
writeback will add more overhead than you might want.
$ python shelve_create.py
$ python shelve_writeback.py
https://riptutorial.com/ 769
Chapter 162: Similarities in syntax,
Differences in meaning: Python vs.
JavaScript
Introduction
It sometimes happens that two languages put different meanings on the same or similar syntax
expression. When the both languages are of interest for a programmer, clarifying these bifurcation
points helps to better understand the both languages in their basics and subtleties.
Examples
`in` with lists
2 in [2, 3]
In Python this evaluates to True, but in JavaScript to false. This is because in Python in checks if a
value is contained in a list, so 2 is in [2, 3] as its first element. In JavaScript in is used with objects
and checks if an object contains the property with the name expressed by the value. So JavaScript
considers [2, 3] as an object or a key-value map like this:
{'0': 2, '1': 3}
and checks if it has a property or a key '2' in it. Integer 2 is silently converted to string '2'.
https://riptutorial.com/ 770
Chapter 163: Simple Mathematical Operators
Introduction
Python does common mathematical operators on its own, including integer and float division,
multiplication, exponentiation, addition, and subtraction. The math module (included in all standard
Python versions) offers expanded functionality like trigonometric functions, root operations,
logarithms, and many more.
Remarks
bool ✓ ✓ ✓ ✓ ✓
int ✓ ✓ ✓ ✓ ✓
fractions.Fraction ✓ ― ✓ ✓ ✓
float ✓ ― ― ✓ ✓
complex ✓ ― ― ― ✓
decimal.Decimal ✓ ― ― ― ―
Examples
Addition
a, b = 1, 2
https://riptutorial.com/ 771
# The "+=" operator is equivalent to:
a = operator.iadd(a, b) # a = 5 since a is set to 3 right before this line
Note: the + operator is also used for concatenating strings, lists and tuples:
Subtraction
a, b = 1, 2
Multiplication
a, b = 2, 3
a * b # = 6
import operator
operator.mul(a, b) # = 6
https://riptutorial.com/ 772
Possible combinations (builtin types):
Note: The * operator is also used for repeated concatenation of strings, lists, and tuples:
3 * 'ab' # = 'ababab'
3 * ('a', 'b') # = ('a', 'b', 'a', 'b', 'a', 'b')
Division
Python does integer division when both operands are integers. The behavior of Python's division
operators have changed from Python 2.x and 3.x (see also Integer Division ).
a, b, c, d, e = 3, 2, 2.0, -3, 10
Python 2.x2.7
In Python 2 the result of the ' / ' operator depends on the type of the numerator and denominator.
a / b # = 1
a / c # = 1.5
d / b # = -2
b / a # = 0
d / e # = -1
Note that because both a and b are ints, the result is an int.
Python 2.x2.2
https://riptutorial.com/ 773
Recommended:
from __future__ import division # applies Python 3 style division to the entire module
a / b # = 1.5
a // b # = 1
a / (b * 1.0) # = 1.5
1.0 * a / b # = 1.5
a / b * 1.0 # = 1.0 (careful with order of operations)
float(a) / b # = 1.5
a / float(b) # = 1.5
Python 2.x2.2
The ' // ' operator in Python 2 forces floored division regardless of type.
a // b # = 1
a // c # = 1.0
Python 3.x3.0
In Python 3 the / operator performs 'true' division regardless of types. The // operator performs
floor division and maintains type.
a / b # = 1.5
e / b # = 5.0
a // b # = 1
a // c # = 1.0
https://riptutorial.com/ 774
See PEP 238 for more information.
Exponentation
a, b = 2, 3
(a ** b) # = 8
pow(a, b) # = 8
import math
math.pow(a, b) # = 8.0 (always float; does not allow complex results)
import operator
operator.pow(a, b) # = 8
Another difference between the built-in pow and math.pow is that the built-in pow can accept three
arguments:
a, b, c = 2, 3, 2
Special functions
The function math.sqrt(x) calculates the square root of x.
import math
import cmath
c = 4
math.sqrt(c) # = 2.0 (always float; does not allow complex results)
cmath.sqrt(c) # = (2+0j) (always complex)
To compute other roots, such as a cube root, raise the number to the reciprocal of the degree of
the root. This could be done with any of the exponential functions or operator.
import math
x = 8
math.pow(x, 1/3) # evaluates to 2.0
x**(1/3) # evaluates to 2.0
math.exp(0) # 1.0
math.exp(1) # 2.718281828459045 (e)
The function math.expm1(x) computes e ** x - 1. When x is small, this gives significantly better
precision than math.exp(x) - 1.
math.expm1(0) # 0.0
https://riptutorial.com/ 775
math.exp(1e-6) - 1 # 1.0000004999621837e-06
math.expm1(1e-6) # 1.0000005000001665e-06
# exact result # 1.000000500000166666708333341666...
Logarithms
By default, the math.log function calculates the logarithm of a number, base e. You can optionally
specify a base as the second argument.
import math
import cmath
math.log(5) # = 1.6094379124341003
# optional base argument. Default is math.e
math.log(5, math.e) # = 1.6094379124341003
cmath.log(5) # = (1.6094379124341003+0j)
math.log(1000, 10) # 3.0 (always returns float)
cmath.log(1000, 10) # (3+0j)
# Logarithm base 2
math.log2(8) # = 3.0
# Logarithm base 10
math.log10(100) # = 2.0
cmath.log10(100) # = (2+0j)
Inplace Operations
a = a + 1
or
a = a * 2
a += 1
# and
a *= 2
Any mathematic operator can be used before the '=' character to make an inplace operation :
https://riptutorial.com/ 776
• *= multiply the variable in place
• /= divide the variable in place
• //= floor divide the variable in place # Python 3
• %= return the modulus of the variable in place
• **= raise to a power in place
Other in place operators exist for the bitwise operators (^, | etc)
Trigonometric Functions
a, b = 1, 2
import math
Note that math.hypot(x, y) is also the length of the vector (or Euclidean distance) from
the origin (0, 0) to the point (x, y).
To compute the Euclidean distance between two points (x1, y1) & (x2, y2) you can
use math.hypot as follows
math.hypot(x2-x1, y2-y1)
To convert from radians -> degrees and degrees -> radians respectively use math.degrees and
math.radians
math.degrees(a)
# Out: 57.29577951308232
math.radians(57.29577951308232)
# Out: 1.0
Modulus
Like in many other languages, Python uses the % operator for calculating modulus.
3 % 4 # 3
10 % 2 # 0
6 % 4 # 2
https://riptutorial.com/ 777
import operator
operator.mod(3 , 4) # 3
operator.mod(10 , 2) # 0
operator.mod(6 , 4) # 2
-9 % 7 # 5
9 % -7 # -5
-9 % -7 # -2
If you need to find the result of integer division and modulus, you can use the divmod function as a
shortcut:
https://riptutorial.com/ 778
Chapter 164: Sockets
Introduction
Many programming languages use sockets to communicate across processes or between
devices. This topic explains proper usage the the sockets module in Python to facilitate sending
and receiving data over common networking protocols.
Parameters
Parameter Description
socket.AF_INET IPv4
socket.AF_INET6 IPv6
socket.SOCK_STREAM TCP
socket.SOCK_DGRAM UDP
Examples
Sending data via UDP
UDP is a connectionless protocol. Messages to other processes or computers are sent without
establishing any sort of connection. There is no automatic confirmation if your message has been
received. UDP is usually used in latency sensitive applications or in applications sending network
wide broadcasts.
The following code sends a message to a process listening on localhost port 6667 using UDP
Note that there is no need to "close" the socket after the send, because UDP is connectionless.
UDP is a connectionless protocol. This means that peers sending messages do not require
establishing a connection before sending messages. socket.recvfromthus returns a tuple (msg [the
https://riptutorial.com/ 779
message the socket received], addr [the address of the sender])
while True:
msg, addr = sock.recvfrom(8192) # This is the amount of bytes to read at maximum
print("Got message from %s: %s" % (addr, msg))
class MyHandler(BaseRequestHandler):
def handle(self):
print("Got connection from: %s" % self.client_address)
msg, sock = self.request
print("It said: %s" % msg)
sock.sendto("Got your message!".encode(), self.client_address) # Send reply
By default, sockets block. This means that execution of the script will wait until the socket receives
data.
Sending data over the internet is made possible using multiple modules. The sockets module
provides low-level access to the underlying Operating System operations responsible for sending
or receiving data from other computers or processes.
The following code sends the byte string b'Hello' to a TCP server listening on port 6667 on the
host localhost and closes the connection when finished:
Socket output is blocking by default, that means that the program will wait in the connect and send
calls until the action is 'completed'. For connect that means the server actually accepting the
connection. For send it only means that the operating system has enough buffer space to queue
the data to be send later.
https://riptutorial.com/ 780
When run with no arguments, this program starts a TCP socket server that listens for connections
to 127.0.0.1 on port 5000. The server handles each connection in a separate thread.
When run with the -c argument, this program connects to the server, reads the client list, and
prints it out. The client list is transferred as a JSON string. The client name may be specified by
passing the -n argument. By passing different names, the effect on the client list may be observed.
client_list.py
import argparse
import json
import socket
import threading
def server(client_list):
print "Starting server..."
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('127.0.0.1', 5000))
s.listen(5)
while True:
(conn, address) = s.accept()
t = threading.Thread(target=handle_client, args=(client_list, conn, address))
t.daemon = True
t.start()
def client(name):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1', 5000))
s.send(name)
data = s.recv(1024)
result = json.loads(data)
print json.dumps(result, indent=4)
def parse_arguments():
parser = argparse.ArgumentParser()
parser.add_argument('-c', dest='client', action='store_true')
parser.add_argument('-n', dest='name', type=str, default='name')
result = parser.parse_args()
return result
def main():
client_list = dict()
args = parse_arguments()
if args.client:
client(args.name)
else:
try:
server(client_list)
except KeyboardInterrupt:
print "Keyboard interrupt"
https://riptutorial.com/ 781
if __name__ == '__main__':
main()
Server Output
$ python client_list.py
Starting server...
Client Output
The receive buffers are limited to 1024 bytes. If the JSON string representation of the client list
exceeds this size, it will be truncated. This will cause the following exception to be raised:
ValueError: Unterminated string starting at: line 1 column 1023 (char 1022)
#!/usr/bin/env python
from socket import socket, AF_PACKET, SOCK_RAW
s = socket(AF_PACKET, SOCK_RAW)
s.bind(("eth1", 0))
s.send(dst_addr+src_addr+ethertype+payload+checksum)
https://riptutorial.com/ 782
Chapter 165: Sockets And Message
Encryption/Decryption Between Client and
Server
Introduction
Cryptography is used for security purposes. There are not so many examples of
Encryption/Decryption in Python using IDEA encryption MODE CTR. Aim of this documentation
:
Extend and implement of the RSA Digital Signature scheme in station-to-station communication.
Using Hashing for integrity of message, that is SHA-1. Produce simple Key Transport protocol.
Encrypt Key with IDEA encryption. Mode of Block Cipher is Counter Mode
Remarks
Language Used: Python 2.7 (Download Link: https://www.python.org/downloads/ )
Library Used:
Library Installation:
PyCrypto: Unzip the file. Go to the directory and open terminal for linux(alt+ctrl+t) and
CMD(shift+right click+select command prompt open here) for windows. After that write python
setup.py install (Make Sure Python Environment is set properly in Windows OS)
Tasks Implementation: The task is separated into two parts. One is handshake process and
another one is communication process. Socket Setup:
• As the creating public and private keys as well as hashing the public key, we need to setup
the socket now. For setting up the socket, we need to import another module with “import
socket” and connect(for client) or bind(for server) the IP address and the port with the socket
getting from the user.
----------Client Side----------
server = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
host = raw_input("Server Address To Be Connected -> ")
port = int(input("Port of The Server -> "))
https://riptutorial.com/ 783
server.connect((host, port))
----------Server Side---------
try:
#setting up socket
server = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
server.bind((host,port))
server.listen(5)
except BaseException: print "-----Check Server Address or Port-----"
Handshake Process:
• (CLIENT)The first task is to create public and private key. To create the private and public
key, we have to import some modules. They are : from Crypto import Random and from
Crypto.PublicKey import RSA. To create the keys, we have to write few simple lines of
codes:
random_generator = Random.new().read
key = RSA.generate(1024,random_generator)
public = key.publickey().exportKey()
random_generator is derived from “from Crypto import Random” module. Key is derived from “
from Crypto.PublicKey import RSA” which will create a private key, size of 1024 by generating
random characters. Public is exporting public key from previously generated private key.
• (CLIENT)After creating the public and private key, we have to hash the public key to send
over to the server using SHA-1 hash. To use the SHA-1 hash we need to import another
module by writing “import hashlib” .To hash the public key we have write two lines of code:
hash_object = hashlib.sha1(public)
hex_digest = hash_object.hexdigest()
Here hash_object and hex_digest is our variable. After this, client will send hex_digest and public
to the server and Server will verify them by comparing the hash got from client and new hash of
the public key. If the new hash and the hash from the client matches, it will move to next
procedure. As the public sent from the client is in form of string, it will not be able to be used as
key in the server side. To prevent this and converting string public key to rsa public key, we need
to write server_public_key = RSA.importKey(getpbk) ,here getpbk is the public key from the client.
• (SERVER)The next step is to create a session key. Here, I have used “os” module to create
a random key “key = os.urandom(16)” which will give us a 16bit long key and after that I
have encrypted that key in “AES.MODE_CTR” and hash it again with SHA-1:
https://riptutorial.com/ 784
#encrypt CTR MODE session key
en = AES.new(key_128,AES.MODE_CTR,counter = lambda:key_128) encrypto =
en.encrypt(key_128)
#hashing sha1
en_object = hashlib.sha1(encrypto)
en_digest = en_object.hexdigest()
• (SERVER) For the final part of the handshake process is to encrypt the public key got from
the client and the session key created in server side.
After encrypting, server will send the key to the client as string.
• (CLIENT) After getting the encrypted string of (public and session key) from the server, client
will decrypt them using Private Key which was created earlier along with the public key. As
the encrypted (public and session key) was in form of string, now we have to get it back as a
key by using eval() . If the decryption is done, the handshake process is completed also as
both sides confirms that they are using same keys. To decrypt:
en = eval(msg)
decrypt = key.decrypt(en)
# hashing sha1
en_object = hashlib.sha1(decrypt) en_digest = en_object.hexdigest()
I have used the SHA-1 here so that it will be readable in the output.
Communication Process:
For communication process, we have to use the session key from both side as the KEY for IDEA
encryption MODE_CTR. Both side will encrypt and decrypt messages with IDEA.MODE_CTR
using the session key.
• (Encryption) For IDEA encryption, we need key of 16bit in size and counter as must callable.
Counter is mandatory in MODE_CTR. The session key that we encrypted and hashed is now
size of 40 which will exceed the limit key of the IDEA encryption. Hence, we need to reduce
the size of the session key. For reducing, we can use normal python built in function
string[value:value]. Where the value can be any value according to the choice of the user. In
our case, I have done “key[:16]” where it will take from 0 to 16 values from the key. This
conversion could be done in many ways like key[1:17] or key[16:]. Next part is to create new
IDEA encryption function by writing IDEA.new() which will take 3 arguments for processing.
The first argument will be KEY,second argument will be the mode of the IDEA encryption (in
our case, IDEA.MODE_CTR) and the third argument will be the counter= which is a must
callable function. The counter= will hold a size of of string which will be returned by the
function. To define the counter= , we must have to use a reasonable values. In this case, I
have used the size of the KEY by defining lambda. Instead of using lambda, we could use
https://riptutorial.com/ 785
Counter.Util which generates random value for counter= . To use Counter.Util, we need to
import counter module from crypto. Hence, the code will be:
Once defining the “ideaEncrypt” as our IDEA encryption variable, we can use the built in encrypt
function to encrypt any message.
eMsg = ideaEncrypt.encrypt(whole)
#converting the encrypted message to HEXADECIMAL to readable eMsg =
eMsg.encode("hex").upper()
In this code segment, whole is the message to be encrypted and eMsg is the encrypted message.
After encrypting the message, I have converted it into HEXADECIMAL to make readable and
upper() is the built in function to make the characters uppercase. After that, this encrypted
message will be sent to the opposite station for decryption.
• (Decryption)
To decrypt the encrypted messages, we will need to create another encryption variable by using
the same arguments and same key but this time the variable will decrypt the encrypted messages.
The code for this same as the last time. However, before decrypting the messages, we need to
decode the message from hexadecimal because in our encryption part, we encoded the encrypted
message in hexadecimal to make readable. Hence, the whole code will be:
decoded = newmess.decode("hex")
ideaDecrypt = IDEA.new(key, IDEA.MODE_CTR, counter=lambda: key)
dMsg = ideaDecrypt.decrypt(decoded)
These processes will be done in both server and client side for encrypting and decrypting.
Examples
Server side Implementation
import socket
import hashlib
import os
import time
import itertools
import threading
import sys
import Crypto.Cipher.AES as AES
from Crypto.PublicKey import RSA
from CryptoPlus.Cipher import IDEA
https://riptutorial.com/ 786
done = False
def animate():
for c in itertools.cycle(['....','.......','..........','............']):
if done:
break
sys.stdout.write('\rCHECKING IP ADDRESS AND NOT USED PORT '+c)
sys.stdout.flush()
time.sleep(0.1)
sys.stdout.write('\r -----SERVER STARTED. WAITING FOR CLIENT-----\n')
try:
#setting up socket
server = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
server.bind((host,port))
server.listen(5)
check = True
except BaseException:
print "-----Check Server Address or Port-----"
check = False
if check is True:
# server Quit
shutdown = False
# printing "Server Started Message"
thread_load = threading.Thread(target=animate)
thread_load.start()
time.sleep(4)
done = True
#binding client and address
client,address = server.accept()
print ("CLIENT IS CONNECTED. CLIENT'S ADDRESS ->",address)
print ("\n-----WAITING FOR PUBLIC KEY & PUBLIC KEY HASH-----\n")
#hashing the public key in server side for validating the hash from client
hash_object = hashlib.sha1(getpbk)
hex_digest = hash_object.hexdigest()
if getpbk != "":
print (getpbk)
client.send("YES")
gethash = client.recv(1024)
print ("\n-----HASH OF PUBLIC KEY----- \n"+gethash)
if hex_digest == gethash:
# creating session key
key_128 = os.urandom(16)
#encrypt CTR MODE session key
en = AES.new(key_128,AES.MODE_CTR,counter = lambda:key_128)
encrypto = en.encrypt(key_128)
#hashing sha1
en_object = hashlib.sha1(encrypto)
en_digest = en_object.hexdigest()
https://riptutorial.com/ 787
#encrypting session key and public key
E = server_public_key.encrypt(encrypto,16)
print ("\n-----ENCRYPTED PUBLIC KEY AND SESSION KEY-----\n"+str(E))
print ("\n-----HANDSHAKE COMPLETE-----")
client.send(str(E))
while True:
#message from client
newmess = client.recv(1024)
#decoding the message from HEXADECIMAL to decrypt the ecrypted version of the message
only
decoded = newmess.decode("hex")
#making en_digest(session_key) as the key
key = en_digest[:16]
print ("\nENCRYPTED MESSAGE FROM CLIENT -> "+newmess)
#decrypting message from the client
ideaDecrypt = IDEA.new(key, IDEA.MODE_CTR, counter=lambda: key)
dMsg = ideaDecrypt.decrypt(decoded)
print ("\n**New Message** "+time.ctime(time.time()) +" > "+dMsg+"\n")
mess = raw_input("\nMessage To Client -> ")
if mess != "":
ideaEncrypt = IDEA.new(key, IDEA.MODE_CTR, counter=lambda : key)
eMsg = ideaEncrypt.encrypt(mess)
eMsg = eMsg.encode("hex").upper()
if eMsg != "":
print ("ENCRYPTED MESSAGE TO CLIENT-> " + eMsg)
client.send(eMsg)
client.close()
else:
print ("\n-----PUBLIC KEY HASH DOESNOT MATCH-----\n")
import time
import socket
import threading
import hashlib
import itertools
import sys
from Crypto import Random
from Crypto.PublicKey import RSA
from CryptoPlus.Cipher import IDEA
#animating loading
done = False
def animate():
for c in itertools.cycle(['....','.......','..........','............']):
if done:
break
sys.stdout.write('\rCONFIRMING CONNECTION TO SERVER '+c)
sys.stdout.flush()
time.sleep(0.1)
https://riptutorial.com/ 788
hash_object = hashlib.sha1(public)
hex_digest = hash_object.hexdigest()
#Setting up socket
server = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
time.sleep(4)
done = True
def send(t,name,key):
mess = raw_input(name + " : ")
key = key[:16]
#merging the message and the name
whole = name+" : "+mess
ideaEncrypt = IDEA.new(key, IDEA.MODE_CTR, counter=lambda : key)
eMsg = ideaEncrypt.encrypt(whole)
#converting the encrypted message to HEXADECIMAL to readable
eMsg = eMsg.encode("hex").upper()
if eMsg != "":
print ("ENCRYPTED MESSAGE TO SERVER-> "+eMsg)
server.send(eMsg)
def recv(t,key):
newmess = server.recv(1024)
print ("\nENCRYPTED MESSAGE FROM SERVER-> " + newmess)
key = key[:16]
decoded = newmess.decode("hex")
ideaDecrypt = IDEA.new(key, IDEA.MODE_CTR, counter=lambda: key)
dMsg = ideaDecrypt.decrypt(decoded)
print ("\n**New Message From Server** " + time.ctime(time.time()) + " : " + dMsg + "\n")
while True:
server.send(public)
confirm = server.recv(1024)
if confirm == "YES":
server.send(hex_digest)
#connected msg
msg = server.recv(1024)
en = eval(msg)
decrypt = key.decrypt(en)
# hashing sha1
en_object = hashlib.sha1(decrypt)
en_digest = en_object.hexdigest()
while True:
https://riptutorial.com/ 789
thread_send = threading.Thread(target=send,args=("------Sending Message------
",alais,en_digest))
thread_recv = threading.Thread(target=recv,args=("------Recieving Message------
",en_digest))
thread_send.start()
thread_recv.start()
thread_send.join()
thread_recv.join()
time.sleep(0.5)
time.sleep(60)
server.close()
Read Sockets And Message Encryption/Decryption Between Client and Server online:
https://riptutorial.com/python/topic/8710/sockets-and-message-encryption-decryption-between-
client-and-server
https://riptutorial.com/ 790
Chapter 166: Sorting, Minimum and Maximum
Examples
Getting the minimum or maximum of several values
min(7,2,1,5)
# Output: 1
max(7,2,1,5)
# Output: 7
but if you want to sort by a specific element in each sequence use the key-argument:
import operator
# The operator module contains efficient alternatives to the lambda function
max(list_of_tuples, key=operator.itemgetter(0)) # Sorting by first element
# Output: (2, 8)
https://riptutorial.com/ 791
min([])
However, with Python 3, you can pass in the keyword argument default with a value that will be
returned if the sequence is empty, instead of raising an exception:
max([], default=42)
# Output: 42
max([], default=0)
# Output: 0
Getting the minimum or maximum or using sorted depends on iterations over the object. In the
case of dict, the iteration is only over the keys:
To keep the dictionary structure, you have to iterate over the .items():
min(adict.items())
# Output: ('a', 3)
max(adict.items())
# Output: ('c', 1)
sorted(adict.items())
# Output: [('a', 3), ('b', 5), ('c', 1)]
For sorted, you could create an OrderedDict to keep the sorting while having a dict-like structure:
By value
Again this is possible using the key argument:
https://riptutorial.com/ 792
# Output: ('b', 5)
sorted(adict.items(), key=operator.itemgetter(1), reverse=True)
# Output: [('b', 5), ('a', 3), ('c', 1)]
sorted('bdca') # string
# Output: ['a','b','c','d']
The result is always a new list; the original data remains unchanged.
Getting the minimum of a sequence (iterable) is equivalent of accessing the first element of a
sorted sequence:
min([2, 7, 5])
# Output: 2
sorted([2, 7, 5])[0]
# Output: 2
The maximum is a bit more complicated, because sorted keeps order and max returns the first
encountered value. In case there are no duplicates the maximum is the same as the last element
of the sorted return:
max([2, 7, 5])
# Output: 7
sorted([2, 7, 5])[-1]
# Output: 7
But not if there are multiple elements that are evaluated as having the maximum value:
class MyClass(object):
def __init__(self, value, name):
self.value = value
self.name = name
https://riptutorial.com/ 793
def __repr__(self):
return str(self.name)
Any iterable containing elements that support < or > operations are allowed.
min, max,and sorted all need the objects to be orderable. To be properly orderable, the class needs
to define all of the 6 methods __lt__, __gt__, __ge__, __le__, __ne__ and __eq__:
class IntegerContainer(object):
def __init__(self, value):
self.value = value
def __repr__(self):
return "{}({})".format(self.__class__.__name__, self.value)
Though implementing all these methods would seem unnecessary, omitting some of them will
make your code prone to bugs.
Examples:
https://riptutorial.com/ 794
res = max(alist)
# Out: IntegerContainer(3) - Test greater than IntegerContainer(5)
# IntegerContainer(10) - Test greater than IntegerContainer(5)
# IntegerContainer(7) - Test greater than IntegerContainer(10)
print(res)
# Out: IntegerContainer(10)
res = min(alist)
# Out: IntegerContainer(3) - Test less than IntegerContainer(5)
# IntegerContainer(10) - Test less than IntegerContainer(3)
# IntegerContainer(7) - Test less than IntegerContainer(3)
print(res)
# Out: IntegerContainer(3)
res = sorted(alist)
# Out: IntegerContainer(3) - Test less than IntegerContainer(5)
# IntegerContainer(10) - Test less than IntegerContainer(3)
# IntegerContainer(10) - Test less than IntegerContainer(5)
# IntegerContainer(7) - Test less than IntegerContainer(5)
# IntegerContainer(7) - Test less than IntegerContainer(10)
print(res)
# Out: [IntegerContainer(3), IntegerContainer(5), IntegerContainer(7), IntegerContainer(10)]
But sorted can use __gt__ instead if the default is not implemented:
res = min(alist)
# Out: IntegerContainer(5) - Test greater than IntegerContainer(3)
# IntegerContainer(3) - Test greater than IntegerContainer(10)
# IntegerContainer(3) - Test greater than IntegerContainer(7)
print(res)
# Out: IntegerContainer(3)
Sorting methods will raise a TypeError if neither __lt__ nor __gt__ are implemented:
res = min(alist)
functools.total_ordering decorator can be used simplifying the effort of writing these rich
https://riptutorial.com/ 795
comparison methods. If you decorate your class with total_ordering, you need to implement __eq__
, __ne__ and only one of the __lt__, __le__, __ge__ or __gt__, and the decorator will fill in the rest:
import functools
@functools.total_ordering
class IntegerContainer(object):
def __init__(self, value):
self.value = value
def __repr__(self):
return "{}({})".format(self.__class__.__name__, self.value)
Notice how the > (greater than) now ends up calling the less than method, and in some cases
even the __eq__ method. This also means that if speed is of great importance, you should
implement each rich comparison method yourself.
To find some number (more than one) of largest or smallest values of an iterable, you can use the
nlargest and nsmallest of the heapq module:
import heapq
heapq.nlargest(5, range(10))
# Output: [9, 8, 7, 6, 5]
heapq.nsmallest(5, range(10))
# Output: [0, 1, 2, 3, 4]
This is much more efficient than sorting the whole iterable and then slicing from the end or
beginning. Internally these functions use the binary heap priority queue data structure, which is
https://riptutorial.com/ 796