T.R. Padmanabhan-Programming With Python-Springer (2016)
T.R. Padmanabhan-Programming With Python-Springer (2016)
Padmanabhan
123
T.R. Padmanabhan
Amrita University
Coimbatore, Tamil Nadu
India
People, not withstanding caste, creed, gender, ethnic diversities, nationalities, are
interacting intensely in the recent decades identifying commonalities, accommo-
dating differences, making common cause. Python stands out as a shining outcome
of such distributed but focused co-ordination. It started with an ideaSimplicity at
lofty heights (my view)that occurred to Guido van Rossum, who continues to be
the accepted benevolent dictator for life (BDFL) for Python community. It is not
that anyone can join this bandwagon and contribute; as it is not that easy. You can
suggest a contribution but its pros and cons are discussed in an open forum through
the net and (in the accepted shape) it enters the Holy Book as PEP (Python
Enhancement Proposal). The (open) Holy Book continues to grow in size shedding
better light. It is a thrill to know how well it is evolving and to feel or participate
in its lustre. Python shines with the layers for its usesimple for the novice,
versatile for the programmer, added facilities for the developer, openness for a
Python sculptor. It has a varied and versatile data structure, a vast library, a huge
collection of additional resources, and above all OPENNESS. So embrace Python
the language by the people, of the people, for the people.
Denitely this is not justication enough for another book on Python. The
variety of data structures and the flexibility and vastness of the modules in the
Python library are daunting. The most common features of Python have been dealt
with in this book bringing out their subtleties; their potential and suitability for
varied use through illustrations. Nothing is glossed over. One can go through the
illustrative examples, repeat them in toto, or run their variants at ones own pace
and progress. The matter has been presented in a logical and graded manner. Some
of the exercises at the ends of chapters are pedagogical. But many of them call for
more effortsperhaps candidates for minor projects. Concepts associated with
constructs like yield, iterator, generator, decorator, super (inheritance), format
(Python 3) are often considered to be abstract and difcult to digest. A conscious
effort has been made to explain these through apt examples. The associated exer-
cises complement these in different ways. Any feedback by way of corrections,
clarications, or any queries are welcome (blog: nahtap.blogspot.com).
I am grateful to Prof. K. Gangadharan of Amrita University to have opened my
eyes to the openness of open systems. This book is an offshoot of this. In many
ways, I am indebted to my students and colleagues over the decades; discussions
with them, often spurred by a query, have been immensely helpful in honing my
understanding and clarifying concepts. Implicitly the same is reflected in the book
as well. I thank Suvira Srivastav and Praveen Kumar for steering the book through
the Processes in Springer.
Lastly (but not priority wise) my thanks are due to my wife Uma for her
unwavering and sustained accommodation of my oddities.
1 PythonA Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Algebra with Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Complex Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Common Functions with Numbers . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Strings and Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Simple Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1 Basic Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Conditional Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Iterative Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 Functions and Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1.1 Lambda Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.2 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.3 Nested Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.4 Nested Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 Built-in Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2.2 Math Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5 Sequences and Operations with Sequences . . . . . . . . . . . . . . . . . . . . 69
5.1 String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.2 Tuple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4 Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.5 Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.6 Operators with Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.6.1 All and Any . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.6.2 sum and eval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.7 Iterator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.8 Iterator Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.9 Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.10 Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.11 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.12 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6 Additional Operations with Sequences . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1 Slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Reversing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.3 Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4 Operations with Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.4.1 Max() and Min() Functions . . . . . . . . . . . . . . . . . . . . . . 109
6.4.2 Additional Operations with Sequences . . . . . . . . . . . . . 112
6.5 Operations with Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.6 Frozensets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.7 Tests and Comparisons with Sets and Frozensets . . . . . . . . . . . . 123
6.8 Operations with Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.9 *Arg and **Kwarg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.10 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7 Operations for Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.1 Unicode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.2 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.2.1 UTF-8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.3 Operations with string S . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.4 Number Representations and Conversions . . . . . . . . . . . . . . . . . 144
7.4.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.4.2 Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.5 More String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.6 bytes and bytearrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.7 Other Operations with Sequences . . . . . . . . . . . . . . . . . . . . . . . . 166
7.8 string Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.9 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8 Operations with Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.1 Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.2 String Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.2.1 FormattingVersion I . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.2.2 FormattingVersion II . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.3 Files and Related Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.3.1 String/Text Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.4 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9 Application Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.1 random Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.1.1 Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 203
9.2 statistics Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
9.3 Array Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.4 bisect Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
9.5 heapq Module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
9.6 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
10 Classes and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
10.1 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.2 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.2.1 Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.3 Functions with Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
10.4 pass : Place Holder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
10.5 Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
10.5.1 Overloading in Basic Python . . . . . . . . . . . . . . . . . . . . . 246
10.6 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
10.6.1 Multiple Inheritances . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
10.7 super() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
10.8 Execution from Command Line . . . . . . . . . . . . . . . . . . . . . . . . . 259
10.9 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
11 Time Related Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.1 Time Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.2 time Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
11.3 datetime Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
11.3.1 time Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
11.3.2 datetime Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.3.3 Time Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
11.3.4 tzinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
11.3.5 Algebra with Time Objects . . . . . . . . . . . . . . . . . . . . . . 285
11.4 Calendars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
11.5 timeit Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
11.6 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
12 Functional Programming Aids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
12.1 operator Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
12.1.1 Generic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.1.2 Inplace Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
12.2 itertools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
12.2.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
12.3 generator Using yield . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
12.4 iterator Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
12.5 decoratorS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
12.6 functools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
12.6.1 total_ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
12.6.2 single dispatch Generic Function . . . . . . . . . . . . . 329
12.6.3 partial Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
12.6.4 Reduction Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
12.7 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Chapter 1
PythonA Calculator
Computer languages have so far been of the interpreted or the compiled type.
Compiled languages (like C) have been more common. You prepare a program,
save it (the debugged version), and (when needed) call it for running (or execution).
Prior to running, the compiler compiles the program as a whole. In the interpreted
versions (like Basic) you give a command, it is executed then and there
(interpreted).
Python functions in both the forms; basically you run it in the interpreter mode.
When needed, written and ready to run modules/functions can be called up to
join the interpreted sequence.
Let us consider the interpreted functioning. Python running environment can be
opened by typing in python3 and following it by the (enter) key entry. Python
environment opens and the python prompt >>>appears at the left end of the
screen. The basic information regarding the version of python precedes this. We can
safely ignore this, at least, for the present.
One of the simplest yet powerful uses of Python is to do calculationsas with a
calculator. Let us go through an interactive session in Python (Rossum and Drake
2014). The session details are reproduced in Fig. 1.1 in the same order. The
numerals in the sequence are not in the screen per se but have been added at the
right end to facilitate explanations. Throughout this book an integer within square
bracketsas [1]refers to the line in the interpreted sequence under discussion.
Let us understand the sequence in Fig. 1.1 by going through the sequence in the
same order. You keyed in 3 + 4 in [1]as you do with a calculatorand pressed
the enter key. Python carried out the algebra you desired and returned the result as
7 which appears in line [2]the next line. Having completed the assigned task
as a calculatorPython proceeds to the next line and outputs the prompt sign
>>> [3]as though it says I am ready for the next assignment. You continue
the sessionas calculatorthrough the steps shown. The following can be
understood from the sequence shown:
1
2 1 PythonA Calculator
trp@trp-Veriton-Series:~$ python3
Python 3.4.2 (default, Oct 30 2014, 15:27:09)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more
information.
>>> 3+4 [1]
7 [2]
>>> 4-7 [3]
-3 [4]
>>> 7*3 [5]
21 [6]
>>> 4-7*3 [7]
-17 [8]
>>> 21/3 [9]
7.0 [10]
>>> 8+4*2-2*7 [11]
2 [12]
>>> 8+4*2-4/2 [13]
14.0 [14]
>>> 4--2 [15]
6 [16]
>>> 4-3.0 [17]
1.0
>>> 4 + 9 [18]
13
>>> 4 +
9 [19]
13 [20]
>>> 4+
File "<stdin>", line 1
4+
^
SyntaxError: invalid syntax
>>> 4+9 [21]
File "<stdin>", line 1
4+9
^
Fig. 1.2 Extract from a Python Interpreter sequence showing an erroneous insertion of space at
start of a command line
4 1 PythonA Calculator
Fig. 1.3 A Python Interpreter sequence bringing out additional calculator type operations
Python (as in any other computer language) are only for the users
understanding.
The operator // returns the floor value of the quotient of a division operation.
Division of 23 by four in [3] yields ve as the quotient and 5 is an integer here.
Division of 23 by 4.2 in [4] also yields 5 as the quotient as 5.0 which is in floating
point mode. The interpreter interprets the result to be in floating point form due to
the divisor being in floating point form.
A clarication regarding number representation is in order here. Integers can be
entered and represented as such; +12 as well as 12 is taken as the positive integer
12the positive sign before the integer is optional. But 12 is taken as a negative
number, the negative sign preceding a negative number being mandatory.
1 PythonA Calculator 5
Reference
2.1 Variables
One can dene variables, assign values to them, and do algebra. Consider the
sequence in Fig. 2.1. [1] has a variable with assigned name a. It has been assigned
the integral value 3. There is no need to assign a variable tag or assign a type to
it. From the statement in [1] Python understands all these. In [2] we are putting a
query to Python What is a? Python interpreter returns the value assigned to a.
In [3] we are passing on the query What type of object is this entity a?. The
interpreter returns with the clarication that a belongs to the class of objects
termed int (integer). Such type queries can be made whenever desired to
understand the class (identity) of any object.
In [4] a variable has been given the name Aa and it is assigned the integral value
4. In general a variable can be given a namecalled Identieras a sequence
of ASCII characters excluding the two$ and ?. The preferred practice is to use
Identiers for variables as well as other entities that we use in a language like
Python such that the variable/entity can be easily identied from it. The constraints
in the selection here are
The rst character has to be a small or a capital letter or _ (underscore).
The characters $ and ? cannot be used in an Identier.
The Identier should not begin or end with a pair of underscores. In fact these
are reserved for specic use (described later).
A specic set of combinations of letters is used as keywords in Python (van
Rossum and Drake 2014a). These are to be avoided as Identiers. Table 2.1
7
8 2 Algebra with Variables
is the set of all the keywords in Python. Avoiding their use directly or in
combinations is healthy programming practice. Same holds good of built-in
function names such as abs, repr, chr, divmod, oat, and so on.
b in [5], c in [6], and d_e in [13] are other examples of such Identiers.
Identiers are case sensitive; a and A are different variables. [5] denes a variable
b and assigns a value 4.1 to it. Python automatically takes b as a floating point
variable and assigns the value 4.1 to it. The same is claried by the type (b)
query and the clarication offered by Python in the two lines following. Algebra
with variables can be carried out as with integers. In [6] the values of a and b are
2.1 Variables 9
added and assigned to a new variable c. Once again there is no need for a separate
declaration, type clarication, and so on. [7] and [8] in the following lines clarify
this. Chain algebra can be carried out and assigned to variablesif necessary new
onesas can be seen from [8] and the following lines. [9] and the following lines
are further examples of this. [10] has two variables assigned values in a sequence.
Such sequential assignments can be done for any number of variables. Python will
decide the type of variable and assign values to them conforming to the sequence
specied. The type queries [11] and [12] and the Python responses in the lines that
follow clarify this. In [13] d_e is assigned the value (d/e). d being a floating point
variablewith value 5.1 as can be seen from [10]d_e is automatically taken as a
floating point variable and assigned the result. The query and response that follow
conrm this. d1 and e1 are assigned values 4.2 and 5.3 in [14]. The following lines
reassign values to them. Note that the assignments to d1 and e1 have been
interchanged without the use of an intermediate temporary storage. This is not
limited/restricted to numerical assignments alone. In [15] the new value of e1 is
d1-2 with d1 having the value prior to the present assignment. [16] is another
example of multiple assignments done concurrently. The Python execution
sequence following conrms this. In [17] d2 and e2 are assigned the same value of
4.78. Such sequence of assignments is also possible. The combination operator
+= in [18] assigns a new value to d3 as d3 = d3 + 1. Same holds good of the
combination operators =, *=, and /= as can be seen from [19], [20], [21], and
[22] and the query-response sequences following these.
Table 2.2 Operators in Python: Algebraic operators are listed in order of ascending priorities
Algebraic Symbol Operation Logical/bit Symbol Operation
operators performed operators performed
+ Addition * Complement
Subtraction & Logical AND
* Multiplication | Logical OR
/ Division ^ Logical XOR
// Floored quotient Right shift (bits)
% Remainder Left shift (bits)
** Exponentiation
In addition the combination operators+=, =, *=, /=, //=, %=, **=, &=, |=, *=, =,
and = are also available
10 2 Algebra with Variables
Fig. 2.2 A Python Interpreter sequence with algebraic operations and simple functions with
numbers
The operators used in algebra and the operations they signify are given in
Table 2.2 (van Rossum and Drake 2014b). The combination operators are also
given in the table.
The underscore symbol _ plays a useful role in interactive sessions. It is
assigned the last printed expression. Referring to the sequence in Fig. 2.2, in [1] it
is the numerical valuethat is 3.2in the preceding line. b is assigned this value
of 3.2. b carries this value for subsequent algebraic steps. [2] is an instance of using
_ where it is used in an algebraic expression.
2.2 Complex Quantities 11
Python has the provision to handle complex numbers and variables. [3] assigns the
value 4 + 3j to e. Here 3j signies the imaginary component of the number.
Algebra can be carried out with complex numbers with equal easein the same
manner as real numbers. [4], [5], and [6] represent such algebra where the real
imaginary parts of the variables/numbers and the expected resultsare all integers.
[7], [8], [9], and [10] show cases where the complex numbers involved have the real
and imaginary parts in floating point form. The results too have the real and
imaginary parts in floating point form. [11] is another example of the use of _to
use the last result in the current line without the need to retype. [12] conrms the
correctness of the computation with [11].
Python has a number of built-in functions (van Rossum and Drake 2014b); each
function accepts the specied arguments, executes the routines concerned and
returns the result (if and as desired). Functions are discussed in detail later. Here we
introduce a few of the built-in functions useful directly in the calculator type of work.
abs(a) returns the absolute value of a specied as argument. [13] is an instance of
the absolute value of 3.2 returned as 3.2. [14] returns the absolute value of the
complex number h with assigned value in [7] as 2.1 4.3j that is
4.78539444560216. complex() is another built-in function. It takes two arguments
x and yin the same order and returns the complex quantity x + yj. [15] is an
example of the direct use of complex() function to form the complex number
3.1 + 4.2j taking 3.1 and 4.2 as the arguments. [16] is another example conrming
this. If only one argument is specied in the complex() function it is implicitly
taken as the real component and the imaginary part is automatically taken as zero.
[17] is an illustration of this usage as can be seen from the lines following. The
conjugate of a complex quantity is obtained as in [19]he representing the complex
conjugate of hc. In [20] and [21] hk.real and hk.imag return the real and
imaginary components of hk and assign them to the variables hr and hi respectively.
The following line conrms that hr is a floating point number (the same is true of hi
also). The function pow(a, b) returns abthe same as a ** b. Here a and b can be
integers or floating point numbers. ab is an integer if and only if a and b are integers
and b is positive. These can be seen from [22] to [25]. The function pow(a, b,
c) returns (a ** b) % c as can be seen from [26]. [27] is another illustration of this at
a slightly longer integer level. The sequence computes (1311) % 17 in two stepsa
longer route. [28] achieves the same in a single step. Figure 2.3 shows the possi-
bilities and constraints in the use of pow() in a compact form.
12 2 Algebra with Variables
pow(x, y, [z])
If z is present, y must be a positive integer
x has to be an integer positive or negative
xy mod z is returned
If z is absent x y is returned
If y is a positive integer and x is an (positive or
negative) integer, x y is returned as an integer
Else (i.e., y is a negative integer or a floating point
number) xy is returned as a floating point number
Fig. 2.4 Illustration of conversions between floating point numbers and integers
2.3 Common Functions with Numbers 13
operation done by int() is not a rounding off. [9] conrms this. Table 2.3 sum-
marizes the functions with numbers discussed here. Additional functions are
introduced later.
The basic operators in Python for doing algebra as well as for forming algebraic
expressions are given in Table 2.2. They are listed in the table in the order of their
priorities (specically the operators in descending order of priorities are**, %, //,
/, *, , and +). Thus in any algebraic chain **if presentwill be evaluated rst;
then % and so on. + operation is the last one to be carried out. The Python
Interpreter sequence in Fig. 2.5 illustrates these. 3 *+ 4 in [1] is fairly clear; the
integer 3 is multiplied by the positive integer +4 to yield the integer 12 as the result.
A clearer way of specifying this is shown to the right (after the # symbol) as
3 * (+4). Similarly 12/4 in [2] is interpreted as division of integer 12 by the
negative integer 4 with 3.0 as the result. Once again 12/(4) shown at the right is
clearer. With 12 + 5 * 8 in [3] the * operation gets priority over the + operation;
hence 5 * 8 is done rst and the result (40) added to 12 subsequently to yield 52 as
the result. 12 + (5 * 8) shown at the right avoids any ambiguity. Note that
Fig. 2.5 Representative algebra involving multiple operations and their priorities
14 2 Algebra with Variables
(12 + 5) * 8 is different from this. With 60/5 * 3 the division operation (/) gets
priority over the multiplication operation (*); hence the expression is evaluated as
(60/5) * 3 (=36.0) as claried at the right. Similarly 60/5 * 3//2 in [5] is evaluated
as ((60/5) * 3)//2 to yield 18.0. With 4 * 5 77/11 + 7 * 2 in [7] division
(77/11 = 7.0), multiplications4 * 5 (20) and 7 * 2 (=14), subtraction
(20 7.0 = 13.0), and addition13.0 + 14 (=27.0) are carried out in that order.
The expression evaluated is (4 * 5)(77/11) + (7 * 2) as shown in the right. Since
the division 77/11 always returns a floating point number the nal result (algebra
involving a mixture of integers and floating point numbers) yields a floating
number.
Algebraic expressions can be made compact by conforming to the priorities of
operators. However as a practice, it is better to use parentheses (though superfluous)
and clarify the desired sequence and avoid room for ambiguity.
The logical operators in Table 2.2 operate bit-wise on integers. The Python
Interpreter sequence in Fig. 2.6 illustrates their use. a in [1] and b in [2] are 101102
(binary equivalent of the decimal number 22) and 101012 (binary equivalent of the
decimal number 21) respectively. a|b in [3] is 101112 which is 23 in decimal form.
The other operations too can be veried similarly.
A string is a type of object in Python. Any character sequence can form a string.
Strings can be useful in taking output as printouts and in presenting any entity
in/from Python. Figure 2.7 is a Python interpreted sequence to demonstrate basic
operations with strings (van Rossum and Drake 2014c). String great is assigned to
the identier s1. [2] and the line following it conrm this. Similarly s2 in [3] is a
string. Strings can be combined conveniently using the addition operator+. s3
dened in [4] is such a combination as can be seen from [5] and the output
following. In Python the operator + is used in the sense of combining two entities
but not restricted to mean the addition of two numbers alone. Such an extended
concept is true of other operators as well as many functions as well. These will be
explained duly.
s4 in [6] is a more elegant good day than s3. Here the white space a
string of single character has been interposed between good and day; the three
stringss1(good), the white space string , and s2 (day) have been combined
using the + operator to form the string s4 (good day). The * operator can be
used with a string to repeat a desired sequence (imposition!). s5 in [7] is an
example. It has been rened in [8] and the repeated sequence reproduced as s6 in a
more elegant manner as can be seen from [8].
Entities in Python (like strings) can be output/displayed by invoking the print
() function. The print(s1) in [9] is possibly the simplest form of use of print()
function. Any string can be directly output in this manner. Objects which can be
output directly too can be printed in this manner. a has been assigned the integer
value 35 in [10] (hence a is of Type int); it is output in [11] through print(a).
a, b, and c have been assigned values 10, 21.3 and True respectively in [12]; in
turn they are of type int, oat, and Boolean. They are output directly in the
same sequence with the print command print(a, b, c) in [13]. Numbers, values of
variables, and the like have to be converted into string form before they can be
output through the print function. The function repr() achieves such a conversion
into a string form which can be directly used as input to the print() function. c as
specied in [14] forms the product of a(=35) and b(=11). To get it (value of 385)
printed out it is converted into string form through repr(c) in [15]. [16] conrms
this. Its value is printed out in [17]. a * b or any other such algebraic sequenceits
valuecan be directly converted into a stringavoiding the use of the intermediate
temporary variable cfor output as done in [18] and [19].
2.6 Exercises
References
van Rossum G, Drake FL Jr (2014a) The Python language reference. Python software foundation
van Rossum G, Drake FL Jr (2014b) The Python library reference. Python software foundation
van Rossum G, Drake FL Jr (2014c) Python tutorial. Python software foundation
Chapter 3
Simple Programs
As was seen in the last two chapters basic algebraic operations are carried out in
Python as with a simple calculator. More involved operations call for preparation of
programs and working with them. The program structure in Python has to conform
to specic syntactic rules. These are to be religiously followed to ensure that Python
interprets the program for subsequent execution (van Rossum and Drake 2014).
19
20 3 Simple Programs
... j += 1 [11]
...
104,
117,
130,
143,
156,
169,
182,
195, [12]
Fig. 3.1 Python Interpreter sequence for Examples 3.1 and 3.2
all successively added to the sum. Once the loop operation is over the program
proceeds to the following line [6] and continues execution. Here the values of nr
and sums are printed out. The simplicity of the loop structure is striking. There is no
need to put parentheses around the condition to be tested, no need to identify the
group by enclosing it within curly brackets and so on. The group may have as many
executable statements as desired. The whole condition is checked after every
execution of the group sequence.
Example 3.2 Identify all the numbers in the interval {100 to 200} which are
divisible by 13 and output them.
The 6-line program from [7][11] achieves this; it is a bit more involved
compared to the previous one. The program accepts three integersl1 (lower
limit), l2 (upper limit), and a specied number nand prints out all the numbers
between l1 and l2 which are divisible by n. ja dummy/running variableis
assigned the value of the lower limit to start with. Two successive checks are done
on j. First check whether j < l2; if so enter the loop/continue within the loop
execution. Within the loop 1 check whether j is divisible by n. If so enter loop 2 and
execute it. Else do not enter loop 2/keep away from it. Loop 2 here demands a
single actionprint the value of the number and proceed to the next line. Once you
3.1 Basic Program Structure 21
Secondary prompt
Primary prompt
All inputs in the while i group given with the same indentation
Fig. 3.2 Structure of the program for the sequence [1][6] in Fig. 3.1 (Example 3.1)
exit loop 2, you are back in loop 1. Increment the value of j [11]. This forms the last
line of the program within loop 1. In the specic case here all numbers in the
interval100200that are divisible by 13 are printed out in the same sequence as
they are encountered.
The program brings out a number of additional aspects of Python.
Loop 2 is within loop 1. All statements within loop 2 form a sub-group.
(Incidentally loop 2 has only a single statement here.) They appear with the
same indentation within loop 1 (see Fig. 3.3).
if is a keyword. It is used to check a condition and execute a loop if the
condition is satised.
The operator == checks whether the values of quantities on either side are
identical. Specically here if j%n is zero, j%n == 0 is true (or 1); if it is
non-zero, it is false (or 0).
Fig. 3.3 Structure of the >>> l1, l2, n = 100, 200, 13 [7]
program for the sequence >>> j = l1
[7][11] in Fig. 3.1 (Example >>> while j < l2: [8]
3.2) ... if j%n == 0: [9]
print(repr(j) + ', ') [10]
... j += 1 [11]
Main program
22 3 Simple Programs
Incidentally if a loop has only a single statement in it, the same can follow the
condition on the same line. The Python Interpreter sequence is as in Fig. 3.4.
Any normal/useful program will require a sequence of activities (representing
the corresponding executable statements) to be carried out. The sequence will be
linked through specic conditional decisions (as in the two small illustrations
above). It is necessary to conceive of the overall computation, fully understand the
same, represent it in a clear logical sequence and then do the program coding. Such
a structured representation is conveniently done in pseudo-code form. It is good
programming practice to represent the program in pseudo-code form and then
proceed with the coding proper. The pseudo code for a program with one condi-
tional loop within is shown in Fig. 3.5a; Example 3.1 can be seen to be of this type.
The pseudo-code in Fig. 3.5b has two conditional loopsloop 2 being executed
within loop 1; Example 3.2 can be seen to be of this type. The following are
noteworthy here:
Any sequence of executions which does not involve conditional checks is
represented by one/a few statements. The suite of these statements together
constitutes one logical block to be executed. begin and end signify the
beginning and the end of the block/suite.
Every logical block is entered after a conditional check. To clarify this, the
logical block is identied through a denite indent on the left of the parent
block.
For successive logical checks followed by corresponding logical blocks similar
indentations are used.
In general the pseudo code of a program may involve a number of conditional
loops in a sequence; some of these loops may have single or multiple loops within
in cascaded/sequential forms. A number of such pseudo code structures appear with
the examples to follow here as well as in subsequent chapters.
3.1 Basic Program Structure 23
(a) (b)
Statement 1 Begin
Statement 2 Statement 1
. . . . Statement 2
. . . . . . . .
Statement i . . . .
while (condition): Statement i
Start loop while (condition):
Statement i+1 Begin loop1
Statement i+2 Statement i+1
. . . . Statement i+2
. . . . . . . .
Statement i+j . . . .
End loop Statement i+j
Statement i+j+1 Statement i+j+1
. . . . Begin loop2
. . . . Statement i+j+2
Statement last Statement i+j+3
. . . .
. . . .
Statement i+j+k
Statement
i+j+k+1
End loop2
Statement i+j+k+3
. . . .
. . . .
Statement i+j+k+l
Statement i+j+l+1
End loop1
Statement i+j+l+2
. . . .
. . . .
Statement last
End
Fig. 3.5 a Pseudo codes for the sequence in Fig. 3.1: a First example and b Second example
.... ....
end end ....
while (): while (): while ():
begin begin begin
.... .... ....
input
Executable
segment 1
Decision
block
condition
Branch 2 Branch 1
Executable Executable
segment 2 segment 3
combining
Executable
segment 4
Output
end
conforming to the logical process desired. The blocks are connected through lines
with arrows showing directions of program flow. In general a program represented
by a flow chart progresses downwards. The flowcharts for the two examples con-
sidered earlier are shown in Fig. 3.8.
Flow charts of programs encountered in practice can be much more involved
involving a number of decision blocks, and executable blocks. Well thought out
programs can be represented as well organized flow charts. In turn it helps the
coding and execution considerably.
The choice of a pseudo code or a flowchart for a program is purely a subjective
one. When embarking on preparing a program for a task, the need to clearly
conceive the program task, and represent it in the form of a flow chart or a pseudo
code with full logic flow and interlinking fully claried, need hardly be stressed.
26 3 Simple Programs
(a) (b)
start start
Sum, Nr = 0, 12
i = Nr l1, l2, n = 100, 200, 13
j = l1
i=0 i 0
i Increment j
no j < l2
yes
Increment sum
yes no
Decrement i j%n
Output
Nr & sum Output j
end end
A select set of keywords helps to test conditions and steer program. Their usage is
brought out through a set of examples here.
Example 3.3 Output the sum of squares of the rst eight odd integers.
The segment [1][4] in the python Interpreter sequence in Fig. 3.9 computes the
desired sumsum of squares of the eight odd integers starting with 1and outputs
the same (=680). Here n is a counterinitialized to 8 and counting is done
downwards until n is zero. The loop starting with while is executed as long as
n 0; while is a keyword here. n = 0 is interpreted as False and causes
termination of the loop execution. The flow chart for the program is shown in
Fig. 3.10; it can be seen to be similar to the flowchart in Fig. 3.8a as far as
functional blocks and program flow are concerned. In the print statement [3] the
function repr(sm) converts sm to a printable string. It is concatenated with the
string the required sum is and the combination output in a convenient form. [4] is
possibly a simpler print version. The print function outputs the string and sn
directly.
True and False are keywords; they are the Boolean values equivalent to 1
and 0 respectively. Their use is illustrated through the following example.
Example 3.4 Identify the rst seven positive integers and output the sum of their
cubes.
The routine [5][8] in the sequence in Fig. 3.9 obtains the desired sum and
outputs the same. As in the previous program n counts down from 8. a is assigned
the True value initially. The loop execution continues as long as the status of a
3.3 Conditional Operations 27
>>> n, m, sm = 8, 1, 0 [1]
>>> while n: #Add squares of the first 8 [2]
... sm += m*m # odd numbers
... m += 2 #Print out the sum
... n -= 1
...
>>> print('The required sum is ' + repr(sm)) [3]
The required sum is 680
>>> print('The required sum is ', sm) [4]
The required sum is 680
>>> a, n, m, sm = True, 8, 7, 0 [5]
>>> while a: #Get the sum of the cubes [6]
... sm += m*m*m #of the first 8 positive
... n -= 1 #integers divisible by 7
... m += 7
... if n == 0:a = False [7]
...
>>> print('The required sum is ' + repr(sm)) [8]
The required sum is 444528
Fig. 3.9 Python Interpreter sequence for Examples 3.3 and 3.4
N, m, sm = 8, 1, 0
n=0 n +0
n
sm = sm + m*m
m = m+ 2
Decrement n
Output sm
end
remains True. It is changed to False in the loop when n becomes zero [7]. [7]
illustrates the use of keyword if. Like while, the if statement checks for a
condition; on the condition being satised, the statement (group of statements)
following is (are) executed. The condition being tested is whether n == 0. If the
28 3 Simple Programs
Fig. 3.11 Python Interpreter sequence for Examples 3.5, 3.6, and 3.7
Here the condition a being true is being tested in [2] with the use of is. The
block of three executable statements up to [3] is executed as long as a is True. As
soon as a = False the loop execution stops. The operation is not can be used
similarly. a is not b has value True as long as a and b are not identical.
Example 3.6 Identify all the numbers between 100 and 1000 which are divisible by
11 and 13 and output them.
The program sequence [5][8] in Fig. 3.11 identies and outputs these numbers.
The condition that the number represented by b is divisible by 11 (b%11 == 0) as
well as by 13 (b%13 == 0) is tested using the single condition in [7]. It uses the
logical operator anda keyword; p and q is True only if p is True and
simultaneously q is also True. The logical operator or can be used in a similar
manner. p or q is True if either p is True or q is True.
Example 3.7 Identify the smallest number greater than 10000 which is a power
of 29.
The routine follows from [9] in Fig. 3.11. Starting with 29 we take its successive
powers. The process is continued without break until the number crosses the value
10000. Once this value is crossed execution breaks out of the loop as specied by
[11]. breaka keywordexits from the current loop on the specied condition
being true. while True is always true: hence in the absence of the conditional
break statement within, the loop execution will continue ad innitum. Incidentally
the desired number here is 24,389(=293).
Example 3.8 Obtain the sum of the cubes of all positive integers in the range [0, 10]
which have 3 as a factor.
The segment [1][4] in the Python Interpreter sequence in Fig. 3.12 is the rel-
evant program. This simple example also illustrates the use of the range()
function.
The scope of the range function is claried in Fig. 3.13. range() species a
sequence of integersoften used to carry out an iteration as done here. The rst
integer signies the start for the range and the second one its termination. The rst
can be absent in which case the default value is taken as zero. The third stands for
the interval for the range; if absent the default interval value is taken as unity. All
the three integers can be positive or negative. The range specication should be
realizable.
range(5), range(0, 5), range (0, 5, 1) all these specify the range {0, 1, 2, 3, 4}.
range (2, 3, 1) and range (2, 3) specify the same range {2, 1, 0, 1, 2}
range (10, 5, 2) implies the range {10, 8, 6, 4, 2, 0, 2, 4}.
range (2, 4, 1) is erroneous.
In the context here the suite of executable statementsin fact the single one in
[2]: sm += i3 is executed over a specied rangenamely 010 with an interval of
3. The routine outputs the sum 33 + 63 + 93 as can be seen from [3]. [4] also
achieves the same output. Both are shown here to bring out the fact that a string can
be specied through or .
30 3 Simple Programs
Fig. 3.12 Python Interpreter sequence for Examples 3.8 and 3.9
Optional: incrementing interval for the range; if absent, the default interval is 1
Fig. 3.13 Structure and scope of range() in Python: Note that a, b, and c should be integers or
functions which return an integer
3.3 Conditional Operations 31
Example 3.9 In the range [2, 10] identify the integers as odd or even and output
accordingly.
The sequence [5][8] in Fig. 3.12 executes the desired task. It also illustrates the
use of the keyword continue. The outer loop is executed for all m in the range
210. If (n%2 == 0)that is n is an even numberthe value is output as an even
number. Then the routine continues with the loopignoring the sequence fol-
lowing. This is implied by continue. Of course if n is odd, (n%2 == 0) is not
satised and the loop execution continues with the rest of the executable lines. Here
the number concerned is output as an odd number. If the continue is replaced
by break the loop will terminate after the rst execution following satisfaction of
condition (n% == 0)after the rst even number (=2) is output. The program has
been repeated as a sequence following [9] with continue being absent. Here the
even numbers are output with an even number tag. Apart from this they are also
output with tag number since line [8] is also executed here. The program is
included here to clarify the role of the keywordcontinue.
px2 qx r 0
where p, q, and r are constants. The solutions can be directly obtained using the
formula
p
q q2 4pr
x
2p
6. Obtain bm3. If bm3 > a, we know that the cube root of a lies between b1 and
bm. Assign bm to b2 as b2 = bm. Proceed to step 8.
7. If bm3 a in step 6, we know that the cube root of a lies between bm and b2.
Assign bm to b1 as b1 = bm.
8. If bm and b1 (or bm and b2) are sufciently close, we take bm as the cube
root value with the desired level of accuracy; else we go back to step 5.
9. The iterative procedure outlined above is depicted in flowchart form in
Fig. 3.14. The Python code for the example is in Fig. 3.15.[1][6]. The result
is in [7]. We have introduced a counternin the routine to keep track of the
number of iterations gone through. If n exceeds a preset limit we terminate the
program. Here the limit has been set as 20. When execution is completed we get
the desired cube root of a as 2.1552734375; the solution is after 10 iterative
cycles.
start
a = 10
b, b3, n = 1, 1, 0
no yes
b3 < a
b1, b2 = b-1, b
Increment b
yes b2-b1>0.001 no
and n < 20
Increment n
bm = (b1+b2)/2 Output bm
no a > b3 yes
b2 = bm b1 = bm
end
>>> a = 10 [1]
>>> b, b3, n = 1, 1, 0 [2]
>>> while b3 < a: [3]
... b +=1
... b3 = b**3
...
>>> b1, b2 = b-1, b
>>> while (b2 - b1 > 0.001) and (n < 20): [4]
... n += 1
... bm = (b1 + b2)/2.0
... if a > bm**3:b1 = bm [5]
... else: b2 = bm [6]
...
>>> print('bm = ', bm, ', n = ', n, ',a = ', a)[7]
bm = 2.1552734375 , n = 10 ,a = 10
>>> b, b3, n = 1, 1, 0 [8]
>>> while b3 < a: [9]
... b +=1
... b3 = b**3
...
>>> b1, b2 = b-1, b [10]
>>> while (abs(b3-a)/a > 0.001) and (n < 20): [11]
... n += 1
... bm = (b1 + b2)/2.0
... if a > b3: b1 = bm [12]
... else: b2 = bm [13]
...
>>> print('bm = ', bm, ', n = ', n, ',a = ', a)
bm = 2.154296875 , n = 9 ,a = 10
The routine also illustrates the use of keywordelse. else is always used
in combination with if to steer a routine through one or the other alternative code
groups. The iteration processreduction in the search range of solution in suc-
cessive iteration cycles gone through is depicted in Fig. 3.16. The following are
noteworthy here:
The approach followed is of a divide and conquer type; with each successive
step the search range is halved. In turn the % error or indecision in the value of
the cube root obtained is also halved.
Since 210 = 1/1024, the condition b1 b2 0.001 (=1/1000) is achieved in 10
successive iterations. Hence n = 10 when the iteration stops.
If denotes the accuracy specied ( = 0.001 here), as reduces the number of
iterative cycles required to achieve the specied accuracy increases. In fact the
number of iterative cycles required is dlog2 ee.
Beyond a limit the cumulative effect of truncation errors will dominate, pre-
venting further improvement in accuracy (error propagation is not of direct
interest to us here).
2.15527343753 10 = 0.01168391. The fractional error in the cube value is
0.001168391 = 0.117%.
34 3 Simple Programs
2.0 3.0
1
2 2.0 2.5
3 2.0 2.25
2.125 2.25
4
2.125 2.1875
5
2.125 2.15625
6
2.1328125 2.15625
7
2.138671875 2.15625
8
2.1396484375 2.15625
9
Fig. 3.16 Narrowing of the search range in successive iterations for Example 3.10
The program is run with the termination condition altered to that in [11] in
Fig. 3.15. The accuracy for termination is specied in terms of the cube value
(in contrast to the last case where it was in terms of the cube root value). The
program stops after nine iterations. The cube root value obtained is
2.154296875. Correspondingly the fractional error is |(2.1542968753 10)|/
10 = 0.0001919.
As mentioned earlier the successive bifurcation procedure outlined here can be
used to seek solution for a variety of equations. More often we seek solutions for
x such that f(x) = 0 where f(x) is a specied function of x. For the above case
3.4 Iterative Routines 35
f(x) = 10 x3. In all these cases we should have a clear prior idea of the possible
range of solutions and the number of solutions in the range. This is to prevent a
wild-goose-chase situation.
The sequence in Fig. 3.17 is a slightly altered approach to the problem in
Example 3.10. Here the search interval in every iterative cycle is reduced to 1/3rd of
the preceding one[1] and [4]. As in the preceding case starting with one, b is
successively incremented until a value of b whose cube exceeds a is identied. The
desired cube root of a lies between b 1 and b. This base interval is divided into
three equal intervals ([1] and [2])ba to bb, bb to bc and bc to b itself. a is
compared with ba3, bb3, and bc3 and the interval where it lies is identied. This
forms the base interval for the start of the next iteration. It is again divided into three
equal segments[4](each of length 1/3rd of the previous case) and ba, bb, and bc
reassigned to the respective new segment boundary values. The iterative cyclic
process is continued until the interval is close enough to zero. The acceptable
interval limit specied to stop iteration here [3] is 0.0001. The cube root value is
obtained in eight iterations; its value is 2.154448000812884 2.1544480008128843
10 = 0.00018535 is the error in the cube value.
The example here also illustrates the use of keywordelif (stands for else
if)[5]. The condition chainif elifelif elifelse can be
used judiciously to test multiple conditions and steer a routine to respective code
segments.
The iteration termination has been specied here in terms of accuracy in the root
value. If necessary it can be specied in terms of the same in the cube value as was
done earlier in the approach using bifurcation of the intervals.
Fig. 3.17 Python Interpreter sequence with the altered approach for Example 3.10
36 3 Simple Programs
3.5 Exercises
P 1=j
6
1. x i2 jij j : Write a Python program to evaluate x for:
yx x 1 3:1
yx x2 3:2
Start with a value for x, substitute it in (3.1) to get y. Use this value of y in (3.2)
and solve for next approximate value of x. Continue this iteratively until the
difference in successive values of x is within the acceptable limit. If x does not
converge within a specied number of iterations, give up! Write a program to
solve the given equation for x. Start with x = 0 as initial value. Solution of (3.2)
yields two values for x; proceed with both.
Write a program to solve for x starting with (3.2) and try it with initial value
x = 0.
6. Consider the cubic equation y(x) = x3 + ax2 + bx + c = 0. Since y(0) = c, y(x)
has a real root with a sign opposite that of c. If c is positive, one can evaluate
y for different values of x (say 0.1, 1.0, 10.0, 100.0) until y is negative.
Then a negative root can be obtained following the algorithm in Example 3.10.
If c is negative a similar procedure can be followed with positive values of x to
extract a positive root. The remaining roots can be obtained by solving the
remaining quadratic factor. Write a program to solve a cubic polynomial. Solve
the cubic for the sets of values(1.0, 1.0, 1.0), (1.0, 1.0, 1.0), (1.0, 10.0,
10.0), (1.0, 10.0, 10.0), and (1.0, 10.0, 10.0), of the set (a, b, c).
7. Newton-Raphson method: the method solves y(x) = o for x using a rst degree
polynomial approximation of y as
dy
y1 y0 dx
dx x0
Solving this for y = 0 yields the solution x0. The procedure can be extended to
solve y(x) = 0 for x.
38 3 Simple Programs
Get two points (x1, y1) and (x2, y2) as on the curve with y1 and y2 being of
opposite signs. Form (3.3) and get x3. Evaluate y3 by substituting in the given
function. If y1 and y3 are of different signs, form the equation similar to (3.3) for
next iteration using (x1, y1) and (x3, y3); else use (x2, y2) and (x3, y3) for it.
Continue the iteration until solution (or its failure!). Write a Python program for
the iterative method. Solve the equations in Exercise (7) above.
9. An amount of c rupees is deposited every month in a recurring deposit scheme
for a period of y years. Annual interest rate is p%. Write a program to get the
accumulated amount at the end of the deposit period with compounding done at
the end of every year. Write a program to get the accumulated amount if the
compounding is done monthly. Get the accumulated amounts for c = 100,
p = 8, and y = 10.
10. A bank advances an amount of d rupees to a customer at p% compound
interest. He has to repay the loan in equated monthly installments (EMI) for
y years. Write a program to compute the EMI (EMI based loan repayment is the
reverse of the recurring deposit scheme). Get the EMI for d = 10, 000, p = 10,
and y = 10.
11. Depreciation:
a. In straight line depreciation if an item (of machinery) is bought for p ru-
pees and its useful life is y years the annual depreciation is p/y rupees.
b. In double declining balance method if the annual depreciation is d%, the
book value of the item at the end of y years is (1 (d/100))y times its
bought out value. The annual reduction in book value is the depreciation.
c. In Sum of years digit method of depreciation, with y as the useful life in
years of an item form s as the sum of integers up to (and including) y. The
depreciation in the xth year is (x/s) times the bought out value.
Write programs to compute depreciation by all the three methods. With
p = 80,000, y = 10, d = 10, get the depreciation and the book values at the end
of each year for ve years.
12. Copper wire (tinned) is used as the fuse wire to protect electrical circuits.
With d mm as the diameter of the wire used, the fusing current If is 80 d1.5
amperes. Adapt the program in Example 3.10 to get the fusing current for a
given d. For the set of values of d {0.4, 0.6, 0.8, 1.0, 1.2} get the fusing
currents.
13. Coffee Strength Equalization: Amla has three identical tumblersA, B, and C.
Each is 80 % full. A has coffee decoction and B and C have milk. She has to
prepare three tumblers of coffee of equal amount and equal strength (accurate to
1%) without the aid of any other vessel. From A she pours coffee into B and C
and lls them; after this she goes through a similar cyclic pouring sequenceB
to C, C to A, A to B and so on. How many times does she have to do this to get
the required set? Solve this through a Python program.
14. Recurring Deposits and Equated Monthly Installment repayment of loan
amount: a xed amount of Rs 100/- is deposited every month in a recurring
3.5 Exercises 39
References
Guttag JV (2013) Introduction to computation and programming using Python. MIT Press,
Massachusetts
Kreyszig E (2006) Advanced engineering mathematics, 9th edn. Wiley, New Jeresy
van Rossum G, Drake FL Jr (2014) The Python library reference. Python software foundation
Chapter 4
Functions and Modules
4.1 Functions
Functions are entities which accept one or more arguments as inputs, execute a
specic code block, and return the result of execution of the specied code block to
the parent.
Example 4.1 Form a Python function to return the harmonic mean of all the
numbers in the interval [100, 200] which are divisible by a. Get the harmonic mean
for a = 9.
mhthe harmonic mean of n1, n2, n3, nkis given by (Sullivan 2008)
1
m h Pk 1
4:1
i1 ni
The desired code block for the function is in Fig. 4.1. The function denition has
its rst statement starting with the keyword def. It is followed by the name of the
function (with a gap of one or two spaces for clarity). All the input arguments are
specied within the parentheses. The :colonat the end implies that the code
block of the dened function follows. The code block is indented with respect to
def by a denite amount (2 to 4 spaces). Preferably the rst line in the code block
here is a stringa statement stating the scope of the function. This is not
41
42 4 Functions and Modules
mandatory. But its inclusion is preferred (for reasons to be claried later). The code
body forming the function follows. returna keywordis the last statement in
any function. The completed output is returned to the calling program. In the
specic case here hm (a) returns the desired harmonic mean valuethe harmonic
mean of all the numbers divisible by a in the interval [100, 200]. The function is
coded in Python interpreter as shown in Fig. 4.2. hm(9) signies calling of the
function hm (a) with argument a = 9. It returns the desired hm value[1].
hm(a = 9) is a more flexible form of the above function[2]. The argument
a has been given a value of 9 (default value). Calling the function without speci-
fying any argument or its value returns the hm value for the default value of a (=9
here)[3]. Calling it with any other value of the argument returns the corre-
sponding hm valueas can be seen from [4] and [5] which return the function
values for 11 and 13 respectively.
To bring out the function details in its versatile form let us consider another
example.
Example 4.2 Do the coding for a function that returns a1=p to the desired accuracy.
Run it for a = 50 and p = 4. The accuracy in the computed value has to be better
than 0.2%.
A function root_1() has been dened and the code for it given in Fig. 4.3. Along
with the results of specic interpreted runs. root_1() can be seen to be a versatile
variant of cube root routine in Example 3.10. Here we obtain the pth root of a. i is
the initial search step sizethat is we start with b = 1 and increase it successively
by i until bp exceeds a. We start the iteration cycles with the interval{b, b i}
and proceed with successive bifurcation of the interval. The iterations stop when the
desired accuracydeltais achieved or the number of iteration cycles reaches the
specied limit nn. The iterative process is the same as in Example 3.10.
4.1 Functions 43
...
>>> hm(9) [1]
13.414098020943065
>>> def hm(a=9): [2]
... h, b = 0, 100 + a - 100%a
... while b < 200:
... h += 1/b
... b += a
... return 1/h
...
>>> hm() [3]
13.414098020943065
>>> hm(11) [4]
16.513049663029967
>>> hm(13) [5]
17.92184242238757
>>>
Fig. 4.2 A Python function to get the harmonic mean as in Example 4.1
When a function is called the argument list order is not a rigid constraint. If the
order remains unchanged only the values of the arguments need be fed as in [5]
here; else the arguments can be fed in any order as done in [7]. The corre-
sponding results are in [8].
If the function is desired to compute a quantity and return the same will be done
as was the case in Example 4.1 (return 1/h). But if the function need not return
anything specic it is indicated by a return statement as in [8] here. The
Interpreter will complete execution of the function and return to the main program.
4.1 Functions 45
Many situations call for the use of single line functions. The keyword lambda
facilitates this in a compact form. It denes an anonymous function of the specied
arguments. The function output can be assigned to any desired object to suit the
context. The details are brought out through a simple example in Fig. 4.5. The
function here has a single argumentc. Itthe one line functionevaluates c2
and assigns it to z. z is evaluated for the argument value of three (as nine) in the
following line. The Python Interpreter sequence in Fig. 4.6 further illustrates the
use of lambda. z is evaluated for the argument value four (as sixteen) in [5] and
[6]. z has been assigned to a and b and evaluated for argument values of two and
four (as four and sixteen) in [5] and [6] respectively.
[7] is an example where lambda is a function of two argumentsx and y. (x/
y) is evaluated and assigned to zz. The ratio (4/5) is evaluated as zz(4,5) in [8]
(=0.8). The equivalent function denition in terms of yy and its evaluation for the
argument set (4,5)yy(4,5)follow in [9] to [11]. The simpler (and more com-
pact) implementation using lambda is more convenient in many situations.
4.1.2 Recursion
routine for (1/n!) follows from [4]. If n = 0 the function returns unity (0!). For all
succeeding values of n the routine calls itself recursively from the preceding value
of n and evaluates (1/n!) as (1/(n 1!))/n.
The code sequence from [7] illustrates an instructive aspect of functions. Two
functions can be dened separately in a Python Interpreter sequence. The latter can
call the former within it as is done here. Of course being an illustrative example this
routine has only one executable statement within it to computecalling and
printing out the output. But a practical situation can be more involved. A function
can call any of the previously dened functions, within itself any number of times.
This makes room for a well-structured programming approach. A main program can
be composed of a number of smaller programsif necessary repeatedly used. Each
such smaller program can be coded separately and then called within the major
program. Such calls within calls can be done as many times as required.
The arguments used in a function denition can be of any type without
restrictions as long as they are meaningfully used within the function. Example 4.4
is an illustration where a function is used as an argument in the denition of another
function.
Example 4.4 Obtain the sum of the cube roots of all the integers from 3 to 8
(inclusive).
The function demb_1 in the demb module returns the cube root of a given
number. The routine is reproduced in Fig. 4.8. The algorithm used in the function
root_1 for Example 4.2 is followed to extract the cube root here. The 0th element
of the returned tuple is the cube root [1]. The function aa() [2] in Fig. 4.8 accepts
three argumentsbb as a function, and c and d as two numbers. The function bb
(jj) is evaluated for all numbers from c to d at intervals of unity and bb[jj][0]the
0th element of the returned tupleis summed up and returned. The function aa() is
called in [3] with demb.demb_1 as the function argument and 3 and 8 as the two
numbers. 10.46875 is the desired sum. It is veried by direct computation in [3].
One function can have other functions dened within it. If necessary such a
daughter function can be returned for use outside. A few toy examples considered
in the Python Interpreter sequence in Fig. 4.9 illustrate some possibilities. snn(x) in
[1] is dened as a function which computes sin(x) using the series (Sullivan 2008)
x3 x5 x7 x9
sin x x ... 4:2
3! 5! 7! 9!
ntht(x, mm) has been dened as a function [2] inside snn(x). It computes a
3 5
term in the series of (4.2) recursively. sx x x3! x5! is computed as a rst
approximation of sin(x) [5]. Subsequent terms are computed and added to sx
48 4 Functions and Modules
def demb_1(a):
#get cube root of a through binary segmentation
# a should be a number > 1
#Termination on achieving root value with accuracy
b, b3, n = 1, 1, 0
while b3 < a:
b +=1
b3 = b**3
#Cube root of a lies between b & b-1
b1, b2 = b-1, b
while (b2 - b1 > 0.001) and (n < 20):
n += 1
bm = (b1 + b2)/2.0
if a > bm**3:b1 = bm
else: b2 = bm
return ([bm, n, a]) [1]
Fig. 4.8 A Python Interpreter sequence illustrating a function forming an argument input for
another function
repeatedly until the fractional addition from the next term is negligible [6]. sin(x)
and the number of terms ((nn + 1)/2) is the number of terms used for the
approximate computation) are returned in [7]. sin(0.2), sin(0.3), sin(0.4) are com-
puted using snn [8]. The values obtained here can be compared with those com-
puted directly using math.sin (x) [9]. As demanded by the context any
number/type of functions can be dened within another function in this manner.
ffa() is dened in [10] to return function ffb [12]. Here ffb () has been dened
within ffa() itself as another function to return sin(xx), xx being the argument. [13]
has ffa assigned to bb. It is conrmed as a function in [14]. bb() in [15] is a
function local to ffa [16] consistent with def in [10]. With 0.3 as argument bb()
(assigned to cc and) is evaluated as sin(0.3) [17]. faa () in [19] returns a more
involved function fbb. Function fbb [20] as dened here accepts two arguments
xx and nn. bb is sin(x) or cos(x) depending on whether nn = 1 or 2. For all other
values of nn, bb[21] is tan(x). As an illustration all these cases have been used in
4.1 Functions 49
[25] to return sin(0.3), cos(0.3), and tan(0.3) respectively. The example illustrates
the possibility of dening functions, doing a set of operations with them, and
returning a more comprehensive and encompassing function.
Any object in its environment in Python (like a variable, a function and c.) can be
read for its value using proper references. The value can be altered and reassigned
in the same environment when possible. Declaring an object as global or
nonlocal makes it possible to change the scope of access of the object for
reading or reassigning in different ways (Rossum and Drake 2014). The possibilities
are discussed here through small examples involving numbers.
a1 is a number with an assigned value (=3.1) in [1] in the Python Interpreter
sequence in Fig. 4.10. Function ff1 accepts b1 as an argument [2] and returns
a1 b1. With b1 = 2, ff1(2) returns 6.2 in [3]. When ff1 is called Python searches
for a1 within ff1 rst; if not available here the scope of search is widened to the
immediate outer domain. In the specic case here a1 is available there with an
assigned value of 3.1. With this value of a1, a1 b1 is computed and returned. If
a1 is not available there either, the search continues in the next outer domain and so
on. If a1 were not available after all such possibilities are exhausted, an error is
returned and execution terminated. The process of search for availability of any
object in this manner is automatic. It obviates the need for redening or reassigning.
ff2 [4] has a1 = 3.0 as an assignment and returns a1 b2. Hence the function
ff2(3) in [5] returns 6.0. a1 within ff2() is different from a1 in [1]; the two have
separate identities. a1 within ff2 () is automatically destroyed as one exits ff2(). a1
outside ff2() remains intact (with a value of 3.1) as seen from [6]. In case a variable
(or any object for that matter) dened within a function is to be available outside, it
has to be declared as global as done in [8]within the function denition of ff3()
(See also Exercise 5.6). As many entities as desired can be declared as global in
this manner. With an assigned value of 4 for a2, ff3(2) has been evaluated as 8.0
and returned in [9]. a2 is accessed in the following line and its value conrmed as
4.0the last value assigned to it. a3 = 44 in [10]. a3 has been declared as global
in [12] within function dem_f3() [11]. The assigned value is 55. a3 is accessed in
[13] (outside the function denition of dem_f3()). Its value remains unaffected at
44. But after dem_f3() is called [14] a3 becomes global. Its value is 55 assigned
within the function call; [15] conrms this.
4.1 Functions 51
Fig. 4.10 A Python Interpreter sequence illustrating the use of global and nonlocal declarations
Table 4.1 The prints in the same sequence as in the execution sequence in Fig. 4.10: for brevity
only the flags are retained in the column in the left
Identication of Details and reasons for the printed values for b1, b2, b3
printed line
f4 [27] b1, b2, b3 values as assigned in [16] before function dem_b0() is
called in [28]
f2 [29] When dem_b0() is called, after b3 = 31, the only other executable
statement within it [24] is executed; b3 = 31its assigned value in
[17]; b1 and b2 remain unaltered
f1 [30] Subsequent to [24] dem_b1 () is called within dem_b0(). b2 and b3
have been declared as nonlocal and global respectively. They are
assigned values of 22 and 32 [20] and [22]. These values are reflected
here
f3 [20] Since no new assignments have been made b1, b2, and b retain their
values as above
f5 [32] As in all previous cases b1 remains unchanged at 10. b2 being a
global object the last assigned value (=22though two levels inside) is
retained. b3 was nonlocal. The assigned value of 32 in [20] is valid
only within dem_b0(). Once you come out b3 used there is
destroyed. b3 (an altogether different object) as assigned in [16] has
retained its value
4.2 Modules
A function dened and used in a Python Interpreter sequence is lost when you quit
the Python session. It is desirable to save a function developed, tested, and
debugged for later use. Such reuse can be direct or indirect for use within another
function dened/used later. This is facilitated by the use of module in Python. In
general a module is a le containing a set of denitions (of functions) and state-
ments. It is saved with the extension .py. A module can be generated in a text
editor and saved wherever desired. Let us consider the routine of Example 4.2. The
function dened for the pth root has been saved in a module in the current directory
with the namesolun.pyas shown in Fig. 4.11. The only content of the
module solun.py is the function root_1. The Python Interpreter sequence in
Fig. 4.12 uses this module to run root_1. The module can be invoked with the
command import solunas in [1]. With that the dened function is avail-
able for the interpreter for execution. The command solun.root_1() in [2] exe-
cutes the function root_1 from the module (and outputs the cube root of 10) as can
be seen from the result in the following line. With the argument value for a as 100,
the function is again called and executed in [3] with p, nn, delta and i retaining
their default values. The cube root extraction with i = 8 in [4] takes 13 iterative
cycles to achieve the same accuracy.
The function has been assigned to rt in [6]. With this the whole function can be
accessed directly for execution using rt. The querysolun.root_1in [5] returns
the information<function root_1 at 0x7f4b60fcab70>that it is a function
(starting) at memory location 0x7f4b60fcab70. The queryrtalso returns an
4.2 Modules 53
Fig. 4.11 A Python module with the function in Example 4.2 as its content
Fig. 4.12 Python Interpreter sequence invoking the module in Fig. 4.8
54 4 Functions and Modules
identical information; a clarication that rt too refers to (points to) the same
function. However the access to root_1 here is easier than using solun.root_1
(involves less key strokes effort?). The cube root of 200 has been obtained suc-
cessively in [7], [8], and [9] using function rt. In [7] the search starts with the basic
range [5, 6] since 200 lies between 53 (=125) and 63 (=216). The corresponding
ranges for [8] and [9] are [1, 9] and [1, 25] respectively. In turn the number iteration
cycles for completion of execution increases to n = 13 and n = 15 respectively.
Incidentally the error value with [9] is less (at 0.04563) than in [7] or [8] (at
0.06129).
If the module to be imported is not in the current directory it can be imported by
calling it from the source directory. alpha_1.py is a Python module in the
directory demo_s. It is imported with the commandfrom demo_s import
alpha_1 in [1] in the Python sequence in Fig. 4.14. The module at the time of
import is in Fig. 4.13a. a has been assigned a value 11.3; b, c, and d are assigned
values in terms of a. a will be assigned the value (of 11.3) and b, c, and d com-
puted conforming to their denitions in Fig. 4.13a at the time of import; these are
done once for all. To access a, it has to be specied as alpha_1.a(a simple a
implies the entity called a in the main running sequence (if at all it exists). The
other quantities within alpha_1 can be accessed similarly [2]. An additional line
has been added to alpha_1.py and the le saved as shown in Fig. 4.13b. An
attempt to access dd as alpha_1.dd fails [3] since the import was effected prior
to the addition of the line involving dd.
The imp module in Python facilitates a renewal. To use this, the imp module
has to be imported [4]. A subsequent imp.reload(alpha_1) loads alpha_1
afresh [5]. In the present case a will be assigned the value 11.3 itself afresh. b, c,
and d too will be computed afresh. dd will be assigned the value of d3 conforming
to its assignment in Fig. 4.13b. alpha_1.dd can be accessed and its value dis-
played as can be seen from [6]. In Fig. 4.13c alpha_1.py has been enhanced
further. A function avr_var() to compute the average value and the variance of the
ve quantitiesa, b, c, d, and ddand display them has been added. As earlier an
attempt to use the function [7] fails because the already imported version of the
module is not aware of this addition. Once again alpha_1.py has to be refreshed
through the command imp.reload(alpha_1) as in [8]. Subsequent access of
alpha_1.avr_var() is successful as can be seen from [9].
The function avr_var_1() added in Fig. 4.13d uses the values assigned to a, b,
c, d, and dd at the time of reload to compute the average value and variance. Here
the same function as the earlier one is repeated with the ve inputs a1, a2, a3, a4,
and a5. alpha_1 is again reloaded [10]. The numerical values for a, b, c, d, and
dd obtained above are assigned to the set and avr and var computed again sepa-
rately. The values here [11] are identical to those following [9] (Fig. 4.14).
The executable statements in a module are useful to initialize the variable
values/objects prior to their use in functions dened subsequently in the same
module as was done in the trivial/illustrative example here.
An alternative to reload operation explained above is to quit the running
Python sequence and start another one afresh. The new import imports the
4.2 Modules 55
(a) (b)
a = 11.3
a = 11.3 b = a/3
b = a/3 c = b*b
c = b*b d = c*2
d = c*2 dd = d**3
(c)
a = 11.3
b = a/3
c = b*b
d = c*2
dd = d**3
def avr_var():
'Alternate average & varaince of 5 numbers'
av = (a + b + c + d + dd)/5
var = (a-av)**2 + (b-av)**2 +(c-av)**2 +(d-av)**2
+(dd-av)**2
var = var/5
print ('average = ', av, ': variance = ', var)
return
(d)
a = 11.3
b = a/3
c = b*b
d = c*2
dd = d**3
def avr_var():
'Alternate average & varaince of 5 numbers'
av = (a + b + c + d + dd)/5
var = (a-av)**2 + (b-av)**2 +(c-av)**2 +(d-av)**2
+(dd-av)**2
var = var/5
print ('average = ', av, ': variance = ', var)
return
Fig. 4.13 Four successive stages in the development of the module alpha_1.py
56 4 Functions and Modules
Fig. 4.14 Python Interpreter sequence testing and developing successive stages of the module
alpha_1.py
updated version of alpha_1.py. This process is attractive only if one, two, or three
modules have been imported for the session. In the case of long running sequences
with a number of imported modules present, a reload is preferable.
Example 4.5 The innite series for exp(x) (Sullivan 2008) is:
X
1 n
x
expx
n0
n!
(a) (b)
def xprx(x): def xcsx(x):
'compute exp(x)' 'compute cos(x)'
y, z, i = 1.0, 1.0, 1 y, z, i = 1, 1, 1
while True: while True:
z *= x/i z *= -x*x/(i*(i+1))
i += 1 i += 2
if abs(z)< 1.0e-10: break if abs(z)< 1.0e-10: break
else: y += z else: y += z
return y return y
(c) (d)
def xsnx(x): def xprx(x):
'compute sin(x)' 'compute exp(x)'
y, z, i = x, x, 2 y, z, i = 1.0, 1.0, 1
while True: while True:
z *= -x*x/(i*(i+1)) z *= x/i
i += 2 i += 1
if abs(z)< 1.0e-10: break if abs(z)< 1.0e-10: break
else: y += z else: y += z
return y return y
def xsnx(x):
'compute sin(x)'
y, z, i = x, x, 2
while True:
z *= -x*x/(i*(i+1))
i += 2
if abs(z)< 1.0e-10: break
else: y += z
return y
def xcsx(x):
'compute cos(x)'
y, z, i = 1, 1, 1
while True:
z *= -x*x/(i*(i+1))
i += 2
if abs(z)< 1.0e-10: break
else: y += z
return y
Fig. 4.15 Routines for Example 4.5 a Routine for exp(x) b Routine for cos(x) c Routine for sin
(x) d Module trgf with the routines for exp(x), cos(x), and sin(x)
The nth term in the series for exp(x) can be expressed in terms of the (n1)th
term as
xn xn1 x
:
n! n 1! n
58 4 Functions and Modules
Hence the nth term can be evaluated by multiplying the (n1)th by x/n. With this
the code for exp(x) is given in Fig. 4.15a. The summation is continued until the
value of the new term is less than 1010.
The series for cos x is
X
1
x2n
cos x 1n
n0
2n!
It has only the terms involving even powers of xwith the alternate terms being
negative. Each term is evaluated from the previous one by multiplying it by
x2 =ii 1. Here again the summation is continued until the contribution from the
new term becomes less than 1010. The code for the function is in Fig. 4.15b. The
innite series expansion for sin x is
X
1
x2n 1
sin x 1n
n0
2n 1!
The code for sin x is done on the same lines as that for cos x; it is in Fig. 4.15c.
The functions xprx(),xcsx(), and xsnx() are for exp(x), cos(x), and sin (x) respec-
tively. They are all in the module trgf.py (trigonometric functions).
The module has been imported into the Python Interpreter session reproduced in
Fig. 4.16 in [1]. exp(0.5j), cos(0.5), and sin(0.5), have been evaluated using the
respective functionstrgf.xprx, trgf.xcsx, and trgf.xsnx in [2], [4], and [6]
respectively. One can see that exp(0.5j) [3] is equal to cos(0.5) [5] + j sin(0.5) [7].
exp(0.5) has been evaluated in [8]. trgf.xprx has again been used to evaluate e [9]
as exp(1).
0.8exp(x)
Sin x
solutions
Fig. 4.17 Sketches of functions 0.8exp(x) and sin x showing the solutions for Example 4.6
def cff():
'print cos(x) for x such that cexp(x) - sin(x) = 0
for 0< x < 1.6' [7]
aa = solna()
print('aa = ', aa)
return
The default values for a and d have been used in this example. a exp(x)sin
(x) can be solved in a similar manner for any other value of the constant a. Desired
accuracy can be achieved by suitably redening the value of d (Fig. 4.19)
Fig. 4.20 Python Interpreter sequence to illustrate the access details of exponential and related
functions in the math module
62 4 Functions and Modules
Python has a number of built-in functions. They are always available for use. Use of
some of them in a limited form has been explained earlier (more of this later).
A number of built-in modules are also availablelike math and random. They
can be imported and the functions within used by programmers as was done with
the dened modules here.
Table 4.2 Exponential and related functions: illustrations for use are in the Python Interpreter
sequence in Fig. 4.20
Access Scope Reference
math.exp Returns exp(a) [1], [2]
(a)
math.expm1 Returns (exp(a) 1):preferable when a is close to zero. [3], [4],
(a) Results are compared in [3], [4], and [5] [5],
[6]
math.log Returns the natural log of a [7]
(a)
math.log1p Returns the natural log of (1 + a): useful when a is close to 0. [10],[11]
(a) [10] and [11] compare results with use of log a.
math.log2 Returns log2a: dlog2 ae is the number of bits in a in binary [12], [13]
(a) form as can be seen from [12] and [13]
math.log10 Returns log10a: [7]
(a)
math.log Returns logab: loge11 = log1011 loge10veried by [9] [7], [8],
(a,b) [9]
p
math.sqrt Returns a: [15]
(a)
q
math.pow p2 [14], [15]
(a,b) Returns a b
: [15] veries 2 2
Reference denotes the relevant lines in it
4.2 Modules 63
Table 4.3 Trigonometric and related functions: illustrations for use are in the Python Interpreter
sequence in Fig. 4.21
Access Scopeall argument values are in radians Reference
math.cos Returns cos a [1],[2]
(a)
math.sin Returns sin a [3], [4]
(a)
math.tan Returns tan a [5],[6],
(a) [7],[8]
math. Returns the value of the hypotenuse of right-angled triangle [9], [10]
hypot(a, with a and b as sides
b)
math. Returns value of a in degrees [11], [12]
degrees
(a)
math. Returns value of a in radians [13], [14]
radians
(a)
math.acos Returns value of acos (a) in radians [15], [16]
(a)
math.asin Returns value of asin (a) in radians
(a)
math.atan Returns value of atan(a) in radians
(a)
math. Returns value of atan(a) in radians in the range - to +the
atan2 quadrant being decided by the signs of a (sin) and b(cos)
(a/b)
Reference denotes the relevant lines in it
Table 4.4 Hyperbolic functions: illustrations for use are in the Python Interpreter sequence in
Fig. 4.22
Access Scope Reference
math.cosh Returns cosh a [1],[2]
(a)
math.sinh Returns sinh a [5], [6]
(a)
math.tanh Returns tanh a[11] and [12] verify tanh(a) = sinh [9],[10], [11],
(a) (a)/cosh(a) [12]
math.acosh Returns value of acosh (a) [3], [4]
(a)
math.asinh Returns value of asinh (a) [7],[8]
(a)
math.atanh Returns value of atanh(a) [14],[15]
(a)
Reference denotes the relevant lines in it
64 4 Functions and Modules
Table 4.5 Additional functions in math: illustrations for use are in the Python Interpreter
sequence in Fig. 4.23
Access Scope Reference
math.ceil(a) Returns the ceilingthe smallest integer a [1]
math.oor(a) Returns the floorlargest integer a [2]
math.copysign Returns a number having the sign of y and absolute [3]
(x,y) value of x
math.fabs(a) Returns the absolute value of (a) [4]
math.factorial Returns a! [5]
(a)
math.fmod(x, Returns x mod (y)use these with floating point [6]
y) numbers and x%y with integers
math.frexp (a) Returns a as a mantissa-exponent (m, e)pair such that [7]
a = m*(2**e)
math.ldexp (m, Returns m*(2**e) [8]
e)
math.fsum(a) Returns the sum of elements in a (a tuple/list or [9], [10],
similar sequence of numbers). [11]
Sum(a) is discussed later (in Chapter 5).
math.modf(a) Returns (m, e) pair representing a in the floating point [12]
format
Math.trunc(a) Truncates a as an integer and returns the same [13]
Reference denotes the relevant lines in it
4.3 Exercises
Fig. 4.21 Python Interpreter sequence to illustrate the access details of trigonometric and related
functions in the math module
66 4 Functions and Modules
>>> l1, l2, l3, l4, l5 = 0, >>> q1, q2, q3, q4, q5 =
1, 2, -1, -2 math.tanh(l1),
>>> m1, m2, m3, m4, m5 = math.tanh(l2),
math.cosh(l1), math.cosh(l2), math.tanh(l3),
math.cosh(l3), math.cosh(l4), math.tanh(l4),
math.cosh(l5) [1] math.tanh(l5) [9]
>>> m1, m2, m3, m4, m5 [2] >>> q1, q2, q3, q4, q5 [10]
(1.0, 1.5430806348152437, (0.0, 0.7615941559557649,
3.7621956910836314, 0.9640275800758169, -
1.5430806348152437, 0.7615941559557649, -
3.7621956910836314) 0.9640275800758169)
>>> n1, n2, n3, n4, n5 = >>> r1, r2, r3, r4 [11]
math.acosh(m1),math.acosh(m2) (1.1752011936438014,
,math.acosh(m3),math.acosh(m4 1.8134302039235095, -
),math.acosh(m5) [3] 1.1752011936438014, -
>>> n1, n2, n3, n4, n5 [4] 1.8134302039235095)
(0.0, 1.0, 2.0, 1.0, 2.0) >>> r1, r2, r3, r4, r5 =
>>> o1, o2, o3, o4, o5 = o1/m1, o2/m2, o3/m3, o4/m4,
math.sinh(l1), math.sinh(l2), o5/m5 [12]
math.sinh(l3), math.sinh(l4), >>> r1, r2, r3, r4, r5 [13]
math.sinh(l5) [5] (0.0, 0.7615941559557649,
>>> o1, o2, o3, o4, o5 [6] 0.964027580075817, -
(0.0, 1.1752011936438014, 0.7615941559557649, -
3.626860407847019, - 0.964027580075817)
1.1752011936438014, - >>> s1, s2, s3, s4, s5 =
3.626860407847019) math.atanh(q1),math.atanh(q
>>> p1, p2, p3, p4, p5 = 2),math.atanh(q3),math.atan
math.asinh(o1),math.asinh(o2) h(q4),math.atanh(q5) [14]
,math.asinh(o3),math.asinh(o4 >>> s1, s2, s3, s4, s5[15]
),math.asinh(o5) [7] (0.0, 0.9999999999999999,
>>> p1, p2, p3, p4, p5 [8] 2.0000000000000004, -
(0.0, 1.0, 2.0, -1.0, -2.0) 0.9999999999999999, -
2.0000000000000004)
>>>
Fig. 4.22 Python Interpreter sequence to illustrate the access details of hyperbolic functions in the
math module
Leibniz Series:
p 1 1 1
::
8 1 3 5 7 9 11
arctan(1.0):
X
1
8
p
n0
16 n2 16 n 3
4.3 Exercises 67
Fig. 4.23 Python Interpreter sequence to illustrate the access details of functions in the math
module detailed in Table 4.5
Nilakantha:
1 1 1
p 3 ...
234 456 678
Spigot:
X1
1 4 2 1 1
p
n0
16n 8n 1 8n 4 8n 5 8n 6
Prepare programs to evaluate the value of using each of the above series and
test each.
68 4 Functions and Modules
References
5.1 String
69
70 5 Sequences and Operations with Sequences
5.2 Tuple
g[0][0][2]
g[0][0][1]
g[0][0][1][1:4]
g[0][0]
g[0]
g[1]
Fig. 5.2 Element g [19] in the sequence in Fig. 5.1 showing the structure and identity of different
elements referred
72 5 Sequences and Operations with Sequences
5.3 List
Fig. 5.3 Python Interpreter sequence to explain the concepts of list, dictionary, and set
5.3 List 73
list and the second [7] a tuple, hllis a list of three elementsh1, h2, and
integer 44[8] and [9]. [10] conrms the third element in hllhll[2]to be an
integer. Different operations with lists and the elements in them are discussed later.
5.4 Dictionary
5.5 Set
Example 5.1 Count the number of vowels and the number of words in
Make me, oh God, the prey of the lion, ere You make the rabbit my prey
(Gibran 1926).
Also count the number of the letters a and b in the above quoteirrespective
of it being small letter or capital letter.
vow1(ss) in the module dem_wr reproduced in Fig. 5.4 is the Python program
for counting the number of vowels in the string ss. vls is a set with all the
vowelssmall and capital letters together ten in numberas its members [1]. The
function len(ss) represents (in [2]) the number of items in ss. In the present
context ss is a string and len(ss) is the number of characters in ss. Every
character in ss starting from ss[0] to ss[len(ss)-1]i.e., the last one, is
examined successively. If it matches any entry in vls (that is if it is a vowel), ca
counteris incremented [3]. The c value at the end of counting is returned. The
nal count value of c is the number of vowels in ss.
The Python Interpreter sequence in Fig. 5.5 has s1 as the given sequence [1].
len(s1) gives the number of characters in s1 as 71 [2]. With s1 as input vow1()
is run [3]. The number of vowels in s1 is seen to be 21.
def wrd2(ss):
'No of words in ss'
noalpha = {' ', ',','.'} [4]
w, l = 0, len(ss)
for i in range(l-1):
if (ss[i] not in noalpha) and (ss[i+1] in noalpha):
w += 1 [5]
return w
def wrd1(ss):
'No. of a /b in ss'
na, nb = 0, 0
aA, bB = {'a','A'}, {'b', 'B'} [6]
for i in range(len(ss)):
if ss[i] in aA: na += 1 [7]
elif ss[i] in bB:nb += 1 [8]
return (na, nb) [9]
Fig. 5.5 Module dem_wr.py with the Python routines for Example 5.1
all and any functions facilitate repeated testing for being true or false.
x == a and x == b and x == c can be implemented compactly using all by
testing x == l for all l in {a, b, c}. Similarly x == a or x == b or
x == c can be tested compactly using any.
Example 5.2 Identify all the numbers in the range {100, 200} which do not have
any of the numbers in {2, 3, 5, 7, 11, 13, 17} as a factor.
76 5 Sequences and Operations with Sequences
def anytst(lb):
'test for any failures'
# la is the list of students who failed in the class
# lb: given list
#Check whether any in given list has failed
la = 'a', 'b', 'c', 'd', 'e' [3]
for j in lb:
if any( j is k for k in la): [4]
print (j+' failed')
else: print (j+' passed') [5]
return
Fig. 5.6 Module dem_all.py with the Python routines for Examples 5.2 and 5.3
The routine is the function alltst(ab, ae) in the module dem_all.py repro-
duced in Fig. 5.6. S = (2, 3, 5, 7, 11, 13, 17) is a tuple of the given numbers [1] in
the function. For any j, the test all(j%l for l in s) in [2], tests whether j is
divisible by very one of the elements in s. If none of them divides j, the condition
(j%l for l in s) is True. If this condition is satised this specic j value is added
(appended) to the list k. The appending is done at the right end of k. It increases
the number of elements in k by one. The test is done for all j values in the specied
range (ab, ae).
[2] also brings out the generality of for in its use. for l in s implies for all
entries in s. Here s can be a tuple, list and so on; but all of them should be of
the same type. It should also match the type of j here. These are implied in the
use of for in the context.
[1] in the Python Interpreter sequence in Fig. 5.7 executes the routine for the
desired range{100200}. The output is the list in [2].
Use of the method append() has been illustrated here. aa.append(b) is a
command to add item b to aa. Here aa is a list; b can be any entity. It will be
appended to aathat is added to aa as its last element. In turn the number of
elements in aa increases by one.
Example 5.3 A list of students who failed in an examination is given as la.
A second list of students lb is input. Check to see whether anyone in lb has failed.
The function anytst(lb) in the module dem_all.py in Fig. 5.6 serves the
purpose. [4] checks whether j matches any entry in lathat is the name j is
5.6 Operators with Sequences 77
Fig. 5.7 Python Interpreter sequence for Examples 5.2 and 5.3
present in the failed list [3]; if so j is declared failed in the output; else j is declared
passed. It is done for every entry in lb as can be seen from [4].
The Python Interpreter sequence in Fig. 5.7 species a student list p in [3].
Status of all the students in p is tested and the desired results output in the following
lines.
The built-in function sum() takes a sequence and returns the numerical sum of the
items in it. All the items in the sequence are to be numbers. [2] in the Python
Interpreter sequence in Fig. 5.8 uses the sum function and computes the mean
value of the numbers in ll. The function sum() is a bit more general than the way it
is used here. It accept two arguments; sum(a, b) should have a as a sequence of
numbers. b should be reducible to a number and forms the biasb is added to the
sum of the elemental values of a. If b is absent the biasits default valueis taken
as zeroas in the example here. [3] illustrates the more generalized use of sum().
The eval() function in its simplest form accepts any expression as a string
and evaluates it. [4] is a trivial example; a1as evaluated hereis sin 0.3. x1 has
been assigned the numerical value 1.2 in [5]. a2 in [6] does the evaluation of sin
(0.3 + 0.1*x1) using this value of x1. With x2 = 0.4 in [7] a3 is evaluated in [8] as
(x12 + x22). The expression to be evaluated can be any built-in function or a
user-dened (if necessary imported) function; all its arguments have to be assigned
values beforehand for eval() as used here. [9] illustrates a more general use of
eval(). All the arguments used in eval() are made available through a dic-
tionary forming the second argument of eval(). Here a4 is evaluated as
(x32 + x42)x3 and x4 being assigned numerical values of 0.2 and 0.4 respec-
tively through the dictionary. In short eval(alpha, beta) evaluates and
returns the expression alpha as follows:
78 5 Sequences and Operations with Sequences
Fig. 5.8 Python Interpreter sequence illustrating use of sum() and eval()
5.7 Iterator
We have seen that sequences like string, tuple, and list have a number of
elements within, each with its own positional identity. In a program often one has to
carry out a set of operations for each of the members of the sequence. Identifying
prime numbers in a sequence of numbers, checking for the presence of Ram or imp
in a name, counting the number of letters in a string, calculation of the grade point
average of a student, calculating the ex-factory cost of the products made in a factory
are examples. Any sequence with such elements in it for which one or a set of
operations can be carried out is an iterable. An iterator is associated with an
iterable; it points to a specic location in the iterable (Ramalho 2014). As and when
required the data in the specic location concerned is accessed and used for pro-
cessing. The access here is on a on demand basis and the full data is not called expect
when specically demanded. The iter() function generates an iterator directly from
an iterable. The Python Interpreter sequence in Fig. 5.9 claries the concepts asso-
ciated with iter() function. lt in [1] is a list with a set of ve distinct elements in it. It
is an iterable. The function iter(lt) returns an iterator a1 from lt [2]. lt[0], lt[2],
5.7 Iterator 79
and lt[4] are strings; each is an iterable with a distinct character set within. [3], [5],
and [7] return respective iterators as a10, a12, and a14 respectively. lt [1] is an
integer and lt [3] is a floating point number; neither is iterable. Attempts to extract an
iterator from these fails ([4], [6]) and TypeError is raised.
80 5 Sequences and Operations with Sequences
The function next() returns the next iterable item from the iterable starting
from the 0th one. next(a10), next(a12), and next(a14) in [8] return the
respective iterable values as R, 3, and 3 respectively. [9] advances to the
subsequent setthat is those with index 1. Repeat of the attempt to access the next
iterator value fails in the case of a12 [10] and StopIteration is raised. One
more attempt to access the next set of values (4th one) will return
StopIteration with a10 and a14 as well [11]. Being wiser with the above we
make fresh attempts in [12] by reassigning a10, a12, and a14 afresh. The repeated
accesses of the next element continue until the iterator lists are exhausted.
Example 5.4 Extract the number of words in the quote
Tenderness and kindness are not signs of weakness and
despair, but manifestations of strength and resolution. using
the iter() function.
The relevant program ctw () (module fp1) is in Fig. 5.10. The logic for word
extraction is the same as that in Example 5.1. The non-alphabetic character list
noalphahas been enhanced here by adding ? to it [1]. Both x1 and x2 are
iterations of the string qt. But with [3] x1 and x2 are the successive iterator
values (characters) in the iterablequote here. [4] tests for word ending and
increments the word count when a word is identied. The quote of interest here is
assigned to a1 in [18] in Fig. 5.9. fp1.ctw (a1)[19] in Fig. 5.9 can be seen to
return the nal word count in the given quote as 16.
As another illustration the number of words in the string a2 is counted through
fp1.ctw (a2) and returned (=4) in [20].
The general version of next () takes two arguments. The rst is the iterator. The
second onea default elementis optional. If present when the iterator range is
exhausted the default is returned. The illustrations thus far omitted the second
argument. a1 in [21] in Fig. 5.9 is the iterator for lt. Line [22] returns the successive
elements of lt until lt is exhausted. Subsequent lines[23] onwardsreturn z
the default quantity specied.
def ctw(qt):
'Count the no. of words in a quote -use iter()'
noalpha = {' ', ',','.', '?'} [1]
x1, x2, c = iter(qt), iter(qt), 0 [2]
next(x2) [3]
for j in range(len(qt)-1):
a, b = next(x1), next(x2)
if a not in noalpha and b in noalpha: c += 1 [4]
return c
squares ([4, 9, 16]) in [3]; here the iterator map(a, (2, 3, 4)) is directly con-
verted into the list b. Function aa in [4] accepts two argumentsx and y
components of a vectorand returns the vector magnitude(Euclidean norm).
In turn bb in [5] uses the x-component sequence (6, 5, 4) and the y-component
sequence (2, 6, 7) to compute and return the corresponding sequence of magnitudes
as a tuple in [6]. [7] illustrates the use of the built-in function pow() with two
arguments. The resulting list [7] is [72, 53, 32]. Similarly [8] (using three
arguments) returns 72%11, 53%11, and 34%11 as the list [5, 4, 4]. Note that in all
these cases, all the arguments must be sequencesand sequences of the same
length. Further the number of sequences should match the number of arguments for
the function.
Example 5.5 Form the scalar product of the vectors a = [1,2,3,4,5], and c = [2, 3,
5, 7, 0.5].
In the Python interpreter sequence in Fig. 5.13 mul[10] is dened as a lambda
function which multiplies the arguments x and y. map(mul, a, c) in [11] is an
iterator of the product of components of the vectors a and c. Their sum as
d (=sum(map(mul, a, c)) is the vector product [11]. d has been evaluated as 36.5 [12].
The mapping can be useful in other ways also. fc in [3] in the module demap
(Fig. 5.14) is tuple of two functions. They return the square and the cube of x and
y respectively. [4] accepts an argument z and returns vv with z2 and z3 as its
elements. The function set fc forms the sequence argument here. [14] in Fig. 5.13
uses the map for vv(2) to return the list [22, 23].
Example 5.6 A set of numbers is given(31, 42, 87, 55, 95, 68). Get their mean
and variance.
The Python program is meva() in Fig. 5.14 in the module demap. bb in [5] is
the number of elements in the input sequence dd. med in [6] gives the mean value
of the elements of the sequence dd. The function list(map(sq, dd) forms a
list with the squares of the items in dd as its elements. The mean of their sum is
formed and med2 subtracted from it to get the variance (vr). This conforms to the
denition of variance as (Decoursey 2003)
Fig. 5.14 Another Python Interpreter sequence to illustrate use of map() function
P
x2i
varxi xmean
n
The mean and variance together is returned as a tuple (med, vr) in [8].
Reverting to the Python Interpreter sequence in Fig. 5.13, the tuple ll in [15] is
the given sequence of numbers. The function meva is called with this as argument
in [16] and (mean, variance) pair is returned.
84 5 Sequences and Operations with Sequences
The zip() function accepts a number of iterables as input and returns an iterator
of tuples. The jth returned element has the jth elements of all the iterable inputs. The
Python interpreter sequence of Fig. 5.15 illustrates its features. a is a list of ve
integers [1] and b a string of ve characters [2]. e [3] zips the two, treating the
characters in b as a tuple of characters and outputs the list of all the ve
tuples so formed [4].
[5] and [6] form another illustration of the use of zip(); p, q, and r are identical
iterators with range(0,10)that is 09 inclusive. zip(p, q, r) is the iterator of
corresponding tuplesjth tuple being (j, j, j). This lot of ten tuples is
returned as a list in [6].
Example 5.7 The marks obtained by ve studentsKishore, Sanjay, Siva,
Asha, and Nisha in the subjectsphysics, chemistry, maths,
mechanics, englishare available as respective lists. Rearrange the list
with separate groups of names and marks in individual subjects.
The information given with students names and subjects is in Fig. 5.16. sa in
[7] in Fig. 5.15 constitutes a list of tupless1, s2, s3, s4, s5each rep-
resenting the data for one student. This single element tuple (sa) is zipped [8]
tuple of names SS =
Fig. 5.17 Formation of tuples of names and marks in physics, from the tuple SS
(Example 5.7)
and produced as a list [9], [10]. [11] has these ve sequencesas inputs. When
zipped as a list we get the names, marks in Physics and so on as separate lists.
Figure 5.17 illustrates the process.
lter() as an iterator function can be used with any sequence to extract sub-
sequences conforming to specic conditions. lter (alpha, beta) takes two
arguments. The second onebetais a sequence to be ltered. The rstalpha
is a function to decide the ltering; only if this function evaluates to True for an
item in beta, that item is eligible for the iterative action. [1] and [2] in Fig. 5.18
form a simple illustration for the use of lter(). h in [1] is a tuple of names.
86 5 Sequences and Operations with Sequences
5.9 Generators
same type. It returns specied string statements depending on the value of the
argument [15]. Note that the order of elements in the returned list is the same as
their order in the argument a1.
88 5 Sequences and Operations with Sequences
5.10 Hashing
Fig. 5.21 Python Interpreter sequence to illustrate application of the hash () function
assigned to bb in [4]. hash(bb) [5] returns the same hash value as in [3] obtained
with hash(aa)implying that aa and bb are the same entities. cc[7] is only
marginally different from aa (c3 is changed to c4); but its hash value [8] is
conspicuously different from that of aa[3]. Number 22 is represented in different
forms to form the tuple xx [9]. But all of them have the same hash value [10]
that of 22 itself [11]. This is different from the hash value of xx as a tuple [12].
All immutable objects (numbers, strings, tuples ) can be hashed. Since
mutable objects (list, dict, set), can be altered anytime, a hash value does not
make sense for them; they are not hashable.
5.11 Input
What is your name please\n. The same is displayed on the terminal [5] and
system advances to the next line waiting for an input line to be fed. Roshan is
fed in [6] and duly displayed [7]. wish() is dened as a simple function starting at
[8]. It seeks an input through the promptWhat is your name please?. The name
of a person is expected as input. Good day to you (name) is returned. An exe-
cution sequence with wish() follows from [9].
With math module imported function fx_0() [11] displays the prompt Give x
value and advances to the next line. x can be fed as a string or a number; it is
converted into a corresponding floating point number and assigned to y [12]. sin(y)
is output. As an illustration fx_0() execution follows. (sin(/4) sin(0.7854)
0.7071080798594735) [15].
Input() function is useful in iterative programming sequences. Typical
example is an iteration where execution is interrupted and certain parameter values
are altered to ensure convergence or speed up solution before resuming execution.
Input() is also useful to debug programs at the development stage.
5.12 Exercises 91
5.12 Exercises
p1 x x ax bx c 0
5:1
x3 a b cx2 ab bc ax abc 0
The polynomial with all the roots with their signs changed is
p2 x x ax bx c 5:2
Comparing (5.3) with (5.1) p3(x) can be seen to be a polynomial in x2 with a2,
b2, and c2 as its roots. If a is real and b and c are also real or complex conjugate
92 5 Sequences and Operations with Sequences
pairs the coefcients of p3(x2) are all real numbers. If the roots are not known
but only the coefcients of the polynomial are known as l, m, and n, we have
p1 x x3 lx2 mx n 0 and
p2 x x3 lx2 mx n 0
3
p3 x x3 mx lx2 n x mx lx2 n 0
x6 2m l2 x4 m2 2ln x2 n2 0
This process of forming polynomials with squares of the roots of the given
polynomial is continued as long as desired. With a > b > c, an becomes orders
larger than bn and cn rapidly as n increases. If an is large enough to make bn and
cn negligible compared to an, the coefcient of (xn)2 becomes equal to the
square of the coefcient of xn itself. This can be seen by comparing the
respective coefcients of (5.1) and (5.3). A similar pattern can be observed with
other coefcients also. At this stage with (xn an) as a factor, the polynomial can
be factorized to get (x2n (bn + cn)xn + bncn) as the factor. Since bn cn,
coefcient of xn can be taken as bn itself. Division of coefcient of x0 by bn
yields cn as the third root. If the two complex roots dominate over a they can be
extracted from the quadratic formed with the coefcients of x3n, x2n, and xn.
With a calculator/Python try the method for a third degree polynomial. Write
a program to apply the root squaring method to identify n and decide when to
stop iteration to get the polynomial with xn as its roots. Rest of the root
extraction procedure is manual (not easily amenable to be programmed). Apply
the method to get the roots of a few polynomials with degrees up to the 20th.
6. Two functions Ex_1 and Ex_2 have been dened in Fig. 5.23. The Python
Interpretor sequence in Fig. 5.23 is obtained by running them. Explain why
Fig. 5.23 The routines and the Python Interpreter sequence running them for Exercise 6 in
Sect. 5.3
5.12 Exercises 93
successive dd(3) and ee(3) output cumulative sums and not three itself. Delete
nonlocal aa in Ex_2; try running the routine and explain the result.
7. Write a program which uses the input() function successively to accept a
sequence of numbers and output the sum. The end of the sequence will be
identied through the string over.
8. 1-D random walk: xx starts at zero and steps in the positive direction by unity
or in the negative direction by unity. The choice of the rst or the second
alternative is made randomly with equal likelihood (Papoulis and Unnikrishna
Pillai 2002). Write a program which will make xx step through 1000 consec-
utive steps. The span of travel of xx is -1000 to +1000. Get the frequency
distribution of each position (that is the number of times xx took each of these
values). Ideally the frequency distribution should follow binomial distribution.
Compare the frequency distribution obtained with that of binomial distribution
(Variance of the difference between the corresponding frequency distribution
values can be an index for the comparison).
9. Tower of Hanoi: three vertical rodsL(Left), C (Central), and R (Right)are
given. C carries a set of n annular discs {d1, d2, d3, dn}stacked on it. The
disc sizes (diameters) are such that size of d1 < size of d2 < size of
d3 < < size of dn. C carries the discs in the same order as their sizes with
disc d1 on top. The disc set is to be moved to the L rod with two constraints:
a. Only one disc can be shifted at a time; but the shift can be from any rod to
any other rod.
b. A disc of larger size cannot be moved on to a disc of smaller size.
With two discs, follow the sequence of movements: d1 R, d2 L, d1 L.
With three discs, follow the sequence of movements: d1 L, d2 R, d1
R, d3 L, d1 C, d2 L, d1 L.
If n is odd start with d1 L and if it is even start with d1 R. With n discs
2n-1 is the minimum number of moves required for the total shift. Write a
(recursive) program to effect the shift.
Use the input () function suitably and present this as an interactive game.
10. Numerical Integration: With y as a function of x, evaluation of the denite
b
integral I R ydx has to be done numerically if the integral is not known in
a
closed form (e.g. error function, elliptic integral) or the functional relation
cannot be expressed with known functions. The set of relations below give the
integral value to different approximations for equally spaced values of x (in
general the accuracy improves as the number of samples used in the expression
increases), h being the spacing (Zwillinger 2003).v
h
I with a single interval trapizoidal rule y0 y1 5:4
2
h
I with 2 intervals Simpon0 s rule y0 4y1 y2 5:5
3
94 5 Sequences and Operations with Sequences
3h
Iwith 3 intervals Simpon0 s three eighth rule y0 3y1 3y2 y3
8
5:5
2h
Iwith 4 intervals Milne0 s rule 7y0 32y1 12y2 32y3 7y4
45
5:6
5h
Iwith 5 intervals 19y0 75y1 50y2 50y3 75y4 19y5 5:7
288
Iwith 7 intervals
7h
751y1 3577y1 1323y2 2989y3 2989y4 1323y5 3577y6 751y7
17280
5:9
For the general case with h b a=n the following (Newton-Cotes) for-
mulae can be used:
!
h X
n1
I ya 2 y2j yb 5:10
2 j1
n=2
!
h X 1 X
n=2
I ya 2 y2j 4 y2j1 yb
3 j1 j1 5:11
n is even
Prepare routines for integration conforming to each of the relations above. Test
them with values for well known functions like sin (y), exp (y), y0.5.
11. Numerical Differentiation: A select set of formulae are given here (Zwillinger):
First derivative:
2-point formula:
1
y00 y1 y0 5:12
h
3-point Formula:
5.12 Exercises 95
1
y00 y1 3y0 4y1 5:13
2h
1
y00 y1 y1 5:14
2h
4-point formula:
1
y00 y2 8y1 8y1 y2 5:15
12h
5-point formula:
1
y00 25y0 18y1 36y2 16y3 3y4 5:16
12h
Second derivative:
1
y000 y1 2y0 y1 5:17
h2
1
y000 y0 2y1 y2 5:18
h2
Third derivative
1
y000
0 y3 3y2 3y1 y0 5:19
h3
1
y000
0 y2 2y1 2y1 y2 5:20
2h3
Fourth derivative
1
y0000
0 y4 4y3 6y2 4y1 6y0 5:21
h4
1
y000
0 y2 4y1 6y0 4y1 y2 5:22
h4
Write a program to get the power at intervals of h/RC for h in the set{0.0,
0.1, 0.2, 0.3, 0.6, 1.0, 1.5, 2.0, 2.5, 3.0}.
bR
13. Ri2 dt is the energy dissipated in R in the interval[a, b] in the above case
a
(Toro 2015). Use the integration formulae for the general case (5.10) and (5.11)
and write programs for energy dissipation in R. With V = 1.0, get the energy
dissipated in the rst ve successive intervals of RC seconds each. Use
h = 0.02RC in each of the cases.
14. Different interpolation formulae are available to get the value of a signal from a
given set of its regularly spaced samples. The Hamming window is one of them
(Mitra 2013). With g(n) as the given set of samples over the rangeM n +M the
interpolated value of g(t) at time t is
X
M
gt gnwt n 5:23
M
where
2pk
wk 0:54 0:46cos 5:24
2M 1
Object
Edge
Measured
data
5.12 Exercises 97
Table 5.1 Measured values at intervals of 1 micron in the neighbourhood of the edge being
detected
Displacement in microns 0 1 2 3 4 5
Value 0.0001 0.0010 0.0048 0.0183 0.0559 0.1379
Displacement in microns 6 7 8 9 10 11
Value 0.2276 0.4640 0.5360 0.7724 0.8621 0.9441
Displacement in microns 12 13 14 15 16 17
Value 0.9817 0.9952 0.9990 0.9990 0.9999
given in Table 5.1. Determine the point where the received intensity is 50 % of
the maximum and take it as the measured edge.
Write a program for this using an interpolation formula and get the value of the
corresponding displacement.
An alternative is to determine the point of maximum derivative and take it as the
edge. Write a program to get the derivative using a formula for derivative and
determine the edge position.
References
Decoursey WJ (2003) Statistics and probability for engineering applications. Newnes (Elsevier
science) Massachusetts
Gibran K (1926) Sand and foam
Mcnamee JM, Pan VY (2013) Numerical methods for roots of polynomials. Elsevier science,
Massachusetts
Mitra SK (2013) Digital signal processingA computer based approach, 4th edn. McGraw Hill,
New York
Padmanabhan TR (1999) Industrial instrumentation. Springer, London
Papoulis A, Unnikrishna Pillai S (2002) Random variables and stochastic processes, 4th edn.
McGraw Hill, New York
Ramalho L (2014) Fluent Python. OReilly Media Inc., California
Rossum Gv, Drake FL Jr (2014) The Python library reference. Python software foundation
Shyamala CK, Harini N, Padmanabhan TR (2011) Cryptography and security. Wiley India,
New Delhi
Toro VD (2015) Electrical engineering fundamentals, 2nd edn. Pearson, Noida
Zhang Y (2015) An Introduction to Python and computer programming. Springer, Singapore
Zwillinger D (ed) (2003) Standard mathematical tables and formulae. Chapman & Hall/CRC,
New York
Chapter 6
Additional Operations with Sequences
tuple, list, dictionary, and similar sequences in Python are also known as
container Objects. A number of methods are available for them. They are all
aimed at culling out meaningful information in different ways for subsequent use.
A method is essentially a function with one argument. If a method mm is
associated with an object obob it is called as obob.mm (). It is equivalent to a
function mm(obob) with obob as its argument. With these preliminaries let us
examine the common methods and related operations with such container objects
(Rossum and Drake 2014).
6.1 Slicing
Slicing of a sequence can be done in different ways. It is carried out using the
indices of the elements in the sequence. The discussions here are with specic
reference to a sequence aa as
aa = [a0, b1, c2, d3, e4, f5, g6, h7, i8, j9, k10, 11].
The details and observations are general enough and valid for any sequence.
Two possible conventions of representation and use are shown in Fig. 6.1 for the
sequence. One can start with index 0 at the left end and proceed as 0, 1, 2, 3, as
indices for successive elements. Alternately start at the right end with 1 and
proceed to the left with indices 1, 2, 3, and so on. The basic slicing structure
is depicted in Fig. 6.2 along with its different options:
The most general form of usage species slicing as aa[::] which implies a
slice of aa starting from and extending to 1; the slice is to include the
elements aa[], aa[ + ], aa[ + 2 * ], aa[ + 3 * ], up to aa[ 1] in the
same order.
99
100 6 Additional Operations with Sequences
Sequence aa
a0 b1 c2 d3 e4 f5 g6 h7 i8 j9 k10 l11
0 1 2 11
Index progression in +ve direction
-12 -11 -3 -2 -1
Index progression in -ve direction
step
separator 2
stop
separator 1
start
aa ( : : )
aa ( : )
If omitted, aa ( :) default stop = end of sequence
The arguments for slicing can be separately dened through the slice() object.
Basically slice() has three forms as shown in Fig. 6.4:
1. In the simplest form aa(slice()) returns a slice of aa from index zero (de-
fault value) to index ( 1).
6.1 Slicing 103
6.2 Reversing
The Python Interpreter sequence in Fig. 6.5 illustrates the use of the built-in
function reversed() and the corresponding method. a1 in [1] is a tuple of
integers. reversed (a1) produces an iterator with the elements of a1 in the
reversed sequence. [2] is the corresponding list of a1 having been reversed. b in [3]
is a list of assorted items. br in [4] is the tuple formed by reversing the sequence
b. In general reversed() can be used with any container/sequence to generate a
tuple or a list of the reversed sequence. The method reverse() operates on a
list. It returns the corresponding reversed sequence itself. The sequence b [3] is
reversed by b.reverse() in [5] as can be seen from the new value of b in the
following lines. b is restored with the subsequent b.reverse() in [6].
a2 (information) is a string; tuple (reversed (a2)) [7] creates a tuple
of the character set of information in reverse order [8]. a2 being immutable,
method reverse() is not applicable to it. However the end to end slicing in the
negative direction produces a new string with the letters in information appearing
in the reverse order [9]. ulta() in [10] is the equivalent function as demonstrated
through its application in [12], and [13] where a2 (information) is reversed and
then restored.
104 6 Additional Operations with Sequences
>>> a1 = 1, 2, 3, 4 [1]
>>> list(reversed(a1)) [2]
[4, 3, 2, 1]
>>> b = ['Rama', 'Latha', 'Adarsh', 'Siddu', 31, 3.1] [3]
>>> br = tuple(reversed(b)) [4]
>>> br
(3.1, 31, 'Siddu', 'Adarsh', 'Latha', 'Rama')
>>> b.reverse() [5]
>>> b
[3.1, 31, 'Siddu', 'Adarsh', 'Latha', 'Rama']
>>> b.reverse() [6]
>>> b
['Rama', 'Latha', 'Adarsh', 'Siddu', 31, 3.1]
>>> a2 = 'information' [7]
>>> tuple(reversed(a2)) [8]
('n', 'o', 'i', 't', 'a', 'm', 'r', 'o', 'f', 'n', 'i')
>>> 'information'[::-1] [9]
'noitamrofni'
>>> def ulta(sst): [10]
... u = ''
... for jj in range(len(sst)-1, -1, -1):
... u += sst[jj]
... return u
...
>>> a0 = ulta(a2) [11]
>>> a0
'noitamrofni'
>>> a3 = ulta(a0) [12]
>>> a3
'information'
>>>
6.3 Sorting
Any sequence can be sorted using sorted() function. sorted (sq) carries out a
sorting with the sequence sq and returns a list. The general form of sorted()
function is shown in Fig. 6.6a. key and reverse are two optional arguments as
shown in the gure. The comparison operation (<, >) is the basis for carrying out
the sorting. In the absence of both the optional arguments (key and reverse) sq
is sorted by comparing its element directly. In case only key is present as the
additional argument, it is a function of sq or elements of sq returning an item on
which the comparison for sorting is carried out. Reverse is a boolean; if present
it is set to true; here the comparison and sorting are done in the reverse (descending)
order. The sort method shown in Fig. 6.6b has a similar structure with key and
6.3 Sorting 105
Fig. 6.6 Structure of a sorted () function and b sort () method long with their options
reverse having the same roles and signicance. It sorts a list in place. For a
list sqq, sqq.sort() returns sqq as a sorted list.
The Python Interpreter sequence in Fig. 6.7a illustrates the applications of
sorted() method and Fig. 6.7b those of sort() function; functionally both carry
out similar sorting. ab in [1] is a tuple of names; being immutable it cannot be
sorted in place. But sorted(ab) is returned as a list as ac (=[aanand, arab,
bala, ram, roshan, zara]) in [2]; the sorting is in ascending orderwith
a < b < c < < y < z. The rst, second, third, and c, characters in
that order, are considered for the comparison in sorting. With dictionary ad,
sorted(ad) returns the list of keys in the alphabetical order [3] (See the
following section for more on this). Similar sorted list is returned with the key
in the dictionary specied as the basis for sorting [4]. The length of the element is
specied as the key for sorting in [5]. In turn rambeing of shortest lengthis
the rst in the list and Roshan of six characters is the last one.
The module marks1.py has the marks of a set of students (see Fig. 5.16). S6,
S7, S8, S9, and S10 represent the students names and their marks as respective
tuples. The tuple dta gives the order details. st is a list of the ve elements
S6, S7, S8, S9, S10. The items here have been arranged to facilitate illustration of
the functions, methods, and so on. marks1.py has been imported ([6] in Fig. 6.7a)
from the folder demo_5 and marks1.st assigned to b1 in [7]. The name of each
student in each entry is the basis for sorting b1 in [8]. The sorted list is assigned to
ai [9]. The sorting can be seen to have been done with the names arranged
alphabetically. b1 has been sorted with the marks scored in Physics as basis for
106 6 Additional Operations with Sequences
Fig. 6.7 a Python Interpreter sequence illustrating use of sorted() function b Python Interpreter
sequence illustrating use of sort() method
6.3 Sorting 107
comparison and returned as aj [10], [11]. ak in [13] is again the same list but
sorting done with the length of the name of students as the yardstick for comparison
[12]. al in [15] returns the list with sorting done based on the marks scored in
Maths; but the sorting is done in the reverse order (91 (Kala), 87 (Karun), 82
(Sarani), 79 (Karthik), 66 (Lan)). For am marks in Chemistry and Physics are
specied in that order [16] for sorting. If the marks scored by two or more can-
didates are equal, the marks in Physics has to be the basis for their comparison; with
Sarani and Karthik scoring 78 in Chemistry, Sarani is ahead of Karthik in the sorted
list since she gets only 76 in Physics in contrast to 77 by Karthik. Similar sorting
with marks in Chemistry and Physics (in that order) as basis is done in reverse order
[18] and returned [19].
108 6 Additional Operations with Sequences
The Fig. 6.7b which illustrates the use of the sort () method with sequences has
aa in [1] as a simple list of numbers. aa.sort() in [2] returns aa in sorted form.
Sorting is implicitly done in ascending order of magnitudes and aa is returned as
the sorted list. As in the previous case the marks list has been imported (Fig. 5.16)
and assigned to b1 in [3]. The sort() function has been applied to b1 in different
ways and the respective sorted lists shown in the subsequent lines. The structure
and operations of the method can be seen to be identical to their counterparts
discussed with the sorted() function above.
The following are to be noted regarding all these rearrangements:
The individual mark-listsS6, S7, S8, S9, S10remain intact since each is a
(immutable) tuple.
The key used for sorting can be specied as any desired function which returns
a value/number that can be used for comparison. Being only for illustrative
purpose, all the functions used here are limited to single line (lambda type)
functions. If necessary the functions can be separately dened and used for
specifying the key.
A number of operators and algebraic operations with numbers involving them were
discussed in Chaps. 1 and 2. The +, *, and comparison operations amongst them
are applicable to sequencesof course with proper reinterpretation. The Python
Interpreter sequence in Fig. 6.8 Illustrates the use of + and * operators. aa
(Good) in [1] is a string. aa3 in [2] is aa * 3another string with aa
Fig. 6.8 Python Interpreter sequence illustrating use of + and * with sequences
6.4 Operations with Sequences 109
The max() and min() built-in functions identify and return the maximum and
minimum elements from a given set. The criteria for identifying the
110 6 Additional Operations with Sequences
True
>>> aa < cc [15]
True
>>> aa[2] < cc[2] [16]
True
>>> qtt = '''What everybody echoes or in silence
passes by as true today may turn out to be falsehood
tomorrow, mere smoke of opinion, which some had
trusted for a cloud that would sprinkle fertilizing
rain on their fields'''
>>> qtt.count('or') [17]
3
>>> len(qtt)
209
>>> qtt.index('or') [18]
22
>>> qtt.index('or', 23) [19]
92
>>> qtt.index('or',93) [20]
146
>>>
Fig. 6.9 Python Interpreter sequence illustrating use of comparison and other operations with
sequences
6.4 Operations with Sequences 111
Fig. 6.10 Python Interpreter sequence illustrating use of max() and min() functions
directly in [5] yields the maximum value (=3). Same holds good of the min ()
function as well. As an extended use the max/min criteria can be specied through a
key as done in [6].
ff in [7] is a tuple of names. max(ff) is decided with the criteria a < b < c <
< x < y < z the comparison being carried out starting with the rst letter.
Sarani (=f1) is returned as the maxthe choice being based on the rst letter itself
[8]. Kala (=f2) is the minimum; the selection based on the rst letter will have
karthik, kala, and karun. The second letterbeing a in all of themdoes not
change the scenario. At the third stage l in kala (=f2) decides the choice.
[9] uses the length of the namenumber of characters in it as the basis to decide
the maximum (f3 = karthik) and minimum (f4 = lan). The marks-list from
demo_5 is imported [10] and assigned to bb in [11]. It is the same as that in
Fig. 5.16. the max()/min() can be specied in different ways and extracted. [12]
shows three examples. Karthik is the student with the longest name. His data is
assigned to e1. Alphabetically Sarani comes last; her data is assigned to e2. The
mark scored in Physics is the basis for selection of e3. The candidate with the
lowest marks scored in physics is lan; his data is assigned to e3.
The Python Interpreter sequence in Fig. 6.11 illustrates the use of some additional
operations with sequences. aa in [1] is a list of strings. del aa[3:9] in [2]
deletes all the elements from aa[3] to aa[8] (inclusive) from aa. aa[2]=2C in [3]
redenes the value of aa[2] as can be seen from the lines following. The usage here
is different from the assignment cc = aa[2] where cc is assigned the value of
aa[2] without altering aa in any way. Similarly ab[1] (=7H) in [12] replaces the
last element of ab(=h7) in [7]. The new value of ab is in [13]. ab.append(ab1)
in [6] appends ab1 [5] to list ab [4]this is possible, ab being an iterable. The
method append() appends a single element to the list; the appended element gets
inserted into the list as its last (rightmost) element. [7] shows ab with ab1
appended to it. ab is restored in [8] to its previous value in [4] by deleting the last
element in abthat is ab1.
ab.extend(ab1) in [10] enhances ab by appending all the elements of ab1 to
it in the same order at one go. In contrast ab.append(ab1) in [6] appends ab1 as a
single entity to ab.
Method append() is useful if the elements to be appended are formed one by
one (as in a loop). But if the elements (more than one) to be appended are known at
a stretch method extend() is a better alternative.
aa is dened afresh in [14]. del aa[3:9:2] in [15] deletes all elements in aa
starting with aa[3] and going up to aa[9] at interval of 2. (start, stop, step) set is
similar to that in slicing (in Sect. 6.1). The truncated aa in [16] is dened afresh in
[17]. Elements of an of three elements in [18], replace three elements in aa through
[19]. Here again the set of three elements in itaa[3] to aa[9] at the interval of
6.4 Operations with Sequences 113
Fig. 6.11 a Python Interpreter sequence illustrating use of some methods and operations with
sequences (continued in Fig.6.11b) b Python Interpreter sequence illustrating use of some methods
and operations with sequences (continued in Fig.6.11a)
114 6 Additional Operations with Sequences
(b)
>>> aa =['a0','b1','c2','d3', 'e4', 'f5', 'g6', 'h7',
'i8', 'j9', 'k10', 'l11'] [23]
>>> an = [] [24]
>>> for jk in range(3, 9, 2): an.append(aa.pop(jk)) [25]
...
>>> aa [26]
['a0', 'b1', 'c2', 'e4', 'f5', 'h7', 'i8', 'k10', 'l11']
>>> an [27]
['d3', 'g6', 'j9']
>>> aa =['a0','b1','c2','d3','e4','f5','g6', 'h7', 'i8',
'j9', 'k10', 'l11'] [28]
>>> aa.remove('d3') [29]
>>> aa [30]
['a0', 'b1', 'c2', 'e4', 'f5', 'g6', 'h7', 'i8', 'j9',
'k10', 'l11']
>>> bb = [2, 3, 5, 2, 7, 1] [31]
>>> bb.remove(2) [32]
>>> bb [33]
[3, 5, 2, 7, 1]
>>> bb.remove(2) [34]
>>> bb [35]
[3, 5, 7, 1]
>>> bb.remove(2) [36]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
twothat is aa[3], aa[5], and aa[7]is replaced; d3, f5, and h7 are replaced
by 3D, 5F, and 7H respectively. Needless to say, size of an here is to be the
same as that required for the substitution.
insert(jj, x) will insert x at the jjth location in aa. An alternate way of
inserting an into aa (done above) uses insert() in a loop [20]. But the insert
operation in [19] is more elegant and compact. an.pop(1) in [21] pops an(1). With
that, an[2] (=7H) in (18) takes the place of an[1] as can be seen from [22]. aa
and an have been redened in [23] and [24] as liststhe latter being an empty one.
an.append (aa.pop(jk)) pops the jk th element in list aa and appends it to an.
With that aa(jk + 1) moves to aa[jk] and number of elements in aa reduces by
one. The popped element is appended to an. For jk = 3, 5, and 7 in succession the
loop in [25] executes an.append (aa.pop(jk)). Table 6.1 claries the steps in the
process. The truncated aa and the new an are in [26] and [ 27] respectively.
aa.remove(x) removes the rst occurrence of element x from list aa. If x is
not present in aa a ValueError is returned. The list aa has been restored in
[28]. aa.remove(d3) in [29] removes the element d3 from it as can be seen
from the updated value of aa in [30]. bb is a list of integers in [31]. bb.remove
(2) in [32] removes the rst occurrence of the integer 2 from the list; the new
value of bb is [3, 5, 2, 7, 1] as in [33]. bb.remove(2) in [34] removes the
6.4 Operations with Sequences 115
Fig. 6.12 a Python Interpreter sequence illustrating use of some methods and operations with sets
(continued) b Python Interpreter sequence illustrating use of some methods and operations with
sets (continued) c Python Interpreter sequence illustrating use of some methods and operations
with sets (continued) d Python Interpreter sequence illustrating use of some methods and
operations with sets (continued)
6.5 Operations with Sets 117
counted. aa1, aa2, and aa3 are lists of strings separately dened in [3], [4], and
[7]. aa1.extend(aa2) in [5] uses the method extend() to add all the elements
of aa2 to the list aa1. The enhanced aa1 in [6] has e4, f5, g6, and h7
from aa2 added to aa1 in the same orderthough they are already present in it. In
fact the two e4s in the enhanced aa1 are separate entitieseach with its own
6.5 Operations with Sets 119
separate index4 and 8 respectively; same holds good of f5, g6, and h7 also.
as1, as2, and as3 in [8] and [9] are the sets formed from the lists aa1, aa2, and
aa3 respectively. as1 does not have any duplicate entriesthe list aa1 has 16
elements in it whereasthe set formed from it (as1) has only 12 (distinct) elements
in itas can be seen from [10] and [11]. as3.remove(b1) in [13] removes the
element b1 from the set as3as can be seen by comparing as3 in [14] with that
in [12]. b1 has been added to as3through as3.add(b1) in [15]; the restored
version is in [16]. An attempt to add b1 again in [17] does not alter as3b1
120 6 Additional Operations with Sequences
as1
as7=as1| as3
as6=as1&as3
as3
Fig. 6.13 Venn diagrams for different operations with as1 and as3
6.5 Operations with Sets 121
[30] shows an example with the three sets as1, as2, and as3. as8 and as88 in
[30] and [31] combine all the elements common to all the three of them.
With the ^ operator, as1 ^ as3 in [32] forms the set as9 from as1 and as3;
as9 in [33] has all the elements of as1 and as3 which are not common to as1 and
as3. For example a0 is present in as1 as well as as3; it is left out of as9. 1B is
present only in as3; it is included in as9. Similarly k10 present in as3 but not in
as1; it is included in as9. as1.symmetric_difference(as3) is the method
corresponding to the ^ operation (yielding as 99).
st1 (interjection) in [34] and st2(interruption) in [35] respectively are strings
(see Fig. 6.12b). st1s and st2 s in [35] and [37] are the sets of integers formed out
of the elements (characters) forming the strings st1 and st2 respectivelythatis
the set of letters present in interjection and is the set of letters present in
interruption.
With pp as a set and qqa list, a string or any other compatible
sequence rr = pp.union(qq) generates the set rr as the union of pp and the
elements of qq. Such mixed mode of forming/generating a set is possible only with
the method (.union()) but not with the operator (|). Union of the elements of the
set as1 and the list aa2 is assigned to set au1 in [38]. au1 so formed is in [39],
[40] and [41] show the formation of a set su1 as a union of the set st1s and the
string st2. Similarly intersection, difference, and symmet-
ric_difference methods use a set as a base argument; the second argument
(as well as other arguments if present) can be sequences of other types. Examples are
in [42]. Set su2 is the set of intersection of set st1s and string st2. su3 is the
set of difference of set st1s and string st2; su4 is the set of symmet-
ric_difference of set st1s and string st2. [43] gives respective details.
pp.remove(qq) removes the element qq from the set pp. if qq is not present in
pp the Python sequence rises a KeyErrror. Thus su2.remove(r) in [44]
removes r as can be seen by comparing su2 in [43] and [45]. su2.remove(x)
in [46] returns a KeyError [47] since su2 does not have x in it as a member. The
method discard() is similar to the method remove(), to a certain extent. su2.
discard(x) discards (removes) x from su2, if it is present. If x is not present
in su2, no action follows as can be seen from [48] & [49].
The Python interpreter sequence continues in Fig. 6.12c. pp.pop() pops an
element at random from pp. su2.pop() in [50] pops i from su2 leaving it with
o, e, t, and n as its contents. pp.clear() clears the set pp. As an example
su2.clear() in [52] clears su2 and leaves it empty [53]. rr = pp.copy() creates
rr as a new seta replica of pp itself. su2c is a copy of su2[55, 56]. Being
copies they are identical as can be seen from [57]; but they are distinct objects [58].
For a given set as3 [59] list(as3) forms a list with all the elements of
as3 being in it. The function while as3:alt.append(as3.pop()) in [60]
achieves the same though in a roundabout manner. alt has been dened as an empty
list beforehand [59]. while as3:alt.append(as3.pop()) pops elements at
random from as3 and appends the popped element to list alt; this continues until
as3 is empty [62].
122 6 Additional Operations with Sequences
6.6 Frozensets
Any set is mutable; methods like remove(), extend(), pop() can be used with
sets to add or remove elements from it. In contrast a frozenset is a rigid set.
Once formed it remains frozen (as with a tuple). Additional elements cannot be
added to it; nor can elements be removed from it. The Python Intrpreter sequence in
Fig. 6.14 illustrates its formation. aa1 in [1] is a list of strings. e4, f5,
g6, and h7 as its elements, occur twice in it. frozenset(aa1) (=afz1) forms
a frozenset of its members and assigns it to afz1. afz1 in [3] has every element
as a unique one without duplication; order is not maintained either. [4] forms a set
as1 out of aa1. Content wise as1 is identical to afz1 as can be seen from [6].
The methods and operations with sets are applicable to frozensets also; the
only exceptions are the update-type methods and their operator counterparts. They
are not applicable to frozensets since they change the set by adding elements
to it or removing elements from it. The usage being similar to that with sets, the
methods/operations are not separately illustrated with frozensets.
>>> aa1 = ['a0', 'b1', 'c2', 'd3', 'e4', 'f5', 'g6', 'h7',
'e4', 'f5', 'g6', 'h7', 'i8', 'j9', 'k10', 'l11'] [1]
>>> afz1 = frozenset(aa1) [2]
>>> afz1 [3]
frozenset({'f5', 'c2', 'j9', 'd3', 'h7', 'a0', 'i8',
'l11', 'g6', 'e4', 'b1', 'k10'})
>>> as1 = set(aa1) [4]
>>> as1 [5]
{'f5', 'c2', 'j9', 'd3', 'h7', 'a0', 'i8', 'l11', 'g6',
'e4', 'b1', 'k10'}
>>> as1 == afz1 [6]
True
>>>
The Python Interpreter sequence in Fig. 6.16 illustrates the use of different func-
tions and methods with dictionarys. An empty dictionary dc is formed in
[1]. dc[z0] = ZZ0 in [2] enters (z0, ZZ0) as a (key, value) pair into dc.
Similarly (y1, YY1) is also added to dc in [3] as can be seen from [4]. With any
dictionary a (key, value) pair can be added in this manner. In fact the value
associated with an existing (key:value) pair can also be changed with a similar
fresh assignment. The dictionary dd in [5] has (b1:BB1) as a (key:
value) pair in it. The value is redened as bb1 in [6]. The altered dd is in [7].
If z0 in dc in [8] checks whether z0 is a key in dc; it being true, yes is
printed out as desired. Since yy is not a key in dc [9], a no is output. dd[a0]
in [10] checks for the presence of a0 as a key in dd. If present its value
(=AA0) is returned. Since aa is not a key in dd, dd[aa] in [11] returns a
KeyError.
For the dictionary dd the method dd.get(c2) in [12] searches for the
key c2 in dd. It being present the associated value CC2 is returned. If the key
is absent as with dd.get(aa) in [13] the command is ignored. The same is true of
the empty dictionary ddc in [15]: ddc.get(aa) in [16] is ignored. In this
respect dd.get() is different from dd[]. [14] is the use of the general form of the
get() method. dd.get(aa, bb) checks for key aa in dd. If present the cor-
responding value is returned. If not the second argument speciedbbis
returned. In the previous case the second argument was left out; since aa is not a
key in dc, the command was ignored.
6.8 Operations with Dictionaries 125
>>> dc = {} [1]
>>> dc['z0']='ZZ0' [2]
>>> dc['y1']='YY1' [3]
>>> dc [4]
{'z0': 'ZZ0', 'y1': 'YY1'}
>>> dd =
{'a0':'AA0','b1':'BB1','c2':'CC2','d3':'DD3','e4':'EE4'}
[5]
>>> dd['b1']='bb1' [6]
>>> dd [7]
{'d3': 'DD3', 'e4': 'EE4', 'a0': 'AA0', 'b1': 'bb1',
'c2': 'CC2'}
>>> if 'z0' in dc: print('yes') [8]
... else:print('no')
...
yes
>>> if'yy' in dc: print('yes') [9]
... else:print('no')
...
no
>>> dd['a0'] [10]
'AA0'
>>> dd['aa'] [11]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'aa'
>>> dd.get('c2') [12]
'CC2'
>>> dd.get('aa') [13]
>>> dd.get('aa','Sorry, no aa here') [14]
'Sorry, no aa here'
>>> ddc= {} [15]
>>> ddc.get('aa') [16]
>>> dd.update({'f5':'FF5'}) [17]
>>> dd [18]
{'a0': 'AA0', 'c2': 'CC3', 'e4': 'EE4', 'b1': 'bb1',
'f5': 'FF5', 'd3': 'DD3'}
>>> from demo_5 import marks1
>>> a1 = marks1.st [19]
>>> d1 = dict() [20]
>>> d1
{}
>>> for jj in
range(len(a1)):d1.update({a1[jj][0]:tuple(a1[jj][1:])})
[21]
...
>>> d1 [22]
{'Sarani': (76, 78, 82, 83, 84), 'Karthik': (77, 78, 79,
80, 81), 'Kala': (90, 86, 91, 92, 93), 'Lan': (65, 86,
66, 67, 68), 'Karun': (85, 86, 87, 88, 89)}
Fig. 6.17 Python Interpreter sequence illustrating more operations with dictionaries
form an iterator. But iter(dd)in a more compact formdoes the same. [2] uses it to
form ee as a list of keys [3]. dd.pop(e4) in [4] pops the value (EE4) for the
specied key (e4) for the dictionary dd as in [5]. If such a key is absent in the
dictionary a KeyError will be raised. However if the popping is done with dd.pop
128 6 Additional Operations with Sequences
(e4, z), the second argument (z) will be returned if e4 is not a key in dd. In the
present case, after step [4] dd does not have (e4:EE4) as an item in it. Hence
another attempt to pop with dd.pop(e4, sorry) in [6] returns sorry [7]. dd.
popitem() pops a (key, value) pair selected randomly from dd. With [8] the item
(a0: AA0) is returned. dd.popitem() is done here repeatedly until dd becomes
empty as in [9]. Another attempt to pop an item in [10] returns KeyError.
For a dictionary d the method d.setdefault(x:y) does the following:
If x is present as a key in d, the associated value is returned. The dic-
tionary remains unaffected. dd is refreshed in [11]. dd.setdefault(b1)
in [12] returns BB1 since the corresponding dictionary item is
{b1:BB1}. The dictionary remains untainted.
If x is not present as a key in d, the item {x:y} is entered into the dictionary
and the value y is returned. dd.setdefault ({f5:FF5}) in [14] adds the
item {f5:FF5} into the dictionary as in [15].
If the value y is not specied in the command, that is if the command is
d.setdefault(x) and the key x is not a valid key, the item (x:None) is
entered into the dictionarythat is x is entered as a key with None as
the associated default value. g6 is not a key in dd as in [16]. Hence {g6:
None} is added as an additional item in the dictionary.
6.9 *Arg and **Kwarg 129
Fig. 6.19 Python Interpreter sequence illustrating additional operations with dictionaries
When a statement in a Python suite accepts a sequence (like a tuple or a list) its
length is normally to be known beforehand. Python offers a convenient facility to
accommodate such sequences whose lengths are not known beforehand. Specifying
an argument as *a implies that a is a sequence whose length need not be xed or
known beforehand (Incidentally such arguments are mostly specied as *arg; but
130 6 Additional Operations with Sequences
this is only a convention and not mandatory). This type of usage nds application as
arguments in functions and the like. The structure and usages of such constructs are
illustrated through the Python Interpreter sequence in Fig. 6.20. (1, 2, c) in [1] is a
tuple. Its rst element is assigned to a and the rest are assigned to b as a list
[2]. b is formed here using elements whose count is not known beforehand; hence
b is a list. a, *b in [3] assigns the rst element of the list [(1, 2, 3), (4, 5, 6, 7)]
namely (1, 2, 3) to a. The restthe tuple of four integersis assigned to b as a
list [4]. [(1, 2, 3), (4, 5, 6, 7)] is a list of two elements. Hence a, *b in [(1, 2, 3),
(4, 5, 6, 7)]:print(a, b) in [5] does the printing for all (both) the elements in [(1, 2,
3), (4 ,5 ,6 ,7)] sequentially. Firstly (1, 2, 3) is assigned to (a, *b) as (1), [2, 3]
respectively and printed out [6]. Subsequently (4, 5, 6, 7) is split in the same manner
as (4), [5, 6, 7] and again assigned to a and b and the same printed out [7].
The same sequence of elements is assigned to a, *b, and c in [8]. Since the
sequence has only two elements(1, 2, 3) and (4, 5, 6, 7)they are assigned to
a and c leaving an empty list for b [9].
vff() in [10] has been dened as a function (a simple illustrative example). The
elements of bb are summed up and bb and this sum together is returned as a
tuple. vff()called as vff(*v1) in [11]sums up the four numbers forming the
tuple v1 and returns v1 and the sum. The list v2 of three numbers is the
argument of vff() in [13]; once again v2 and the sum of the elements in it are
returned. The function vf1(*v3) in [14] accepts argument v3 as a set of numbers
and returns the mean, the variance, and the argument vector itself [16]. For vectors
v1 and v2, vf1() is evaluated and returned in [17] and [18] respectively (In fact for
the specic case here def vf1(v3) would have been a simpler function denition
statement).
Function dpdt(v1, v2) in [19] has vectors v1 and v2 as its arguments and
returns their inner product [20]. Function ang_1() in [21] invokes the inner product
function repeatedlyto get the magnitudes of the vectors vf and vd, and their inner
product. The ratio <vc, vd>/||vc|| * ||vd|| and the angle between the vectors are
obtained. ang_1(va, vb) in [23] returns the angle between vectors va and vb as
2.300523983021863 radians. vec_m(*vv) as a function dened in [24] returns the
magnitude of vector vv. Note that it suits vvs of different numbers of elements.
vec_m() is used in [26] along with the inner product function dpdt to obtain the
same angle.
When *aa is used in place of an argument in a function (or class and the like) the
set of items used in its place when calling the function (or instantiating the class) are
treated as though they represent a corresponding (elastic) sequence as aa. **bb is
its counterpart for a dictionary. If it is present as an argument in a function all items
used in its place are treated as though they are entries (key-value pairs) in a
dictionary bb. In the Python Interpreter sequence in Fig. 6.21 dc1 is a
dictionary. The function sho_0(cc) [2] prints all the key-value pairs of cc
in successionas can be seen from the function sho_0(dc1) in [3]. sho_1(**dd)
[4] is another function to do a similar print out. Here dd should comprise of pairs of
the form key = value. Sho_1() is called in [5] with four (key, value) pairs in
place of **dd. The desired printouts follow for all of them.
6.9 *Arg and **Kwarg 131
Arguments for function (or class) denitions can be a mix as in [6] for the
function Sho_2(). They can be direct arguments, sequence members, or dictionary
type members. However they should be specied and supplied in the same order. In
the function here the only argument directly speciedeeis to be printed out
rst [7]. It is followed by the prints of elements forming the sequence zz [8]. Lastly
the elements of yythe dictionary key-value pair typeare to be printed
out [9]. Sho_2() is called in [10] with a set of assorted arguments. The rst one
string ss0is identied with argument ee in [6] and printed out [11].
The four subsequent argumentsrr1, pp2, (two strings) and 67, 78
(two integers)are automatically identied forming zz in [6]. They are printed out
in sequence from [12]. All the rest of the arguments are identied with yy in [6].
The prints from [13] onwards conrm this.
6.10 Exercises 133
6.10 Exercises
1. Marks information as in Fig. 5.16 is given. Prepare a program to form the list of
students who failed only in Physics. Prepare a program to get the list of students
who failed in two or more subjects. Test both the programs with the data in
Fig. 5.16.
2. Frequency of a letter pair occurring in a string of characters is called its bigram
frequency. In Example 6.3 use for jk in lo[:-2] to get bigram frequencies.
Prepare a program to make a dictionary of the most common ten bigram
frequencies for a given string (Shyamala et al. 2011).
3. a and b are two 4-dimentional vectors. Develop a function in Python to get the
dot product a . b. Evaluate a . b for a = [1.2,2.3, 4.5, 6.7] and b = [9.8,
8.7, 7.8, 6.5].
4. Use *a, *b and develop the function in the exercise above. Evaluate a.b
5. Dene a Python function to get the arithmetic mean (ma), harmonic mean (mh),
geometric mean (mg), and weighted mean (mw) of a set of given numbers
x = {xi} (Sullivan 2008).
1X n
ma xi
n i1
!1
X
n
mh x1
i
i1
!1=n
Y
n
mg xi
i1
X
n X
n
mw wi xi where wi 1
i1 i1
Evaluate the means for x = {9.6, 6.7, 5.4, 3.3, 2.8, 7.2} and w = {0.10, 0.16,
0.17, 0.18, 0.19, 0.20}.
6. The Fibbonacci sequence is a sequence of numbers satisfying the property: the
ith number is ni = ni1 + ni2 (Sullivan 2008). Write a Python program to get ni
given n0 and n1.
Get all ni up to n10 for the set of (n0, n1) values (0, 1), (3, 4), and (1, 3).
7. Binomial Distribution: The coefcients of (a + b)n for n = 1, 2, 3, form the
Pascals triangle. dk,nthe coefcient of akbnk can be recursively expressed
as
dn;k 1 if k 1 or n
dn1;k dn1 ;k1 for all other k:
134 6 Additional Operations with Sequences
cn;k 1 for k 1
2n for k n
cn;k1 cn1;k for all other k:
With a = 1103515245, c = 12345, and m = 231. The number has the range (0,
231 1). Prepare a program to generate the nth random number xn from xn1
recursively. Use the values given here for a, c, and m as default values. Use 753
as the default value for x0. For a given d, a similar (pseudo) random number in
the range (0, d 1) is xn mod d. Modify the program to output a random
number in the range (0, d 1) if a value of d is specied.
9. Let x be a random number with uniform distribution in the range (0, 2n1). The
value of k such that cn,k-1 < x < cn,k with cn,k as in Exercise 7 above represents
an integer conforming to binomial distribution. Combine the programs in
(7) and (8) above to generate a random number with binomial distribution over
a specied range.
10. {xi} and {yi} are two random sequences of length d each. The correlation
function between the two is dened as
X
d1s
rs xi yis
i0
for any (Papoulis and Unnikrishna Pillai 2002). Prepare a program to get r()
for varying from 0 to d/10. With d = 1000 get two uniformly distributed
random sequences and get the r() for them. When xi and yi are the same r() is
the autocorrelation function. Get the autocorrelation function for both the
sequences.
11. Prepare a program to show a function as a bar graph. Depict the correlation
functions in Exercise 10 above as a bar graph.
12. A list of 200 students with marks in mathematics, physics, chemistry, and
English is to be made available for admission to an institution with four
6.10 Exercises 135
branches of study. Each branch admits forty students based on a rank list. The
rank list for admission is to be prepared with marks in mathematics +0.5 times
the sum of marks in physics and chemistry combined as the basis. Further each
student has a preferred list of branches which is used for branch allotment. Use
the programs in the forgoing exercises for the following.
(a) Student list with marks: Prepare a program to assign marks in mathematics
at random in the range 8089 conforming to binomial distribution. For this
prepare an array of 1024 numbers their values being decided by the
cumulative binomial distribution (0th entry has 80, 1st to 11th have entry
81st, 12th56th have entry 82nd, 1023rd has entry 89). Get 200 random
numbers in the range (0, 1023)with uniform distributionobtained with
replacement. Use these as indices to allot marks from the above list.
(b) Follow the same procedure to allot marks in physics, chemistry, and
English also.
(c) Rank list: For each student get the weighted mark as M + 0.5 * (P + C)
where M, P, and C are the marks in Mathematics, Physics, and Chemistry
respectively.
(d) Form a list of student data with each entry being a list with the student
Serial Number (0199), marks in Mathematics, Physics, Chemistry,
English, and the weighted marks as its elements.
(e) Based on the weighted marks in (c) above, rank the students and assign
ranks in descending order (student with the highest weighted marks having
rst rank). The rank is added as the next item in the students list.
(f) Branch preference list: The list of numbers [1, 2, 3, 4] represents the four
possible branches. Shuffle the list 200 times and allot to the 200 students
successively; this is the branch preference list for each student. Add this list
as an additional entry to the data list of each student. To shuffle a list of k
numbers, with j as a random integer in the range (0, k 1) do circular right
shift of the list by j positions (This is not a good algorithm for shuffling; it
sufces for the present context).
(g) Branch allotment: Allot branch of his/her choice to the top ranking student.
Do the same to the second rank holder and so on. Continue until all the
branches are full. Add the allotted branch as an additional item to each
student list.
(h) Wait list: Have a wait-list of ten studentscontinuing the rank list based
allotment.
(i) After allotment is over two students from each of the branches leaves the
course. All these are selected randomly from the allotted sets. Continue
re-allotment maintaining ranks and accommodating the eight top wait-listed
students. Complete allotment. Add allotted branch as an additional item to
the student data.
13. A set of n integers is given. Write programs to arrange them in ascending
orderuse the following algorithms (all are recursive):
136 6 Additional Operations with Sequences
(a) Form L0 as an empty list. Search the full set of integers for the smallest
integer. Append the result to L0. Repeat the search with the rest of the set;
continue to nish (Bubble sort algorithm).
(b) Divide the set into n/2 groups of two elements each. Arrange each group in
ascending order. Merge the rst two groups (which are already arranged)
into a single group of four elements all being in ascending order. Repeat
with each pair of successive groups. Merge each pair of groups of four
elements at the next stage. Continue this to completion. Two groups of n/2
elements eachalready arranged in ascending orderare combined at the
last stage. The merging at every stage is with two arranged sets. The
procedure outlined here reduces the number of comparisons to be carried
out substantially. Whenever necessary, pad up the groups with zero integers
(Merge sort algorithm) (Guttag 2013).
References
Guttag JV (2013) Introduction to computation and programming using Python. MIT Press,
Massachusetts
Papoulis A, Unnikrishna Pillai S (2002) Random variables and stochastic processes, 4th edn.
McGraw Hill, New York
Rossum Gv, Drake FL Jr (2014) (2) The Python library reference. Python software foundation
Shyamala CK, Harini N, Padmanabhan TR (2011) Cryptography and security. Wiley India, New
Delhi
Sullivan M (2008) Algebra and trigonometry, 8th edn. Pearson Prentice hall, New Jersey
Chapter 7
Operations for Text Processing
7.1 Unicode
Unicode (The Unicode Standard) denes a code space of 1, 114, 11 code points in
the range 010FFFFh. All characters in all languages, formatting separators, control
characters, mathematical symbols, are all represented by uniquely assigned binary
numbers in Unicode assigned by the Unicode consortium. A Unicode point is
represented as U+ah where ah is the hex number representing the point.
Representations for most of the commonly used characters, call for the use of a
maximum of 16 bits. In Unicode the characters of English language and the other
(European) groups of languages are assigned separate segments in the binary
sequence. This facilitates interface software development and speeds up software
based conversions between the characters and their Unicode representations. The
Unicode consortium also identies characters and symbols not assigned so far, and
assigns Unicode values to them. This is a continuing process.
137
138 7 Operations for Text Processing
7.2 Coding
7.2.1 UTF-8
All characters can be coded in UTF-8. A basic set of 128 charactersforming the
ACSII sethas a 7-bit representation. This is accommodated within a byte. This
ensures backward compatibility with ASCII which has been the most widely used
code so far. A leexclusively of ASCII characters is represented as a byte
sequenceeach byte representing an ASCII character. The ASCII character set is
given in Table 7.1 (Padmanabhan 2007). The 26 capital letters (A, B, C, Y, Z),
the corresponding small letters (a, b, c, . y, z), Arabic numerals (0, 1, 2, 9),
commonly used algebraic symbols (+, , /, *), as well as other symbols (%, $, )
are all part of it. A number of control characters (Next Line, Tab, End of Line, Tab,
End of Document, ) are additional to these; this also includes formatting sepa-
rators (comma, full stop, question marks ). The byte value in decimal, octal, and
hex form for each of the characters is given in the table.
Any Unicode characteroften referred as Unicode pointthat can be repre-
sented in UTF-8 has a bit representation whose length can extend up to 21 bits. This
possible set can be grouped into four distinct (and mutually exclusive) ranges as
shown in Table 7.2. The following characterize UTF-8:
7.2 Coding 139
The rst set is for characters with the number of signicant bits being seven or
less. It has a range of 0128 comprising of 128 characters. They are represented
by a single byte each with the most signicant bit being zero. Such a repre-
sentation is identical to the ASCII set.
The rest of the UTF-8 is for Unicode characters of length 8 bits or more. They
are represented as 2-, 3-, or 4-byte sets depending on the number of bits in the
Unicode point. If it is of 8- to 11-bits length a 2-byte representation is used. If it
is of 12- to 16-bits a 3-byte representation is used. If it is of 17- or more bits in
size a 4-byte representation is used.
Unicode standard versions are updated over timealways an updated version
replacing an existing one. However all updates add new characters to the
140 7 Operations for Text Processing
Table 7.2 Group details of unicode points and their UTF-8 representations
Sl. no. 1 2 3 4
No. of bits in code point 7 11 16 21
First code point U+0000 U+0080 U+0800 U+10000
Last code point U+007F U+07FF U+FFFF U+1FFFF
No. of bytes in the 1 2 3 4
sequence
First byte 0XXXXXXX 110XXXXX 1110XXXX 11110XXX
Second byte 10XXXXXX 10XXXXXX 10XXXXXX
Third byte 10XXXXXX 10XXXXXX
Fourth byte 10XXXXXX
existing set but do not alter the character allocation done so far (Ignore Korean
mess). Hence any fully debugged encoder/decoder will be always valid.
All multi-byte representations have a leading byte and 1-, 2-, or 3-continuation
bytes. All continuation bytes are of the form 10xx xxxx with b7 and b6 being 1
and 0 respectively.
The leading byte has the form 110x xxxx for 2-byte representation. b7b6b5 =
110 signies the 2-byte structure. Up to 211 characters can be represented with
2- and 1-byte sets together.
1110 xxxx is the leading byte for 3-bytes character representation. b7b6b5b4 =
1110 signies the 3-byte structure. Up to 216 characters can be represented with
3-, 2-, and 1-byte sets together.
1111 0xxx is the leading byte for the 4-bytes charactersb7b6b5b4b3 = 1111 0
signifying the 4-byte structure. Up to 221 characters can be represented using all
these four sets together.
One can synchronize with any serial data stream and identify a character by
examining a maximum of four consecutive bytes.
The characters in Latin languages as well as some others can be represented with
the 2-byte set. It also includes characters like , (characters with diacritical
marks). A substantial part of the rest of the character set can be represented by the
3-byte sets. 4-byte sets are needed only for the less common (CJK) characters,
mathematical symbols, and emojis.
Example 7.1 The characters A, *, , , and have the Unicode values
U+41, U+7e, U+394, U+2190, U+221a respectively. Convert them into
respective UTF-8 byte sequences.
The binary value of 41h is 100 0001. Being a 7-bit number, its UTF-8 code is a
single byte with 0 as its MSB; it is 0100 0001. Similarly the binary value of 7eh is
111 1110; again being of 7-bits, the UTF-8 code is 01111110.
The binary value of 394h is 11 1001 010010 bits long. From the second row in
Table 7.2 one can see that its UTF-8 representation is of 2 bytes, these being 1100
1110 and 1001 0100.
7.2 Coding 141
The binary value of 2190h is 10 0001 1001 0000. Being of 14-bits, the UTF-8
code is of 3 bytes (see third row in Table 7.2). These are 1110 0010, 1000 0110,
and 1001 0000 respectively. Similarly the binary value of 221ah is 10 0010 0001
1010again of 14 bits; corresponding UTF-8 code is of 3 bytes1110 0010, 1000
1000, and 1001 1010 respectively.
Example 7.2 UTF-8 byte sequences of a set of four characters are given as 0011
1100, 0011 1110, (1110 0010, 1000 1000, 1001 1110), (1110 0010, 1000 0110,
1001 0010) respectively. Obtain respective Unicode values.
The codes 0011 1100 and 0011 1110 being single bytes, the respective Unicode
values are of less than 8 bits in length; they are U+3c, U+3e respectively. (They
represent < and > respectively.)
The code (1110 0010, 1000 1000, 1001 1110) is of 3 bytes; its Unicode is
U+221e (represents the symbol ). Similarly the code (1110 0010, 1000 0110,
1001 0010) is of 3 bytes; the corresponding Unicode is U+2192 (represents the
symbol ).
Python as a language will be called upon to do operations on strings of char-
acters (texts), groups of bytes (numerical), or on their combinations. A number of
operations (methods, functions and the like) are available with strings and byte
sequences (van Rossum and Drake 2014). Some of the latter category has already
been discussed in the preceding chapter. We shall focus on the methods/functions
with strings here and also on those to convert from one to another form. Input and
output schemes facilitate interface with strings, byte sequences and so on. These are
normally handled through print methods/functions discussed later.
(a)
>>> s1 = 'Good' ' Morning' [1]
>>> s1 [2]
'Good Morning'
>>> s2 = 'How' ' are' ' you' [3]
>>> s2
'How are you' [4]
>>> s3 = s1 + '! ' + s2 [5]
>>> s3 [6]
'Good Morning! How are you'
>>> s1[0] [7]
'G'
>>> s1[0:1] [8]
'G' [9]
>>> s1[:-1] [10]
'Good Mornin'
>>> s1[2:] [11]
'od Morning'
>>> s4 = 'hello how are you?' [12]
>>> s5 = s4.capitalize() [13]
>>> s5 [14]
'Hello how are you?'
>>> s5.casefold() [15]
'hello how are you?'
>>> s5.center(25) [16]
' Hello how are you? ' [17]
>>> s5.center(25,'*') [18]
'****Hello how are you?***'
>>> s5.center(30,' ') [19]
' Hello how are you? '
>>> s5.rjust(25) [20]
' Hello how are you?'
>>> s5.rjust(25,'@') [21]
'@@@@@@@Hello how are you?'
>>> s5.ljust(25) [22]
'Hello how are you? '
>>> s5.ljust(25,'@') [23]
'Hello how are you?@@@@@@@'
>>> s6 = 'One day there passed by a company of cats a wise
dog' [24]
>>> s6.count('th') [25]
1
>>> s6.count(' a ') [26]
2
>>> s1, s2 = 'a1b2c3', 'd 5 e 6' [27]
>>> s1.join(s2) [28]
'da1b2c3 a1b2c35a1b2c3 a1b2c3ea1b2c3 a1b2c36'
>>> s3 = ('zZ', 'yY', 'xX') [29]
>>> s1.join(s3) [30]
'zZa1b2c3yYa1b2c3xX'
Fig. 7.1 a Python Interpreter sequence illustration string operations (continued in Fig. 7.1b),
b Python Interpreter sequence illustration string operations (continued from Fig. 7.1a)
7.3 Operations with string S 143
(b)
>>> ''.join(s3) [31]
'zZyYxX'
>>> from demo_6 import twd
>>> s7 = twd.sa1 [32]
>>> len(s7) [33]
634
>>> s7.count('th') [34]
18
>>> s7.count('th', 100, 600) [35]
16
>>> s7.count('th', 100) [36]
16
And as he came near and saw that they were very intent and
heededhim not, he stopped.
And when the dog heard this he laughed in his heart and
turned from them saying, "O blind and foolish cats, has it
not been written and have I not known and my fathers before
me, that that which raineth for prayer and faith and
supplication is not mice but bones." '''
In all our day-to-day dealings and transactions numbers are represented and pro-
cessed in decimal form. In computers and computer-based schemes and applica-
tions numbers are represented in binary form and processed. Methods and functions
available in Python do all representation and related algebra with numbers in binary
form. For a convenient and compact representation numbers are more often rep-
resented in octal or hexadecimal (hex) form. But for displays, printout, and similar
human related interface decimal numbers are used. Python has the flexibility to
represent numbers in different ways and convert them from one form to another.
These are explained and illustrated here.
7.4 Number Representations and Conversions 145
7.4.1 Integers
Integers can be represented in decimal form directly. Binary, octal, and hex repre-
sentations can be done in simple and well accepted formats. Illustrative details are in
the Python Interpreter sequence in Fig. 7.3. n1 [1] is the hex number 43h. 0x or
0X signies the following integer sequence to be a hex number; the integers can be
from the set{0, 1, 2, 8, 9, a(A), b(B), c(C), d(D), e(E), f(F)}. Small or capital
letters can be used for a, b, c, d, e, and f. 0 43 = 4 * 161 + 3 * 160 = 67 (in
decimal form) as can be seen from [2]. Octal numbers are represented as 0o or
0O followed by the numbera sequence of digits from the set{0, 1, 2, 6, 7}.
Fig. 7.3 Python Interpreter sequence illustrating number representations and conversions
146 7 Operations for Text Processing
The octal number 0O103 in [3] is again the decimal number 67 itself [4]. 0b or
0B followed by a binary sequence is a binary number. 0b1000011 in binary form
[5] is again the decimal number 67 [6].
The function int(x) in its simplest form accepts x as a number and returns its
integral part as an integer. int() in [7] returns a zero [8]. int(501) in [9] returns
501 itself. int(67.8901) in [10] returns 67. The fractional part of the number is
ignored and the integral part returned as an integer. However if rounding-off is to be
done round() function can be used (discussed in the following section).
int(y, rr) is the general form of the int() function. Details of its use are as
follows:
If rr is omitted and y is a number, the integral part of y is returned as in the
foregoing cases. Here the number is implicitly taken to be a decimal number.
If rr is present it signies the radix (base) of the number. It can take any value
from 2 to 36. Further y has to be a string representing the number to the base rr.
The characters in the string are from the set{0, 1, 2, 8, 9, a, b, c, y, z}
where a, b, c, y, z represent the integer values 1035 in the same order. The
letters can be capital or small versions. In addition the binary, octal, and hex
strings are also acceptable.
Negative integers have the negative sign at the left end of the string.
Lines [11][19] illustrate a few possible uses of int() function. In [11] 592
signies an integer to base 36. Its decimal value is 6806 as seen from [12]. Similarly
z, zxy, and XYZ all to base 36are shown in the succeeding lines along with
their decimal equivalent values. int(abc, 30) in [17] represents abc as an
integer to base 30. Its equivalent decimal value is 9342. int(0xfe2, 16) in [19]
takes 0xfe2 as a hex integer having decimal value 4066 [20].
Example 7.4 The string01110is given. Treat it as an integer to bases 236
and obtain respective decimal values.
l2 is formed as a null set in [21]. All the required integers are successively
appended to it in the following lines. In subsequent lines l2 is displayed in three
convenient segments.
The function bin(x) returns the binary equivalent of x as a string. Here x can be an
integer in decimal form, octal form, hex form or binary form itself as illustrated in [22].
hex(x) returns the hex value of integer x as a string. Similarly oct(x) returns the octal
value of x as a string. In all these cases x has to be an integer but its representation can
be in decimal, octal, binary, or hex form. These are illustrated in [23][26].
The function oat(ff) accepts a floating point number (or an integer) ff as a string
and returns its equivalent as a decimal value. The Python Interpreter sequences in
7.4 Number Representations and Conversions 147
(a)
>>> f1, f2, f3, f4 = float('345.67'),
float('0.34567e3'),float('3456.7E-1'),float('-3456700e-
4') [1]
>>> f1, f2, f3, f4
(345.67, 345.67, 345.67, -345.67)
>>> g1, g2, g3, g4 = 345.67, 0.34567e3, 3456.7E-1, -
3456700e-4 [2]
>>> float(g1), float(g2), float(g3), float(g4) [3]
(345.67, 345.67, 345.67, -345.67)
>>> h = (g1, g2, g3, g4) [4]
>>> list(map(float, h)) [5]
[345.67, 345.67, 345.67, -345.67] [6]
>>> j1 = h1.as_integer_ratio() [7]
(6081090949973279, 17592186044416) [8]
>>> j1[0]/j1[1] [9]
345.67
>>> g4.as_integer_ratio() [10]
(-6081090949973279, 17592186044416) [11]
>>> (2.00).is_integer() [12]
True
>>> (0.20e10).is_integer() [13]
True
>>> (2.00e-1).is_integer() [14]
False
>>> nn = 67.8901 [15]
>>> float.hex(nn) [16]
'0x1.0f8f765fd8adbp+6' [17]
>>> (j1[0]/j1[1]).hex() [18]
'0x1.59ab851eb851fp+8'
Fig. 7.4 a Python Interpreter sequence illustrating floating point number representations and
conversions (continued in Fig. 7.4b), b Python Interpreter sequence illustrating floating point
number representations and conversions (continued from Fig. 7.4a)
Fig. 7.4a, b illustrate the use of different operations related to floating point numbers.
The number 345.67 (and 345.67) is represented as strings in different forms in [1]
in Fig. 7.4a. oat() returns the decimal value. The same set of numbers is assigned
to g1, g2, g3, and g4 in [2] and the float values obtained in [3]. With h = (g1, g2,
g3, g4) as a tuple the conversion is carried out using the map() function in [5] and
the result shown in [6]. A floating point number can be expressed in the rational form
as a ratio of two integers using the method .as_inger_ratio(). [7] illustrates this
for 345.67; the (numerator, denominator) pair is returned as a tuple in [8]. [9]
conrms this by evaluating the ratio (numerator/denominator) directly; [10] does the
conversion to ratio form for the negative number 345.67.
x.is_integer() tests whether x is an integer; if yes, True is returned; else
False is returned; the illustrations are in [12], [13], and [14].
148 7 Operations for Text Processing
(b)
>>> hh1 = (345.67).hex() [19]
>>> hh1 [20]
'0x1.59ab851eb851fp+8'
>>> hh2 = float.fromhex(hh1) [21]
>>> hh2 [22]
345.67
>>> k1 = '0x2.0fp+3' [23]
>>> float.fromhex(k1) [24]
16.46875 [25]
>>> k3 = '0x2.0p+1' [26]
>>> k4 = float.fromhex(k3) [27]
>>> k4 [28]
4.0
>>> float.fromhex('0xfp-1') [29]
7.5
>>> round(57.654545) [30]
58
>>> round(57.654545,3) [31]
57.655
>>> b1 = 1.04555500000 [32]
>>> l2 = [] [33]
>>> for jj in range(1,9):l2.append(round(b1,jj))
...
>>> l2
[1.0, 1.05, 1.046, 1.0456, 1.04556, 1.045555, 1.045555,
1.045555] [34]
>>> a1 = 1.04555000000 [35]
>>> l1 = [] [36]
>>> for jj in range(1,9):l1.append(round(a1,jj)) [37]
...
>>> l1
[1.0, 1.05, 1.046, 1.0455, 1.04555, 1.04555, 1.04555,
1.04555] [38]
>>>
In algebra involving floating point numbers normally the numbers are present in
decimal formrepresented in int.fraction form or in the (mantissa, expo-
nent) form.
In Python it is also possible to represent and display a floating point hex number
in (mantissa, exponent) form. Here the exponent is represented as pa signifying 2a
with a being the (positive/negative) exponent. With this convention a floating
point decimal number fn can be represented in hex form using oat.formhex
(fn). The floating point decimal number 67.8901 is assigned to nn in [15]. It is
represented in hex form as explained above using oat.hex(nn) in [16] and [17].
The decimal number 345.67expressed as an integer ratio in [9] (j1[0]/j1[1])is
expressed as a floating point number in hex form in [18]. It is veried through direct
7.4 Number Representations and Conversions 149
conversion in [19], [20] and again through reversal to decimal form in [21] and [22]
(in Fig. 7.4b). k1 in [23] is a hex number in floating point form (with binary value
of 10000.01111); it is converted to decimal form through oat.fromhex(k1) in
[24] as 16.46875. Two additional examples of conversion from hex to decimal form
follow from [26] to [29].
For a floating point number x, round(x) rounds x to the desired accuracy. With
a single argument, x round(x) rounds x to an integer as in [30]. round
(x, d) rounds x to a number to d signicant digits beyond the decimal point as in
[31]. The rounding off is carried out based on the actual representation of the
number in memory.
Example 7.5 The numbers b1 = 1.0455550 and a1 = 1.045550 are represented as
1.04555500000000001215028078149771317839622497558593750 and
1.04554999999999997939426066295709460973739624023437500 respectively
in the computer. Round them off to different accuracies and explain any anomaly.
list l2 is initialized as an empty list in [33]. b1 (=1.0455550) is rounded off
to different signicant digits and appended to l2 as in [34]. In all cases the rounding
is done to the nearest level as is to be expected.
A similar rounding off is done with a1 = 1.045550 [35] and l1 is the list of
rounded numbers. [37] shows the respective values. The rounding sequence is as
follows:
1.045549 is rounded to 1.04555.
1.04554 is rounded to 1.0455 and not 1.0456.
1.0455 is rounded to 1.046.
1.045 is rounded to 1.05.
1.04 is rounded to 1.0 and not 1.1.
The Python Interpreter sequence in Fig. 7.5 Illustrates some additional operations
with strings. SS.endswith(sb) returns True if the string SS ends with the
sub-sequence sb; else it returns False. string S6 in [1] ends with dog. [2]
conrms this. One has the option of checking for any element in a tuple to be at the
end of SS as illustrated by [3] and [4] with the string S6 itself. Similar checks can
be made over a selected segment of SS also. SS.endswith(sb, p, q) picks out
the slice SS[p, q] and checks whether it ends with the sub-string sb. [5] is an
illustration with S6. SS. startswith(sb, p, q) is similar to SS.endswith
(sb, p, q); it tests string SS for sb being at the start of the slice SS [p, q] in it.
One can also test for an element in a tuple (used in place of sb) to be at the start of
SS [p, q]. [5], [6], and [7] are illustrations of use of .startswith(). A select
number of methods is available to check for the presence of different categories of
characters in strings. SS.isalpha() returns True if the string SS is non-empty
150 7 Operations for Text Processing
Fig. 7.5 Python Interpreter sequence illustrating additional operations with strings
7.5 More String Operations 151
Table 7.3 Methods to test string content: if the specied string is non-empty and the
specied test is satised True is returned; else False is returned
isalpha() All are alphabetic characters
isnumeric() All are numeric characters i.e., digits
isdecimal() All the decimal charactersincludes digit characters (0, 1, 9), and
others dened in Unicode
isdigit() All decimal characters and some others like , , , superscript
and subscript digits, and so on as dened in unicode
isalnum isalpha() or isdecimal() or isnumeric()
isidentier() Python identier
islower() All characters of lower case type
isprintable() All are printable characters (excludes control, formatting characters )
isspace() All are white space characters
istitle() The string has to be a title-cased stringrst letter in every word is a
capital letter
isupper() Every character in the string is in upper case
isalum() Every character in the string satises one of SS.isalpha(), SS.
isdecimal(), SS.isdigit(), or SS.isnumeric()
startswith Slice SS[p:q] starts with substring sb. One can also use a tuple in
(sb, p, q) place of sb and test whether ss starts with an element in it
endswith(sb, Slice SS[p:q] ends with substring sb. One can also use a tuple in
p, q) place of sb and test whether SS ends with an element in it
and all its characters are alphabetic. Table 7.3 lists the different methods of this
category with details of the checks they make. Relevant illustrative examples of
usage are also shown in the sequence in Fig. 7.5.
Strings can be combined and split in different ways. Illustrative examples are
in the Python Interpreter sequence in Fig. 7.6. SS.split() directly splits SS into a
list of all the words in itwords in the sense of groups of characters separated by
white spaces. The string ss in [1] is split to form the list sp shown in [2].
Any sequence of strings can be concatenated into one string using com.join
(sqn). Here sqn is a sequence (like a tuple or a list). The string com is
interposed between adjacent elements of sqn in forming the concatenated string.
The string sj in [4] is formed by combining the elements of list sp. A white
spaceis inserted between adjacent elements in it [3]. Sjj in [5] is formed
similarly with three white spaces between adjacent elements. Sj and sjj are again
split into respective word lists in [6] using the method split() itself. Note that
the intervening spaces are ignored whatever be their lengths. Sj0 in [7] is formed by
concatenating the elements of sp with the single character string $ as the sepa-
rator. With split() the argument need not be specied if white space is the
separator as was done in [2]. With other separators the separator has to be specied
as a string argument. Sj0.split($) splits string sj0 into its element treating
$ as the separator between adjacent elements [8]. The method .split() has two
arguments in its general formas SS.split(aa, b); here aa is the separator
string and b the integer signifying that SS is to be split into b + 1 elements. sp4 in
152 7 Operations for Text Processing
Fig. 7.6 Python Interpreter sequence illustrating use of split() method with strings
Fig. 7.7 Python Interpreter sequence illustrating use of variants of split() method with strings
(a)
>>> ss = ' One day there passed by a company of cats a
wise dog ' [1]
>>> s6 = ss.strip() [2]
>>> s6
'One day there passed by a company of cats a wise dog'
>>> s6.strip() [3]
'One day there passed by a company of cats a wise dog'
>>> s6.strip('Ognd') [4]
'e day there passed by a company of cats a wise do'
>>> s6.lstrip('e nOd') [5]
'ay there passed by a company of cats a wise dog'
>>> s6.rstrip('a wise dog') [6]
'One day there passed by a company of cat'
>>> s6.replace('c', 'C') [7]
'One day there passed by a Company of Cats a wise dog'
>>> s6.replace('e', 'EE', 2) [8]
'OnEE day thEEre passed by a company of cats a wise dog'
>>> s6.partition('company') [9]
('One day there passed by a ', 'company', ' of cats a
wise dog')
>>> s6.partition('Company') [10]
('One day there passed by a company of cats a wise dog',
'', '')
>>> ss6.rpartition('EE') [11]
('OnEE day thEErEE pass', 'EE', 'd by a company of cats a
wise dog')
>>> s6.find('e ') [12]
2
>>> s6.find('e ',8, 30) [13]
12
>>> s6.find('e ',15) [14]
47
>>> s6.find('e ',15, 45) [15]
-1
>>> s6[-8:] [16]
'wise dog'
>>> s6.rfind('e ', -8) [17]
47
>>> s6[-45:-8]
' there passed by a company of cats a '
>>> s6.rfind('e ',-45, -8)
12
>>> s6.index('e ',15) [18]
47
Fig. 7.8 a Python Interpreter sequence illustrating more operations with strings (continued in
Fig. 7.8b), b Python Interpreter sequence illustrating more operations with strings (continued from
Fig. 7.8a)
7.5 More String Operations 155
(b)
>>> s6.index('e ',15, 45) [19]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found
>>> s6.rindex('e ',15, 45) [20]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found
>>> s6.rindex('e ',15) [21]
47
>> s6t = s6.title() [22]
>>> s6t
'One Day There Passed By A Company Of Cats A Wise Dog'
>>> s6t.swapcase() [23]
'oNE dAY tHERE pASSED bY a cOMPANY oF cATS a wISE dOG'
>>> s6t.lower() [24]
'one day there passed by a company of cats a wise dog'
>>> s6t.upper() [25]
'ONE DAY THERE PASSED BY A COMPANY OF CATS A WISE DOG'
>>> s6.zfill(70) [26]
'000000000000000000One day there passed by a company of
cats a wise dog'
side of ss [1]. The stripped version of ss is s6 in [2]. In the absence of white spaces
at the end, with s6.strip() the string remains untouched as in [3]. The generalized
version of the strip method is ss.strip(chr) where chr is a set of characters to
be removed. Here all combinations of the specied set in chr are removed.
ss6.strip(Ognd) removes O and n at the left and g at the right in
[4]. Note that o at the right endbeing the small letter is not removed. Hence
dprecedes o and is not at the endremains untouched. s6.lstrip() and
s6.rstrip() are similar to s6.strip(). S6.lstrip(e nOd) removes the e
nOd combination at left (leading) end [5] and s6.rstrip(a wise dog) does a
similar thing at the right end [6]. Again all the respective character combinations are
removed in both the cases. ss.replace(oo, nn, m) replaces the substring
oo by the substring nn in the string ss in the rst m occurrences of oo (m is
an integer here). The integer m is optional. If replacement is sought to be done in
the whole of the string ss, m is omitted. s6.replace(c, C) in [7] replaces
all c in s6 by Cs. s6.replace(e, EE, 2) replaces only the rst two
occurrences of es in s6 by EEs [8].
ss.partition(prt) partitions ss into three segments. The rst occurrence of
prt in the string ss is identied; ss is split into one substring up to this occur-
rence, the substring prt itself, and the rest of ss as the third subset. As an illus-
tration s6.partition(company) in [9] splits s6 into three substrings in the line
156 7 Operations for Text Processing
following. If the specied substring prt is absent in ss, ss along with two empty
substrings following it, are returned. [10] searches for Company (with a capital
C) in s6. It being absent, s6 and the two empty substrings are returned. The
method .rpartition() does partitioning similarly; but here the scanning to
identify the substring starts at the right end of ss.rpartition(EE) partitions s6
as in [11].
ss.nd (aa, ns, ne) identies the slice ss[ns:ne] in ss, scans it for the
presence of the substring aa in it and returns the index of its rst occurrence. If
the integer ne is omitted the scanning is done in the whole of the substring ss from
ss[ns] onwards. If ns is also omitted the whole of ss is scanned to identify the rst
occurence of aa and its location. If aa is not present in ss 1 is returned as the
index value. S6.nd(e) [12] identies the index of the rst occurrence of e in
s6 as 2. In fact it is the location of the word ending with e. S6.nd(e, 8, 30)
[13] returns 12 as the index (corresponding to the word these). The next
occurrence is at index 47 (corresponding to the word wise) as can be seen from
[14] where the full sub-sequence from s6 is scanned. S6.nd(e, 15, 45) in [15]
conrms the absence of any word ending in e in this range by returning the index
1. e in sb scans for the presence of e in sb but not for its location. In this
respect .nd() method is more demanding; the use of the in operator sufces
when it serves the purpose adequately.
ss.rnd(aa, ns, ne) is the counterpart of ss.nd(aa, ns, ne). Here search
for the sub-string aa is from the right end. Lines following [16 and 17] are
illustrations for its use. Functionally ss.index (aa, ns, ne) achieves the same
purpose as ss.nd(aa, ns, ne). In case the sub-string aa is not present in ss, a
ValueError is raised. .index() differs from .nd() only in this respect. A similar
identity holds good for ss.rindex() and ss.rnd() as well. s6.index(e, 15) in
[18] returns the index value 47. But s6.index(e, 15, 45) returns ValueError
in [19] (Fig. 7.8b). s6.rindex(e, 15, 45) in [20] returns a ValueError. S6.
rindex(, 15) returns 47 as the index in [21]. ss.swapcase() swaps the cases of
all the letters in ss.
The following are additional method with stringsall illustrated through the
string s6/s6t.
s6.title() converts the string s6 to title form; the leading letters of all the
words in it are turned into capital letters [22].
s6t.swapcase() swaps the cases of all the letters in s6t [23].
S6t.lower() returns a copy of s6t with all the characters converted to lower
case [24].
s6t.uper() returns a copy of s6t with all the characters converted to upper case
[25].
s6.zll(70) returns a string of 70 characters as long as len(s6) is 70 or less.
The difference between 70 and len(s6) is made up by lling 0 at the leading
(i.e.,. left end) side [26]. If length of s6 exceeds 70 (or the number in its place) in
length, s6 is returned without any change.
7.6 bytes and bytearrays 157
Computers store data as bit sequences. But even at the elementary levels of storage
and processing data is treated as byte sequences. Files are more often formed,
stored, and communicated in natural languages. To facilitate exchange of data
between these two classes two dedicated data typesbytes and bytearrays
are available in Python. The operations dealing with these are discussed here.
bytes is similar to a string with some restrictions. It is immutable and made up
of only bytes. Any ASCII characterwith the possibility of representation as a byte
can also be an element in it. A bytearray is the mutable counterpart of bytes;
it is also made up of only bytes. Any ASCII character can also be an element in it.
When the bytes sequence is used as a whole in a program it is used as a
bytes object. But if its elements are to be altered in the program it is used as a
bytearray. A number of functions and methods are available to convert bytes/
bytearray from one form into another. We discuss these in different groups here.
Bytes and bytearrays can be formed in different ways. The Python Interpreter
sequence in Fig. 7.9a, b illustrate the methods of their formation and related
operations. A set of characters can be transformed into the bytes type by pre-
ceding it with a b or B ([1], [2] in Fig. 7.9a). However all the characters here are
constrained to be of ASCII type. ASCII charactersdue to their wide use in many
data/le representations/storagesenjoy this privilege. This is the simplest way of
forming a bytes type string. In general any string ss can be represented as its
equivalent bytes counterpart by encoding it. The method ss.encode() encodes
the string ss directly into a bytes string. The encoding is implicitly taken to be
of UTF-8 type, UTF-8 being the most widely used representation. Encoding to any
specic type can be effected by specifying it through an argument. The same
string (life of zest and value) has been encoded without specifying the
encoding and specifying encoding to be in UTF-8 form in [3]. The source string
itself being in UTF-8 form, it remains unaltered in both cases but for the conversion
to the bytes type.
A variety of encoding standards (UTF-16, UTF-32, ) can be specied as the basis
to convert a given string of characters into the bytes form. Similarly a given bytes
object can be converted into a string using the method bytes.decode(). If
encoding is not specied UTF-8 is taken as the default type. Otherwise the encoding
scheme has to be specied. s6 in [4] in the Fig. 7.9a is a string. It is encoded into
UTF-8 form and assigned to s7 [5]. All the characters in s6 being of the ASCII type, it
remain unaltered. Subsequent decoding of s7 using the UTF-8 format itself conrms
this [7]. s6 is encoded into UTF-16 form and assigned to s9 [8]. In UTF-16 every
Unicode point is represented as a single 2-byte long set or a pair of 2-byte long sets.
With any UTF-16 string 2 bytes are added at the beginning (\xff and \xfe) signifying
the direction of representation. With this s9 has a total of 106 bytes [9] (=2 + 52 * 2);
note that s6 has 52 characters in it. s9 is decoded with encoding specied as UTF-16
in [10] to conrm that the retrieved string is s6 itself. Similarly s6 is encoded into
UTF-32 form [11] and decoded back into s6 itself [13] (Fig. 7.9b). UTF-32 uses a
158 7 Operations for Text Processing
(a)
l1 = b'Life of zest and verve' [1]
>>> type(l1) [2]
<class 'bytes'>
>>> 'Life of zest and verve'.encode(),'Life of zest and
verve'.encode(encoding ='utf-8') [3]
(b'Life of zest and verve', b'Life of zest and verve')
>>> s6 = 'One day there passed by a company of cats a
wise dog' [4]
>>> s7 = s6.encode(encoding = 'utf-8') [5]
>>> s7
b'One day there passed by a company of cats a wise dog'
[6]
>>> s7.decode(encoding = 'utf-8') [7]
'One day there passed by a company of cats a wise dog'
>>> >>> s9 = s6.encode(encoding = 'utf-16') [8]
>>> s9
b'\xff\xfeO\x00n\x00e\x00\x00d\x00a\x00y\x00\x00t\x00h\x0
0e\x00r\x00e\x00\x00p\x00a\x00s\x00s\x00e\x00d\x00\x00b\x
00y\x00\x00a\x00\x00c\x00o\x00m\x00p\x00a\x00n\x00y\x00\x
00o\x00f\x00\x00c\x00a\x00t\x00s\x00\x00a\x00\x00w\x00i\x
00s\x00e\x00\x00d\x00o\x00g\x00'
>>> len(s6), len(s9) [9]
(52, 106)
>>> s9.decode(encoding = 'utf-16') [10]
'One day there passed by a company of cats a wise dog'
>>> s10 = s6.encode(encoding = 'utf-32') [11]
>>> s10
b'\xff\xfe\x00\x00O\x00\x00\x00n\x00\x00\x00e\x00\x00\x00
\x00\x00\x00d\x00\x00\x00a\x00\x00\x00y\x00\x00\x00\x00\x
00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00r\x00\x00\x0
0e\x00\x00\x00\x00\x00\x00p\x00\x00\x00a\x00\x00\x00s\x00
\x00\x00s\x00\x00\x00e\x00\x00\x00d\x00\x00\x00\x00\x00\x
00b\x00\x00\x00y\x00\x00\x00\x00\x00\x00a\x00\x00\x00\x00
\x00\x00c\x00\x00\x00o\x00\x00\x00m\x00\x00\x00p\x00\x00\
x00a\x00\x00\x00n\x00\x00\x00y\x00\x00\x00\x00\x00\x00o\x
00\x00\x00f\x00\x00\x00\x00\x00\x00c\x00\x00\x00a\x00\x00
\x00t\x00\x00\x00s\x00\x00\x00 \x00\x00\x00a\x00\x00\x00
\x00\x00\x00w\x00\x00\x00i\x00\x00\x00s\x00\x00\x00e\x00\
x00\x00\x00\x00\x00d\x00\x00\x00o\x00\x00\x00g\x00\x00\x0
0'
>>> len(s10) [12]
212
Fig. 7.9 a Python Interpreter sequence illustrating methods with bytes and bytearray (continued
in Fig. 7.9b), b Python Interpreter sequence illustrating methods with bytes and bytearray
(continued from Fig. 7.9a)
7.6 bytes and bytearrays 159
(b)
>>> s10.decode(encoding = 'utf-32') [13]
'One day there passed by a company of cats a wise dog'
>>> cc= ' [14]
>>> cc8 = cc.encode(encoding = 'utf-8') [15]
>>> cc8 [16]
b'\xe5\xa6\x82\xe6\x9e\x9c\xe4\xbd\xa0\xe6\x83\xb3\xe6\x8
8\x90\xe4\xb8\xba\xe6\x88\x91\xe4\xbb\xac\xe7\x9a\x84\xe8
\xb5\x9e\xe5\x8a\xa9\xe5\x95\x86\xe6\x88\x96\xe5\xb9\xbf\
xe5\x91\x8a\xe5\x95\x86'
>>> len(cc), len(cc8) [17]
(16, 48)
>>> cc16 = cc.encode(encoding = 'utf-16') [18]
>>> cc16 [19]
b'\xff\xfe\x82Y\x9cg`O\xf3`\x10b:N\x11b\xecN\x84v^\x8d\xa
9RFU\x16b\x7f^JTFU'
>>> cc16.decode(encoding = 'utf-16') [20]
[21]
>>> cc32 = cc.encode(encoding = 'utf-32') [22]
>>> cc32 [23]
b'\xff\xfe\x00\x00\x82Y\x00\x00\x9cg\x00\x00`O\x00\x00\xf
3`\x00\x00\x10b\x00\x00:N\x00\x00\x11b\x00\x00\xecN\x00\x
00\x84v\x00\x00^\x8d\x00\x00\xa9R\x00\x00FU\x00\x00\x16b\
x00\x00\x7f^\x00\x00JT\x00\x00FU\x00\x00'
>>> cc32.decode(encoding = 'utf-32') [24]
xed 4-byte representation for each character; additionally 4 bytes are prepended here
to the lot at the beginning; with all this the encoded bytes here is s10; it is of 212 bytes
(4 + 4 * 52) [12].
160 7 Operations for Text Processing
The encoding and decoding done with the string of Chinese characters (cc in
[14]) bring out the generality of the encode() and decode() methods. cc has
sixteen characters as can be seen from [17]. cc.encode(encoding = utf-8) in
[15] encodes cc to a UTF-8-type bytes sequence [16]. Every character here has a
UTF-8 representation running into 3 bytes each. Hence the encoded bytes sequence
is of 48 bytes [17]. With UTF-16 every character here encodes into 2 bytes (cc16
in [18, 19]). The bytes sequence here is 34 (2 + 2 * 16) bytes in length [25].
Similarly UTF-32 uses a four byte representation for every character to form cc32
in [22]. [23] gives the corresponding bytes sequence; it is of sixty-eight
(4 + 4 * 16) bytes [25]. The rst character of cc (cc[0]) has been separately
encoded into the three forms in [26] and again back to the character itself in [27].
The bytes()/bytearray() function basically returns a bytes/bytearray
type sequence. A few possibilities of its formation exist. The Python Interpreter
sequence in Fig. 7.10 illustrates these. at [1] is a string of a single Chinese
character. bytes(at, UTF-8) converts it into a bytes string in the UTF-8
form; that is the UTF-8 representation of is 3 bytes long. The same is assigned
to a1 in [2]. It is conrmed in [3] by encoding directly. In general any string
can be converted into a bytes object conforming to the desired encoding. For this
the bytes() function takes two argumentsthe rst being the string itself and the
second the encoding within single quotes (as illustrated above). In fact bytes()
uses ss.encode() discussed earlier to do the conversion.
a1 in [2] has been slightly altered by changing its third byte (from x97 to
x98) and assigned to aa in [4]; it is decoded in [5] into a string of a single Chinese
character (). The character string represented by at is converted into a string
conforming to UTF-16 in [6]. Similarly bb representing the single character is
converted to UTF-16 form in [9].
Every character in a bytes/bytearray being a byte, some common features
of representation of bytes/bytearray are noteworthy here:
If the character has an ASCII representation the character will be used directly.
Else the hex number will be used in the representation. However when speci-
fying a bytes/bytearray sequence one can use either representation.
In a sequence representing bytes/bytearray every byte value can be shown
as \xn1n2 where n1 and n2 are the MS and LS nibbles respectively.
a2is as a bytes sequence shown in [7]; the last 2 bytes here are \x57 and \x5b
represented by respective ASCII charactersW and [. The rst 2 bytes \xff
and \xfe (related to the type of sequence representation) are outside the ASCII
range and hence are retained as bytes. aa2 dened as a bytes and displayed in the
following line conrms this. Similarly ab in [9] has X and [ in place of their
respective ASCII values.
cc in [10] is a string of Chinese characters. The UTF-8 representation of
cc[1] is cc8 [11]; it is a three-byte sequence. The decoded value is the character
itself as can be seen from the line following. cc16 in [12] is the UTF-16 value
of cc[1] itself. In UTF-16 every character has a 2-bytes/4-bytes representation
7.6 bytes and bytearrays 161
(x82Y here). The whole of cc is converted into UTF-8 form in [14] and decoded
back in [15]. Every character here has a 3 bytes representation [16]. A sequence of
integersall in the range 0256can be converted into a bytes sequence using
the bytes() function as in [17]. \n representing the new line (command) has
ASCII value 1010 and ( has ASCII value 4010.
All functions and methods pertaining to bytes above are equally true of
bytearray as well, the sole difference being that the bytearray is a mutable
sequence. Figure 7.11 shows use of operations similar to those with bytes in
Fig. 7.10 above. As an illustrative example the array of integers aa (=[33, 34, 35,
36]) [4] has been converted into a bytearray using the function byteartray()
in [5]. The bytes version of aa is obtained in [6]. Both can be seen to be
composed of the same set of characters. bb in [7] is a sequence of integers; all of
them do not have values less than 256. Hence bb cannot be represented as a bytes
sequence or as a bytearray as can be conrmed from the following lines. Datay
in [1] is the bytearray representation of the list [10, 20, 30, 40] itself. The
second byte in it has been reassigned the value 41 in [2] and the altered
bytearray is shown in [3] (In [3] ) is the ASCII representation of 41 in b\n)
\x1e(). Data in [17] in Fig. 7.10 being immutable, cannot be altered in this manner.
7.6 bytes and bytearrays 163
The functions bytes() and bytearray() can be used directly in two more
contexts. Bytes(n)/bytearray(n) with n as an integer produces a bytes/
bytearray sequence of zeros the sequence length being n. This is illustrated in
[8] for n = 5. bytes(range(a, b, c))/bytearray(range(a, b, c)) is a sequence
of integersa, a + c, a + 2 * c, a + ((b 1 a)//c) * c. This is converted to
form a bytes/bytearray sequence. bytes(range(48, 52, 2)) produces
b0,2468 as bytes in [9]. Here {48, 50, 52, 54, 56} are the ASCII values of the
numerals {0, 2, 4, 6, 8}. Similarly with ba1 in [9], {65, 68, 71, 89} are the
ASCII values of {A, D, Y} respectively.
The python Interpreter sequence in Fig. 7.12 illustrates operations linking
bytes objects and integers. int() converts a string or a bytes object to the
corresponding integers to the base specied (see Sect. 7.4.1). [1] and [2] are
additional examples of this. The string 159a as well as the bytes object b159a
has the decimal value 5530 (=163 + 5 * 162 + 9 * 161 + 10 * 160). A bytes
sequence can be converted into a corresponding integer using the method int.
from_bytes(). The use of its variants is illustrated from [4] to [8]. With bb as a
bytes sequence int.from_bytes(bb, byteorder = big) converts bb into
an integer taking the left most byte of bb as the MS byte. int.from_bytes(bb,
byteorder = little) does the conversion taking bb to be the little-endian
its MS byte being taken as the right most one; both are illustrated in [4] which
shows 0200h = 51210 and 0002h = 210. If bb is formed in the system beforehand,
one can use the system byteorder itself, by specifying it as in [6]. To facilitate this,
the sys module has to be imported prior to the conversion [5]. The conversions
into integers so far here have implicitly taken the byte sequence to represent a
positive integer. In case it is a negative integer in 2s complement form, the same
may be specied through the use of a third argument as in [7] and [8]. If
signed = False is specied as the third argument, the bytes sequence is taken
as representing a positive integer as in [7]. In [8] Signed = Trueas the third
argumentsignifying the negative integer in 2s complement form. The byte
sequencebxfe\x00 is specied as a (big-endian) in 2s complement form; the
converted integer is at4 (=512). If the third argument is absent the integer
concerned is taken as a positive one by default as was done in [4] and [6] above.
nn.to_bytes() converts the integer nn to a corresponding byte sequence. The
sequence length as the number of bytes and the type of representation as being
big, or little have to be specied as the two arguments. As an example the
integer at1 (=51210) is converted into a bytes sequence of both types and
assigned to b1 and b2 in [9]. In both cases the number of hex characters in the
sequence has been specied as four. The byte ordernot being speciedis taken
as False; that is nn is taken as a positive number by default. If the number is a
negative one represented in 2s complement form, the third argument may be
specied as byteorder = True. [10] illustrate both the cases. at4 (=512) and at5
(=512) obtained earlier are reconverted into 4-byte sequences of length four
charactersthe number being in hex form represented as a bytes sequence. B\xff\xff
\xfe\x00 has been converted to integer (=512) with int.from_bytes() in [11].
164 7 Operations for Text Processing
Fig. 7.12 Python Interpreter sequence illustrating methods linking bytes objects and integers
>>> c1,c2,c3,c4,c5 =
chr(0x41),chr(0x37e),chr(0x7e),chr(0x2190),chr(0x221a)
[1]
>>> c1, c2, c3, c4, c5
('A', ';', '~', '', '')
>>> chr(x03b4) [2]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x03b4' is not defined
>>> d1, d2, d3, d4, d5 = ord('A'), ord(';'), ord('~'),
ord(''), ord('') [3]
>>> d1, d2, d3, d4, d5
(65, 894, 126, 8592, 8730)
>>> cc1, cc2, cc3, cc4, cc5 = '\u0041', '\u037e',
'\u007e','\u2190', '\u221a' [4]
>>> cc1, cc2, cc3, cc4, cc5
('A', ';', '~', '', '')
>>> c01, c02, c03, c04, c05 = '\U00000041', '\U0000037e',
'\U0000007e','\U00002190','\U0000221a' [5]
>>> c01, c02, c03, c04, c05
('A', ';', '~', '', '')
>>> g1, g2 = '\N{GREEK CAPITAL LETTER OMEGA}', '\N{GREEK
SMALL LETTER BETA}' [6]
>>> g1, g2
('', '')
check the bytes sequence formed in [12] is converted back to a list in [13]. Note
that {122, 123, 124, 130} are the decimal equivalents of the hex number set {7a,
7a, 7c, 82}.
A few simple functions/methods are available to convert characters into
respective Unicode numbers and vice versa. The Python Interpreter sequence in
Fig. 7.13 illustrates their use. The function chr(nn) treats nn as the Unicode
representation of a character and returns the corresponding character. [1] shows a
few examples. The Unicode values of a set of characters are converted into
respective characters and assigned to c1, c2, c5. The integer x03b4 does not
represent any valid Unicode character; hence chr(x0b4) in [2] returns an error. The
function ord(cc) converts the character stringccinto the corresponding
integer. Here cc has to be a single Unicode character. The character set {A, ,
*, , } is converted back to the corresponding set of integers in [3]. The
integer values are output in decimal form here.
The escape sequence \upqrs with pqrs as a hex number of four hex characters
(a 16-bit number) is treated as a Unicode character. The Unicode set used in [1] is
reproduced in [4] with this representation. Similarly a Unicode character can be
represented as \Upqrstuvw with a capital U preceding eight hex characters
(a 32-bit number). [5] illustrates this for the same set of characters as considered
166 7 Operations for Text Processing
character
string: any base chr() ord()
up to 36
Int.to
Integer - decimal _byte
array
) Int.fr ()
nt( Int om_
i Int .to bytea
.fro _b rray()
m_ yte
bin() Integer - binary by
tes
s()
()
Integer decimal/ string
oct() bytes bytearray
binary/octal/hex
Integer - octal e x()
he mh
.decode()
.encode()
x() .fro Integer sequence
tes
by
Integer - hex
string
bytearray.fromhex()
(Number).hex()
float.fromhex()
Fig. 7.14 A compact representation of the conversion possibilities between numbers and
sequences in Python
above. \N{name} also can be used to represent Unicode characters. Here name
is the name for the character in the Unicode database. [6] is an illustration for the
character pair (, ).
A number of functions and methods for conversions amongst numbers, strings,
bytes and so on have been discussed. The scheme shown in Fig. 7.14 is a compact
representation of all these together.
With the ASCII character set as basis a few character sets have been dened in the
module string. They can be of use in text processing; the string can be
downloaded and any set within accessed as string.xx. The details are summa-
rized in Table 7.4.
7.9 Exercises 167
7.9 Exercises
8. A letter set of three like ght, ion, the is called a trigram. Write a program
to get the frequencies of all the trigrams and retain the data for the most
frequent 20 of them. Get the most frequent 20 trigrams for the above plain text.
9. The normalized letter frequencies form the probabilities of occurrence of the
P
respective letters. With pi as the probability of occurrence of the ith letter, p2i
is called the Index of Coincidence (IC). The IC values for general texts in
different languages are known. For English texts IC = 0.0655; in contrast for a
completely random text it has the value of 0.0385. Write a program to get the
IC value and get it for the given text.
Armed with the letter frequencies, knowledge of the dominant bigrams and
trigrams, and the IC values one should be able to do cryptanalysis of most of
the common conventional ciphers.
10. The Substitution Cipher uses a look-up table (LUT) to substitute every letter
in the plain text with the one in the LUT to generate the cipher text. Write a
program to generate the cipher text from the plain text using the LUT. Use it to
get the cipher text for the given plain text.
11. A cipher text obtained using the Substitution Cipher is given. One can get its
letter frequencies, compare with those of English text, and identify the sub-
stitution used for the most common letters like e, s, t etc. Similarly one can
identify the substitution used for the letters with the least frequencies like z,
q, x, etc. Still some indecision remains. The most common bigrams and
trigrams can be identied and compared. With these the substitution used for
many of the letters can be identied. Armed with these and our familiarity with
common English words (eng?ish english, re?ain remain ) additional
identication can be done. Identifying the plain text in this manner constitutes
Cryptanalysis. Obtain the cipher text with a substitution cipher. Do crypt-
analysis and retrieve the plain text (With readily available programs and cipher
text of about 300 letters the exercise may take a few hours of effort for
completion).
12. With a, b, c, z represented by 1, 2, 3, , 26, the Afne Cipher uses
the relation y = (ax + b) % 26 to substitute the letter represented by integer x by
the letter represented by integer y. a and b are integers (the two together forms
the encryption/decryption key) with the constraint on a that its only common
factor with 26 is one. Write a program to get the cipher text for a given plain
text with Afne Cipher (for given a and b values).
13. Afne Cipher is a special case of a Substitution Cipher. For a given cipher text
one can obtain the letter frequencies; by comparison with the known fre-
quencies of common texts a few most dominant letters can be identied. By
substitution in the equation y = ax + b the same can be conrmed. Do crypt-
analysis of the crypto text in Exercise (12) above.
14. With a = 1, the Afne Cipher becomes a Shift Cipher. Vigenere Cipher is a
generalized version of the Shift Cipher. It uses a set of m key values{b1, b2,
bm}. The plain text is split into successive blocks of m letters each (normally
m will be a single digit integer). The rst letter of each block is shifted by b1,
7.9 Exercises 169
the second by b2, and so on up to the mth letter (by bm). Do this successively for
all the blocks. This completes encryption. Prepare a program to do encryption
conforming to Vigenere Cipher. Get the cipher text for the given plain text.
15. Cryptanalysis of Vigenere Cipher is a more challenging affair. One has to
identify the value of m rst and then the set {b1, b2, bm}. The IC concept can
be used to identify the m value. With ci as the ith letter in the cipher text, the
sub-sequence of letters{c1, c1+m, c1+2m, }forms a Shift Cipher type
crypto-text with b1 as the shift. It will have the characteristics of a normal text; its
IC value will be close to that of plain text (=0.0655). Same is true of the other
(m 1) sub-sequences also. With different values of m (2, 3, 4, ), form the
sub-sets {c1, c1+m, c1+2m, }. Get the character frequencies and the IC values.
The m-value which yields the IC closest to 0.0655 is the correct one. The
procedure can be repeated with successive sub-sequences to conrm the m-
value. Once the m-value is identied, with each of the m separate sub-sequences
the procedure in Exercise (13) above can be used to get the set{b1, b2, bm}.
With the m value and the full set {b1, b2, bm} known the plain text can be
recovered. For the cipher text in the last exercise, do cryptanalysis and retrieve
the plain text (With all the programs available cryptanalysis and plain text
retrieval may take a few hours).
16. Huffman Coding: One of the earliest schemes of lossless data compression was
proposed by Huffman (Forouzan 2013). We shall go through a simplied
version of the scheme. A data transmission scheme uses a set of four symbols
{a, b, c, d} with probabilities of occurrence {0.45, 0.3, 0.15, 0.1} respectively.
The Huffman scheme for the set follows:
The symbols are arranged in descending order of probabilities. The most
probable symbol is assigned the code value 0a single bit. For the rest the rst
bit is taken as 1. The second most probable symbol is assigned the second bit
value0 and its code is 10. For the rest the second bit is assigned the value 1; a
third bit is also assigned to them with values of 0 for the more probable one and
value of 1 for the less probable one respectively (see Fig. 7.15).
0.30 10
b
1
0.15 110
c
11
0.10 111
d
170 7 Operations for Text Processing
Table 7.5 The symbols, their probability values, and the cumulative probability values for the
Example in Exercise 7.17
Symbol A B C D E
Probability 0.1 0.3 0.2 0.35 0.05
Cumulative probability range 0.00.1 0.10.4 0.40.6 0.60.95 0.951.0
7.9 Exercises 171
The symbol sequence is identied by its probability and the probability value
forms the basis to decide the code to be assigned to it. The rst symbol B is
assigned the probability range 0.10.4 (P1Q1) as shown in the rst line in the
gure.
The second symbol C has the absolute probability range 0.40.6. Hence the
rst and the second symbols together is assigned the absolute probability range
within the (P1Q1) band as 0.1 + (0.4 0.1) 0.4 to 0.1 + (0.4 0.1) 0.6
that is 0.220.28shown blown up in the second line. This range is repre-
sented by (P2Q2) in line 2 in the gure.
The third symbol A has the absolute probability range 0.00.1. Hence
the sequence BCA is assigned the probability range 0.220.22 +
(0.28 0.22) 0.1that is 0.220.226shown blown up in the third line.
This range is represented by (P3Q3) in line 3 in the gure.
Proceeding successively in the same vein the probability range formation for
the full sequenceBCADDBEis shown in the gure. Finally the sequence
has the specic probability range 0.2251429750.225154 assigned to it. The
corresponding binary range is 0.0011100110100010111110000
101001010001100111 to 0.001110011010001110110001001010101001
0000101 any binary value within this range can be used to uniquely represent
this sequence. Specically 0.001110011010001 sufces here since this part is
common for the full range. The additional bits are discarded since they do not
add any additional information of interest to us here.
The code for the sequenceBCADDBEis generated from the binary value
of the probability for it. It involves two changes:
a. Truncate the number of bits at a point where the value has crossed the point
P7 in Fig. 7.16 signifying that the next symbol in the sequence is E itself
that is the sequence has ended (this has been done above).
b. Ignore the 0. part of the probability and use only the rest of the bit
sequence. 0. is superfluous and does not add any information to the
sequence.
Any source sequence of characters from the set in Table 7.5 can be encoded in
the same manner. The encoding algorithm is summarized as follows:
a. Start with the table of probabilities and cumulative probabilities.
b. Identify the probability range of (P1Q1) for the rst character.
c. Let the probability range for (Pi1Qi1) be Dis * Die where s and e
signify the Start and End of the range.
d. The jth character in the table has the absolute cumulative probability range
(Cj1, Cj).
e. For all i from 2 onwards up to the last character (nth) in the source sequence,
do the following recursively:
172 7 Operations for Text Processing
0.22 0.28
0.1 BC 0.4
Positioning
of symbol 2 Q2
P2
0.2236 0.2257
0.22 0.226
Positioning BCAD
of symbol 4 P4 Q4
0.225142975 0.225154
0.2249335
Positioning
of symbol 7 P7 Q7
BCADDBE
Fig. 7.16 Arithmetic coding procedure for the message sequence BCADDBE
f. Let Si be the ith character in the sequence. We have the recursive relations for
Dis and Die as Dis = Di1,s + (Di1,e Di1,s) Csi1 and Die = Di1,s + (Di1,e
Di1,s) Csi. Update the probability range for the character sequence up to and
inclusive of Si using these.
g. With a total of n characters in the sequence (Dns * Dne) is the probability
range representing the last character (E). Truncate it such that the truncated
value lies within (Dns * Dne) range. Remove the 0. part of the probability
value of the truncated number to get the code for the sequence.
h. Successive characters in the source sequence affect only the trailing bits of
the probability being evaluated. Hence the leading bits of the coded
sequence can be progressively taken out from the left end and added to code
as soon as they stabilize in value.
The decoder algorithm is as follows:
a. Prex the received sequence with 0. to form the cumulative probability pc
of the sequence.
b. Identify the (P1 Q1) segment where pc lies. Identify the rst symbol S1.
7.9 Exercises 173
References
Forouzan B (2013) Data communications and networking, 5th edn. McGraw Hill, New York
Original UTF-8 paper. (http://doc.cat-v.org/plan_9/4th_edition/papers/utf)
Padmanabhan TR (2007) Introduction to microcontrollers and their applications. Alpha Science
International Ltd, Oxford
Shyamala CK, Harini N, Padmanabhan TR (2011) Cryptography and security. Wiley India, New
Delhi
The Unicode Standard: A TechnicalIntroduction. (http://www.unicode.org/standard/principles.
html)
van Rossum G, Drake FL Jr (2014) The Python library reference. Python Software Foundation
Chapter 8
Operations with Files
Information in different forms is stored in les and retrieved from them. A number
of le related operations are available in Python (van Rossum and Drake 2014);
their details and use are discussed here.
8.1 Printing
Information can be displayed on the monitor or written into les using the print()
function. The Python Interpreter sequence in Fig. 8.1 illustrates the use of com-
paratively simpler forms of print() function. A number or a sequence of numbers
like (nn, mm) in different forms of representationcan be displayed directly on
the monitor using print() as can be seen from [1]; the same is true of any item
which can be printed/output directly. a is assigned the result of 3.0 * 2 and b is
computed as a/2.0 in the following line. Both are displayed using print(a, b) in
[3]. print() function without any argument returns a blank [4]. {a1, a2, a3, a4}
is a set of strings [5]. print(a1) in [6] prints out the string a1 directly. A more
general form of the print() function is print(e1, e2, e3, , sep = fg,
end = hj). Three sets of arguments are present here.
e1, e2, e3, , is a comma separated set of entities which can be printed
directly. These can be numbers and strings/or their combinations which can be
printed directly.
The second argument is specied as sep = fg; fg has to be a string. It is used
as the separator between e1 and e2, e2 and e3, and so on in the printout. If this
argument is absent, e1, e2, e3, are printed out without any separator
between any two of these.
The third argumentend = gh has gh as a string. The print() is executed
here with gh forming the end part of the printout. If gh is absent, by default
the Interpreter advances to the next line after executing the printout.
175
176 8 Operations with Files
a1 and a2 are printed out in [7] with the sep being specied as , a0 in [8] is a
tuple of strings. It is directly printed out in [8] as print(a0). The elements of a0
are successively printed out in a loop in [9]. Each print execution ends with a
comma. [10] is a more elegant realization of the same. The sequence\n (the
backslash followed by n)signies a new line (as in ASCII set). Its use in [11]
implies that at the end of printing a, advance to a new line. The output can be seen
in the following line (Execution of [11] is in no way different from that of [6]
above, since in the absence of end specication, Interpreter advances to the next
line by default). Pythonlike other computer languagesuses a number of such
escape sequences in strings; each such sequence is a single character with an
implied signicance. The Escape sequences and their respective meanings are listed
in Table 8.1. [12] prints out a1 and a2, as a1 followed by a2 in a new line (In
contrast both are on the same line in [7]). [13] species a1, a2, a3, and a4 to be
printed out on successive lines. Further ! mark is to be printed out at the end of a4;
then the interpreter advances to the new line.
Successive elements of a0 are printed out in [14]. A tab (as \t) separates the
successive arguments output from a0. print() execution ends with a tab as can be
seen from the output. \v is the vertical tab. With the loop in [15] the interpreter
prints out every element of a0 followed by the next after a vertical tab; that is after
printing out every item the interpreter advances after a vertical tab. In contrast with
[16] every printout ends with a vertical tab. The subtle difference between the two is
noteworthy.
The print() variety possible continues with Fig. 8.2. b0 in [1] has the single
quoteas part of the string. The double quotes at either end impart the string
status to the sequence. [2] prints out the string; the single quote is retained here. The
same holds good of b1 in [3] and with print(b1) in the following line; all the
(three) single quotes are part of the string here. b2 in [4] has backslash (\) as part
of the string. With the printout of b2 in [5] \n is misinterpreted as an escape
character as can be seen from the output in the two lines following. The backslash
Simple and unsophisticated use of the print() function was demonstrated in the
foregoing section. Present versions of Python (>3) facilitate convenient formatting
of entities to form strings. These strings may be added/stored in appropriate data-
bases or printed as outputs. Thus formatting and printing become delinked; it adds a
level of flexibility to program execution. The string is formed from a tuple of items.
8.2 String Formatting 179
Basically the different elements in the string represent the information to be for-
matted. They are all linked together and suitably padded with additional literal text
(if necessary) to form the formatted string. Two versions of the formatting scheme
are available. The version described rst here is comparatively rigid; it is essentially
an earlier version (from C) retained for continuity. The second one is more com-
prehensive and flexible; it is the one recommended to be used.
8.2.1 FormattingVersion I
Details of formatting in the rst version are summarized in Fig. 8.3. The scheme is
characterized by the following:
1. The string has as many replacement elds as the number of entities in the
tuple. The replacement elds appear in the same order as the entities in the
tuple.
2. The modulo operator % signies the start of each replacement eld.
3. The % character is followed by four optional components and a nal
mandatory character signifying the type of conversion to be carried out. This
conversion character can be one from Table 8.2.
4. The rst optional component is a mapping key. It is present only if the items to
be put in the replacement elds are specied through a dictionary instead
of a tuple.
5. An optional flag modies the structure/orientation of the entity. The possible
flag types and their effects are given in Table 8.3.
6. An integer specifying the width/number of spaces to be allocated in the string to
the entity forms the next component.
ith field: number of fields same as number of elements in in source replaced in same order as in source
Fi % ( ) F W.p f
Conversion type (See table 8.2)
Precision: No. of digits after decimal point (optional only if the element is a number)
Table 8.2 Conversion characters for the rst version of string formatting
Conversion Meaning
d Signed integer decimal
i Signed integer decimal
o Signed octal value
x Signed hexadecimal (lowercase)
X Signed hexadecimal (uppercase)
e Floating point exponential format (lowercase)
E Floating point exponential format (uppercase)
f Floating point decimal format
F Floating point decimal format
g Floating point formatuses lowercase exponential format if exponent is less
than 4 or not less than precision, decimal format otherwise
G Floating point formatuses uppercase exponential format if exponent is less
than 4 or not less than precision, decimal format otherwise
c Single character (accepts integer or single character string)
r String (converts any Python object using repr())
s String (converts any Python object using str())
a String (converts any Python object using ascii())
% No argument is converted, results in a %character in the result
Table 8.3 Conversion flags for the rst version of string formatting
Flag Meaning
# Value conversion will use alternate form
0 Conversion will be zero padded for numeric values
Converted value is left adjusted (overrides the 0 conversion if both are given)
(a space) A blank should be left before a positive number (or empty string) produced by
a signed conversion
+ A sign character (+ or ) will precede the conversion (overrides a space flag)
Fig. 8.4 Python Interpreter sequence illustrating string formatting conforming to Version 1
single item it is directly tted into the string and output [2]. The width specied
as six is for the whole number representationinclusive of the sign and the decimal
point. az in [3] is a tuple of numbers selected to bring out the flexibility possible in
182 8 Operations with Files
formatting. The numbers are printed out to different format specications in the
following lines. The flag#in [4] demands output in decimal form [4a] even
though the number (az[0] = 3) is an integer. With proper formatting specied the
numbers for [4], [5], and [6] are output [4a], [5a], and [6a] properly aligned. The
same is not true of the next two outputs. The flag0in [5] and [5a] ensures 0
padding to the left of the number in lieu of blank spaces. The blank space used as a
flag in [6] is a space provision for a sign (as if the number is negative). The flag
in [7] and [7a] results in left adjusted output. The output uses the floating
point exponent in [8] and [8a] due to the use of the conversion charactere. rm in
[9] is a dictionary of three entriestwo strings followed by an integer. The
stringRoshan gets in Maths 100 out of 100 [10a]is formed from it and
output. The keysName and subjectare replaced by the respective
string s (Rohan and Maths) and the third keyMarksreplaced by
the corresponding integer100. The integer value91is assigned to ab in [11]
and output in different formats in succeeding linesthat is hex and octal values
with and without respective prexes (0x and 0O respectively). z1 in [16] is a tuple
of two elementsa single character string (h) and an integer (123). With U+123
as the Unicode of a character ({) a string is formed in [17] with the single
character format. With * as the width allocated for the entity in [19] the width
value is specied (as 8) in the tuple that follows. Similarly the asterisk in [20]
signies the precision desired. Its value is specied as two in the tuple. In [21] the
width as well as the precision is specied in the tuple itself. In all these cases the
assignments to the asterisks have to precede the concerned element value in the
same order.
8.2.2 FormattingVersion II
ith field
All three fields optional Mandatory curly bracket signifies end of field
Fi { field name ! Conversion : format_spec }
All fields optional
fill align sign # 0 width , .precision type
String: identified by s
Table 8.4
Three options:
!r repr(); !s str(); !a ascii()
a dot.(in case the entity is a dictionary item, the eld name is the key of
the entity concerned). If the replacement elds are specied in the same order as
the sequence of entities within the brackets, the serial number can be omitted;
the interpreter will assume them to be 0, 1, 2, and so on, in the same order.
2. The character!identies the conversion eld, if the same is present. Three
forms of conversionsspecied as !s, !r, and !a are possible. With ee as
the specied item, these return str(ee), repr(ee), and ascii(ee)
respectively.
3. The character : species the format specication if the same is present. The
format specication here is again composed of a number of optional elds:
184 8 Operations with Files
The formatting examples in Fig. 8.7 illustrate the variety and flexibility possible
when formatting with Version II. The set of simple strings in the list az [1] are
directly formatted into a string in [2]. The desired replacement order being the
same as in az, the index values are not specied. ay in [3] is a dictionary of
four items. They are formatted into a string [4] by specifying the keys in the
186 8 Operations with Files
Fig. 8.6 Python Interpreter sequence illustrating string formatting conforming to Version IIthe
examples are the same as those in the sequence in Fig. 8.4
Fig. 8.7 Python Interpreter sequence illustrating the variety and flexibility in formatting possible
with Version II
188 8 Operations with Files
formed in [9], [10], [11], and [12]a number, a string of two different types of
items, a list of different types of items including one involving computation of an
algebraic expression and the last one being a dictionary. All these form
arguments for forming ss1the formatted string in [13]. The replacement elds
to form ss1 are also in different orders.
The math module is imported [14] and the value of directly printed out in
[15]. v1, v2, and v3 in the formatted string in [16] represent in three different
waysall giving identical results. The eld width specied for in [17]5 digits
gives a corresponding approximate value of . pi () has been dened as a
number in the math module (see math.__dict__); hence it is accessed as math.
pi here.
Example 8.1 Marks earned by a set of students in different subjects are given as a
set of strings in demo_5.marks1 (see Fig. 5.16). The subject names and the
student names are also given there. Output the data as a well arranged formatted
table.
The program to present the information is in the suite from [19]. The output is
presented in the lines starting with [20]. The table is properly formatted (spaces
uniformly set out and aligned) since the length of every entity in the table is equal
to/less than the default tab size (8). If the length of any quantity exceeds this tab
size, the program has to be suitably changed.
Modules (see Sect. 4.2) serve as platforms to store python code and functions in
conveniently organized form. When required, they can be retrieved and used
through importing. Data as a number sequence or as bland text can be stored as
les. In the Python environment, a le is a string or a bytes object stored in a
specied location. A set of associated methods provides access to the le to use
specied and selected parts from it or modify it in desired ways. We shall study
these in some detail here.
the current directory, ft has been opened as a new le. As mentioned earlier all
items written into the le represented by ft together will be stored as a string/
bytes object. s1 in [2] is a single string. d1.write(s1) in [3] writes s1 as a
text in the le (through d1). write() is the method used to do the writing. The
write() command on execution returns the total number of bytes written into the
le. Here it is 25 as seen from [4]. Once opened in this manner as many
190 8 Operations with Files
(a) (b)
a1 Maya: 'Roshan, How are you?'
b2 Roshan:'Fine, Maya, Thanks
c3 Maya:'Nice to know that, Roshan
d4
Fig. 8.9 Content of lesftb (a) and fmr (b)after each is written and closed
characters/character sequences as desired can be written into the le. When the
desired writing is complete the le can be closed with d1.close() as in [5].
Whenever a le is opened for writing or other related operations it should be closed
with the method close() to free up the system resources committed to the opened
le. The le ft is opened again in [6] but this time it is in read mode as the second
argumentrsignies. d3.read() in [7] uses the read method to read the
contents of ft. Another read in [8] returns an empty string since the content of ft has
already been read in [7] itself. d2.close() in [9] is the formal closing of ft to avoid
the le being left open as well as to free the resources used by ft when ft was in the
open state. A new leftahas been opened in [10] with d3 representing the
open le here. The le path has been specied with open() itselfmandatory
when the le opened is in another directory (and not the current one). fta has been
opened in the directoryDocuments. e1 is a tuple of strings [11]. They are
all written in the same sequence into fta in [12]. Since each string herea1,
b2, c3, and d4is of two characters, each write() returns 2 on completion
of writing. The resulting full content of fta is a single string of eight characters
a1b2c3d4. The same can be seen from [17] to [20] where fta is opened again
this time in read mode specifying the path [17]. The contents are read [18],
displayed [19], and the le is closed [20]. Another leftbis opened in [14] with
its path specied. The elements of e1 are written into it in [15] in four separate and
successive lines. Here each write() comprises of three characterstwo being the
string and the third the new line character\n. When ftb is closed after this
write sequence, its content is a sequence of four lines (a1, b2, c3, d4); all these
four lines together make up the le content. The le content has been reproduced in
Fig. 8.9a; nevertheless it still remains a single stringa1\nb2\nc3\nd4\n.
When le operations are desired to be done in a clear sequence use of the
with keyword makes it elegant. The le is automatically closed as part of the
sequence obviating the need for a separate close() command. ftb is opened in
[21] in this manner for reading and its contentsas a stringis assigned to mn and
ftb closed; [22] conrms this. mna single string of four lines(as explained
earlier) is shown in [23]. [21] constitutes a single operation; multiple commands
also can be executed in the same manner within a single suite.
Additional methods with les and the flexibility they offer are brought out
through the Python Interpreter sequence in Fig. 8.10. h1 in [1] represents a new le
fmropened in the current directory in the write mode. Three strings are
written in succession into the le[2], [3], [4]and the le is closed. Each of the
8.3 Files and Related Operations 191
three strings ends with a newline and the le contents at that stage look as in
Fig. 8.9b. However (in the Python environment) the le itself is a single string
comprising of these lines. fmr is opened (as h2) in the read mode in [6] and another
lefaaagain in the same directoryis opened in write mode [7] as h3. The
suite of statements from [8] reads fmr line by line and writes it to faa. readline()
reads one line of the opened le and advances the le pointer to the start of the
following line. After all the lines are read from faa, mm = h2.readline() returns
an empty string. This terminates the loop. From the number of characters written
successively (29, 27, and 32) one can see that all the three lines in fmr have been
written into faa. Following this h2 and h3 are closed. faa has been opened again in
[12], its contents read, and faa closed after completion of the read operation. The
192 8 Operations with Files
(a)
>>> e1 = ('a1', 'b2', 'c3', 'd4') [1]
>>> p1 = open('ftz', 'w') [2]
>>> for jj in e1:p1.write(jj + '\n')
...
3
3
3
3
>>> p1.close()
>>> p2 = open('ftz','r+') [3]
>>> p2.readline() [4]
'a1\n'
>>> p2.write('e5') [5]
2
>>> p2.close()
>>> with open('ftz','r') as p3:p3.read() [6]
...
'a1\nb2\nc3\nd4\ne5'
>>> p4 = open('ftz','r+') [7]
>>> p4.write('f6')
2
>>> p4.seek(0) [8]
0
>>> p4.read() [9]
'f6\nb2\nc3\nd4\ne5'
>>> p4.write('g7')
2
>>> p4.seek(0) [10]
0
>>> p4.read() [11]
'f6\nb2\nc3\nd4\ne5g7'
>>> p4.close()
>>> with open('ftz','w') as p5: [12]
... for jj in e1:p5.write(jj + '\n')
...
3
3
3
3
(b)
>>> with open('ftz','a') as p6:p6.write('Aa\n') [13]
...
3
>>> p6.close()
>>> with open('ftz','r') as p7: p7.read() [14]
...
'a1\nb2\nc3\nd4\nAa\n'
>>> dd = b'Dhruva as a star is an eternal symbol of HOPE'
[15]
>>> ee = b'\x65\x66\x67\x68' [16]
>>> with open('fty', 'w+b') as q1: [17]
... q1.write(dd)
... q1.write(b'\n')
... q1.write(ee)
...
45
1
4
>>> q2 = open('fty', 'r+b') [18]
>>> q2.read()
b'Dhruva as a star is an eternal symbol of HOPE\nefgh'
>>> q2.seek(0) [19]
0
>>> q2.readline() [20]
b'Dhruva as a star is an eternal symbol of HOPE\n'
>>> q2.readline() [21]
b'efgh'
>>> q2.close()
>>> d1 = open('Rubiayat') [22]
>>> while True: [23]
... mm = d1.readline()
... if mm == '3\n':break
...
>>> for jj in range(5):d1.readline() [24]
...
'\n'
'And, as the Cock crew, those who stood before\n'
'The Tavern shouted--"Open then the Door!\n'
'"You know how little while we have to stay,\n'
'"And, once departed, may return no more."\n'
>>> d1.close()
ftz closed. e5 can be seen to be appended at the end of the le as the read contents
reveal in [6]. The le is again opened in r+ mode [7] and the string f6
written into it. A fresh read of the le [9] shows that f6 has been added at the
beginning of the le. The updating done can be seen to depend on the previous
accesses after the le has been opened. Again writing the string g7 into ftz and
reading the le content [11] conrms this. tuple e1 has been written afresh into
the le ftz [12]. Subsequently ftz has been opened for appendingmode a [13]
and string aA appended to it. The appending can be seen to be done at the end of
the le [14] as expected.
dd [15] and ee [16] are two byte objects. A new le fty has been opened for
writing bytes (modew + b) [17]. dd and ee are written into it in successive
lines; when read, the le content is output as a single le object [18]. A fresh line by
line reading of fty (starting at the beginning of the le) shows the le content as
bytesand written in two successive lines [19], [20], [21].
Example 8.2 Rubiayat is a string le (a few couplets from Rubiayat by
Omar Khayyam) in the current directory. The couplets are entered in it with the
Serial Number as the title. Print out the third couplet.
Rubiayat is opened in read mode in [22]. Successive lines are read using the
methodreadline() in a loop to identify the third couplet [23]. The loop is terminated
as soon as 3 is read in a line. The subsequent four linesread with readline() in a
following loop are printed out [24] (d1.close() has been omitted in the listing).
8.4 Exercises
1. Prepare a python program to print out the pyramid of integers as in Fig. 8.12.
Save it in a le, read it back and reproduce it.
2. In Fig. 8.12 replace every integer by its 9s complement (nine minus the integer)
and get a new pyramid of integers.
3. In the pyramid in Fig. 8.12 replace every integer by S % 10 where S is the sum
of all the integers to the left in the same row.
4. Prepare a program to replace an integer in the range (0, 25) by a corresponding
alphabetic character. Use this to prepare a pyramid of alphabetic characters as in
Fig. 8.13.
5. Replace every character in Fig. 8.13 by the next one; replace Z by A.
6. The Gregorian calendar has the following features:
Reference
van Rossum G, Drake FL Jr (2014) The Python library reference, Python software foundation
Chapter 9
Application Modules
The Python Standard Library has a set of application modules with a possible wide
spectrum of users (van Rossum and Drake 2014). These are of interest here.
The random module in Python provides a variety of options that go with random
variables, random processes and their uses. As with random number generators in
computer based systems a pseudo random number generator is at the base of the
module; its period (in number of bits) being orders larger than the size of numbers
and sequences used, the operations are essentially random. The method random.
random() returns a (53-bit) floating point number in the range [0.0, 1.0)the
number being a random selection based on a uniform distribution over the range
(see [2] in Fig. 9.1). All other methods/functions are based on this basic selection.
The basic methods available are the following: Their use is illustrated with the
Python Interpreter sequence in Fig. 9.1.
A random seed nn initializes the generator to a seed integer nn. All subsequent
calls to different methods use this initialized generator as the basis; the result is a
deterministic sequence. Results of any random number based simulation/study
can be reproduced later by setting the seed to this number. This can be used to
conrm repeatability. Another application is to satisfy the need of using a
common (random data based) database for different simulations. The seed is set
to 253 in [3]. A set of random numbersb1, b2, and c1 are generated fol-
lowing this [4], [5].
random.getrandbits(bb) yields a sequence of bb random bits. The
sequence is returned as an integer [6] and [7].
199
200 9 Application Modules
Fig. 9.1 Python Interpreter sequence to illustrate the features of the random module
9.1 random Module 201
The seed for the random generator is reset to the earlier value (=253) in [8] and
the command sequence repeated. One can see that bb1, bb2, and cc1 obtained
here [9], [10] have the same values as b1, b2, and c1 obtained earlier. The
random.getrandbits(24) that follows in [11] returns the same 24-bit set as
obtained earlier [6].
A fresh Python session is started (after closing the above one) and random
imported again. The seed is set to the value (=253) [12] used in the foregoing
session. The previous command sequence is repeated. b10, b20, and c10
obtained here [13], [14] have the same values as b1, b2, and c1 in the last
session. Same holds good of random.getrandbits() in [15] as well.
When the seed is specied as an integer it is directly used as the seed for the
pseudo random generator. Alternately it can be a string, bytes, or
bytearray. In all these cases the equivalent binary string is used as the seed
for the pseudo random generator. If the seed is not specied the current system
time is taken as the seed. The tuple as1 in [16] is taken as the seed in [17].
A set of four random numbers in the interval [0.0, 1.0) is generated in [18]. The
seed value is restored in [19] and a further set of four random numbers is
generated in [20]; these can be seen to be repetitions of the set obtained earlier.
random.choice(aa1) returns a randomly selected element from the sequence
aa1. aa1 remains unaltered. aa1 can be a list, bytes, tuple and so on. In
the Python Interpreter sequence in Fig. 9.2 aa1 is a list of tuples. A random
element is selected from it as hh and the tuple hh is returned in [2]. In the
line following a similar random selection is done successively three more times
(in turn dd, bb, and cc are returned).
random.sample(aa1, k) uses aa1 as a base; a sample of k elements from
aa1 is selected at random and returned. In [3] a sample of three elements is
returned (aa, cc, hh) since k = 3. The original set aa1 remains undis-
turbed. Further the samples are selected independently and randomly. Hence the
sample set can be subdivided further (if necessary) and used as independent
sample sets.
random.shufe(AA) shuffles the sequence AA randomly in place. aa1 has
been assigned to bb in [4] and bb shuffled in [5]. The shuffled sequence can be
compared to the original one [1].
random.randrange(a, b, c) uses a sequence{a, a + c, a + 2 * c,
c * ((b a 1)//c)} and returns a randomly selected element from it. random.
randrange(5, 2000, 15) returns 13115 in [6]. (13115 5)//15 = 874; thus the
874th element is selected at random and returned. If c is left out an integer in the
range (a, b) is selected randomly and returned. If only b is specied, an integer
less than b is returned. [7] is an illustration.
random.randint(a, b) returns a randomly selected integer between a and b
(inclusive). [8] is an illustration. This is an alias for random.randrange(a,
b + 1). These essentially do random.choice(a, b + 1, c)in the sense that it
202 9 Application Modules
Fig. 9.2 Python Interpreter sequence to illustrate the additional features of the random module
Fig. 9.3 Listing of a Python function using Gaussian distribution: A bar graph of number
frequencies is made and a list of random numbers returned
the range (mean 3); it is reproduced in Fig. 9.4. Each bar represents the fre-
quency of numbers in the interval (6/20).
The distribution functions supported through the random module along with the
parameters to specify the functions (Krishnan 2006; Zwillinger 2003) are given in
Table 9.1.
9.2 statistics Module 205
Table 9.1 Details of the distribution functions available in the random module in Python. In
each case the function (when called) returns a number n conforming to the specied distribution
Type of Calling function Details of parameter(s) Range of n
distribution
Uniform uniform (a, b) [a, b]
Triangular triangular c is the mode. Default values of [a, b]
(a, c, b) a, c, b are 0, (a + b)/2, 1
Beta Betavariate and both are greater than [0, 1]
(, ) 0
Exponential expovariate (a) a1 is the mean [0, ) if a > 0 and
[0, ) if a < 0
Gammavariate gammavariate and both are greater than (0, )
(, ) 0
Gaussian gauss(, ) is the mean and the (, )
standard deviation
Log normal Lognomvariate >0 [0, ]
(, )
Normal normalvariate is the mean and the [, ]
distribution (, ) standard deviation
Von Mises Vonmisesvariate is the mean angle and the
(, ) concentration parameter
Pareto Paretovariate (a) a is the shape parameter (0, 1]
Weibull Weibullvariate and are the scale and shape (0, ]
(, ) parameters
The statistics module offers the facility to extract the key statistical information for
the given sample set. The sample values have to be real numbers. They need not be
ordered. The semantics of the methods are given in Table 9.2. The function stsc
(aa) (in module dst_aa.py) reproduced in Fig. 9.5 accepts a sequence of numbers
as input; it returns statistical information compiled using the statistics module, as a
dictionary. The sequence of 1000 random numbers (conforming to Gaussian dis-
tribution) obtained earlier is used as input to stsc() and the statistical information
extracted presented as a dictionary in Fig. 9.6. For a sufciently large sample set the
mean and median values should be 3.0the mean value used to generate the
sample set; the variance should be 4.0, since = 2.0 for the generated sample set.
206 9 Application Modules
Table 9.2 Quantities that can be calculated using statistics module. dd is the numerical input data
presented as a sequence (list, tuple and c)
Quantity Calling function Returned quantity
Mean mean(dd) Arithmetic mean
Median median(dd) Median (middle) valueneed not be an element of dd
Median low median_low(dd) If len(dd) is odd both return the median value; if len
Median high median_high (dd) is even, median low and median high return the
(dd) lower and the higher of the median values
Median median_grouped 50th percentile of dd
grouped (dd)
Mode mode( Most common element in dd (only if it is unique)
dd)
Standard pstdev(dd [, ]) Standard deviation of the population: if is given, it is
deviation taken as the mean; else it is computed and used
Population Pvariance Population variance: if is given, it is taken as the
variance (dd [, ]) mean; else it is computed and used
Sample stdev( Standard deviation of the sample: if x is given, it is
standard dd [, x]) taken as the mean; else it is computed and used. With
deviation sample set, this is preferred to pstdev()
Sample variance Variance of the sample: if x is given, it is taken as the
variance (dd [, x]) mean; else it is computed and used. With sample set,
this is preferred to pvariance()
def stsc(aa):
"Collection of statistical information for the sample
set -- aa"
dm = {}
dm['mean'] = statistics.mean(aa)
dm['median'] = statistics.median(aa)
dm['median_low'] = statistics.median_low(aa)
dm['median_high'] = statistics.median_high(aa)
dm['median_grouped'] = statistics.median_grouped(aa)
dm['pstdev'] = statistics.pstdev(aa)
dm['pvariance'] = statistics.pvariance(aa)
dm['stdev'] = statistics.stdev(aa)
dm['variance'] =statistics.variance(aa)
Fig. 9.5 Python function to list out the statistical information extracted from a given data
sequence
Fig. 9.6 Statistical information extracted from the data set obtained by executing the routine in
Fig. 9.3
9.3 Array Module 207
(a)
>>> import array [1]
>>> c1 = b'98765abcde' [2]
>>> type(c1)
<class 'bytes'>
>>> c2 = array.array('B', c1) [3]
>>> c2
array('B', [57, 56, 55, 54, 53, 97, 98, 99, 100, 101])
>>> bb1 = bytearray.fromhex('2211 abcd effe aabb ccdd
cddc') [4]
>>> bb1
bytearray(b'"\x11\xab\xcd\xef\xfe\xaa\xbb\xcc\xdd\xcd\xdc'
)
>>> bb2 = array.array('B', bb1) [5]
>>> bb2
array('B', [34, 17, 171, 205, 239, 254, 170, 187, 204,
221, 205, 220])
>>> c2n = array.array('b', c1) [6]
>>> c2n
array('b', [57, 56, 55, 54, 53, 97, 98, 99, 100, 101])
>>> bb3 = array.array('b', bb1) [7]
>>> bb3
array('b', [34, 17, -85, -51, -17, -2, -86, -69, -52, -35,
-51, -36])
>>> a1 = [1.2, 22.3, 3.4, 4.5, 5.6] [8]
>>> a3 = array.array('f', a1) [9]
>>> a3
array('f', [1.2000000476837158, 22.299999237060547,
3.4000000953674316, 4.5, 5.599999904632568])
>>> a2 = array.array('d', a1) [10]
>>> a2
array('d', [1.2, 22.3, 3.4, 4.5, 5.6])
>>> b1 = [2, -3, 44, -55, 678, 8901, -87654] [11]
>>> b2 = array.array('i', b1) [12]
>>> b2
array('i', [2, -3, 44, -55, 678, 8901, -87654])
>>> b2[:3] [13]
array('i', [2, -3, 44])
>>> b2.buffer_info() [14]
(35327984, 7)
>>> a2.itemsize, b2.itemsize [15]
(8, 4)
>>> b2.byteswap() [16]
>>> b2
array('i', [33554432, -33554433, 738197504, -905969665, -
1509818368, -987627520, -1700135169])
>>> b3 = array.array('q', [2, -3, 44, -55, 678, 8901, -
87654]) [17]
>>> b3
array('q', [2, -3, 44, -55, 678, 8901, -87654])
Fig. 9.7 a Python Interpreter sequence to illustrate array formations (continued in Fig. 9.7b)
b Python Interpreter sequence to illustrate array formations (continued in Fig. 9.7c) c Python
Interpreter sequence to illustrate array formations (continued from Fig. 9.7b)
9.3 Array Module 209
(b)
>>> b3.byteswap() [18]
>>> b3
array('q', [144115188075855872, -144115188075855873,
3170534137668829184, -3891110078048108545, -
6484620513460092928, -4241827899029585920, -
7302024945339465729])
>>> bb3 = array.array('l',b1) [19]
>>> bb3
array('l', [2, -3, 44, -55, 678, 8901, -87654])
>>> bb3.itemsize [20]
8
>>> bb3.byteswap() [21]
>>> bb3
array('l', [144115188075855872, -144115188075855873,
3170534137668829184, -3891110078048108545, -
6484620513460092928, -4241827899029585920, -
7302024945339465729])
>>> c2n.append(-107) [22]
>>> c2n
array('b', [57, 56, 55, 54, 53, 97, 98, 99, 100, 101, -
107])
>>> c2nn = array.array('b', b'f') [23]
>>> c2nn
array('b', [102])
>>> cc2 = c2n +c2nn [24]
>>> cc2
array('b', [57, 56, 55, 54, 53, 97, 98, 99, 100, 101, -
107, 102])
>>> cc2.reverse() [25]
>>> cc2
array('b', [102, -107, 101, 100, 99, 98, 97, 53, 54, 55,
56, 57])
>>> cc2.extend(c2nn) [26]
>>> cc2
array('b', [102, -107, 101, 100, 99, 98, 97, 53, 54, 55,
56, 57, 102])
>>> br1 = bytes([20, 61, 102, 143, 184, 225]) [27]
>>> br1
b'\x14=f\x8f\xb8\xe1'
>>> br2 = bytearray([254, 215, 186, 147, 108, 69, 30])[28]
>>> br2
bytearray(b'\xfe\xd7\xba\x93lE\x1e')
>>> c2nn.frombytes(br1) [29]
>>> c2nn [30]
array('b', [102, 20, 61, 102, -113, -72, -31])
>>> c2nn.frombytes(br2) [31]
>>> c2nn [32]
array('b', [102, 20, 61, 102, -113, -72, -31, -2, -41, -
70, -109, 108, 69, 30])
(c)
>>> c2nn.tobytes() [33]
b'f\x14=f\x8f\xb8\xe1\xfe\xd7\xba\x93lE\x1e'
>>> c2nn.tostring() [34]
b'f\x14=f\x8f\xb8\xe1\xfe\xd7\xba\x93lE\x1e'
>>> b0 = c2nn.tolist() [35]
>>> b0 [36]
[102, 20, 61, 102, -113, -72, -31, -2, -41, -70, -109,
108, 69, 30]
>>> b1 = [34, 17, -85, -51, -17, -2, -86, -69, -52, -35, -
51, -36] [37]
>>> for kk in b1:c2nn.append(kk) [38]
...
>>> c2nn
array('b', [102, 20, 61, 102, -113, -72, -31, -2, -41, -
70, -109, 108, 69, 30, 34, 17, -85, -51, -17, -2, -86, -
69, -52, -35, -51, -36])
>>> c2nn.count(-2) [39]
2
>>> c2nn.index(-51) [40]
17
>>> c2nn.insert(17, 102) [41]
>>> c2nn
array('b', [102, 20, 61, 102, -113, -72, -31, -2, -41, -
70, -109, 108, 69, 30, 34, 17, -85, 102, -51, -17, -2, -
86, -69, -52, -35, -51, -36])
>>> c2nn.pop(17) [42]
102
>>> c2nn [43]
array('b', [102, 20, 61, 102, -113, -72, -31, -2, -41, -
70, -109, 108, 69, 30, 34, 17, -85, -51, -17, -2, -86, -
69, -52, -35, -51, -36])
>>> len(c2nn)
26
>>> cn = array.array('b') [44]
>>> for kk in range(26):cn.append(c2nn.pop()) [45]
...
>>> cn
array('b', [-36, -51, -35, -52, -69, -86, -2, -17, -51, -
85, 17, 34, 30, 69, 108, -109, -70, -41, -2, -31, -72, -
113, 102, 61, 20, 102])
>>> c2nn [46]
array('b')
>>> cn.index(-2) [47]
6
>>> cn.remove(-2) [48]
>>> cn.index(-2) [49]
17
Table 9.3 Characters used to Type code C type Minimum size in bytes
dene arrays and their
signicance: all except u b Signed char 1
(already obsolete) signify B Unsigned char 1
numbers u Py_UNICODE 2
h Signed short 2
H Unsigned short 2
i Signed int 2
I Unsigned int 2
l Signed long 4
L Unsigned long 4
q Signed long 8
Q Unsigned long 8
f Float 4
d Double 8
The sequence operations like slicing, indexing, concatenation can be used with
array. A number of other methods are also available with arrays. b2[:3] in [13]
in Fig. 9.7a forms a slice of the rst three elements of array b2 formed earlier.
The method b2.buffer_info() returns a tuple of two items [14] comprising of
the memory address of b2 and the number of elements in b2. The method
itemsize() returns the size of the elements of the concerned array in number of
bytes. a2.itemsize and b2.itemsize [15] return 4 and 8 as the respective
values.
byteswap() swaps the bytes of the array concerned. Such swapping may be
called for with serial interface protocols which use the alternate byte sequence
representation. b2.byteswap() in [16] swaps the bytes of the elements of array
b2. With a 4-bytes representation of integer 2, b2[0] (=00 00 00 02h) when
swapped becomes 33,554,432 (=02 00 00 00 = 225); similarly with the other
swapped elements. The array b3 in [17] is formed with the list b1 [11] used to
form b2 in [12]; hence its elemental values are the same as those with b2. But here
every element is of 8-byte type. [18] in Fig. 9.7b forms its swapped version (for
example with the rst element, 257 = 144,115,188,075,855,872). bb3 [19] has l
as its index for its formation. It is again an array of signed integers each being
8-bytes long [20]the same as b3 having index q for its formation in [17] above.
In turn the swapped version of bb3 [21] is identical to the swapped version of b3
itself. c2n in [6] formed earlier in Fig. 9.7a is an array of signed single byte
integers formed from corresponding characters. The append() method is used in
[22] in Fig. 9.7b to append 107 to c2n. The single character bf is converted to a
corresponding array c2nn (=102) in [23]. Subsequently c2n and c2nn are com-
bined in [24] to form a single bytes type single integer array cc2.
The method reverse() reverses the sequence in the array in place. cc2.re-
verse() in [25] is an example of its application. cc2.extend(c2nn) in [26]
extends the array by combining c2nn with it. append() appends an integer
212 9 Application Modules
(a single element) to the array. extend() extends the array to another array of the
same type; extend() is the same as doing a set of successive append() operations
in a loop.
Conversion from an array to a list, bytes, string are possible through methods
dedicated for the same namely: tolist(), tobytes(), and tosrtring()
respectively. Similarly fromlist(), frombytes() and fromstring() can be
used to extend arrays by appending the set of elements from the respective
sequences. br1 in [27] and br2 in [28] are bytes and bytearray type
sequences. c2nn.frombytes(br1) [29] appends the full set of elements from
bytes br1 to c2nn [30]. c2nn.frombytes(br2) [31] extends c2nn further by
appending all elements of the bytearray br2 to it [32]. c2nn.tobytes() [33]
(Fig. 9.7b) returns the bytes sequence of elements of c2nn. Similarly c2nn.
tostring() [34] and c2nn.tolist() [35] return respective string and list
sequences. The latter in [36] is assigned to b0.
b1 [37] is an array of single byte signed integers; its elements are appended to
c2nn in the same sequence [38]. As mentioned earlier this needs the elements of
b1 to be of the same type as those of c2nn. c2nn.count(2) [39] returns the
number of occurrences of 2 in c2nn. c2nn.index(51) [40] returns the index of
the rst occurrence of 51 in c2nn. c2nn.insert(17, 102) [41] inserts 102 at the
indexed location (17th) of c2nn. The new value of c2nn accessed in the following
line conrms this.
c2nn.pop(17) in [42] pops the 17th element of c2nn. The last inserted element
102 is popped out of c2nn here. c2nn is accessed and again output in [43] which
conrms this. As an exercise cn is initialized as an empty array in [44]. Elements of
c2nn are popped out successively and appended to cn [45]. cn is the reversed
version of c2nn and c2nn is left as an empty array [46].
cn.index(2) [47] returns the index of rst occurrence of 2 in cn. cn.
remove(2) [48] removes 2 at its rst occurence in cn. The subsequent cn.
index(2) in [49] conrms this by showing the index position of rst occurrence
of 2 as 17 in the new cn. Filesbeing binary or bytes type of sequencescan be
converted to arrays and vice versa. The Python Interpreter sequence in Fig. 9.8
illustrates use of the relevant methods. File ft is opened in [1] and its content
assigned to ds1[2] and the le closed. An empty array gg1 (type Bunsigned
single byte character) is formed [3]. gg1.fromstring (ds1) in [4] lls up gg1
with the sequence of unsigned integers representing string ds1. The methods
tostring() and fromstring() are retained for compatibility with older ver-
sions of Python; these will be discontinued in later versions. tobytes() and
frombytes() may be used instead.
[5] returns the length of gg1 as well as ds1 as 25 and 25 showing that the full
string ds1 has been converted to form gg1. typecode returns the character used
to create the array. gg1.typecode in [6] returns the character representing the
elements of array gg1. array. typecodes [7] returns the full character set
possible to form arrays. The returned set conforms to the set of characters in
Table 9.3.
9.3 Array Module 213
Fig. 9.8 Python Interpreter sequence illustrating data transfer between array and le
214 9 Application Modules
Transfer of data between arrays and les is facilitated by methods tole() and
fromle(). File fty has Dhruva is a symbol of eternal hope as its content [8].
gg3 in [9] is a bytes type sequence (b A light that leads). ga3 [10] is an array
formed from gg3. fty is opened [11] in the append mode, ga3 written to it [12],
and fty closed. Here ga3.tole(d1) writes ga3 to the open le represented by d1.
fty is read afresh in [13] and its contents displayed. gg2 is declared as a new
(empty) array [14] and content of le fty transferred to it as an array [15].
Subsequently gg2 is converted to a string [16] through gg2.tostring(). gg2.
fromle(d2, 35) in [15] lls the array gg2 with 35 characters from the open le
represented by d2. In general gf.fromle(f0, n) accesses the open le f0, gets
n characters and lls the array gf with it. The type of reading, writing, and
appending are decided here by the mode selected to open the le concerned.
Fig. 9.9 Python Interpreter sequence illustrating the methods with bisect module
216 9 Application Modules
ts snugly in the index range (16) of llthat is between 256 and 410. However
ll1[7] (=474) in the following line [10] is larger than the largest element ll[4] in the
specied range ll[2:5]. Hence it is tted as the next element in ll. The segment
ll[2]ll[5] remains sorted but not the whole of ll. The sorting range can be specied
through the lower (lo) and the higher (hi) limits with the other methods of bisect
as well.
la in [11] is a list of names. It has been sorted in [12]. A new name Dhara has
been inserted into the list in [13]. The methods in bisect can be used similarly
with any mutable sequence in Python that can be sorted using Pythons data
structure.
In Python the heapq module pertains to a class of mutable sequences where the
individual elements are arranged in a binary tree fashion. The value of every parent
node in the binary tree is greater than the values of its two daughter nodes. More
specically v[k] v[2 * k + 1] and v[k] v[2 * k + 2] where v[k] is the value at
the kth node for all k values. Often such an organized entity is called a heap. The
methods available with the heapq module do operations conforming to these
inequalities. As such in a heapq v[0] is always the smallest element. If it is popped
out the smaller one between v[1] and v[2] takes its place (being the new smallest
element); in turn the rest of the heap is automatically updated in a similar fashion.
The Python Interpreter sequence in Fig. 9.10 facilitates understanding of the
facilities with heapq. heapq is imported in [1] in Fig. 9.10a. lb is formed as a list
[3] of integers 015. The way it is presented lb is a sorted list. heapq.heapify
(lb) [4] executes the method .heapify() with lb; it rearranges the elements of lb
as a heap. Incidentally lb being a sorted list, it is already a heap. lb as a heap is
depicted in the binary tree form in Fig. 9.11. Here the serial number of each node in
shown within brackets beside the integer value at the node. heapq.heappop(lb)
in [5] pops and returns the smallest element of the heap lb. The heap is updated
automatically. The updated heap is shown in Fig. 9.12a in binary tree form. The
updation process can be understood by comparing the heap here with that in
Fig. 9.11. The nodes in Fig. 9.12a where the values are changed, are identied in
block letters; the dotted arrow in each case shows the sequence of changes in the
content of the nodes.
The values at the two daughter nodes of node 0-N1 and N2-1 and 2are
compared and 1 (being the smaller of the two) occupies node N0. N2 and its
branches beneath remain untouched. Node 1 (N1) is lled by one of its daughters
(N3 and N4) (having values 3 and 4); 3 (from N3) being the smaller value occupies
9.5 heapq Module 217
(a)
>>> import heapq [1]
>>> lb = [] [2]
>>> for gg in range (16):lb.append(gg)
...
>>> lb
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
[3]
>>> heapq.heapify(lb) [4]
>>> lb
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
>>> heapq.heappop(lb), lb [5]
(0, [1, 3, 2, 7, 4, 5, 6, 15, 8, 9, 10, 11, 12, 13, 14])
>>> heapq.heappop(lb), lb [6]
(1, [2, 3, 5, 7, 4, 11, 6, 15, 8, 9, 10, 14, 12, 13])
>>> heapq.heappop(lb), lb [7]
(2, [3, 4, 5, 7, 9, 11, 6, 15, 8, 13, 10, 14, 12])
>>> heapq.heappop(lb), lb [8]
(3, [4, 7, 5, 8, 9, 11, 6, 15, 12, 13, 10, 14])
>>> heapq.heappop(lb), lb [9]
(4, [5, 7, 6, 8, 9, 11, 14, 15, 12, 13, 10])
>>> heapq.heappop(lb), lb [10]
(5, [6, 7, 10, 8, 9, 11, 14, 15, 12, 13])
>>> heapq.heappop(lb), lb [11]
(6, [7, 8, 10, 12, 9, 11, 14, 15, 13])
>>> heapq.heappop(lb), lb [12]
(7, [8, 9, 10, 12, 13, 11, 14, 15])
>>> heapq.heappop(lb), lb [13]
(8, [9, 12, 10, 15, 13, 11, 14])
>>> heapq.heappop(lb), lb [14]
(9, [10, 12, 11, 15, 13, 14])
>>> heapq.heappop(lb), lb [15]
(10, [11, 12, 14, 15, 13])
>>> heapq.heappop(lb), lb [16]
(11, [12, 13, 14, 15])
>>> heapq.heappop(lb), lb [17]
(12, [13, 15, 14])
>>> heapq.heappop(lb), lb [18]
(13, [14, 15])
>>> import random [19]
>>> gg1 = []
>>> random.seed(2)
>>> for kk in range(16):gg1.append(random.randint(0,
1000)) [20]
...
>>>
Fig. 9.10 a Python Interpreter sequence illustrating the methods with heapq module (continued
in Fig. 9.10b) b Python Interpreter sequence illustrating the methods with heapq module
(continued from Fig. 9.10a)
218 9 Application Modules
(b)
>>> gg1
[978, 883, 970, 869, 57, 93, 86, 369, 855, 173, 753, 828,
685, 874, 315, 257]
>>> hp1 = [] [21]
>>> for kk in gg1:
... heapq.heappush(hp1, kk)
... print(hp1)
...
[978]
[883, 978]
[883, 978, 970]
[869, 883, 970, 978]
[57, 869, 970, 978, 883]
[57, 869, 93, 978, 883, 970]
[57, 869, 86, 978, 883, 970, 93]
[57, 369, 86, 869, 883, 970, 93, 978]
[57, 369, 86, 855, 883, 970, 93, 978, 869]
[57, 173, 86, 855, 369, 970, 93, 978, 869, 883]
[57, 173, 86, 855, 369, 970, 93, 978, 869, 883, 753]
[57, 173, 86, 855, 369, 828, 93, 978, 869, 883, 753, 970]
[57, 173, 86, 855, 369, 685, 93, 978, 869, 883, 753, 970,
828]
[57, 173, 86, 855, 369, 685, 93, 978, 869, 883, 753, 970,
828, 874]
[57, 173, 86, 855, 369, 685, 93, 978, 869, 883, 753, 970,
828, 874, 315]
[57, 173, 86, 257, 369, 685, 93, 855, 869, 883, 753, 970,
828, 874, 315, 978]
>>>
N0 0
N1 1 N2 2
N3 3 N4 4 N5 5 N6 6
N7 7 N8 8 N9 9 N10 10 N11 11 N7
N12 12 N13 13 N14 14
N15 15
Fig. 9.11 Heap of integers in the range 015 showing the nodes and their respective contents:
with each node Ni on the left is the ith node and the integer on the right is its content
N1. Node 4 and its daughter branches remain untouched. In the same vein N3 is
lled by the value 7 from its daughter node N7. Similarly 15 occupies node N7.
Node N15 being empty, gets deleted.
9.5 heapq Module 219
1
(a) N0
N1 3 N2 2
N3 7 N4 4 N5 5 N6 6
N15
(b) N0 2
N1 3 N2 5
N3 7 N4 4 N5 11 N6 6
N7 15 N8 8 N9 9 N10 10 N11 14 N7
N12 12 N13 13 N14
(c) N0 3
N1 4 N2 5
N3 7 N4 9 N5 11 N6 6
N7 15 N8 8 N9 13 N10 10 N11 14 N7
N12 12 N13
(d) N0 9
N1 12 N2 10
N3 15 N4 13 N5 11 N6 14
N7
(e) N0 10 (g) N0 12
N1 12 N2 11 N1 13 N2 14
15 N3 15 N4
N3 N4 13 N5 14 N6
11
(f) N0
(h) N0 13
N1 12 N2 14 N1 15 N2 14
N3 15 N4 13 N5 N3
Fig. 9.12 Status of the heapq in Fig. 9.11 after successive pops: in each case the arrows in
dotted lines show the movement of the node contents conforming to the heapq algorithm.
Figs. (a)(c) show the status after popping 0, 1, and 2 respectively. Figs. (d)(h) show the status
after popping integers 812
220 9 Application Modules
The resulting heap has 14 elements in it with 1 at node N0 being the smallest
element. Note that the heap is no longer in sorted order though it still conforms to
the basic rule of heapany parent node holding value lower than those in its two
daughter nodes.
The heap lb is continually popped in [6][18]. The heaps at some of the selected
stages are shown in Fig. 9.12. In every case the changed values are identied in
bold form.
With aa in a heap and bban entity of the same type as the elements of aa, the
method heapq.heappush(aa, bb) pushes bb into the heap aa. gg1 [20] is a set
of random numbers (integers) in the range (01000). Heap hp1 [21] (in Fig. 9.10b)
is formed by successively pushing elements from gg1 into heap hp1done by the
for loop. Formation of the heap and migration of elements node to node conform to
the basic heap rule mentioned earlier: v[k] v[2 * k + 1] and v[k] v[2 * k + 2]
for all k values.
Different methods available with heap are illustrated in the Python Interpreter
sequence in Fig. 9.13. la [2] in Fig. 9.13a is a list of random numbers. lb [2] is
another similar list of random numbers which have been reset as a heap. heap.
heapreplace(lb, la[0]) [3] pops the smallest item from lb(=31) and pushes la[0]
(881) into the heap lb. The updated lb has 881 taking up the position due to it in the
heap.
lc[5] is a new heap formed from la with all its elements except la[0]. heapq.
merge(aa, bb, cc) is to merge the heaps aa, bb, and cc into a single heap and
returns the corresponding iterable. There is no constraint on the number of argu-
ments for the method .merge(). List (heap.merge(lb, lc)) [6] merges heaps lb
and lc and returns the combined heap as a list ld.
heapq.heappushpop() combines the push followed by pop into a single
method. heapq.heappushpop(lb, 222) [7] pushes 222 into lb and pops its
smallest element65.
heapq.heapreplace(aa, bb) removes the smallest item from heap aa and
pushes bb into it. The subtle difference between heapreplace() and heap-
pushpop() is to be clearly understood. In heappushpop() the incoming item is
pushed into the heap and then the popping done. heapreplace() does the same
in the reverse order. a1 as a list of integers in [8] is converted into a heap in [9].
heapq.heapreplace(a1, 12) [10] pops out a1[0] (=15) and then inserts 12 into
a1. The resulting a1 is in [11]. heapq.heappushpop(a1, 11) in [12] pushes 11
into the heap. Itbeing the smallest element in the heapis popped out. The heap
in effect remains the same [13].
Two methods are available to extract a desired number of smallest and largest
values from a heap. heapq.nlargest(4, ld) [14] returns the largest four elements
of ld as a list. Similarly heapq.nsmallest(4, ld) returns the smallest four ele-
ments of ld as a list ([15] in Fig. 9.13b). These methods are recommended only
for small values of n. Sorting the queue and slicing may be more attractive for larger
values.
In general the methods in heap may be applied to any list of elements whose
values can be compared in Python. An example is considered here by way of
9.5 heapq Module 221
(a)
>>> import heapq, random [1]
>>> random.seed(3)
>>> la, lb = [], []
>>> for kk in range (12):
... la.append(random.randint(0, 1000))
... lb.append(random.randint(0, 1000))
...
>>> heapq.heapify(lb)
>>> la, lb [2]
([881, 237, 155, 948, 399, 15, 795, 163, 980, 43, 798,
843], [31, 154, 65, 535, 308, 687, 888, 776, 605, 650,
759, 886])
>>> heapq.heapreplace(lb, la[0]) [3]
31
>>> lb
[65, 154, 687, 535, 308, 881, 888, 776, 605, 650, 759,
886]
>>> lc = la[1:] [4]
>>> heapq.heapify(lc) [5]
>>> lc
[15, 43, 163, 237, 155, 795, 948, 980, 399, 798, 843]
>>> ld = list(heapq.merge(lb, lc)) [6]
>>> ld
[15, 43, 65, 154, 163, 237, 155, 687, 535, 308, 795, 881,
888, 776, 605, 650, 759, 886, 948, 980, 399, 798, 843]
>>> lb, lc
([65, 154, 687, 535, 308, 881, 888, 776, 605, 650, 759,
886], [15, 43, 163, 237, 155, 795, 948, 980, 399, 798,
843])
>>> heapq.heappushpop(lb, 222) [7]
65
a1 = [881, 237, 155, 948, 399, 15, 795] [8]
>>> heapq.heapify(a1) [9]
>>> a1
[15, 237, 155, 948, 399, 881, 795]
Fig. 9.13 a Python Interpreter sequence illustrating methods with heapq (continued in Fig. 9.13b)
b Python Interpreter sequence illustrating methods with heapq (continued from Fig. 9.13a)
222 9 Application Modules
(b)
>>> heapq.nsmallest(4, ld) [15]
[15, 43, 65, 154]
>>>
medals=[(2,'Silver'),(3,'Bronze'),(1,'Gold'),(5,'Next_tria
l'),(4, 'Certificate')] [16]
>>> medals[0] < medals[1] [17]
True
>>> hp = [] [18]
>>> for kk in medals:heapq.heappush(hp, kk) [19]
...
>>> heapq.heappop(hp)
(1, 'Gold')
>>> heapq.heappop(hp)
(2, 'Silver')
>>> medals
[20]
[(2, 'Silver'), (3, 'Bronze'), (1, 'Gold'), (5,
'Next_trial'), (4, 'Certificate')]
>>>
9.6 Exercises
2. A few sentences are reproduced below and assigned to SS. Get the set of
words in them. Form a list, a sorted list and a heapq of the set.
SS = The copper tube lines that carried pneumatic supply and signals were as
common in the industry as the electric power supply conduits. With the advent
of electronic schemes all these have become things of the past. Quantities
which were ignored (gas concentration) and those perceived as not measurable
(high temperature) have catapulted and fallen prey to sensors.
3. Form aa as a list of 16 random numbers in the range [0.0, 1), all rounded to
four digits. Form bb as a sorted list from aa. Form cc as a heap from aado
these manually.
Continuously pop bb and cc. Observe how the list and the heap change.
4. An eatery offers four types of dishesd[0], d[1], d[2], and d[3]. A set of 40
customers line up to buy the dishes. Each customer has his own order of
preferences for the dishes. Initially the customers queue up in front of the dish
countersten at each counter. The sale and queue progress take place at
intervals 1, 2, 3, set by a counter Ct. When Ct = 1, the rst set of sales takes
place; when Ct = 2, the second set of sales takes place, and so on. Whenever
Ct advances a maximum of four customers is allowed to change Queue (cus-
tomers always observe the status of remaining stock and take decisions
accordingly). The change is allowed in the orderlast person in the queues for
d[0], d[1], d[2], and d[3]; the last but one person in the queues for d[0], d[1],
d[2], and d[3]; and so on.
Assign the dish preferences randomly for the 40 customershave it as a
tuple of numbers for each customer.
Do the necessary program and carry out the following:
Select the persons randomly and ll the four queues initially.
Track the progress of sales, queue movement, and the movement of people.
Give the queue status when one of the items is completely sold off, and when
two of them are completely sold off. Identify the people who did maximum
number of queue jumping and show their queue jumping history.
224 9 Application Modules
5. Write a program to generate test marks data for a class of 30 students in six
subjects each in a particular Semester. The subjects are b1, b2, b3, b4,
b5, and b6. The student names are s1, s2, s3, , and s30. In each
subject the mark is assigned randomly conrming to Gaussian distribution with
a given mean and sigma value set. For the six subjects take these set values as
{(70, 15), (60, 12), (75, 8), (65, 11), (75, 9), (70, 14)} respectively. Write a
program and generate the marks data for all the students in all the subjects; in
each case have the mark score correct to one decimal digit. Store the data in a
dictionary with the students name being the key and the marks obtained
forming a corresponding array of six entries each as the value. Use these as
the base data to test the programs in the following exercises:
a. Ten percent of the students fail in each subject. For each subject identify the
pass mark.
b. For each subject get the average mark of the students who have passed.
c. The students who have passed are to be given gradesA, B, and C
the number getting these grades being in the ratio1:3:2. The top one sixth
of the students who have passed (the integer closest to that) is to be given
A grade; the next three sixth (integer closest to that) is to be given B
grade. The rest are given C grade. Assign the grades in all the subjects for
all the students who have passed. For each student assign the grades tuple of
six entrieseach entry being A, B, C, or F (F signies failure).
d. Students who have failed in more than two subjects are to repeat the
Semester.
e. The student who gets maximum number of As is the rst rank holder. If
this number is more than one use the sum of marks in all the subjects to
decide the rst rank.
Write a Python program to do all the above and complete the exercise. All the
computed data is to be made available in the form of a dictionary with students
name as the key, marks in all the subjects forming another tuple, First
rank, Repeat semester, or semester completed being an additional entry. All
these together as a list should form the value against the key.
6. The annual rainfall in the n districts of a state is given for 10 consecutive years.
The following data is to be generated:
a. The annual average rainfall for the state.
b. If the rainfall in a district exceeds the state average for three consecutive
years the district is termed rain-fed.
c. If the rainfall in a district is below the state average for three consecutive
years the district is termed rain-decient.
d. If the rainfall in a district for the last 3 years is within m 0.2 m being
the average rainfall for that district in the last 10 years and 2 its variance
predict the coming years rainfall as p mm with a good condence level.
In all other districts predict the coming years rainfall as the last years itself
without an attached condence level tag.
9.6 Exercises 225
Simpsons Rule:
Prepare a program to compute the area using (a) Trapezoidal rule and
(b) Simpsons rule.
x2 y2
p2 q2 1 is the equation of an ellipse in the xy Plane; its part in the rst
quadrant is shown in Fig. 9.14. Find the area enclosed by the axes and the
ellipse in the rst quadrant (shaded area) for p = 4 and q = 3. Do it for 100 and
1000 as the values of n. The actual area is pq/4. Find the percentage error in
both the cases.
8. The Monte Carlo method offers a radically different approach to get the area
(Guttag 2013). Obtain a sufciently large number of random points within the
area enclosed by x = a, x = b, y = 0, and y = ym where ym is the maximum
E
A
B
O D x
226 9 Application Modules
value of y within the interval [a, b] of x (i.e., the rectangle enclosed by the
vertical lines at a and b and the horizontal lines at 0 and ym). If f is the fraction
of the points lying within the area of interest f(b a)ym is the desired area. The
random point (x, y) can be obtained with x as a random number in the interval
[a, b] with uniform distribution and y as another random number in the interval
[0, ym] again with uniform distribution. Write a program to get the area using
the Monte Carlo method. Estimate the area of the ellipse segment shown in
Fig. 9.14; do it with 10,000 and 1,00,000 number of random points. Find the
percentage error in both the cases.
9. With p = q = 1 in (7) above the ellipse reduces to a circle with unit radius. The
area shown in Fig. 9.14 for this case is /4. Use this to estimate the value of ;
do this with 10,000 and 1,00,000 number of random points. Find the percentage
error in both the cases.
10. Ram makes a request to Shyam: Can you please lend me one thousand rupees?
(a) Shyams answer has ve possible values with probabilities as given in
Table 9.5. The answer is to be decided using a random number in the range
{1, 100} with uniform distribution; if the number is in the range {1, 15} the
decision is A and so on.
(b) The decision matrix in Table 9.6 is to be used to decide the answer. Use a
random number in the range {1, 10} to decide Shyams moodhe is in
good mood if the number is six or less; else he is in bad mood.
Write programs to generate the answers and test them.
11. Sandeep has a farm of 110 coconut trees. The trees are planted in ten rowsthe
odd rows having eleven trees and the even ones having nine trees. The
co-ordinates of the trees in the rst row are (0, 0), (0, 2), (0, 4), , and (0, 10);
the co-ordinates of the trees in the second row are (1, 1), (1, 3), (1, 5), , and
(0, 9); similarly with other rows. He has assessed the quality of the trees with
values one, two, and three assigned to thembased on their age, health, and
type of fruit. 35, 35, and 40 numbers of trees have values assigned one, two,
and three respectively. He wants to partition the farm into three parts of equal
values or at least nearest to that and give to his three sons.
(a) Select trees randomly and assign quality values one, two, and three to each
conforming to the constraint given above.
(b) The partition is to be done with vertical parting lines. Decide their positions
such that the value differences amongst the three partitions are the
minimum.
(c) Do partitions with two parallel lines of unit slope each conforming to the
condition in (b) above.
(d) With diagonally opposite corners of the plot as centres draw circular arcs
and do the partition again conforming to the condition in (b) above.
12. Her Majestys Judgement: The Queen in Alice in Wonderland watches the
game of Crochet. She doles out judgments at randomCut off his head, Cut
off her head. We have a more sophisticated queen here lording over a popu-
lation of 1000. At regular intervals she picks out anyone of her subjects ran-
domly and declares him/ her the accused. The accused is doled out a judgment
selected randomly from a set of four. Implement the scheme and help out the
computer savvy queen for the rst ten judgments.
13. Selective Institutional Admissions: Thou shalt obey the rules of the land, said
the Lord of the Land. The Land has two (disciplined and devoted to The Lord)
communitiesthe Elite (E) and the Committed (C)with 70 and 30% of
the population. The Land has a sacred Institution and there is a mad rush for
admissions. Every year for the fty seats in the institution, 10,000 belonging to
both the communities applyall being equally eligible. The admissions shalt
be proportional and equal injustice be done to all, had decreed the Lord. The
wise elders of both the communities (C and E) evolved the admission procedure
as follows:
The 6,000 applications from the E community be numbered E[0], E[1], E[2],
, and E[5999]; similarly the 4000 applications of the C community be
numbered C[0], C[1], C[2], , and C[3999]. Have a basket of 100 balls70
marked E and the remaining 30 marked C. Pick a ball at random from the
basket; if it is marked E, pick out a random number from the 6000 E-set and
assign a seat to that applicant. Do these 50 times to complete admissions. (The
random pickings from the applicants set is to be without replacement.)
228 9 Application Modules
X
m
cij aik bkj
k1
Write a program to get the product of two matrices and test it with specic data
generated with matrices of random numbers. Use arrays to represent vectors
and matrices of numbers.
15. Solution of matrix equation: The matrix equation Ax = b with A being an
n n matrix, and b an n-dimensional vector can be solved for the n unknowns
xby different methods. Gauss Elimination is one of the popular methods
of solution (Kreyszig 2006). The method is presented as an algorithm here:
Form the augmented matrix C of size n (n + 1) where
For j = 1 to n 1
For k = j + 1 to n
ckj
dkj
cjj
For m = j + 1 to n + 1
For j = n 1 to 1
!
1 Xn
xj cj;n 1 aj;k xk
cjj kj 1
The vector [xj] forms the solution. Use arrays to represent vectors and
matrices of numbers.
16. Form the tuple SS = all quality inner garments(display in a shop window)
Get the collection of words in SS, jumble them, and form different combina-
tions of phrases through a program.
References 229
References
Guttag JV (2013) Introduction to computation and programming using Python. MIT Press
Kreyszig E (2006) Advanced engineering mathematics, 9th edn. Wiley, New Jersey
Krishnan V (2006) Probability and random processes. Wiley, New Jersey
van Rossum G, Drake FL Jr (2014) The Python library reference. Python Software Foundation
Zwillinger D (ed) (2003) Standard mathematical tables and formulae. Chapman & Hall/CRC, New
York
Chapter 10
Classes and Objects
10.1 Objects
In Python any entity which has a name (identity) assigned to it is an object (van
Rossum and Drake 2014). It can be an integer, a number (real or complex), a list, a
string, a function, and so on. There is no restriction on it. Further one does not have
to formally declare the type of entity before assignment. Python does it and adapts
itself on the flytransparent to the user. Every object in Python has its own id
(Identity)a number representing its address in the memory. Consider the Python
Interpreter sequence in Fig. 10.1 which brings out the basic characteristics of
objects. With the assignment a = 3 in [1] Python understands a to be an integer and
assigns the numerical value 3 to it. These are evident from [2] and [3] respectively.
Further the id (a) in [4] signies its specic identication (address) in memory.
b = a in [5] declares b as another quantity identical to a. In Python b is essentially
another tag/name for the given quantity (object) with id (8978848). In other words
a and b are two different name tags for the same entity. [6] claries this. b is
assigned a different value in [7] and it acquires a different identityit has become a
different object (though of the same type). [7], [8], and [9] conrm this. b is
assigned the numerical value 3.0 in [10]. b is no longer an integer. It is a different
object with a different identity [11]. However a remains the integer object com-
mitted earlier. c in [12] is a list of assorted items. With d = c in [13] d is
another name tag for this list object. d[2] = 4 + 3j in [14] changes an element in the
list. The change is reflected in c as well. c and d have the same identity [15] and
they point to the same object. The list d has been given a fresh assignment in [16]
a tuple of two items in it. d is now a different (type of) objectas can be veried
from the ids of c and d [17].
10.2 Classes
>>> a = 3 [1]
>>> type(a) [2]
<class 'int'>
>>> a [3]
3
>>> id(a) [4]
8978848
>>> b = a [5]
>>> id(b) [6]
8978848
>>> b = 5 [7]
>>> a, id(a) [8]
(3, 8978848)
>>> id(b) [9]
8978912
>>> b = 3.0 [10]
>>> id(b) [11]
140381507170688
>>> c = [2, 5.0, 3-4j, 'cc'] [12]
>>> d = c [13]
>>> d[2] = 4+3j [14]
>>> c, d
([2, 5.0, (4+3j), 'cc'], [2, 5.0, (4+3j), 'cc'])
>>> id(c), id(d) [15]
(140381505950728, 140381505950728)
>>> d = (9.0, b*b) [16]
>>> id(c), id(d) [17]
(140381505950728, 140381505950600)
>>>
Fig. 10.1 Python Interpreter sequence to bring out the basic features of objects
(a)
class Teacher: [1]
"Teacher information" [2]
tchn = 0 [3]
def __init__(self, nm, ag): [4]
self.name = nm [5]
self.age = ag [6]
print('New teacher with name: {} & of age:
{}'.format(self.name, self.age))
Teacher.tchn += 1 [7]
defTh_cnt(self): [8]
"Give number of teachers on roll"
if Teacher.tchn > 1:print('{} Teachers are on the
rolls'.format(Teacher.tchn))
elif Teacher.tchn == 1:print('There is only one
teacher on the rolls')
else: print('There is no teacher on the rolls')
(b)
>>> import school
>>> t1 = school.Teacher('Rakesh', 31) [9]
New teacher with name: Rakesh & of age: 31
>>> t2 = school.Teacher('Ramya', 24) [10]
New teacher with name: Ramya & of age: 24
>>> t3 = school.Teacher('Shyama', 32) [11]
New teacher with name: Shyama & of age: 32
>>> t4 = school.Teacher('Harini', 35) [12]
New teacher with name: Harini & of age: 35
>>> t1.tchn [13]
4
>>> t1.tchn, t2.tchn, t3.tchn, t4.tchn [14]
(4, 4, 4, 4)
>>> id(t1.tchn),id(t2.tchn),id(t3.tchn),id(t4.tchn) [15]
(8978880, 8978880, 8978880, 8978880)
>>> t1.tchn = 5 [16]
>>> id(t1.tchn),id(t2.tchn),id(t3.tchn),id(t4.tchn) [17]
(8978912, 8978880, 8978880, 8978880)
>>> t4.age [18]
35
>>> t4.age = 36 [19]
>>> t4.age
36
>>> t1.Th_cnt() [20]
4 Teachers are on the rolls
A number of functions which are valid and used within the class are also
dened within the suite. Each such function is a method belonging to the
class.
A class is said to encapsulate the data and methods applicable to it. The
variables and methods belonging to a class are its attributes. They are
accessed using the dot convention.
The method dened as __init__(self. nm, ag) [4] is a special method of the
class. It initializes the object created. In the specic case here it has three
arguments. nm, and ag are the arguments specied/supplied at the time of the
object formation. self is a dummy argument. It represents the object itself.
Although any name can be used to signify this, self is the widely used (and
accepted) name for it. The __init__() method denition can occur anywhere
within the class denition; but it is customary to keep it as the rst method in
line with its uniqueness/signicance.
With Teacher .__init__() denes two variablesself.name and self.
ageand assigns values to them as nm [5] and ag [6]. Then the name and the
type of the object are printed out. Following this the teacher count (tchn) is
incremented by one [7].
The class Teacher has one more method dened within itth_cnt(self)
[8]. If called, the method prints out the number of Teachers in a convenient
format.
complex, list, tuple, dict, & c are all available classes in Python rep-
resenting built-in data types. A user dened class creates a new data typea
customer-dened data type which can be used along with the available ones.
10.2.1 Instantiation
In line with [6] another variable t1.age is generated and ag (the integer value =
31) is assigned to it. Here again t1 takes the place of self.
A string involving t1.name and t1.age is formed and printed out; the line
following [9] in the gure conrms this.
Teacher.tchn is incremented in [7]. This has two ramications: Access and
assignment of variable Teacher.tchn implies that a class variable is accessed
and incremented. Further t1 being an instance of the class Teacher, the vari-
able appears as t1.tchn and automatically it is incremented.
Three additional objects of class Teachert2, t3, and t4are formed in [10],
[11], and [12] respectivelyall similar to t1; each has an assigned name and age
respectively. t1, t2, t3, and t4the objects conforming to the same classTeacher
are additional instances of the class Teacher. All the four objectst1, t2, t3, and
t4formed here are identical in structure and propertiescharacterized by the
variables and methods associated with them. All conform to the class Teacher.
t1.tchn in [13] is a variable (again an object) and represents the value of tchn
of the instance t1. It has been accessed in [13] and it can be seen to have the value 4
being the total count of the objects of class Teacher at this stage. The values of
t1.tchn, t2.tchn, t3.tchn, and t4.tchn are accessed and reproduced [14]. tchn is
a class variable. Whenever a new instance of Teacher is created, it is accessed
within __init__() through Teacher and tchn updated. As such t1.tchn, t2.
tchn, t3.tchn, and t4.tchn are different names for the same object. This is evident
from [15] where the respective ids (=8978880) are accessed and displayed. t1.tchn
alone is accessed separately and its value changed in [16]. It has become a different
variable as can be seen by comparing its new id (=8978912) with those of t2.tchn,
t3.tchn, and t4.tchn (=8978880) as done in [17]. t4.name and t4.age are two
attributes specic to the instance t4. t4.age has been accessed [18] and changed
from 35assigned during instantiationto 36 [19]. It conrms that the instance
variables can be accessed and changed anytime (as is with the common class
attribute/variable .tchn) provided the proper syntax is stuck to.
Teacher has one more method dened within itth_cnt() [8]. It takes self as
its only argument and outputs a statement showing the number of objects of class
Teacher. t1.th_cnt() in [20] executes this method and displays the total number
of teachers on the role. In general any method dened in a class has self as its rst
argument; it stands for the instance on which it is called to operate.
Student is another class dened within the module school; its listing is
reproduced in Fig. 10.3a. It is similar to the teacher class. It has three attributes: the
class attribute stdn, the method __init()__, and a second methodst_cnt().
Four students have been instantiated as s1, s2, s3, and s4 (in [2], [3], [4], and [5])
in Fig. 10.3b. The student count has been accessed subsequentlyas s1.stdn in
[6]and conrmed as 4. The instance methods1.St_cnt() prints the current
value of the number of students on the rolls in a specied format. It has been
accessed and displayed in [7].
As mentioned earlier the variables and method denitions within a class are
known as attributes of the class. (In Python any quantity bb accessible as aa.bb is
10.2 Classes 237
(a)
class Student: [1]
"Student information"
stdn = 0
def __init__(self, nm, ag):
self.name = nm
self.age = ag
print('New Student with name: {} & of age:
{}'.format(self.name, self.age)) [2]
Student.stdn += 1
def St_cnt(self): [3]
(b)
>>> from demo_9 import school [1]
>>> s1 = school.Student('Maria', 18) [2]
New Student with name: Maria & of age: 18
>>> s2 = school.Student('Adarsh', 19) [3]
New Student with name: Adarsh & of age: 19
>>> s3 = school.Student(' Rana', 17) [4]
New Student with name: Rana & of age: 17
>>> s4 = school.Student('Latha', 20) [5]
New Student with name: Latha & of age: 20
>>> s1.stdn [6]
4
>>> s1.St_cnt() [7]
4 students are on the rolls
>>> school.Teacher.__doc__ [8]
'Teacher information'
>>> school.Teacher.__name__ [9]
'Teacher'
>>> school.Student.__name__
[10]
' Student'
>>> t1.__dict__
[11]
{'tchn': 5, 'name': 'Rakesh', 'age': 31}
>>> t1.__class__
[12]
<class 'demo_9.school.Teacher'>
an attribute of aa.) Variables stdn and tchn are attributes of the classes Student
and Teacher respectively. The methods st_cnt and th_cnt in Student and
Teacher respectively are also attributes of the respective classes. t1.name and t1.
age are assigned within the instance of t1. They are attributes of instance t1; same
is true of t2.name, t2.age, t3.name, t3.age, t4.name, and t4.age also.
The attributes mentioned above are acquired by the class or instance concerned
by virtue of the class denition. Apart from these every class has a set of built-in
attributes which store the basic information regarding the class. These are all of the
read only type and cannot be altered during access. All these have the form __xx__.
As mentioned earlier __doc__ returns the docstring (acronym for Document
string) of the class if it is present. Thus School.Teacher.__doc__ in [8] returns
the stringTeacher information which is the docstring of the class Teacher
([2] in Fig. 10.2). .__name__ returns the name of the class as in [9] and [10]. The
attributes of an instance and their values are stored as a dictionary within. The
same can be accessed to know the status of the instance. t1.__dict__ in [11] returns
all the attributes of t1 and the respective assigned values. .__class__ returns the
source information of the class to which the instance belongs [12]. A few more
built-in attributes are available; they are discussed later.
A user-dened variable within a class whose name starts with two or more
underscores has a special status in Python. It is protectedin the sense that it
cannot be accessed from outside for reading or modication. The class Pupil in the
module school is reproduced in Fig. 10.4; it is similar to the class Student. An
instance of Pupil has a tag associated with it (in addition to the name and age). It is
.__rating [2]. It is assigned a value as part of initialization; but __rating is not
visible outside. The __rating is 20 if the Pupils name is Sandhya, else it is 10
(one way of rating pupils!). Two instances of class Pupil (p1 and p2) have been
created in the Python Interpreter sequence in Fig. 10.4 [4] and [5]. Renu [4] is
assigned a rating of 10 while Sandhya [5] is assigned a rating of 20 as can be seen
from [6] and [7]. Attempt to access p1.__rating in [8] returns an attribute error
conrming its inaccessibility. p1.__dict__ in [9] returns a dictionary with all the
attributes of p1 and their assigned values. p1.__rating does not appear here
stressing its inaccessibility.
The method .__str__() returns a printable string [3]. In the specic case here it
is a formatted string giving out the name and the roll number of the Pupil. print
(p2) directly prints out the string [10]. This is a convenient way of providing any
key information about the instance. p1.mm = 80 (marks obtained by Renu in
Maths) in [11] introduces a new attribute exclusively for the instance p1 and
assigns the value 80 for it. mm as an attribute has been added exclusively to p1 but
not to any other instance of the class Pupil. In turn p1.__dict__ is also updated
(transparent to the user). The same is conrmed from the following line where p1.
__dict__ has been accessed again and shown [12].
10.3 Functions with Attributes 239
Fig. 10.4 Python Interpreter sequence to illustrate the special status of instance variables with two
leading underscores in their names
A set of functions available with objects relate to their attributes directly. getattr
(O, n) returns the value of attribute n of the object O. Here O can be an instance
of a class. Figure 10.5 is a continuation of the Python Interpreter sequence in
240 10 Classes and Objects
Fig. 10.4. getattr(p1, age) [1] returns the value of p1.age. In fact getattr
(p1, age) is essentially an alternative for p1.age. setattr(O, n, v) sets the
value of attribute O.n to v if n is already an attribute of O. If n is not an existing
attribute of O, such a new attribute will be created and v assigned as its value.
setattr(p1, branch, EEE) [2] adds the attribute branch to the object p1
and assigns the value EEE (string) to it. The same is conrmed in [3] which
outputs p1.branch. hasattr(O, n) checks for the presence of attribute n for
the object O. True or False is returned depending on whether the attribute is
present or not. The query hasattr(p1, branch) [4] returns True conrming
the presence of such an attribute. But p2 does not possess this attribute as can be
seen from [5]. delattr(p1, branch) in [6] deletes the specied attribute for the
instance p1. A fresh query hasattr(p1, branch) in [7] returns False
conrming this.
All the above four attribute related functions allow attributes to be created,
modied, or deleted as the case may be, for the selected object (instance of the class
here). Other instances of the class remain untouched.
10.4 pass : Place Holder 241
pass statement in Python does not signify any operation. It is a place holder when
a statement is mandatory but no code need be executed. It is useful to assign
attributes to class instances dynamically. Guru in school (reproduced in
Fig. 10.5) is a class without any executable code in it [8]. school.Guru.__doc__
[9] returns the docstring of Guru. g1 in [10] is an instance of Guru. A new
attribute g1.name is created and Uma (string) assigned to it [11]. g2 is
another instance of Guru [12]. [13] shows that name is an attribute of g1 but g2 is
not endowed with this attribute [14]. However a subsequent allocation of a name to
g2 [15] is conrmed in [16]. This emphasizes the possibility of dynamic creation
and enhancement of individual instances of a class.
Two additional examples for classes are considered here to stress the variety
possible. The module padha has a classMishramdened in it (Fig. 10.6). It
accepts two argumentsxx, yy,evaluates an assorted set of functions involving
them, and assigns the results to a tuple. From the nature of the functions here one
can see that both the arguments are to be numbers; the setx, y (=0.2, 4.2)has
been taken as the argument set [1] in the Python Interpreter sequence in Fig. 10.7
and a1 is instantiated as an object of class Mishram [2]. [3] conrms this. a1.xy
has been output in [4];
The instance a2 of Mishram in [5] has complex numbers as arguments. In turn
elements of a2.xy in [6] are also complex. As with all operators, functions & c., in
Python, the type of argument need not be specied separately. But the need to use
argument types consistent with the operations remains.
padha.Mishram1 (Fig. 10.6) is another class with two string argumentsxx,
yyas inputs. It prints out a string involving two objects (x1 and x2). x1 is a
concatenated string; x2 is the greater of the two strings (Vide Sect. 6.3). a3 in
Fig. 10.7 is an instance of Padha.Mishram [8] with VidyaLavanya and
class Mishram:
"return assorted functions"
def __init__(self, xx, yy):
self.xy = (xx*xx, 1.0/xx, xx/yy, xx+yy)
class Mishram1:
"Play with strings"
def __init__(self, xx, yy):
self.x1 = xx + '--' + yy
self.x2 = xx if xx > yy else yy
def opt(self):
print ('The combined string is: ', self.x1, ':
The boss is:', self.x2)
Fig. 10.6 Denitions of two classes to illustrate the varieties possible in the methods
242 10 Classes and Objects
Fig. 10.7 Python Interpreter sequence to illustrate use of classes in Fig. 10.5
10.5 Overloading
P
Vng_n [3] (Fig. 10.8) has been dened to give i jxi j as the output (l1 norm in
vectors). v1am ([4] in Fig. 10.9) is an instance of class vng_n. abs(v1am)
(1.2 + 2.3 + 3 + 4 + 5.7 = 16.2) [5] gives the value of the norm. vng_x [5] in
Fig. 10.8 accepts two numbers as bytes objects, converts them into respective
integers to base 16 [6] and returns a complex number with these as its real and
imaginary parts. v2 ([6] in Fig. 10.9) is a pair of bytes (171, 205) objects. V2_x [7]
is an instance of vng_x(v2). complex(v2_x) [8] returns the corresponding
complex number.
The class vng_i [7] in Fig. 10.8 accepts a bytes object and an integer (range 0
35) as a tuple argument and returns the bytes object as an integer to the base of
the integerthe second element of the tuple [8]. v3 [9] in Fig. 10.9 is a tuple
bytes object (btmf01) and an integer (=30) combined. v3_i is an instance of
vng_i with v3 as its argument. int(v3_i) [10] is the corresponding integer value.
Its value is conrmed by direct evaluation in [11].
244 10 Classes and Objects
Fig. 10.9 Python Interpreter sequence to illustrate overloading with classes in Python (Single
argument)
v4([12] in Fig. 10.9) is a tuple of two bytes-type integers. Each set has been
converted into a corresponding integer by instantiation in [13]; the pair has been
again converted into a corresponding complex number [14]. Conrmation the
conversion has been done in [15] directly using int() function itself.
The overloading discussed thus far pertains to operands with single arguments.
Similar overloading is possible with operands involving multiple arguments also.
A set of examples are considered to illustrate their use. The programvct2and
the relevant Python Interpreter sequence are in Figs. 10.10 and 10.11 respectively.
Class vc2 in Fig. 10.10 [1] accepts argumentsx1, x2, and x3a set of three
numbers being components of a vector. They are assigned to components c1, c2,
and c3. __add__(self, ott) is dened as a method which accepts two such
vectorsself and ott (signies other) as arguments [2] and the set of three
sums is returned as vector. These two tasks together constitutes the __add__()
10.5 Overloading 245
method within vct2. In the Python Interpreter sequence in Fig. 10.11 dd (3.3,
4.4, 1.1) [1] and ee (2.3, 3.1, 4.3) [2] represent two such vectors. ff = dd + ee
in [3] signies vector addition as dened in __add__. The sum vector (1.0,
1.3000000000000003, 5.4) is returned as [3].
Multiplication, substraction, true-division, floor division, mod operation, and
power are successively dened in a similar manner as __mul__, __sub__,
__truediv__, __oor__, __mod__, and __pow__ respectively in vct2. They
correspond to *, , /, //, %, and ** respectively. The operations dd * ee,
dd ee, ee/ff, ee//ff, ee%ff, and gg ** hh have been carried out conforming to
these (Fig. 10.11). The vector components of ee, ff, gg, and hh have been taken as
integers here. (gg ** hh in vector form as dened here does not make sensethe
same has been done here more as an exercise to highlight the overloading of the
pow() operator).
Methods of the form __XYZ__ are predened in Python (special methods).
In a class being created, any of these special methods can be dened to suit the
context. Conversely one need not dene a new method like __XYZ__ but use one
246 10 Classes and Objects
Fig. 10.11 Python Interpreter sequence to illustrate overloading with classes in Python (Two
arguments)
available with necessary denition. All the overloading examples discussed above
are of this category.
The concepts of classes, their instances, and objects are all basic to Python.
Assignments like x = 4, y = x2, z = sat, zz = Gamaya automatically take x, y,
z as objects. All operations which make sense and are valid are interpreted suitably.
The following (typical) operations and their interpretations are to be viewed in this
light.
10.5 Overloading 247
xx x x
x2 x 2
xy 4 y 4 x2 20
xsx z 4 SatSatSatSat
xpzz Sat 0 0 Gamaya0 Sat Gamaya
Python automatically adapts and interprets operations +, *, and the like to suit
the context (overloading when necessary). But attempts to use the operators as
below (subtracting, dividing, or multiplying strings) do not make sense. Hence they
are not valid (in Python) either.
zmzz z zz
zdzz z=zz
xsz 4 z
xdzz x=zz
10.6 Inheritance
Inheritance is an important and useful feature that goes with classes. It pertains to a
class (child) having another class as an argument (parent); the child class
implicitly inherits the attributes of the parent class. This obviates the need for
redening/assigning values for these attributes.
The listing of class Admn_a (Student) is reproduced in Fig. 10.12. It denes a
class Admn_a. Admn_a has the class Student (listed in Fig. 10.3) as its sole
argument. Such a class denition inherently implies that Admn_a inherits class
Student. With this the attributes of Studentvariables as well as methods
become accessible from within Admn_a. The suite of class Admn_a in Fig. 10.12
(from module school) and the related Python Interpreter sequence in Fig. 10.13
bring out the key features of giving shape to inheritance and using them. Admn_a
has been set to keep track of details of Student instances and the branches of study
allotted to them. Administrator of school is the docstring of Admn_a. rgstr1
carries the basic Student branch registration details in a dictionary form. CE,
EE, and ME are the three designated branches. keys and the associated values
as integers represent the number of students with each specic branch allotted. For a
student in a base class (also known as parent class, super class) Admn_a is a
derived class (also known as child class, subclass). def__init__(self, nm1, ag1,
branch1) [3] relates to the single argument classStudent which is input to class
Admn_a. The rst two argumentsnm1, ag1are assigned to the (instantiated)
Student as name and age in [4]. The third argumentbranch1is allotted as
the third argument of (instantiated) Student [5]. Such assignment to a base class in
248 10 Classes and Objects
addition to what was done within Student itself is possible with derived classes.
With this a Student (base class) instantiated through Admn_a (derived class) has
all the three specied attributes attached to it. The register carrying details of branch
allotted (rgstr1) is updated with the branch allocation in [6]. Details of the updated
branch alone are returned [7] as a string.
School has been imported; as mentioned earlier classes Student and Admn_a
are in it. st1 is formed as an instance of Admn_a [1] in Fig. 10.13. One student
10.6 Inheritance 249
Multiple inheritances have many dimensions and raise many issues. We shall get
into these in depth through a series of examples.
250 10 Classes and Objects
class Student:
"Student information"
stdn = 0
def __init__(self, nm, ag):
self.name = nm
self.age = ag
print('New Student with name: {} & of age:
{}'.format(self.name, self.age))
Student.stdn += 1
def St_cnt(self):
"Give number of students on roll"
if Student.stdn > 1:print('{} students are on the
rolls'.format(Student.stdn))
elif Student.stdn == 1:print('There is only one
student on the rolls')
else: print('There is no student on the rolls')
#__str__() newly added
def __str__(self):
return "{} of age {} has registered as a
student".format(self.name, self.age)
self.nt,self.at,self.bt,self.ns,self.as1,self.bs=nmt,agt,br
t,nms,ags, brs
Student.__init__(self, self.ns, self.as1) [9]
Teacher.__init__(self, self.nt, self.at) [10]
Admn_b.rgstrS[self.bs] += 1 [11]
Admn_b.rgstrT[self.bt] += 1 [12]
def __str__(self): [13]
# return Teacher.__str__(self) + ';' +
Student.__str__(self)
return "{} branch has {} students after {}'s
registration & {} branch has {} Teachers after {}'s
registration ".format(self.bs,Admn_b.rgstrS[self.bs],
self.ns, self.bt, Admn_b.rgstrT [self.bt], self.nt)
instantiated [4] in Fig. 10.16 as stte; the printouts with parent implementations
(Student and Teacher) are in [6] and [7]. The __str__() in Admin_b has been
output in [8]. The updated status of rgstrt and rgstrt can be seen here. Once again the
denitions and assignments can be seen to be straightforward and applicable to cases
with more than one parent class as long as the structure of the classes remains similar.
10.7 super()
(a)
>>> class AA(): [1]
... def sat(self): print('AA is the God') [2]
...
>>> pa = AA() [3]
>>> pa.sat() [4]
AA is the God
>>> class CA(AA): [5]
... def sat(self): [6]
... print('CC is the true God')
... super().sat() [7]
...
>>> pac = CA() [8]
>>> pac.sat() [9]
CC is the true God
AA is the God
>>> class BB(): [10]
... def sat(self):
... print('BB is the real God')
... super().sat() [11]
...
>>> class CC(BB): [12]
... def sat(self): [13]
... print('CC is the true God')
... super().sat() [14]
...
>>> class DD(CC, AA): [15]
... def sat(self): [16]
... print('DD is the supreme God')
... super().sat() [17]
...
>>> pp = DD() [18]
>>> pp.sat() [19]
DD is the supreme God
CC is the true God
BB is the real God
AA is the God
>>> DD.__mro__ [20]
(<class '__main__.DD'>, <class '__main__.CC'>, <class
'__main__.BB'>, <class '__main__.AA'>, <class 'object'>)
Fig. 10.17 a Python Interpreter illustrating the basic ideas of function super() (continued in
Fig. 10.17b) b Python Interpreter illustrating the basic ideas of function super() (continued from
Fig. 10.17a)
sat() has super() preceding the print() [23], [25], [27]. As a result the control is
transferred from Dd to Cc to Bb to Aa. After the print out in Aa control reverts to
Bb and sat() execution is continued and completed there; similarly from Bb
254 10 Classes and Objects
(b)
>>> class Aa(): [21]
... def sat(self): print('Aa is the God')
...
>>> class Bb(): [22]
... def sat(self): [23]
... super().sat()
... print('Bb is the real God')
...
>>> class Cc(Bb): [24]
... def sat(self): [25]
... super().sat()
... print('Cc is the true God')
...
>>> class Dd(Cc, Aa): [26]
... def sat(self): [27]
... super().sat()
... print('Dd is the supreme God')
...
>>> pd = Dd() [28]
>>> pd.sat() [29]
Aa is the God
Bb is the real God
Cc is the true God
Dd is the supreme God
>>>
Fig. 10.19 A Python Interpreter sequence illustrating some aspects of multiple inheritance
256 10 Classes and Objects
Fig. 10.20 A Python Interpreter sequence illustrating the variety possible with multiple
inheritance
the same. The instantiation as dd and execution of dd.sat() yields the same results
as earlier. However the linkage through __init__() shows the path for general-
ization for cases with multiple parents and more than one methodinherited in
different ways.
The Python Interpreter sequence in Fig. 10.20 has a set of classes (A2, B2, C2,
and D2) with two methodssat() and asat()linked in different ways. The
structure and linking of the classes is similar to the case considered above. dd [11]
is an instance of D2. The mro for dd is similar to that in Fig. 10.17 [20]
reproduced below:
(<class __main__.D2>, <class __main__.C2>, <class __main__.
B2>, <class __main__.A2>, <class object>).
10.7 super() 257
The method sat() in D2 [9] continues through C2 [7] and B2 to A2 [2]. dd.sat()
[12] outputs the chained execution of sat() conforming to mro. Since sat() is not
dened in B2 it is bypassed and search continued to A2 [2]. Similarly dd.asat()
[11], [5], [3] bypasses C2 since it is not dened in C2. The logical execution of dd.
asat() continues up to (and terminates in) A2. The presence of def saat
(self):pass [3] in A2 ensures its completion. A2 has no executable statement in its
asat(). If def asat(self).pass is omitted in A2 the chained execution of asat()
conforming to mro cannot be completed. The example brings out two more aspects
of super().
The class chain should provide for logical completion of the execution of the
chained methods.
If an intermediate class in the mro chain does not have a chained function
dened in it, the same will be bypassed in the execution chain.
We revisit the example with Admn(Student, Teacher) with the use of super()
function. The module School_c with the classes Student, Teacher, and
Admn_sp(Student, Teacher) is reproduced in Fig. 10.21. All the three classes
have been curtailed in scope to focus only on inheritance where the parents are to be
supplied arguments in the desired order. When Admn_sp(Student, Teacher)
[12] is instantiated, the required argument set is supplied the arguments in the desired
order. def__init__(self, brs, brt, *Arg) [13] accepts brs and brt (Student
branch and Teacher branch) and assigns them to self.bs and self.bt [14]; the
rest of the arguments supplied are passed on to the next in line in the mro through
super().__init__(*Arg) [15]. Thus due to def__init__(self, nms, ags,
*Arg) [2], Student [1] accepts the next two arguments as nms and ags (name and
age of the Student respectively); they are assigned to self.name [3] and self.
age [4] (of the student) respectively. Once again the rest of the arguments are passed
on to the next in the mro through super().__init__(*Arg) [7]. Teacher (next in
the mro) accepts the rest of the arguments as nmt and agt [10] (name and age of the
Teacher respectively). This completes the __init__chain. def __str__(self)
[16] in class Admn_sp() passes control to the def __str__(self) [8] in Student.
Super.__str__() [17] ensures this. The Super.__str__() in Student [9] in turn passes
control to the def __str__(self) [11] in Teacher. The string formed here is
concatenated with string in Student (self.stt); the combined string [9] is
returned through the sibling class Admn_sp [17].
Additionally the method nunt(self) [18] in class Admn_sp returns details of
the new entrants to the School as a string.
School_c has been imported in [1] in the Python Interpreter sequence in
Fig. 10.22. The child class Admn_sp() has been instantiated as sta in [2]. The
number of arguments, their types, and their sequence match the requirements
conforming to the mro (Admn_sp Student Teacher). The outputs in [3]
and [4] are the printouts demanded at instantiation of Student and Teacher
respectively. Details of the new entrants as the string sta.nunt() are in [5]. The
string returned by instantiation of sta is in [6].
258 10 Classes and Objects
self.name = nmt
self.age = agt
Teacher.tchn += 1
print('New teacher with name: {} & of age:
{}'.format(self.name, self.age))
def __str__(self): [11]
return "{} of age {} has registered as a
teacher".format(self.name, self.age)
Fig. 10.22 The Python Interpreter sequence instantiating the School in Fig. 10.20
In all cases the mro is central to the linking of arguments, methods & c. Two
additional observations on the use of super () are in order here:
The number, sequence, and types of arguments used in instantiation should
match the requirements.
As many (diverse) parents as required for the inheritance scheme can be
accommodated in the chain.
The discussions and illustrations of Python program execution have been carried
out so far in the interactive mode. Python functions as an interpreted language
wherein each statement is executed and the system returns the prompt in the next
line of input. Alternately readymade scriptswhole programscan be run directly
without opening Python per se. This mode and its salient features are illustrated
here.
term_tst_a.py is a simple Python module reproduced in Fig. 10.23. aa and bb
are assigned values [1] and their ratio printed out [2]. Further a function prd(a1,
b1) with two arguments a1 and b1 is dened [3]; it prints out the a1 b1 product.
Fig. 10.24 Execution of Program in Fig. 10.23 in command line as well as in interpreter mode
after opening python environment
Fig. 10.25 A routine to demonstrate the use of __main__ to identify and use the execution
environment
Since prd(a1, b1) has been dened, calling it with arguments a0 and b0 (=2.1 and
3.2 respectively) as term_tst_a.prd(a0, b0) leads to its execution [7].
term_tst.py in Fig. 10.25 is another routine done to illustrate a different
dimension of the command line execution. As had been mentioned earlier every
entity in Pythonlike function, module, class, and the like has a name associated
with it. It appears in its __dict__ as an attribute. Further during execution the
active execution environment is identied by assigning the name __main__. The
module term_tst.py brings out a possible use of this. The compound statement
starting with [2] in it has two parts. If the module is invoked for execution directly
__name == __main__ is true. aa + bb is printed out [3]; else (that is when
the module is imported by another program), the clause in [4] (print(Difference of
aa & bb = , aa bb)) is executed. The Python environment is closed and the
terminal started afresh in [8] in Fig. 10.24. term_tst.py is executed directly from the
open terminal [9].
if __name__ == __main__: in [2] (Fig. 10.25) being True, the main clause
following (print(Sum of aa & bb = , aa + bb)), is executed here as can be seen
in [10] in Fig. 10.24. Following this python3.5 is again opened in [11] and in the
Python environment module term_tst imported [12] (Note the need to retain .py
extension for direct execution but not for importing). [13] is the output resulting
from the execution of the zero indent level statements. Reverting to Fig. 10.25 the
term_tst is not the execution environment. Hence the else:print(Difference
of aa and bb = , aa bb) is executed [4]. [13] in Fig. 10.24 conrms this. Further
the function prd() is dened as in [5] in Fig. 10.25; calling it as term_tst.prd(a0,
b0) [15] (in Fig. 10.25) with arguments (a0 = 2.1 and b0 = 3.2) results in its due
execution [16].
[17] is an illustration of another use of if __name__ == ___main__. Since
the program execution environment is assigned the name __main__ the set of
executable statements following [17] are parsed and the set executed only when you
come out as in [20]. In contrast normally each of the statements here [18] to [19]
would have been executed one after another in the interpreter mode.
262 10 Classes and Objects
10.9 Exercises
1. Materials Management: Efcient sourcing of all items required for the pro-
duction in a manufacturing organization is the task of a Materials Manager. Let
us understand the role through a tangible examplethat of manufacture of an
electric fan. Insulated wire, stampings, insulating varnish, ns, capacitor, con-
necting wires (harness), paint, and hardware items like shaft and casing castings
these are the major items to be sourced. The managers tasks are the
following:
Have an idea of production schedule as weekly production rate.
Ensure timely availability of all materials for uninterrupted production. For each
item he should have at last two vendors (to prevent supplier monopoly).
10.9 Exercises 263
He should not stock too many numbers of an item; it can lead to inefcient use
of working capital. But he has to match supply to production rate. This is to be
done for each item to minimize blocking of working capital.
Do vender development when situation demands.
Have a clear index of performance and try to optimize it continuously.
Have a provision to update index of performance regularly/when required.
Develop a program for materials management. Materials manager, items in
stock, item_used can possibly be the classes. Each item can be an instance of the
class item_used. Dene inputs to classes and class functions. Identify inheri-
tance sequences and incorporate all these in the program.
Reference
van Rossum G, Drake FL Jr (2014) The Python library reference. Python Software Foundation
Chapter 11
Time Related Operations
The time related modules in Python have provision to express and convert
time/time intervals involving two common time StandardsUTC (ITU 2002) and
ISO 8601. Both of these are briefly discussed here to facilitate understanding of the
relevant provisions in Python.
The Coordinated Universal Time (acronym UTC) is the Standard which denes
times, time zones, and relates them. UTC is a rened version of the GMT; it uses
the atomic clock as the basis. The proleptic Gregorian calendar is the basis for UTC.
00.00 h on the rst of January 1970 at 0 longitudecalled the epochis taken
as the starting point for time as well as calendar. Each year is of 365 days (366 days
in leap years), each day is of 24 h, each hour is of 60 min, and each minute is of
60 s. The time is adjusted by a maximum of one second at irregular intervals
(agreed by the competent International community) to account for the slowing
down of Earth. The Standard times in different countries are all tied to the UTC time
with offsets ranging from 12 h to +14 h in intervals of 30 min (occasionally
15 min). Network protocols, World Wide Web Standards, and the like use the UTC
system as the basis.
ISO 8601 species formats for representing time, date, and related information.
This is recommended for information exchange. The time related modules/classes
in Python have the provision to accommodate these in different contexts. The
salient features of the representation are discussed here.
265
266 11 Time Related Operations
Date
Full UTC time representation: 19hr, 23.5 mins. UTC time on the
2016-02-25T19:23.5Z 25th of February 2016: Z signifies UTC time
13:53:37.123-05:30
13:53:37-05:30
13:53-05:30
13-05:30
Time alone representation - alternatives
19:23:37.123Z
19:23:37Z
19:23Z
19Z
2016-02-25
Date alone representation - alternatives
2016-02
2016 Year alone representation
No. of hours
No. of days
No. of years
The module time is imported in the Python Interpreter sequence in Fig. 11.3. With
t1 = time.time() in [1] t1 is the total time elapsed since epoch. It is in seconds in
floating point mode with 1 s precision. time.gmtime(t1) converts t1 into a
format in terms of year, month, day, hours, minutes, and seconds (items 05 in the
same order) [2]. It also includes the day of the week (with Monday being assigned
0) and the serial number of the day in the year (items 6 and 7). The last eld is the
change for daylight change whenever it is implemented. The indices, their signif-
icance, and range are summarized in Table 11.1. The timezone ag = 0, in the
absence of daylight saving time in the zone. This is referred as the struct_-
time representation. struct_time is a tuple of nine elds and any of these
can be accessed by proper indexing [3]. time.gmtime() without any arguments
returns the current UTC time in the same 9-eld format [4]. time.localtime(t1)
converts t1 into the 9-eld formatas local time. The difference between time.
gmtime(t1) and time.localtime(t1) is to be clearly understood. In the specic
case here the local time is 5 h 30 min ahead of UTC which accounts for the
difference between the two.
time.strftime(format, string) [5] accepts the 9-eld time (struct_time) as a
tuple and returns the corresponding time in a specied compact format. The
formatting details conrm to those in Table 11.2. Methods are available to convert
268 11 Time Related Operations
time amongst these three formats (seconds from epoch, struct_time, and the
compact format). These are summarized in Table 11.3. Their uses are illustrated in
the Python Interpreter sequence in Fig. 11.3. In all these cases the time argument
Fig. 11.3 Python Interpreter sequence illustrating the use of features in the time module
11.2 time Module 269
can be input in the desired format. In its absence the current time is used as the
default argument.
time.ctime(aa) accepts an argument aarepresenting a time span from the
epoch the unit of time being in seconds. It is converted into a string in local time
and returned as can be seen from [6]. The time t1 in [1] is converted and displayed
here. The string is in the form shown in Fig. 11.4. This as a string is con-
verted into time. struct_time form (of 9 or less elds) by time.strptime
(string, time) [7]. time.asctime(tg) converts struct_time that tg rep-
resents, to the format in Fig. 11.4 [8]. time.asctime() without an argument,
returns the current time represented by time.localtime() in the same format.
time.mktime() takes the 9-eld struct_time as argument and returns the
corresponding epoch time in seconds [9]. calendar.timegm(t1) (this is from
the module calendar; calendar has been imported here for the specic purpose of
invoking this method. Calendar has been discussed separately later) accepts the
0-eld struct_time as argument, treats it as gmtime() and returns the corre-
sponding UTC time in seconds [10]. time.mktime(tdb) treats tdb as the local
time tuple while calendar.timegm(tdb) treats it as the UTC time tuple.
For Coimbatore (India) the difference is 19 800 s showing the local time to be
5.30 h ahead of UTC. In fact the attribute time.timezone returns the difference
between UTC time and local time. The timezone is the offset of local time with
respect to UTC. The offset to the West is taken as positive. For India it is negative
and 5.30 h head of UTC [11]. time.tzname is the name of the time zone as a
tuple. The rst of these is the DST (Daylight Saving Time) and the second one the
local DST time zone. The latter may be ignored if not specied. It is not applicable
for India; the time zone name for Indian time is IST (Indian Standard Time) [13].
The attributes time.monotonic(), time.perf_counter(), and time.
process_time() represent the system time with clear differencesall of them
270 11 Time Related Operations
Table 11.3 Time and date conversion possibilities with time module
Method Arguments Returned quantity
.gmtime() Nil Local time in 9-eld format
Epoch time in secondste te as UTC time9-eld format
.localtime() Nil Local time in 9-eld format
Time in secondsts Ts as local time9-eld format
.strftime() Format, 9-eld string Date-time string in specied format
.strptime() String, format Time in 9-eld format
.asctime() Nil Local time in specied format
Time as 9-eld string time in specied format
.ctime() Nil Current time returned in specied format
Epoch time in secondste Local time returned in specied format
.mktime() Time as 9-eld string Epoch time in seconds
Calendar.timegm() Time as 9-eld string
Seconds
Date
Month
Three
Day of characters each
the week
Fig. 11.5 Python Interpreter sequence illustrating the use of features in the time module
can be used to ascertain and compare the time of execution of different routines.
The function aabb(a, b) [3] is to illustrate such an application. The arguments
a and b here are integers. kk is repeatedly incremented from 0 to b 1, a times.
This denes the function. Its time of execution should be a * b times the time to
increment. Overheads will increase it marginally. t1a and t2a are the values of
time.perf_counter() and time.process_time() at the start of execution of
aabb() [4]; t1b and t2b are their respective values at the end [6]. All these four
quantities are returned by the function [7]. After execution of the main routine, the
time.sleep(1) [5] adds an additional idling second to the function. aabb() is
11.2 time Module 273
Fig. 11.6 Python Interpreter sequence displaying details of times in time module (continued
from Fig. 11.14)
called with a = 1000 and b = 1000 and the returned tuple (of four numbers)
assigned to cc[8]. cc[3]cc[2] is the increase in the processing time representing
the execution duration. cc[1]cc[0] is essentially 1 s more than the specic process
time represented by cc[3]cc[2]. The program mainly involves 106 increments to a
number. With 64 727 088 ns (cc[1]cc[0]) as execution time, each increment
takes 65 ns (apprx.). The basic speed of the processor is 3 Ghz (33 ns as basic
clock time). The incrementing requires two processor clock periods for execution.
The routine is executed again with a = 500 and b = 1000. The corresponding
execution time (to do 500 000 increments) is 35 833 471 ns (apprx.)half of the
time for the last case as is to be expected.
process_time() can be used to compare the speed of performance of dif-
ferent algorithms. When multiple processes are involved the process_time()
and perfo_counter() can be put to similar use at a different level.
The implementation details of individual clocks in the time module can be
obtained by invoking the method time.get_clock_info(name_of_clock).
The details for the ve clocks discussed earlier are obtained and displayed in the
Python Interpreter sequence in Fig. 11.6 [1]. time.altzone stands for the offset
(in seconds) of the local time from the UTC time. Indiawhere this PC is runis
ahead of UTC by 5 h 30 min. Correspondingly time.altzone is 19 800 in
India [2].
274 11 Time Related Operations
The datetime module has a set of classes and constants (attributes) dened in it.
These are useful to dene specic dates, times, time intervals and the like. They
facilitate working with different instances of time, their relations and so on. The
classes date, time, datetime, and timedelta are classes dened in
datetime; each has methods dened within it which follows a pattern. We shall
illustrate these and see how they are all closely related. The Python Interpreter
sequence in Fig. 11.7 has the class date (within the module datetime) in focus. Date
(1990, 11, 22) in [1] represents a specic date (22nd Nov 1990) assigned to d1. The
three arguments of date are the year (four digits), month (two digits) and date (two
digits) respectively. date.today() assigned to d2 [2] represents the correspond-
ing data for today (the day this sequence was prepared).
date.timetuple(d1) and date.timetuple(d2) in [3] return the respective
9-eld tuples of time instant discussed earlier (time.struct_time). The hour,
minute, and seconds values are set to zero heredate being the concern. The three
items in date can be accessed and changed separately using year, month, and day as
the respective keys. As an example the year in d1 is changed to 1992 and the
changed date assigned to d3 in [4]. date.timetuple[d3] is in [5]. d3.toor-
dinal [6] returns the number of days elapsed from the start of the proleptic
Gregorian calendar up to the day of d3. (727 524: 2015 365 = 735 475. The
disparity 727 524735 475 is due to the corrections to the calendar implemented at
the time of adoption of the calendar and also on other occasions). date.fro-
mordinal(727 524) [7] converts the number back to the date we started with. The
date corresponding to any time from epoch in seconds (such as one returned by
time.time()) can be retrieved using time.fromtimestamp(). With t1 in [8] as
argument the corresponding date is retrieved in [9]. The residual hours etc., are
ignored here.
d3.weekday() [10] returns the weekday that d3 represents. Here Monday is
taken as 0 and Saturday as 6. However d3.isoweekday() represents the day with
Monday as 1 and Sunday as 7 (in line with the ISO Standard). This explains the
difference between the two integers (6 and 7) returned.
A week in the ISO calendar starts on a Monday. If the rst of January in a year is
on a Friday, Saturday, or Sunday the rst week of the year starts on the Monday
following; else it starts on the Monday preceding. The rst of January is on a
Monday in 1990, Tuesday in 1991, Wednesday in 1992 and Friday in 1993 [11]
and [12]. dd1.isocalendar() [13] returns a 3-tuple with year, week number,
and weekday as its three elements. All these three quantities conform to the ISO
Calendar. January 1st of 1992 being a Friday it is shown as being in the 53rd week
(of year 1991). In the other three cases January 1st is on Monday, Tuesday, and
Wednesday and it is in the rst week of the year. d3.isoformat() and
d3.__str__() [14] also return the date information in d3 but to a different format
as YYYY-MM-DD. d3.ctime() [15] returns time.ctime() corresponding to
d3.timetuple(). In fact it is equivalent to time.ctime(time.mktime
11.3 datetime Module 275
Fig. 11.7 Python Interpreter sequence illustrating the use of features in the datetime module
date.min and date.max represent the earliest and the last date which can be
accommodated in the date class. They are 1-1-1 and 12-31-9999 respectively [15].
date.resolution is the smallest interval possible. It is one day [17].
The datetime module has a class time dened in it (not to be confused with the
time module discussed at the beginning of the chapter). All arguments within it
implicitly use the local time as the basisunless separately specied (see
Sect. 11.3.4). The methods provided here and the arguments used are all in line with
similar ones in the date class in the foregoing section.
The object time can be dened with values for hour, minute, second,
microsecond, (and time zone if specied) as its arguments; they are in the same
order as here with value of hour at the left end. All these are optional. If anyone of
this set is left out the rest are specied as in a dict with hour, minute, second, and
microsecond as the respective keys. Two aspects of any instance of time are
noteworthy:
The default values for all the arguments are zero; hence only the non-zero values
need be specied.
If the order is maintained the time instance can be compactly set in terms of
numbers alone without resorting to the dict format. Thus time(2), time(2, 3),
time(2, 3, 4), and time(2, 3, 4, 567 899) are all time objects representing times
of (2 h), (2 h 3 min), (2 h 3 min 4 s), and (2 h 3 min 4 s 567 899 s.)
respectively.
In the Python Interpreter sequence in Fig. 11.8 t1 in [2] has been specied with
hour, minute, second, and microsecond being 11, 12, 13, and 145 678 respectively,
in the same order. The respective keys are not given since the order is maintained.
tm in [3] has been specied as a time of 12 min. Since all arguments following
minutes are zero they are not displayed here. The rest of the quantities are
understood to be zero. tm value in the following lines conrms this. A time object
like t1 can be edited and specic arguments in it changed using the method re-
place(). t3 in [4] has been redened in this manner. tt.isoformat() returns a
string representing the value of the time object tt. The string conrms to ISO
8601 format as HH:MM:SS:mmmmmm. The values of t1, tm and t3 are returned
as a tuple in [5]. The method .__str__() in [6] is functionally the same as the
method isoformat(). tt.strftime() returns the time object tt as a string. Its
format is specied as an argument string. The formatting details conform to
Table 11.2. t1 is displayed in this manner in [7]. .__format__() is functionally
equivalent to strftime(). .dst() returns the DST value if specied. In India
where this PC is run DST is none. Hence t1.dst() and tz.dst() in [9] return
(None, None) as a tuple.
11.3 datetime Module 277
time.min and time.max are the respective minimum and maximum time
values that can be specied. They are (0, 0) and (23, 59, 59, 999 999) respectively
as can be seen from [10].
The date class pertains to dates and related methods. They have the Gregorian
Calendar as the basis. The time class pertains to the time instances and intervals
with 24 h 60 min 60 s for a day (any changes to the day by the addition of one
second done when necessary to account for the slowing down of Earth is not con-
sidered). The datetime class offers methods which are meaningful combinations
of their counterparts in date and time. Similarly the objects on offer here are
meaningful combinations of their counterparts in date and time. The Python
Interpreter sequence in Fig. 11.9 brings out the salient features of datetime.
datetime is imported from datetime module [1]. datetime.now() [2] returns
the current local date and time together as a tuple. Essentially it represents the time
since epoch as a tuple. It has the year, the month, day of the month, hour, minute,
second, and microsecond as its elements. datetime.today() also has the same
items on offer. But depending on the platform now() may offer relatively better
precision. datetime.utcnow() too represents the time since epoch but as a tuple
conforming to the UTC. [2] returns all these three quantities. The differences are only
278 11 Time Related Operations
in microseconds (57 and 15 respectively) being the delay in the sequential execu-
tions. Barring this difference, now() and today() are the same. utcnow()being
UTC-basedis behind these by 5 h, 30 min. Any time since epoch can be
(a)
>>> from datetime import datetime [1]
>>> n2, d2, nu = datetime.now(), datetime.today(),
datetime.utcnow() [2]
>>> n2, d2, nu
(datetime.datetime(2016, 3, 9, 19, 58, 50, 473443),
datetime.datetime(2016, 3, 9, 19, 58, 50, 473500),
datetime.datetime(2016, 3, 9, 14, 28, 50, 473515))
>>> dt1 = datetime(1990, 11, 22, 11, 12, 13, 145678) [3]
>>> dt1.timetuple() [4]
time.struct_time(tm_year=1990, tm_mon=11, tm_mday=22,
tm_hour=11, tm_min=12, tm_sec=13, tm_wday=3, tm_yday=326,
tm_isdst=-1)
>>> dt1.utctimetuple() [5]
time.struct_time(tm_year=1990, tm_mon=11, tm_mday=22,
tm_hour=11, tm_min=12, tm_sec=13, tm_wday=3, tm_yday=326,
tm_isdst=0)
>>> dt2 = dt1.replace(month = 9, minute = 33) [6]
>>> dt2
datetime.datetime(1990, 9, 22, 11, 33, 13, 145678)
>>> dt1.toordinal() [7]
726793
>>> dr = datetime.fromordinal(726793) [8]
>>> dr
datetime.datetime(1990, 11, 22, 0, 0)
>>> from datetime import date, time [9]
>>> d1, t1 = date(1990, 11, 22), time(11, 12, 13, 145678)
[10]
>>> datetime.combine(d1, t1) [11]
datetime.datetime(1990, 11, 22, 11, 12, 13, 145678)
>>> datetime.date(dt1), datetime.time(dt1) [12]
(datetime.date(1990, 11, 22), datetime.time(11, 12, 13,
145678))
>>> dt1.weekday(), dt1.isoweekday(), dt1.isocalendar(),
dt1.isoformat() [13]
(3, 4, (1990, 47, 4), '1990-11-22T11:12:13.145678')
>>> dt1.isoformat('&')
'1990-11-22&11:12:13.145678'
>>> dt1.__str__() [14]
'1990-11-22 11:12:13.145678'
>>> ds = datetime.strptime("Tue Mar 8 21:22:23 2016", "%a
%b %d %H:%M:%S %Y") [15]
>>> ds
datetime.datetime(2016, 3, 8, 21, 22, 23)
Fig. 11.9 a Python Interpreter sequence illustrating the use of features in the datetime class
(continued in Fig. 11.9b). b Python Interpreter sequence illustrating the use of features in the
datetime class (continued from Fig. 11.9a)
11.3 datetime Module 279
(b)
>>> dt1.__format__("%a %b %d %H:%M:%S %Y") [16]
'Thu Nov 22 11:12:13 1990'
>>> dt1.ctime() [17]
'Thu Nov 22 11:12:13 1990'
>>> t1 = 1455891360.1407747 [18]
>>> dp = datetime.fromtimestamp(t1) [19]
>>> dp
datetime.datetime(2016, 2, 19, 19, 46, 0, 140774)
>>> dup = datetime.utcfromtimestamp(t1) [20]
>>> dup
datetime.datetime(2016, 2, 19, 14, 16, 0, 140774)
>>> dt1.timestamp() [21]
659252533.145678
>>> datetime.min, datetime.max, datetime.resolution [22]
(datetime.datetime(1, 1, 1, 0, 0),
datetime.datetime(9999, 12, 31, 23, 59, 59, 999999),
datetime.timedelta(0, 0, 1))
>>>
Fig. 11.10 Python Interpreter sequence illustrating the use of features in the timedelta class
11.3.4 tzinfo
time and datetime objects discussed so far relate to the place of origin at the
formation time. Taking specic examples datetime.now() as n2 in [2] in
Fig. 11.9 represents the time instant: 19 h, 58 min, 50.473443 s on the 9th of
March 2016 in India, since the PC/system is operated in India. But the tuple that n2
represents does not carry any information regarding the origin of the object n2. n2
could as well have been a corresponding time instant in Bangladesh 30 min before
(since Bangladesh Standard Time is 30 min ahead of India). Such an object in
Python is called a nave object. Similarly nu (datetime.utcnow()) in [2] in
Fig. 11.9 represents the time instant 72 s (473 515473 443) behind n2. But the
representation here is 5 h 30 min. behind that of n2 since it has UTC as its refer-
ence. nn too is a nave object. This is true of all the datetime/time related
objects discussed so far. All these are nave objects in the Pythonic sense. None of
them carry any information regarding their origin. Python has the provision to
incorporate the source-related information also into the time concerned. Time
instants and objects with such source information incorporated within them are
called aware objects. Extracting a nave object from an aware object and clamping
a naive object into a corresponding aware object are also possible. The tzinfo
class in the datetime module serves these purposes.
tzinfo is an abstract base class which cannot be instantiated. It has three
methods dened in it. These together represent the full time source information:
A convenient name that can be assigned to the time concernedlike IST (Indian
Standard Time). It is called the tzname.
The time offset from UTC: it can have a value in minutes representing a
timedelta in the range1439(24 60 1).
The daylight saving time (DST) adjustment value in minutes (east of UTC is
implied): DST value may normally extend up to 1 h. If it is zero DST is taken as
None.
A typical implementation of tzinfo is shown in Fig. 11.11. It is designated
desitime [2]. utcoffset is the method which returns the offset from UTC. It is
5 h 30 min here. tzname() returns a name given to the specic time zone. It is
Bhatta_time (Courtesy Arya Bhatta). dst() returns the DST value as None
(since IST does not have a Daylight saving component in it). desitime is
assigned to dz1 in [3]. Its use to form aware objects has been illustrated using
11.3 datetime Module 283
(a)
>>> from datetime import tzinfo, datetime, timedelta [1]
>>> class desitime(tzinfo): [2]
... def utcoffset(self, dt0):return timedelta(hours =
5, minutes = 30)
... def tzname(self, dt0):return 'Bhatta_time'
... def dst(self, dt): return timedelta(0)
...
>>> dz1 = desitime() [3]
>>> ddnz, ddnu, ddnn = datetime.now(tz =
dz1),datetime.utcnow(), datetime.now() [4]
>>> ddnz, ddnu, ddnn [5]
(datetime.datetime(2016, 3, 11, 8, 8, 34, 75061,
tzinfo=<__main__.desitime object at 0x7efd18070da0>),
datetime.datetime(2016, 3, 11, 2, 38, 34, 75102),
datetime.datetime(2016, 3, 11, 8, 8, 34, 75105))
>>> ddnz.tzinfo, ddnu.tzinfo, ddnn.tzinfo [6]
(<__main__.desitime object at 0x7efd18070da0>, None,
None)
>>> ddnz.utcoffset(), ddnz.tzname(), ddnz.dst() [7]
(datetime.timedelta(0, 19800), 'Bhatta_time',
datetime.timedelta(0))
>>> class bdeshtime(tzinfo): [8]
... def utcoffset(self, dt0):return timedelta(hours =
6)
... def tzname(self, dt0):return 'Bdeshtime'
... def dst(self, dt0): return timedelta(0)
...
>>> dzb = bdeshtime(tzinfo) [9]
>>> ddne = ddnz.astimezone(tz = dzb) [10]
>>> ddne [11]
datetime.datetime(2016, 3, 11, 8, 38, 34, 75061,
tzinfo=<__main__.bdeshtime object at 0x7f11877d5fd0>)
Fig. 11.11 a Python Interpreter sequence illustrating the use of features in the tzinfo class
(continued in Fig. 11.11b). b Python Interpreter sequence illustrating the use of features in the
tzinfo class (continued in Fig. 11.11a)
(b)
>>> ddne.utcoffset(), ddne.tzname(), ddne.dst() [12]
(datetime.timedelta(0, 21600), 'Bdeshtime',
datetime.timedelta(0))
>>> ddn0 = ddnz.replace(tzinfo = None)
>>> ddn0
datetime.datetime(2016, 3, 11, 8, 8, 34, 75061) [13]
>>> from datetime import time [14]
>>> dtz = time(7, 38, 34, 75061, tzinfo = dz1) [15]
>>> dtz
datetime.time(7, 38, 34, 75061, tzinfo=<__main__.desitime
object at 0x7f11877d5da0>)
>>> dtz.utcoffset(), dtz.dst(), dtz.tzname()
(datetime.timedelta(0, 19800), datetime.timedelta(0),
'Bhatta_time') [16]
>>> dtz.isoformat() [17]
'07:38:34.075061+05:30'
>>> dtzb = dtz.replace(tzinfo = dzb)
>>> dtzb
datetime.time(7, 38, 34, 75061, tzinfo=<__main__.bdeshtime
object at 0x7f11877d5fd0>)
>>> dtzb.isoformat() [18]
'07:38:34.075061+06:00'
>>>
accessed and their values shown in [5]. The tz information for all the three objects has
been reproduced in [6]. ddnu.tzinfo and ddnn.tzinfo remain None and ddnz.
tzinfo returns its address [7]. The three attributes of ddnz.tzinfo accessed as
ddnz.utcoffset, ddnz.tzname, and ddnz.dst [7] are the values assigned in [2]
and [3] earlier.
The datetime represented by an aware object like ddnz can be changed to
another (in a different timezone) through the use of its tzinfo. .astimezone()
can be used for this with the new timezone information as argument. A new
tzinfo set has been dened as bdesh(tzinfo) in [8]. It is assigned to dzb in [9].
ddne [10] is formed as a new datetime object [11] representing the same time instant
as ddnz itself. But it has been expressed as Bangladesh time. ddne too is an aware
object. Its time zone components are in [12]as dened for dzb in [9].
ddnz.replace (tzinfo = None) erases tzinfo (the time zone infor-
mation) in ddnz and converts it into a nave object; it is assigned to ddno. Value
of ddno in [13] conrms this.
The time objects discussed in Sect. 11.3.1 are all navethe default tz value was
None for all of them. A nave time too can be converted into a corresponding aware
time by tagging the associated tzinfo into it. time has been imported in [14] from
the datetime module. dtz has been dened as an aware time with tzinfo = dz1
[15]. It represents the time7 h 38 min 34.75061 s as Bhatta_timeconrmed in
11.3 datetime Module 285
[16]. dtz being an aware time, tdz.isoformat [17] displays the time conforming to
the ISO format. By way of an exercise dtz.replace(tzinfo = dzb) changes the
tzinfo in dtz and assigns the new time to dtzb. dtzb is an aware object with
Bdesh_time as basis. The same is displayed in ISO format in [18].
Fig. 11.12 Python Interpreter sequence illustrating the algebraic features with timedelta class
286 11 Time Related Operations
Fig. 11.13 Python Interpreter sequence illustrating the algebraic features with timedelta and
datetime classes
11.3 datetime Module 287
dt2 td3 in [5]. Their values are accessed and shown in the following lines in the
gure.
The difference between two datetime objects represents a time intervalit is
a timedelta object. dt3 dt2 is formed as a timedelta object in [6] and its
value accessed and shown in the following line. Any two datetime objects can
be compared since they are of the same type. The comparison operators can be used
here with proper reinterpretation. dt2 [4] of year 1993 is larger than dt1 [3] of year
1990; hence dt2 > dt1 in [7] returns True implying dt2 succeeds dt1. Other
comparison operations also can be used in a similar manner.
The algebra relating datetimes between themselves and datetime with
timedeltas are applicable to dates as well. When dates and timedeltas are
combined only the date component of timedelta is signicant. The time com-
ponents (seconds, and microseconds) if present are ignored. d1 and d2 in [8] are
two date objects. d1 + td3 [9] adds the years, months, and days of td3 to the
corresponding elements of date dt1 to form a new date (incidentally only two days
are to be added to dt1 to form the new date). Similarly d2 td0 is a new date
preceding d2. d1 d2 [9] represents a time interval; it is automatically returned as
a timedelta object.
Comparison operators can be used with dates. A date succeeding another is
interpreted as the larger one; in this sense d1 > d2 is False [10]. Other com-
parisons can be done similarly.
11.4 Calendars
Fig. 11.14 Python Interpreter sequence illustrating the features of calendar module
Monday; w = 4 species the width allocated for each day in the display to be
equal to four. The default value is the minimal onetwo per date with one inter-
vening space. If necessary a specied number of blank lines can be added between
rows by (by assigning the number to l). l = 0 by default implies no blank line
between the rows of a week.
cdr1.formatmonth(2016, 3, w = 4) [4] returns the calendar data for March
2016 as a formatted string. In fact this is the formatted string counterpart of the
printed information above.
cdr2 [5] has been dened as another text calendar with weeks beginning on
Sundays. cdr2.pryear(2016, m = 2) prints the entire calendar for the year 2016 in
a specied format. The same is shown separately in Fig. 11.15 (only the part
pertaining to JanuaryAugust). cdr2.pryear(y, w = 2, l = 1, c = 6, m = 3) is the
general form for it. Values given are the default values for the arguments for the
11.4 Calendars 289
2016
January February
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 1 2 3 4 5 6
3 4 5 6 7 8 9 7 8 9 10 11 12 13
10 11 12 13 14 15 16 14 15 16 17 18 19 20
17 18 19 20 21 22 23 21 22 23 24 25 26 27
24 25 26 27 28 29 30 28 29
31
March April
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 3 4 5 1 2
6 7 8 9 10 11 12 3 4 5 6 7 8 9
13 14 15 16 17 18 19 10 11 12 13 14 15 16
20 21 22 23 24 25 26 17 18 19 20 21 22 23
27 28 29 30 31 24 25 26 27 28 29 30
May June
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7 1 2 3 4
8 9 10 11 12 13 14 5 6 7 8 9 10 11
15 16 17 18 19 20 21 12 13 14 15 16 17 18
22 23 24 25 26 27 28 19 20 21 22 23 24 25
29 30 31 26 27 28 29 30
July August
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
1 2 1 2 3 4 5 6
3 4 5 6 7 8 9 7 8 9 10 11 12 13
10 11 12 13 14 15 16 14 15 16 17 18 19 20
17 18 19 20 21 22 23 21 22 23 24 25 26 27
24 25 26 27 28 29 30 28 29 30 31
31
Fig. 11.15 The calendar returned by executing cdr2.pryear(2016, m = 2) in Fig. 11.14: only
the part JanuaryAugust is shown
year y. w is the width allocated for each date, l is the number of lines per row,
m is the number of months in each row. The entire calendar of 12 months is split
into groups of m months and displayed as blocks of m months in each row. c is
the spacing between adjacent months.
cdr2.formatyear(2016) returns the full calendar for 2016 as a formatted
string [7]. The formatting details are identical to the pryear() method discussed
above. The string (cfr2016) is of length 2056 characters [8]. The characters
290 11 Time Related Operations
representing the rst few lines of the calendar are accessed and reproduced in [9]
and [10]. The formatting details can be related to the calendar in Fig. 11.15 above.
When a calendar is dened by default a week is taken as starting on a Monday; this
is true of cdr1 in [1] and the following few executable statements in the gure.
cdr2 in [5] has the rst day of the week dened as Sunday (with rstweekday);
any other day can be dened as the rst day of the week in the same manner. cc1
[11] has the rst day dened as Wednesday (rstweekday = 2).
A set of methods are available which return the calendar for a chosen year as a
list. The ways in which the date, weekday, and the month are included in the list
differ. cc1.monthdayscalendar(2016, 3) returns the calendar for the month of
March in 2016 [12] as a list(cc1iw); here the week starts on a Wednesday. The
calendar is arranged into weeks, every week starting on a Wednesday and ending on
the following Tuesday. The dates are arranged accordingly. 1st March 2016 being a
Tuesday the set of dates for the rst week forms the list [0, 0, 0, 0, 0, 0, 0, 1]. For
the second week it continues as [2, 3, 4, 5, 6, 7, 8]. This can be seen from cc1iw[:2]
accessed and reproduced in [13].
The Python Interpreter sequence in Fig. 11.14 is continued in Fig. 11.16. cc in
[14] is a calendar starting on a Sunday as earlier. cciw in [15] is the calendar for
March 2016similar to cc1 in [11]. With the week starting on a Sunday (and 1st
March falling on a Tuesday) the rst 2 weeks look as follows (reproduced from
cciw [:2] in [16]): [[0 0 1 2 3 4 5] [6 7 8 9 10 11 12]].
cc.monthdays2calendar(2016, 3) (=ccmc) [17] returns the calendar for
the month of March 2016 as a list; it is again arranged week wise. [18] accesses the
data for the rst two weeks. The data for each day is a tuple of two numbersthe
date and the day of the week respectively.
cc.monthdatescalendar(2016, 3) [19] returns the datetime.date()
objects for the dates of the month concerned (March) as the object ccmt. These are
also arranged week wise. The rst week starting on a Sunday is represented by
datetime.date(2016, 2, 28)the 28th of February 2016 on a Sunday. ccmt
[:2] [20] accesses the list of datetime.date objects for the rst two weeks
starting with 28th of February (Sunday 2016) and ending with 12th March 2016.
cc.yeardatescalendar(2016) (=ccyd [21]) returns the full years calendar
as a list of datetime.date objects. The list is segmented into groups of three
months each (default valuecan be changed if desired). The data for each group is
again arranged month wise and that for each month week wise (similar to.pryear
() in [6] in Fig. 11.14). The week starts on the set daySunday here. [22] Accesses
the 0th group of the calendar (months of January, February, and March)the 0th
month in it (January) and the rst 3 weeks of this month (as ccyd [0][0][3]).
cc.yeardays2calendar(2016) (=ccy2) returns the similar calendar as a list
for the year 2016 itself [23]. Here every day is represented as a tuple of date and
weekday (similar to ccmc [18]). ccy2 [0][0][3] in [24] shows the segment of the
calendar for the rst three weeks; these tuples represent the same set of
datetime.date objects as in [22] above.
cc.yeardayscalendar(2016) (=ccyt) in [25] returns the full years calen-
dar arranged in the same form (default groups of three months each, each group of
11.4 Calendars 291
Fig. 11.16 Python Interpreter sequence illustrating the features of calendar module
three separate months, each month of weeks starting on a Sunday. Here only the
dates are included. The data for the rst three weeks is accessed as ccyt [0][0][:3]
in [26] and shown; these again correspond to the rst three weeks in January 2016
(see also the calendar structure for January in Fig. 11.15).
There can be situations/application where one has to scan a calendar to identify a
specic date/day like festivities or decide on a celebration of events and so on. In
292 11 Time Related Operations
Fig. 11.17 Python Interpreter sequence illustrating the use of iterators in the calendar module
such cases the full calendar need not be formed and used. An iterator is an effective
alternative. A set of iterators are available in the calendar module which can be
used in such cases. The Python Interpreter sequence in Fig. 11.17 accesses the
iterators for specic cases and illustrates their use.
As earlier cc has been dened as a calendar in [2] with the week staring on a
Sunday. cc.iterweeks() (=ai0) is an iterator for week days starting with
Sunday [3]. list(ai0) in [4] is the full list of the corresponding week days. cc.
itermonthdates(2016, 3) (=aimd) [5] is the iterator counterpart of month-
datescalendar(2016, 3) [19] in Fig. 11.16. The calendar for the month of
March in 2016 is in focus here with aimd as the iterator for the datetime.date
11.4 Calendars 293
objects for all the dates of the month. It starts on the Sunday of the rst week of the
month. Datetime.date(2016, 2, 28) being the data for the rst day of the week
concerned, it forms the rst element pointed by the iterator. An example
illustrating the use of the iterator (though trivial) follows.
Example 11.1 Nevan who runs Royal Restaurant left the station on the rst day of
March 2016 m and returned on the 10th of March. How many days was he out of
station?
With aimd as iterable the routine from [6] counts the total number of days
from the rst up to the 10th of March 2016 (inclusive). 11 being the returned value
[7], Nevan has been out of station for 12 days (the count starts with 0).
cc.itermonthdays2(2016, 3) in [8] is the iterator of all the (date, day)
tuples formed in the list in [17] in Fig. 11.16. The (date, day) pairs for all the
dates from the 1st to the 7th of March 2016 have been formed as a list and returned
in [9]. It is done through a small routine here. The routine scans for the (date, day)
tuples; as soon as the date becomes non-zero (=1) the routine starts appending the
tuples; it continues until the 7th of March 2016. The list of tuples collected is
returned. The rst element in list is for the 1st of March on a Tuesday and the
last7thfor the following Monday.
itermonthdays(2016, 3) in [10] is the iterable representing the mon-
thdayscalendar(2016, 3) in [15] in Fig. 11.16. (list of all the dates of the
month of March starting on the rst Sunday of the month). The routine from [10] in
Fig. 11.17 returns the list of all the rst 10 dates represented by the iterable.
All the illustrations above are centered on the calendar for the month of March
2016. Instead any month of any year in the admissible range can be used here.
A set of additional simple functions are available in the calendar module. The
Python Interpreter sequence in Fig. 11.18 has the calendar module imported in [1].
calendar.setrstweekday(calendar.WEDNESDAY) [2] directly sets the
rst week day as Wednesday. calendar.rstweekday() [3] returns the integer
representing the rstweekday. It has been reset to Sunday in [4]. calendar.
isleap(y) conrms whether y as an year is a leap year. 2016 in [5] as an
exampleit is conrmed as a leap year. calendar.leapdays(y1, y2) returns
the total number of leap years in the interval (y1, y2) (inclusive). calendar.
leapdays(2001, 2017) returns the number as four years [6] (2004, 2008, 2012,
and 2016).
The weekday of any specied date can be obtained directly. calendar.week-
day(2016, 3, 21) returns the day as a Sunday [7]. calendar.weekheader(5)
returns the header of the calendar as a string having the abbreviated weekday names
[8]. 5 is the width for each day set. This can be change to any other desired value.
calendar.monthcalendar(2016, 3) returns the calendar for the month of March
2016 [9] as a matrix with only the dates; successive weeks (starting with the rst)
are in successive rows. Days of the rst week in the previous month and those of
the last week in the succeeding month have zeros in place of the actual dates
(similar to a typical month sheet in the calendar): compare this with.format-
months() in [4] in Fig. 11.14. calendar.month(2016, 3) [10] returns the
294 11 Time Related Operations
Fig. 11.18 Python Interpreter sequence illustrating the usage of the additional methods/functions
in calendar module
calendar for the whole month of March 2016 as a formatted string. calendar.
prmonth(2016, 3) [11] is its printed counterpart; compare this with [3] in
Fig. 11.14 where the formatting has been changed by allocating a larger width for
each date. calendar.prcal(2016, m = 2) [12] prints the calendar for the entire
yearsimilar to pryear(2016, m = 2) [6] in Fig. 11.14 and shown in Fig. 11.15.
11.4 Calendars 295
(a)
timeit.timeit(stmt, setup, timer, number)
Number of times the test is to be executed default value = 1000000
(b)
timeit.repeat(stmt, setup, timer, repeat, number)
Number of times the test is to be executed default value = 1000000
The test execution is carried out 10 000 times. The execution takes 844.11 sthat
is 844.11 ns per cycle. No setup is required here. time.perfo_counter() is
the timer clock (by default) used. If number = 10 000 were left out the default
value of 1 000 000 will be used in its place. The same test is carried out in [4] with
a setup operation preceding. a1 and b1 are assigned the strings g, and Are you
staring respectively as setup. The test a1 in b1 is carried out 10 000 times. The
setup is done once at start and the time for it is not included in the test run. The test
is repeated in the following line.
The time module is imported [5] and time.process_time() used as the
timer clock in [6]. The function timeit.repeat() is similar to timeit.timeit
() but the whole testing is repeated a specied number of times. The structure of
timeit.repeat() is shown in Fig. 11.19b. The number of repeats has the default
value three. If necessary it can be specied separately. Barring this timeit.
repeat() is similar to timeit.timeit(). The timing testg in Are you
staring is carried out three times (default value) and then ve times in [7] and [8]
respectively.
timeit.Timer() is a class dened in the module timeit. The dening
statement with arguments and default options is shown in Fig. 11.21a. It is
instantiated as tt in [9]. Timer.timeit(number = xxx) is a method in Timer
which executes the test specied xxx times and returns the execution time. Default
value for xxx is 1000,00. tt.timeit() is returned (for the default value of num-
ber = 1 000 000) in [10]. The method tt.repeat (r = 3, number = 1 000 000) exe-
cutes the test repeatedly a specied number of times and returns the timings. Here
again the default values are r = 3 and number = 1 000 000. tt.repeat() has been
carried out for the default values in [11].
All the above timing tests have been done for testing whether g in Are you
staring is True. The time per execution of the loop varies from 42.01 to 86.08 ns
here. Even for the case of repeated tests (where the execution sequence and timing
should be identical) the time durations vary. These variations are due to the
interruptions in the program execution in the regular functioning of the processor.
Hence the minimum execution time for the lot should be taken as the guiding value
in any decision regarding comparison of the codes and their performance.
p
As another example a, b, c, and d are assigned values and a b c d
obtained in different ways in the lines followingwith 10 000 000 runs (default
number) in [12] and the same in the following line; the execution times per loop are
238.9 and 237.6 ns respectively. The function chk() in [13] (Fig. 11.20b) evaluates
the same 1000 times repeatedly without invoking timeit or any of the functions
within it. Increase in time.process_time() and time.perfo_time() (rep-
resenting processing time) represent respective execution times. They are 538.5 and
538.4 ns respectively in the execution of chk() in [14]. The overheads here in terms
of decrementing the counter variable e and checking its value in every loop add to
the execution time; the increase in loop execution times here are mainly due to
p
these. The sqrt function in the math module is used to evaluate a b c d
11.5 timeit Module 297
(a)
trp@trp-Veriton-Series:~$ python3.5 -q [1]
>>> import timeit [2]
>>> timeit.timeit('"g" in "Are you staring"', number =
10000) [3]
0.0008441140000172709
>>> timeit.timeit('a1 in b1', setup = 'a1 = "g"; b1 =
"Are you staring"', number = 10000) [4]
0.0008608150000100068
>>> timeit.timeit('a1 in b1', setup = 'a1 = "g"; b1 =
"Are you staring"', number = 10000)
0.000842470999998568
>>> import time [5]
>>> timeit.timeit('a1 in b1', setup = 'a1 = "g"; b1 =
"Are you staring"',timer = time.process_time, number=
10000) [6]
0.000843211
>>> timeit.repeat('a1 in b1', setup = 'a1 = "g"; b1 =
"Are you staring"', number = 10000) [7]
[0.0008424019997619325,
0.0008400020005865372,0.0008397960000365856]
[0.000861885999995593, 0.0008400990000154707,
0.000836861999971461]
>>> timeit.repeat('a1 in b1', setup = 'a1 = "g"; b1 =
"Are you staring"', number = 10000, repeat=5) [8]
[0.000842976000001272, 0.0008368370000084724,
0.0008468860000334644, 0.0008398329999863563,
0.000839723000012782]
>>> tt = timeit.Timer('a1 in b1', setup = 'a1 = "g"; b1 =
"Are you staring"') [9]
>>> tt.timeit() [10]
0.050677361999987625
>>> tt.repeat() [11]
[0.05228787999976703, 0.04205966100016667,
0.04201785000032032]
>>> timeit.timeit('(a+b+c+d)**0.5', setup = 'a=2; b=3;
c=4;d=5') [12]
0.23894656800001712
>>> timeit.timeit('(a+b+c+d)**0.5', setup = 'a=2; b=3;
c=4;d=5')
0.237641372999974
Fig. 11.20 a Python Interpreter sequence to illustrate features of timeit module (continued in
Fig. 11.20b). b Python Interpreter sequence to illustrate features of timeit module (continued in
Fig. 11.20a)
in [15]. sqrt is imported from math module in setup prior to execution of the
loop. The test returns 318 ns as timing per loop execution.
timeit module offers the facility to test code snippets for the execution times
directly from the command line. The general format for the same is shown in
298 11 Time Related Operations
(b)
>>> def chk(): [13]
... t1 = time.process_time(), time.perf_counter()
... a, b, c, d, e = 2, 3, 4, 5, 10000
... while e:
... aa = (a+b+c+d)**0.5
... e -= 1
... t2 = time.process_time(), time.perf_counter()
... return t2[0]-t1[0], t2[1]-t1[1]
...
>>> chk() [14]
(0.005385477999999999, 0.005383988000062345)
>>> timeit.timeit('sqrt(a+b+c+d)',setup = 'from math
import sqrt; a, b, c, d = 2, 3, 4, 5', number = 1000) [15]
0.0003181270000141012
(a)
timeit.Timer(stmt, setup, timer)
Timer used to compute timing: time.perf_counter() is the default timer
(b)
Python -m timeit -n -r -s statement
Statement whose timing is to be tested: default - nil
Fig. 11.21 Structures of a timeit.Timer() class and b Command line execution of timeit
Fig. 11.21b. Setup option, options for the number of runs and repetitions of the
p
whole test are available. Timings of a b c d are obtained in different ways
in Fig. 11.22all from the command line. The time for one execution is calculated
and displayed directly here. In every case the best of three successive runs of the
p
test is returned. a, b, c, and d are assigned values and a b c d computed in
[1] in every execution loop. The best execution time is 257 ns. math.sqrt is
imported in every execution loop in [2]; in turn the best execution time extends to
1.98 s. The assignment a, b, c, d = 2, 3, 4, 5 is done in setup in [3] and
p
a b c d computed in the execution loopdone 10 000 times; the test is
repeated three times. The best execution time is 246 ns per loop; the same is 229 ns
in [4] when done again.
11.6 Exercises 299
11.6 Exercises
Ealuation of 5.17.2 as in Exercise 3 (Sect. 5.3) and that using the pow() function
sin(x), cos(x), and exp(x) as in Example 4.6 and those using the respective
functions in the math module
Values of using the four series in Exercise 6 (Sect. 4.3) and the value obtained
using pi in the math module.
Scalar product of vectors with vectors given as lists as in Example 5.5 and the
same taking the vectors as arrays.
Variance of a sample set as in Example 5.6 and the variance obtained using
variance() in the statistics module. Do the same taking the sample set
as an array.
Sorting of a data set using sort() and that using heapq() in a loop.
References
The basic built-in functions for algebra and logic operations in Python have been
dened as functions in the operator module. They can be used with appropriate
arguments for compact coding in Python. The operations follow a pattern for a set
in the lot. Hence only representative usages are illustrated. All items in the op-
erator module have been imported [1] in Fig. 12.1. lt(a, b) returns True if
301
302 12 Functional Programming Aids
Table 12.1 Methods for comparing objects in operator module; a and b are objects here.
In each case if the condition is satised True is returned; else False is returned
Method Alternate form Direct equivalent Condition tested
lt(a, b) __ lt__ (a, b) a <b a<b
le(a, b) __ le__ (a, b) a <= b ab
gt(a, b) __ gt__ (a, b) a >b a>b
ge(a, b) __ ge__ (a, b) a >= b ab
eq(a, b) __ eq__ (a, b) a == b a=b
ne(a, b) __ ne__ (a, b) a !b ab
is(a, b) No alternate forms a is b a & b are the same?
is_not(a, b) a is not b a & b are different?
Table 12.3 Methods for logical and bit-wise operations in operator module
Method Alternate form Direct equivalent Operation done
not_(a) __not__(a) not a Negation (Logical)
truth(obj) Truth test
invert (a) __invert__(a), *a Bitwise inversion
inv(a), __invt__(a)
and_(a, b) __and__(a, b) a&b Bitwise AND
or_(a, b) __or__(a, b) a|b Bitwise OR
xor(a, b) __xor__(a, b) a^b Bitwise exclusive OR
lshift(a, b) __lshift__(a, b) ab Left shift
rshift(a, b) __rshift__(a, b) ab Right shift
304 12 Functional Programming Aids
of a. If a is a Boolean variable its inverse (False or True as the case may be) is
returned. Else True or False is returned depending on whether a is zero or not.
With aa = 1 and bb = 0 as in [8] not_(aa) and not_(bb) are False and True
respectively [9]. 0b1100001 being non-zero not_(0b1100001) is False. Truth
(expression) tests whether expression is True or not (similar to the expression
following if. a, b, and c have been assigned values in [10]. truth(a == b) and
truth(a == c) are returned as True and False respectively in [11].
The other operators in Table 12.2 are applicable only for integers. inverse()
returns the bit-wise inverse of a (that is *a). With a as an integer (a + 1) is
returned. The inverses of numbers considered in [6] above are returned in [12].
and_(), or_(), and xor_() are respective bit-wise Boolean operations. [13] is a set
of illustrative examples.
lshift(a, b) shifts integer a by b bits to the left; it is the equivalent of
multiplying a by 2b. Similarly rshift(a, b) shifts a by b bits to the right; it is
equivalent to a // b operation; [14] are representative examples.
Table 12.4 has all the methods for operation on sequences. The possible alter-
nate syntax as well as the operational details is also given in the table. Illustrative
examples of their use are in Fig. 12.2. concat(a, b) concatenates a and bwhich
can be sequences of any type; but both are to be of the same type. s1 and s2 in [2]
are two lists. concat(s1, s2) concatenates them in the same order [3]. s3 and s4
are two strings [4]. The concatenated string s0 is formed as concat(s3, s4); it is
in [5]. contains (a, b) checks for the presence of b in the sequence a; if it is
True (False) True (False) is returned. The method is the equivalent of b
in a test. [6] is an illustrative example (testing for the presence of letter V in s0
both ways). countOf(s0, e) returns the number of occurrences of the letter e
in s0 [7]. With string s0 as a sequence each of its characters is an element here.
Hence ee has no presence in s0. This explains countOf(s0, ee) being zero in
[8]. s4 is a list of strings [9]. String aa is present in it three times. Hence
countOf(s4, aa) returns 3 in [10].
indexOf(a, b) returns the index of rst occurrence of letter b in list
a. Illustrative examples are in [11]. getitem(a, bb) does the reverse operation
the element at index b of sequence a is returned here. Here bbeing an indexhas
to be an integer (or an object evaluating to an integer). The third item of string s0
(p) and the forth item of list s4 (dd) are returned in [12]. delitem(a, b)
deletes the item with index b (an integer) from sequence a. S4[3] is deleted in [13].
The truncated s4 is in [14]. String s0 is immutable and such selective deletion is not
possible. Hence delete(s0, 3) [15] returns an error. setitem(a, b, c) has three
arguments. a is an immutable sequence. The element a[b] is replaced by the element
c (object c) in it. setitem(s4, 3, DD) replaces s4[3] (aa) by the new object
DD [16]. The modied s4 is in [17].
Three generic methods with varied possible uses are available in operator
module. The Python Interpreter sequence in Fig. 12.3 brings out their uses.
306 12 Functional Programming Aids
Operations corresponding to the composite operations +=, =, and the like also have
the corresponding operators dened in the operator module. These are given in
Table 12.5. As an illustration iadd(a, b) (representing +=) adds b to a in place;
a has the enhanced value. a and b in [1] in Fig. 12.4 are two lists. iadd(a, b) in
[2] adds list b to list a and a is the new and augmented list [3]. This is
feasible only if a and b are of mutable type; else iadd(a, b) returns the added result
but a remains unaltered. As an illustration c and d in [4] are numbersimmutable.
iadd(c, d) returns the added value (4.7 + 2.1 = 6.8) [5] but c and d remain the
same [6]. br1 and br2 in [7] are two bytearrays (immutable). iadd(br1, br2) in
[8] return the added bytesarray in [9]analogous to adding lists. Note the
difference between adding integers and bytearrays as done by iadd() method.
12.2 itertools
A number of iterator functions are available in the itertools module. These are
adaptations of their popular counterparts in other languages. The python Interpreter
sequence in Fig. 12.5ac illustrate their usage. The three of them in Table 12.6
continue ad innitum and have to be stopped through other executable statements.
count(a, b) is an iterator to run a counter. The count starts at number a and at
every step increments count by number b. a0 in [2] is an iterator to count starting
with value 5 and increment count by 3 at every step. The rst two values can be
seen to be 5 [3] and 8 [4]. The subsequent ve values are returned as list b0 [7].
12.2 itertools 309
(a)
>>> from itertools import * [1]
>>> a0 = count(5,3) [2]
>>> next(a0) [3]
5
>>> next(a0) [4]
8
>>> b0 = [] [5]
>>> for jj in range(5):b0.append(next(a0)) [6]
...
>>> b0 [7]
[11, 14, 17, 20, 23]
>>> a1 = cycle('a0b1c2d3') [8]
>>> c1 = ([],[]) [9]
>>> for jj in range(25): [10]
... c1[0].append(next(a1))
... c1[1].append(next(a1))
...
>>> c1 [11]
(['2', '3', '0', '1', '2', '3', '0', '1', '2', '3', '0',
'1', '2', '3', '0', '1', '2', '3', '0', '1', '2', '3',
'0', '1', '2'], ['d', 'a', 'b', 'c', 'd', 'a', 'b', 'c',
'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b',
'c', 'd', 'a', 'b', 'c', 'd'])
>>> list(repeat('z1z2', 5)) [12]
['z1z2', 'z1z2', 'z1z2', 'z1z2', 'z1z2']
>>> list(map(pow, range(10), repeat(2))) [13]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> dd = [2, 4, 6, 8, 10] [14]
>>> d0 = list(accumulate(dd))
>>> d0 [15]
[2, 6, 12, 20, 30]
>>> import operator
>>> d1 = list((accumulate(dd, operator.mul))) [16]
>>> d1 [17]
[2, 8, 48, 384, 3840]
>>> d2 = list((accumulate(dd, operator.truediv))) [18]
>>> d2 [19]
[2, 0.5, 0.08333333333333333, 0.010416666666666666,
0.0010416666666666667]
Fig. 12.5 a Illustrative usages of iterators in itertools module (continued in Fig. 12.5b).
b Illustrative usages of iterators in itertools module (continued from Fig. 12.5a). c Illustrative
usages of iterators in itertools module (continued from Fig. 12.5b)
cycle (a) takes any iterablea sequenceas an argument. Its elements are
returned in succession until exhaustion. Then the cycle repeats. As an illustration
with stringa0b1c2d3a1 is formed as a cyclic iterator [8]. With this as basis
c1 is returned as a pair of sequences [11]one of integers (0, 1, 2, 3, 0, 1, ) and
the other as one of alphabets(a, b, c, a, b, ).
310 12 Functional Programming Aids
(b)
>>> zz, yy = (1, 3, 5), ('a0', 'b1', 'c2', 'd3') [20]
>>> list(chain(zz, yy)) [21]
[1, 3, 5, 'a0', 'b1', 'c2', 'd3']
>>> xx = [zz, yy] [22]
>>> bb = chain.from_iterable(xx) [23]
>>> list(bb) [24]
[1, 3, 5, 'a0', 'b1', 'c2', 'd3']
>>> tt = combinations('mnop', 2) [25]
>>> list(tt) [26]
[('m', 'n'), ('m', 'o'), ('m', 'p'), ('n', 'o'), ('n',
'p'), ('o', 'p')]
>>> list(combinations(yy, 3)) [27]
[('a0', 'b1', 'c2'), ('a0', 'b1', 'd3'), ('a0', 'c2',
'd3'), ('b1', 'c2', 'd3')]
>>> ww = list(combinations_with_replacement(yy, 3)) [28]
>>> ww [29]
[('a0', 'a0', 'a0'), ('a0', 'a0', 'b1'), ('a0', 'a0',
'c2'), ('a0', 'a0', 'd3'), ('a0', 'b1', 'b1'), ('a0',
'b1', 'c2'), ('a0', 'b1', 'd3'), ('a0', 'c2', 'c2'),
('a0', 'c2', 'd3'), ('a0', 'd3', 'd3'), ('b1', 'b1',
'b1'), ('b1', 'b1', 'c2'), ('b1', 'b1', 'd3'), ('b1',
'c2', 'c2'), ('b1', 'c2', 'd3'), ('b1', 'd3', 'd3'),
('c2', 'c2', 'c2'), ('c2', 'c2', 'd3'), ('c2',
'd3', 'd3'), ('d3', 'd3', 'd3')]
(c)
>>> yy0 = ('b1', 'c2', 'a0', 'd3') [30]
>>> list(combinations(yy0, 3)) [31]
[('b1', 'c2', 'a0'), ('b1', 'c2', 'd3'), ('b1', 'a0',
'd3'), ('c2', 'a0', 'd3')]
>>> yy1 = ('b1', 'c2', 'a0', 'b1') [32]
>>> list(combinations(yy1, 3)) [33]
[('b1', 'c2', 'a0'), ('b1', 'c2', 'b1'), ('b1', 'a0',
'b1'), ('c2', 'a0', 'b1')]
>>>
>>> list(permutations(yy, 3)) [34]
[('a0', 'b1', 'c2'), ('a0', 'b1', 'd3'), ('a0', 'c2',
'b1'), ('a0', 'c2', 'd3'), ('a0', 'd3', 'b1'), ('a0', 'd3',
'c2'), ('b1', 'a0', 'c2'), ('b1', 'a0', 'd3'), ('b1', 'c2',
'a0'), ('b1', 'c2', 'd3'), ('b1', 'd3', 'a0'), ('b1', 'd3',
'c2'), ('c2', 'a0', 'b1'), ('c2', 'a0', 'd3'), ('c2', 'b1',
'a0'), ('c2', 'b1', 'd3'), ('c2', 'd3', 'a0'), ('c2', 'd3',
'b1'), ('d3', 'a0', 'b1'), ('d3', 'a0', 'c2'), ('d3', 'b1',
'a0'), ('d3', 'b1', 'c2'), ('d3', 'c2', 'a0'), ('d3', 'c2',
'b1')]
>>> xx = lambda x:x*x [35]
>>> zz = map(xx, count(2,3))
>>> list(next(zz) for jj in range(20)) [36]
[4, 25, 64, 121, 196, 289, 400, 529, 676, 841, 1024, 1225,
1444, 1681, 1936, 2209, 2500, 2809, 3136, 3481]
>>> list(islice(map(xx, count(2,3)), 10)) [37]
[4, 25, 64, 121, 196, 289, 400, 529, 676, 841]
>>> list(islice(map(xx, count(2,3)),4, 40, 5)) [38]
[196, 841, 1936, 3481, 5476, 7921, 10816, 14161]
>>> list(islice(cycle('a0b1c2d3'), 20)) [39]
['a', '0', 'b', '1', 'c', '2', 'd', '3', 'a', '0', 'b',
'1', 'c', '2', 'd', '3', 'a', '0', 'b', '1']
(Fig. 12.5b); they have been chained together in [21] to form a single list. chain.
from_iterable() is a slightly altered version of chain. It takes a single
argumenta sequence of sequenceslist of strings, list of lists and so
on. The elements are chained to form the iterator for a single sequence. Elements of
the rst sequence are chained rst; after the same is exhausted chaining continues
with the elements of the second sequence and so on until all the sequences are
chained (called lazy evaluation); as an example xx in [22] is a list of the two
tuples yy and zz. bb [23] is formed as an iterator as chain.from_iterable
(xx). The corresponding composite chain is returned in [24] as list(bb).
combinations (a, n) has a as a sequence andn as an integer. All combi-
lena
nations of elements of a taken b at a time in numbertogether form
b
the object of the
iterator here. With mnop as a sequence tt in [25] is an iterator
lentt
for the set of such combinations. list (tt) is in [26]. yy [21] being a
2
all the four combinations of yy with three elements taken at a
list of four strings
lenyy
time ( in number) are returned in [27].
3
combinations_with_replacement(a, n) is the counterpart of combi-
nations(a, n) with replacement of each item in a. ww [28] is the
12.2 itertools 313
(a)
trp@trp-Veriton-Series:~$ python3.5 -q
>>> from itertools import * [1]
>>> pp = product('abc', '98') [2]
>>> list(pp) [3]
[('a', '9'), ('a', '8'), ('b', '9'), ('b', '8'), ('c',
'9'), ('c', '8')]
>>> list(product('abc', '98', repeat = 2)) [4]
[('a', '9', 'a', '9'), ('a', '9', 'a', '8'), ('a', '9',
'b', '9'), ('a', '9', 'b', '8'), ('a', '9', 'c', '9'),
('a', '9', 'c', '8'), ('a', '8', 'a', '9'), ('a', '8',
'a', '8'), ('a', '8', 'b', '9'), ('a', '8', 'b', '8'),
('a', '8', 'c', '9'), ('a', '8', 'c', '8'), ('b', '9',
'a', '9'), ('b', '9', 'a', '8'), ('b', '9', 'b', '9'),
('b', '9', 'b', '8'), ('b', '9', 'c', '9'), ('b', '9',
'c', '8'), ('b', '8', 'a', '9'), ('b', '8', 'a', '8'),
('b', '8', 'b', '9'), ('b', '8', 'b', '8'), ('b', '8',
'c', '9'), ('b', '8', 'c', '8'), ('c', '9', 'a', '9'),
('c', '9', 'a', '8'), ('c', '9', 'b', '9'), ('c', '9',
'b', '8'), ('c', '9', 'c', '9'), ('c', '9', 'c', '8'),
('c', '8', 'a', '9'), ('c', '8', 'a', '8'), ('c', '8',
'b', '9'), ('c', '8', 'b', '8'), ('c', '8', 'c', '9'),
('c', '8', 'c', '8')]
>>> list(product([1, 2, 3, 4], repeat = 2)) [5]
[(1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3),
(2, 4), (3, 1), (3, 2), (3, 3), (3, 4), (4, 1), (4, 2),
(4, 3), (4, 4)]
>>> a0 = 'pqrstuvwxyz' [6]
>>> list(zip_longest(a0, range(6))) [7]
[('p', 0), ('q', 1), ('r', 2), ('s', 3), ('t', 4), ('u',
5), ('v', None), ('w', None), ('x', None), ('y', None),
('z', None)]
>>> list(zip_longest('elephant', 'marshall',
'beautiful')) [8]
[('e', 'm', 'b'), ('l', 'a', 'e'), ('e', 'r', 'a'), ('p',
's', 'u'), ('h', 'h', 't'), ('a', 'a', 'i'), ('n', 'l',
'f'), ('t', 'l', 'u'), (None, None, 'l')]
Fig. 12.6 a Python Interpreter sequence demonstrating usage of some iterators in the
itertools module (continued in Fig. 12.6b). b Python Interpreter sequence demonstrating
usage of some iterators in the itertools module (continued from Fig. 12.6a)
314 12 Functional Programming Aids
(b)
>>> list(enumerate(list(zip_longest('elephant',
'marshall', 'beautiful')))) [9]
[(0, ('e', 'm', 'b')), (1, ('l', 'a', 'e')), (2, ('e',
'r', 'a')), (3, ('p', 's', 'u')), (4, ('h', 'h', 't')),
(5, ('a', 'a', 'i')), (6, ('n', 'l', 'f')), (7, ('t',
'l', 'u')), (8, (None, None, 'l'))]
>>> list(zip(range(2, 6), range(1, 5))) [10]
[(2, 1), (3, 2), (4, 3), (5, 4)]
>>> import operator [11]
>>> list(starmap(operator.sub,list(zip(range(2, 6),
range(1, 5))))) [12]
[1, 1, 1, 1]
>>> list(starmap(pow,list(zip(range(12, 16),range(2,
6),list(repeat(7, 4)))))) [13]
[4, 6, 0, 1]
>>> e1= enumerate('Elephant', 1) [14]
>>> list(starmap(operator.mul, e1)) [15]
['E', 'll', 'eee', 'pppp', 'hhhhh', 'aaaaaa', 'nnnnnnn',
'tttttttt']
>>> ai = tee(a0, 3) [16]
>>> for jj in range(len(a0)):print(next(ai[0]),
next(ai[1]), next(ai[2])) [17]
...
p p p
q q q
r r r
s s s
t t t
u u u
v v v
w w w
x x x
y y y
z z z
>>>
the iterator for the sequence of elements4 (=22), 25 (=(2 + 3)2), 64 (=(2 + 2 *
3)2), 121 (=(2 + 3 * 3)2), list(islice(map(xx, count(2, 3), 10)) returns a
list of the rst 10 elements here [38]. The generalized version of the islice()
iterator is islice(a, b, c, d) where a is the sequence and the slice comprises of the
elements at b, b + d, b + 2 * d, b + 3 * d, until b + k * d where b + k * d
c < b + (k + 1) * d. With b, c, and d being 4, 40, and 5 respectively the list in [39]
starts with the 4th element (corresponding to 2 + 4 * 3) and includes every subse-
quent 5th element until the 39th(2 + 4 * 3)th, (2 + 19 * 3)th, and (2 + 39 * 3)th.
As another example with a as a string (a0b1c2d3), list(islice(cycle
(a0b1c2d3, 20))) yields a list of 20 elementsthe members of a0b1c2d3 are
sequentially cyclically repeated to produce a slice of length 20.
All the items in the itertools module have been imported into the Python
Interpreter sequence in Fig. 12.6 [1]. product(*aa, repeat = 1) is an iter-
ator for the Cartesian product of elements in the set of sequence represented by
*aa. If the sequences are aa1, aa2, aa3, the items represented are (aa1[0], aa2
[0], aa3[0], ,.), (aa1[1], aa2[1], aa3[12], ,), (aa1[2], aa2[2], aa3[2], ,.),
This is true for the default value of the last argument (repeat = 1). If the repeat
is specied as 2 each element set is duplicated to represent a corresponding
sequence; same is true of other repeat values also. A few illustrative examples are
shown in the gure. pp [2] is the iterator for the Cartesian products generated
by abc, and 98 without repetition. list(pp) [3] conrms this. With re-
peat = 2 in [4] each element set is duplicated. list(product(abc, 98,
repeat = 2), is a list of 36 items; each item has four elementsall possible
combinations generated by the set of six items in list(pp) taken two at a time. If
product() has a single sequence of arguments the iterator is for the same
sequence repeated repeat times. list(product([1, 2, 3, 4], repeat = 2)) [5]
is the sequence of all possible integer pairs using [1, 2, 3, 4].
zip_longest(*ai, llvalue = None) as an iterator, is a variant of zip (see
Sect. 5.8). It points to the set of tuples formed by combining the corresponding
elements of all the iterable argumets. The sequence continue until the longest
iterable is exhausted. All deciencies are lled by the specied llvalue
(with default None). Note that in contrast zip() stops when the shortest iterable
is exhausted. With a0 (=pqrstuvwxyz) as a string [6] list(zip_longest
(a0, range(6))) produces a list of tuples [(p, 0), (q, 1), (r, 2), (s, 3), (t,
4), (u, 5), (v, None) (w, None), (x, None), (y, None), (z, None)] [7].
list(zip_longest(elephant, marshal, beautiful)) [8] produces a list of
tuples combining letters from all the three words-in the same order. The last of
them is (None, None, l) accommodating the last letter (l) of beautiful with two
Nones preceding it. Continuing with Fig. 12.6b list(enumerate(zip_
longest(elephant, marshal, beautiful)))) enumerates the above set of
tuples starting at 0 and forms a corresponding list [9]. starmap(ff, aa) as an
iterable is a variant of map. With ff as a function, it represents the sequence of
functions ff(aa[0], ff(aa[1]), Here aa[i] is the ith element of aa; it constitutes the
set of arguments for ff for the specic i value. Contrast with map (ff, a, b, c )
316 12 Functional Programming Aids
which is the iterator for the functions ff(a[0], b[0], c[0], ), ff(a[1], b[1], c[1], ).
In [10] list(zip(range(2, 6), range(5)) represents the list [(2, 1), (3, 2), (4,
3), (5, 4)]; in turn list(starmap(operator.sub, list(zip(range(2, 6),
range(5))))) is a list of four ones representing [2-1, 3-2, 4-3, and 5-4] respectively
[12]. Similarly list(starmap(pow, list(zip(range(12, 16), range(1, 5),
list(repeat(7, 4)))))) leads to the list of integers [4 (=121 mod 7), 6(=132 mod
7), 0 (=144 mod 7), 1 (=155 mod 7)] [13]. With e1 (=enumerate(Elephant, 1))
[14]same as in [6] above list(starmap(operator.mul, e1)) forms the
list {E(=E * 1), ll(=l * 2), tttttttt(=t * 8)] [15].
tee(aa, n = 2) accepts argument aa as a sequence. With its elements n number
of iterators is formedall of them identicalall of them representing the elements
of aa itself; the default value of n is two. ai(=tee(a0, 3)) forms a set of three
identical iterators for the elements of a0[16]. The corresponding full sets of ele-
ments are printed out in [17].
12.2.1 Filtering
A set of four iterator functions are available in the itertools module to lter
data out of sequences in different ways. The Python Interpreter sequence in
Fig. 12.7 illustrates their use. compress(aa, bb) has aa as a sequence. In the
(a)
trp@trp-Veriton-Series:~$ python3.5 -q
>>> from demo_5 import marks2 [1]
>>> from operator import itemgetter [2]
>>> from itertools import groupby [3]
>>> nm = itemgetter('Name') [4]
>>> for ky, mk in groupby(marks2.st, nm): [5]
... print(ky) [6]
... for mr in mk:print(' ', mr['Marks']) [7]
...
Karthik
[77, 78, 79, 80, 81]
Sarani
[76, 78, 82, 83, 84]
Karun
[85, 86, 87, 88, 89]
Kala
[90, 86, 91, 92, 93]
Lan
[65, 86, 66, 67, 68]
>>> for ky, mk in groupby(marks2.st, nm): [8]
... print('NAME: ', ky) [9]
... for rr in mk:print(rr) [10]
...
NAME: Karthik
{'Marks': [77, 78, 79, 80, 81], 'Name': 'Karthik'}
NAME: Sarani
{'Marks': [76, 78, 82, 83, 84], 'Name': 'Sarani'}
NAME: Karun
{'Marks': [85, 86, 87, 88, 89], 'Name': 'Karun'}
NAME: Kala
{'Marks': [90, 86, 91, 92, 93], 'Name': 'Kala'}
NAME: Lan
{'Marks': [65, 86, 66, 67, 68], 'Name': 'Lan'}
>>> nms, mks = [], []
>>> for ky, mk in groupby(marks2.st, nm): [11]
... nms.append(ky) [12]
... for mr in mk:mks.append(mr['Marks']) [13]
...
>>> nms [14]
['Karthik', 'Sarani', 'Karun', 'Kala', 'Lan']
>>> mks [15]
[[77, 78, 79, 80, 81], [76, 78, 82, 83, 84], [85, 86, 87,
88, 89], [90, 86, 91, 92, 93], [65, 86, 66, 67, 68]]
Fig. 12.8 a Python Interpreter sequence illustrating the use of groupby (continued in
Fig. 12.8b). b Python Interpreter sequence illustrating the use of groupby (continued from
Fig. 12.8a)
12.2 itertools 319
(b)
>>> nm0, mk0 = [], []
>>> for ky,mk in groupby(marks2.mm,lambda nn:nn[0]): [16]
... for mb in mk: [17]
... nm0.append(mb[0]) [18]
... mk0.append(sum(mb[1])) [19]
...
>>> nm0 [20]
['Kishore', 'Sanjay', 'Siva', 'Asha', 'Nisha']
>>> mk0 [21]
[395, 391, 228, 418, 300]
>>> q0 = ['a0', 'b1', 'b1', 'b1', 'c2', 'c2', 'a0', 'a0',
'a0'] [22]
>>> ql1 = [(x, list(y)) for x, y in groupby(q0)] [23]
>>> ql1 [24]
[('a0', ['a0']), ('b1', ['b1', 'b1', 'b1']), ('c2', ['c2',
'c2']), ('a0', ['a0', 'a0', 'a0'])]
>>> q1 = sorted(q0) [25]
>>> q1
['a0', 'a0', 'a0', 'a0', 'b1', 'b1', 'b1', 'c2', 'c2']
>>> ql2 = [(x, list(y)) for x, y in groupby(q1)] [26]
>>> ql2 [27]
[('a0', ['a0', 'a0', 'a0', 'a0']), ('b1', ['b1', 'b1',
'b1']), ('c2', ['c2', 'c2'])]
Fig. 12.9 Marks details of a few students used to illustrate use of groupby()
exp_f(a0) (with a0 = 1.2) is assigned afresh to a1 in [5]. The program suite in the
following lines evaluates exp (1.2) to an accuracy of 1.010 by successively
invoking next(a1) until the required accuracy is achieved (if achievable in 30
iterative cycles) [7]. The value obtained for exp (1.2) is 3.320116922735597
(compare with the value from the calculator as 3.320116923) [8].
Use of generator with yield in exp_f makes the computation elegant. Every
new term(i + 1)this evaluated by multiplying the previous one (ith) by x/
(i + 1). Every recurring series can be evaluated similarly (recursively).
322 12 Functional Programming Aids
Example 12.2 Realize the random number generator considered in Exercise 6.8
using yield.
The random number generator is dened as function rny in Fig. 12.12; its
arguments a, c, m, and sd are assigned default values as specied (1 103 515 245,
12 345, 2 147 483 648 and 753 respectively). xn is dened recursively in [2] and
yield xn [3] returns the generator for the next xn. The function is assigned
to ay in [4] with the default seed value (sd = 753). next(ay) in [5] returns the
rst random number value; the subsequent three random numbers are in [6];
additional random numbers can be obtained in the same manner. With the seed set
to 234, the rst four random numbers obtained are in [8]. ryd() in [9] is an
enhanced version of rny() considered above. If a d value is specied the generator
corresponds to the random number sequence in the range (0, d 1) [10]. If d is not
specied, by default ryd() returns a random number in the range (0, m 1)
itself [11]. ryd() has been assigned to cy in [13]. When d is not specied, with the
default seed value (753) the sequence returned [14] is identical to that in [5] and
[6] with ay. Similarly with sd = 234 as seed [15] the random number sequence
returned [16] is the same as in [8]. With d = 257 [17], the random number sequence
has range (0, 256). The rst four values are in [18].
The generator functions formed through yield can be chained together to
accommodate multiple sequences. The Python interpreter sequence in Fig. 12.13
shows two illustrative examples. nn1, nn2, and nn3 in [1], [2], and [3]
12.3 generator Using yield 323
(Fig. 12.13a) form three sets of tuples. Correspondingly wish_n [4] denes a
sequence of generator functions to wish the set in nn1, nn2, and nn3
respectively. The set in nn1 is wished rst [5] and the yield in [6] returns the
wished name (Lava). Similar wishes follow for those in nn2 and nn3 in the same
order. list(wish_n()) in [9] prints the wishes in the desired sequence. Hello
Lava Good day to you is printed out in the rst call since Lava is the rst
element in nn1. Since yield returns Lava [6], the same is added to the list
[9]. Similarly at the second call Hello Kusha good day to you is printed out since
Kusha is the next element in nn1. In turn Kusha is added to the list [9]. With
nn1 is exhausted control is transferred to the next set [7]. Execution continues in
the same manner with the third set in [8] and then to completion. The list in [10]
has seven names corresponding to their successive returns in a sequencethe rst
two are from nn1, the following three from nn2, and the rest two from nn3.
324 12 Functional Programming Aids
(a)
>>> nn1 = 'Lava', 'Kusha' [1]
>>> nn2 = 'Tom', 'Dick', 'Harry' [2]
>>> nn3 = 'Queen of Spades', 'King of Hearts' [3]
>>> def wish_nn(): [4]
... for jj in range(len(nn1)): [5]
... print('Hello {}, Good day to you!'.format(nn1[jj]))
... yield nn1[jj] [6]
... for jj in range(len(nn2)): [7]
... print('Hello {}, Good day to you!'.format(nn2[jj]))
... yield nn2[jj]
... for jj in range(len(nn3)): [8]
... print('Hello {}, Good day to you too,
dear!'.format(nn3[jj]))
... yield nn3[jj]
...
>>> list(wish_nn()) [9]
Hello Lava, Good day to you!
Hello Kusha, Good day to you!
Hello Tom, Good day to you!
Hello Dick, Good day to you!
Hello Harry, Good day to you!
Hello Queen of Spades, Good day to you too, dear!
Hello King of Hearts, Good day to you too, dear!
['Lava', 'Kusha', 'Tom', 'Dick', 'Harry', 'Queen of
Spades', 'King of Hearts'] [10]
Fig. 12.13 a Illustration of direct use of yield to chain generator functions (continued in
Fig. 12.13b). b Illustration of direct use of yield to chain generator functions (continued from
Fig. 12.13a)
The three generator functions in wish_n() have been separated out as wish_1()
[11], wish_2() [12], and wish_3() [13] respectively (Fig. 12.13b); all these three
are combined into a composite generator function wish_g() in (14). list
(wish_g()) in [15] invokes the full set. The output can be seen to be identical to
that obtained earlier.
The use of yield from as done in the generator wish_g() in [14] facilitates
chaining of generators and transfer of execution from one generator to another.
Further (if desired) the generator functions wish_1(), wish_2(), and wish_3() can
be altered at a later date without the need to touch the master generator wish_g().
With this (when the application demands), one can decide the overall structure of a
program and get into the details separately later.
12.4 iterator Formation 325
(b)
>>> def wish_1(): [11]
... for jj in range(len(nn1)):
... print('Hello {}, Good day to you!'.format(nn1[jj]))
... yield nn1[jj]
...
>>> def wish_2(): [12]
... for jj in range(len(nn2)):
... print('Hello {}, Good day to you!'.format(nn2[jj]))
... yield nn2[jj]
...
>>> def wish_3(): [13]
... for jj in range(len(nn3)):
... print('Hello {}, Good day to you too,
dear!'.format(nn3[jj]))
... yield nn3[jj]
...
>>> def wish_g(): [14]
... yield from wish_1()
... yield from wish_2()
... yield from wish_3()
...
>>> list(wish_g()) [15]
Hello Lava, Good day to you!
Hello Kusha, Good day to you!
Hello Tom, Good day to you!
Hello Dick, Good day to you!
Hello Harry, Good day to you!
Hello Queen of Spades, Good day to you too, dear!
Hello King of Hearts, Good day to you too, dear!
['Lava', 'Kusha', 'Tom', 'Dick', 'Harry', 'Queen of
Spades', 'King of Hearts']
The __iter__() and __next__() methods can be built into a user-dened class
and iterator action imparted to it. Starting with a basic set of parameters an
iterator can be generated in this manner. The Python Interpreter sequence in
Fig. 12.14 carries a few illustrative examples.
As a class Ff_0 returns an iterator for a sequencean arithmetic pro-
gression in a nite eld [1]. An integer sequence starts at ms and continues with
successive increments of me until the integer value reaches md. With v10 as an
element of this sequence v10(mod 11) is returned for every v10 to form the desired
progression. def__init__() [2] assigns the argument valuesms, me, and md
to corresponding instance variables. If the next method is present in the class
326 12 Functional Programming Aids
denition as here [4] the __iter__() method [3] need return only self or a desired
variant of self. When the class is instantiated the next element value is auto-
matically computed (transparent to the instance) and returned through the
__next__() method. In the example here if the value of v10 exceeds the set limit
the iteration is stopped [5]; else the next element value is computed as explained
above and returned [7]. The __iter_() and the __next__() methods together play
the role of next(iter()) with a sequence. z1 [8] is an instance of Ff_0. list(z1)
[8] returns the full list of the integer sequence desired. As another instance Ff_0 (2,
27, 3) is assigned to z0 in [10]; functionally z0 is identical to z1 in [8]. for jj in
z0 print(jj, end = ) outputs the same set of numbers [11]. The example also
brings out the basic operations with for. The for statement calls the iter() on
the sequence. Following this the element in the concerned object is accessed
through the next() method. Once the number of items in the sequence is exhausted
the for loop terminates.
Class Ff_1() [12] has been dened with an iterator method in it but without
the associated __next__() method. Hence the __iter__() method yields the
desired value (it does not return) [14]. With an instance of Ff_1, at every access
iter() for the next element is returned. z2 [15] and zz2 [17] are instances of Ff_1
()similar to z1 and z0 above. The for based loops in [16] and [18] return the
corresponding full sequences. In both cases for loop accesses the next method
with the iterator. It obviates the need for dening a separate __next__()
method in the class (as was done with Ff_0()). z3 [19] is again an instance of Ff_1
(). Attempt to extract next(z3) [20] fails because z3 has not been dened as an
iterator. Iterator formation using iter (z3) as in [21] and its use the
subsequent lines can be seen to be the correct usage.
12.5 decoratorS
Nested functions, a function forming an argument to another being dened, and one
function returning another functionall these have been discussed in Sect. 4.1.
Decorators provide a template with an elegant and flexible syntax for many nested
function structures.
The Python Interpreter sequence in Fig. 12.15 has a few examples to facilitate
understanding of decorators. Two functions jj1() [1] and ff1() [2] have been
dened; function jj1(yy) [1] returns yy * yy. Function ff1(gg1, xx) has gg1 (a
function) and xx (a number or an object returning number) as its arguments [2]. It
returns (gg1(xx) + 5.0)0.5 [3]. ff1(jj1, 3.0) [4] uses jj1 as the function argument,
invokes function ff1 for the number 3.0 and returns (3.0 * 3.0 + 5.0)0.5
(=3.7416573867739413). In a typical application functions jj1() and ff1() can be
more involved and ff1 can be dened independently of jj1 itself. The decorator
based implementation of this function pair follows from [5] to [11] with ff2, jj2 and
hh2 used in place of ff1 and jj 1respectively. @ff2 [9] implies that the function
following (jj2()) has been decorated by function ff2; here ff2 is the decorator
328 12 Functional Programming Aids
function. Thanks to the @ff2 usage the linked pairff2 and jj2behave in the
same manner as the pair jj1 and ff1 above. Functionally the two pairs are equiv-
alent. [13] and [14] clarify this further. Function jj2 takes the place of function hh2
within ff2 and hence it is local to ff2.
As an additional illustration fn11 [22] has been decorated by pnt0() [21], [15].
The two together decide the name of a new born baby conrming to traditional
family customs. Parents and grandparents contribute their own to the name and all
these are combined into a single string to form the name and decide a pet name too.
Function fn11() [22] prompts the parents [23] to return their contributions to the
name as n0 [24]. The function pnt0() [15] forms the full name [20], forms the pet
name, and returns these through the function nn() [16] dened within it. With the
decoration through @pnt0 [21] the queries are made in the desired sequence to
form the name and the pet name. As an example fnl1() is invoked in [25] and the
full name and pet name are formed and returned (as Veera Venkata Kumar [26],
and Veeku [27] respectively).
12.6 functools
12.6.1 total_ordering
len(), iter() are examples of generic functions in Python. The actual functional
implementation differs depending on the argument types. But the function returns
an object with a predictable uniform pattern. The single dispatch decorator trans-
forms a function to such a generic one. The rst argument of the function statement
and its type together, decide the alternate denition implementedhence the term
Single Dispatch Generic Function. The register attribute of the generic
function is used to do the desired overloading. The procedure is illustrated through
an example in the Python Interpreter sequence in Fig. 12.17.
Function ff(aa, bb) [3] has aaa string and bba list of strings as its
two arguments. It returns cc as a single string joining aa to the string formed
by joining the elements of strings in bb. Illustrative invocation of ff follows in
[4] where the string pp is joined to the string formed by joining the set of
strings qq, rr, and ss; the result is the single string ppqqrrss.
Presence of the single dispatch decorator [2] transforms ff into a Single Dispatch
Generic Function. For an integer type argument the desired overloading is enabled
[6] through the register attribute. With cc as an integer type the second
argument (dd) is selected as a list of integers; the elements of dd are summed up
to form an integer which is added to cc and the sum is returned in hex form. [7] is
an illustration of its use.
Another example of a similar overloading follows from [8]. With gg as a
floating point type number and hh as a list of similar floating point numbers, the
12.6 functools 331
Fig. 12.17 Single dispatch decorator transforming a function into the generic form
elements of hh are summed up and added to gg and the sum expressed in hex
form and returned [9]. An illustrative example is in [10].
In all the above three invocations of ff [4], [7], and [10], the dispatch attribute of
ff steers the implementation to the appropriate function. ff.registry.keys [11]
returns all the registered implementations of ff. The attributes ff.registry(ob-
ject), ff.registry(int), and ff.registry(oat) show the respective distinct
function IDs [12]. ff.dispatch() in [13] conrm the steering done based on the
type of argument. The keys also point to the respective dispatch locations.
332 12 Functional Programming Aids
A function which has been dened elaborately can be used to yield a curtailed
function with the curtailed portion assigned implicitly. The curtailed function can
be used separately. The partial function in functools facilitates this. The
Python Interpreter sequence in Fig. 12.18 has an illustrative application. With p, q,
r as the components of a 3-D vector the function VVCC [1] returns its Euclidean
p
magnitude p2 q2 r 2 . The vector 2.0i + 3.1j + 4.2k has the magnitude
5.5901699437494745. partial has been imported in [3] from functools.
vc2_0 has r in VVCC frozen at zero [4]. This makes vc2_0 as a function to return
p
the 2-D vector magnitude p2 q2 . vc2_0(2.1, 3,1) in [5] returns
p
2:02 3:12 as 3.6891733491393435.
The general form of partial usage is partial(ff, *agg, **kka); (*agg,
**kka) forms the frozen set here.
The reduce function in the functools module has the functional form re-
duce(ff, aa, bb). Here ff is a function and aa an iterable. bb is optional. ff() has to
be a function of two arguments. The function is applied to all the element of aa in
succession. If bb is specied ff(bb, aa[0]) is done rst; the result forms the rst
argument for the next ff() and ff(ff(bb, aa[0]), aa[1]) done next. The sequence of
such reduction operations is continued with all the elements of aa and the nal
reduced result returned. If bb (initializer) is not specied the function starts with aa
[0] and aa[1] as the two arguments. The resulting reduction sequence is ff(aa[0],
aa[1]), ff(ff(aa[0], aa[1]), aa[2]), ff(ff(ff(aa[0], aa[1]), aa[2]), aa[3]), until
completion. As an illustration reduce is imported [6] in Fig. 12.18 to compute
and return 7! The lambda function in [7] returns the product of the two arguments;
the initialization is not specied. It is taken as unitythe rst element of range (1,
7). Multiplication is done in succession up to seven, 7! (=720) is computed and
returned.
12.7 Exercises
1. The suite starting with [1] in Fig. 12.19 is a generatora modied version of
the random number generator program for Example 12.2. Run the program and
explain its behaviour. Do the same with the modied version starting with [2].
2. In Fig. 12.13 replace yield nn1[jj], nn2[jj], and nn3[jj] by yield in each
case; run the programs and explain respective outputs.
3. Modify pnt0() in Fig. 12.15 to include contributions from great grandparents.
Also accommodate separate print output for the baby being a boy or a girl
through an additional query for it.
4. Repeat the above with two decorator functionsone for grandparents and the
other for great grandparents.
5. The perf_counter (in the time module can be used to ascertain time duration for
the execution of routines invoking decoration.
Define function timing(ff) as
def timing(ff):
from time import perf_counter
t0 = perf_counter
gg = ff(*arg, **karg)
t1 = perf_counter
del_t = t1 t0
return del_t
Use timing(ff) inside a decorator function tdr. With
@tdr
def jj(. . )
...
return time of execution of jj.
Write a program to generate Fibbonacci number recursively from f(n) = f
(n 1) + f(n 2) and get the execution time for f(6) with f(0) = 2 and f
(1) = 5.
6. Marks data for a set of students is given as in Fig. 5.16. Rank the students with
the following criteria:
The rank is decided by the total marks obtained in all the subjects.
In case of a tie the student with higher marks in mathematics has the higher
rank.
In case of a tie with equal total marks and marks in mathematics as well,
marks obtained in physics is taken as the next one for comparison and rank
assignment.
Similar resolution of multiple ties is carried out with priority in the order:
total marks, marks in mathematics, physics, chemistry, and then English.
Prepare student mark lists using random numbers as in Exercise 5 in
Chap. 9. Test the program with this data.
Adapt the merge-sort algorithm (Exercise 13(b) in Chap. 6) to carry out the
ranking
7. The decision to procure a refrigerator of a given size is to be made by com-
paring the closely similar products offered by 10 companies. The refrigerators
differ marginally in their specications, price etc. Write a program to carry out a
comparison and do ranking. The criteria to be used in the order of decreasing
importance are: cooling size (volume), warranty period, warranty for the
cooling system, input voltage range, price, brand name (assign an index in a
scale of one to ve for brand).
Modify the above by assigning weights to individual criterion.
Do a comparison with actual data.
12.7 Exercises 335
(Similar comparison may be done with other products: The procedure can
be adopted to compare strategies, compare candidates for selection to a posi-
tion & c.).
8. The overload to a generic function can be used with different types of argu-
ments as discussed in Sect. 12.6.2. With (a1, a2, a3, a4) as a 4-D vector the
Euclidean vector magnitude vm = (a21 + a22 + a23 + a24)0.5.
Given a vector as a tuple write a program to get the magnitude of the vector. As
illustrated in Sect. 12.6.2 use the tuple as the argument and the function for vm
as the additional argument in the overloading scheme. Use the program to get
the vector magnitude for the vector (9.1, 8.2, 7.3, 6.4).
9. With va (a1, a2, a3) and vb (b1, b2, b3) as two vectors the angle between va and
vb can be obtained as follows (Anton and Rorres 2005):
Get the unit vectors along va and vb as
au = (a1u, a2u, a3u)
and
bu = (b1u, b2u, b3u)
where
aju p
aj
2 2
a1 a2 a32
for all j and similarly for all bju. cos = a1u b1u + a2u b2u + a3u b3u. Given va
and vb write a program to get cos.
Make the two vectors as a list of the two tuple vectors. Get cos by overloading
as illustrated in Sect. 12.6.2. Evaluate cos for va = [2.3, 3.4, 4.5, 6.8] and
vb = [7.9, 9.1, 1.2, 3.4].
10. Let va and vb be two four dimensional vectors each being given as a list of
numbers representing the component magnitudes. Dene a single dispatch
generic function which will add the two vectors and return the sum vector of
four components. Test it with va = [2.3, 3.4, 4.5, 6.8] and vb = [7.9, 9.1, 1.2,
3.4].
11. va and vb are two four dimensional vectors each being given as a tuple of the
four vector component magnitudes. Dene a single dispatch generic P function to
return the dot product of the two vectors (The dot product is j aj bj ). Test it
with va = (11.2, 12.3, 9.4, 8.7) and vb = (22.4, 35.6, 42.3, 9.87). Note that
the dot product is also equal to vma. vmb cos with vma and vmb being the vector
magnitudes (as in Exercise 8 above) and the angle between the vectors (as in
Exercise 9 above).
12. Let v represent a 5-D vector (v1, v2, v3, v4, v5). With va, vb, vc, vd, and ve as ve
5-D vectors dene functions to get vector magnitude, unit vectors, and the
angle between two vectors. Dene a 5-D orthogonal reference base unit vector
set; for this start with va and form the rest of the unit vector set with vb, vc, vd,
and ve in that order as follows (Gram-ScHmidt procedure) (Anton and Rorres
2005):
336 12 Functional Programming Aids
Form the unit vector along va as u1.this is the rst component of the
reference base unit vector set.
u1. vb (dot product of u1 and vb) is the projection of vb on va. Form the vector
vb u1. vb and the unit vector u2 along this. This is the second component
of the reference base unit vector set.
Starting with vc subtract components along u1 and u2 from it; form the unit
vector u3 along this as the third component of the reference base unit vector
set.
Proceed similarly to get u4 and u5. to complete the 5-D orthogonal reference
base unit vector set.
Write a program to get the 5-D orthogonal reference base unit vector set.
Test it with a specic vector set.
Use *arg and **kwarg constructs to generalize the program to an n-D vector
set.
13. Start with the program for the 5-D vector set in the forgoing exercise. With
partial dene a new 3-D limited vector set and its corresponding functions.
Test it with va = [2.3, 3.4, 4.5] and vb = [7.9, 9.1, 1.2] and making the
limited set with the fourth and the fth components as (2, 2).
14. Let {xi} be a given sequence of numbers (can be regular samples of a con-
tinuous signal). A corresponding smoothened (ltered) sequence {yi} can be
obtained from {xi} by using a moving window lter. The simplest moving
window lter of length l (an odd integer) takes the average of the sample set (xi
b, xib1, xi, xi+1, xi+b} where b = (l 1)/2 and assigns it to yi. Note that
if xi has the range zero to n (inclusive) yi has the range b to n + b. Two
routines to do window ltering are given in Fig. 12.20 (yy0() and yy2()) where
l is taken as 11. The former program computes yi directly. In the latter case yi
for any i is computed using the already computed yi1 and modifying it. Get
{yi} for {xi} = [range(15)] and {xi} = list(2 for j in range(15)]. The
function mtr() is a monitoring function to nd the time for execution of the
function inside it. It is used as a decorator for function yy0(). Assign it as
decorator for yy2() also and nd the time for execution of for both sets of {xi}.
Explain why takes conspicuously less time for execution with yy2().
15. Use the timeit module (Vide Sect. 11.5) to measure execution times for yy0
() and yy2() in the programs of the last exercise and compare the results.
References 337
import time
def mtr(ff0):
def spvr0(*args):
ts = time.perf_counter()
rr = ff0(*args)
dt = time.perf_counter() - ts
print('[%0.6fs]'%(dt))
return rr
return spvr0
@mtr
def yy0(xx):
'Moving window filter of length ll'
aa, ll = len(xx), 11
bb = (ll-1)//2
y0 = []
for jj in range(-bb-2,aa+bb+2):
sp = 0
for kk in range(-bb, bb+1):
if 0 <= jj+kk < aa:sp += xx[kk+jj]
y0.append((round(sp/ll, 4), jj))
return y0
def yy2(xx):
'Moving window filter of length ll'
aa, ll = len(xx), 11
bb = (ll-1)//2
y1 = [0]*(aa+ll)
for jj in range(aa+ll):
if jj == 0:y1[jj] = xx[0]/ll
elif jj < ll:y1[jj] = y1[jj-1]+ xx[jj]/ll
elif jj < aa:y1[jj] = y1[jj-1] + (xx[jj]-xx[jj-ll])/ll
else:y1[jj] = y1[jj-1] - xx[jj-ll]/ll
y2 = [round(jj, 4) for jj in y1]
return [(y2[kk], kk - bb) for kk in range(len(y2))]
References
A Built-in module, 62
Absolute value, 11 bytearray, 157
Algebraic operators, 9 decode, 157
Algebra with time objects, 285 encode, 157
all, 75 bytes, 157
and, 75 decode, 157
any, 75 encode, 157
append, 76, 112 byteswap, 211
Array module, 207
append, 211 C
array, 207 calendar, 287
count, 212 rstweekday, 287
extend, 212 formatmonth, 288
frombytes, 212 formatyear, 289
fromlist, 212 isleap, 293
fromstring, 212 itermonthdates, 292
index, 212 itermonthdays, 293
insert, 212 iterweeks, 292
pop, 212 leapdays, 293
remove, 212 month, 293
reverse, 211 monthcalendar, 293
tobytes, 212 monthdayscalendar, 290
tolist, 212 monthdays2calendar, 290
tostring, 212 prcal, 294
typecode, 212 prmonth, 287
ASCII, 138 prmonthrange, 295
pryear, 288
B TextCalendar, 287
Beta distribution, 205 timegm, 295
Bisect module, 214 setrstweekday, 293
bisect, 214 weekday, 293
bisect_left, 214 weekheader, 293
bisect_right, 214 yeardatescalendar, 290
insort, 214 yeardayscalendar, 290
insort_left, 214 yeardays2calendar, 290
insort_right, 214 chr, 155
Boolean, 16 class, 232
break, 31 attribute, 235
buffer_info, 211 __class__, 238
339
340 Index