100 Page Python Intro
100 Page Python Intro
This book is a short, introductory guide for the Python programming language. This book is well suited:
Prerequisites
You should be already familiar with basic programming concepts. If you are new to programming, I'd highly recommend
my comprehensive curated list on Python (https://learnbyexample.github.io/py_resources/) to get started.
Why is it called 100 Page Python Intro when it has more than 100 pages?
There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors — Leon
Bambrick
The material I was using for my workshops was 56 pages. I had more chapters to add, but I thought it would be a
struggle to reach 100 pages, instead of overshooting the goal in the end. The measurement also depends on a few
factors. The main content will be less than 100 pages if I reduce the font size from 12 to 11, exclude cover, TOC, Preface,
etc.
Conventions
The examples presented here have been tested with Python version 3.9.5 and includes features that are not
available in earlier versions.
Code snippets that are copy pasted from the Python REPL shell have been modified for presentation purposes.
For example, comments to provide context and explanations, blank lines and shortened error messages to
improve readability and so on.
A comment with filename will be shown as the first line for program files.
External links are provided for further exploration throughout the book. They have been chosen with care to
provide more detailed resources on those topics as well as resources on related topics.
The 100_page_python_intro repo
(https://github.com/learnbyexample/100_page_python_intro/tree/main/programs) has all the programs and files
presented in this book, organized by chapter for convenience.
Visit Exercises.md
(https://github.com/learnbyexample/100_page_python_intro/blob/main/exercises/Exercises.md) to view all the
exercises from this book.
Acknowledgements
Offical Python website (https://docs.python.org/3/) — documentation and examples
stackoverflow (https://stackoverflow.com/) and unix.stackexchange (https://unix.stackexchange.com/) — for
getting answers to pertinent questions on Python, Shell and programming in general
/r/learnpython (https://www.reddit.com/r/learnpython) and /r/learnprogramming
(https://www.reddit.com/r/learnprogramming) — helpful forum for beginners
/r/Python/ (https://www.reddit.com/r/Python/) — general Python discussion
tex.stackexchange (https://tex.stackexchange.com/) — for help on pandoc (https://github.com/jgm/pandoc/) and
tex related questions
Cover image:
Ilsa Olson (https://ko-fi.com/profetessaoscura) — cover art
LibreOffice Draw (https://www.libreoffice.org/discover/draw/) — title/author text
pngquant (https://pngquant.org/) and svgcleaner (https://github.com/RazrFalcon/svgcleaner) for optimizing
images
Warning (https://commons.wikimedia.org/wiki/File:Warning_icon.svg) and Info
(https://commons.wikimedia.org/wiki/File:Info_icon_002.svg) icons by Amada44
(https://commons.wikimedia.org/wiki/User:Amada44) under public domain
Dean Clark and Elijah for catching a few typos
E-mail: learnbyexample.net@gmail.com
Author info
Sundeep Agarwal is a freelance trainer, author and mentor. His previous experience includes working as a Design
Engineer at Analog Devices for more than 5 years. You can find his other works, primarily focused on Linux command
line, text processing, scripting languages and curated lists, at https://github.com/learnbyexample
(https://github.com/learnbyexample). He has also been a technical reviewer for Command Line Fundamentals
(https://www.packtpub.com/application-development/command-line-fundamentals) book and video course published by
Packt.
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
(https://creativecommons.org/licenses/by-nc-sa/4.0/)
Images mentioned in Acknowledgements section above are available under original licenses.
Book version
1.2
Introduction
Wikipedia (https://en.wikipedia.org/wiki/Python_(programming_language)) does a great job of describing about Python in
a few words. So, I'll just copy-paste relevant information here:
Python is an interpreted, high-level and general-purpose programming language. Python's design philosophy
emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-
oriented approach aim to help programmers write clear, logical code for small and large-scale projects.
Python is dynamically typed and garbage-collected. It supports multiple programming paradigms, including
structured (particularly, procedural), object-oriented, and functional programming. Python is often described as a
"batteries included" language due to its comprehensive standard library.
Since 2003, Python has consistently ranked in the top ten most popular programming languages in the TIOBE
Programming Community Index where, as of February 2021, it is the third most popular language (behind Java,
and C).
info
Installation
On modern Linux distributions, you are likely to find Python already installed. It may be a few versions behind, but should
work just fine for most of the topics covered in this book. To get the exact version used here, visit Python downloads page
(https://www.python.org/downloads/) and install using the appropriate source for your operating system.
Using the installer from the downloads page is the easiest option to get started on Windows and macOS. See
docs.python: Python Setup and Usage (https://docs.python.org/3/using/index.html) for more information.
For Linux, check your distribution repository first. You can also build it from source, here's what I use on Ubuntu:
$ wget https://www.python.org/ftp/python/3.9.5/Python-3.9.5.tar.xz
$ tar -Jxf Python-3.9.5.tar.xz
$ cd Python-3.9.5
$ ./configure --enable-optimizations
$ make
$ sudo make altinstall
You may have to install dependencies first, see this stackoverflow thread (https://stackoverflow.com/q/8097161/4082052)
for details. Should you face any issues in installing, search online for a solution. Yes, that is something I expect you
should be able to do as a prerequisite for this book, i.e. you should have prior experience with basic programming and
computer usage.
info
Online tools
In case you are facing installation issues, or do not want to (or cannot) install Python on your computer for some reason,
there are plenty of options to execute Python programs using online tools. Some of the popular ones are listed below:
Repl.it (https://repl.it/languages/python3) — Interactive playground. Code, collaborate, compile, run, share, and
deploy Python and more online from your browser
Pythontutor (http://www.pythontutor.com/visualize.html#mode=edit) — Visualize code execution, also has
example codes and ability to share sessions
PythonAnywhere (https://www.pythonanywhere.com/) — Host, run, and code Python in the cloud
The offical Python website (https://www.python.org/) also has a Launch Interactive Shell option
(https://www.python.org/shell/ (https://www.python.org/shell/)), which gives access to a REPL session.
First program
It is customary to start learning a new programming language by printing a simple phrase. Create a new directory, say
Python/programs for this book. Then, create a plain text file named hello.py with your favorite text editor and type
the following piece of code.
# hello.py
print('*************')
print('Hello there!')
print('*************')
If you are familiar with using command line on a Unix-like system, run the script as shown below (use py hello.py if
you are using Windows CMD). Other options to execute a Python program will be discussed in the next section.
$ python3.9 hello.py
*************
Hello there!
*************
A few things to note here. The first line is a comment, used here to indicate the name of the Python program. print() is
a built-in function, which can be used without having to load some library. A single string argument has been used for
each of the three invocations. print() automatically appends a newline character by default. The program ran without a
compilation step. As quoted earlier, Python is an interpreted language. More details will be discussed in later chapters.
info
All the Python programs discussed in this book, along with related text files,
can be accessed from my GitHub repo learnbyexample: 100_page_python_intro
(https://github.com/learnbyexample/100_page_python_intro/tree/main/programs). However, I highly recommend
typing the programs manually by yourself.
If you install Python on Windows, it will automatically include IDLE, an IDE built using Python's tkinter module. On
Linux, you might already have the idle3.9 program if you installed Python manually. Otherwise you may have to install
it separately, for example, sudo apt install idle-python3.9 on Ubuntu.
When you open IDLE, you'll get a Python shell (discussed in the next section). For now, click the New File option under
File menu to open a text editor. Type the short program hello.py discussed in the previous section. After saving the
code, press F5 to run it. You'll see the results in the shell window.
Screenshots from the text editor and the Python shell are shown below.
Thonny (https://thonny.org/) — Python IDE for beginners, lots of handy features like viewing variables, debugger,
step through, highlight syntax errors, name completion, etc
Pycharm (https://www.jetbrains.com/pycharm/) — smart code completion, code inspections, on-the-fly error
highlighting and quick-fixes, automated code refactorings, rich navigation capabilities, support for frameworks, etc
Spyder (https://www.spyder-ide.org/) — typically used for scientific computing
Jupyter (https://jupyter.org/) — web application that allows you to create and share documents that contain live
code, equations, visualizations and narrative text
VSCodium (https://vscodium.com/) — community-driven, freely-licensed binary distribution of VSCode
Vim (https://github.com/vim/vim), Emacs (https://www.gnu.org/software/emacs/), Geany (https://www.geany.org/),
Gedit (https://wiki.gnome.org/Apps/Gedit) — text editors with support for syntax highlighting and more
REPL
One of the best features of Python is the interactive shell. Such shells are also referred to as REPL, which is an
abbreviation for Read Evaluate Print Loop. The Python REPL makes it easy for beginners to try out code snippets for
learning purposes. Beyond learning, it is also useful for developing a program in small steps, debugging a large program
by trying out few lines of code at a time and so on. REPL will be used frequently in this book to show code snippets.
When you launch Python from the command line, or open IDLE, you get a shell that is ready for user input after the >>>
prompt.
$ python3.9
Python 3.9.5 (default, May 4 2021, 09:12:57)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
Try the below instructions. The first one displays a greeting using the print() function. Then, a user defined variable is
used to store a string value. To display the value, you can either use print() again or just type the variable name.
Expression results are immediately displayed in the shell. Name of a variable by itself is a valid expression. This behavior
is unique to the REPL and an expression by itself won't display anything when used inside a script.
>>> print('have a nice day')
have a nice day
# use exit() to close the shell, can also use Ctrl+D shortcut
>>> exit()
I'll stress again the importance of following along the code snippets by manually typing them on your computer.
Programming requires hands-on experience too, reading alone isn't enough. As an analogy, can you learn to drive a car
by just reading about it? Since one of the prerequisite is that you should already be familiar with programming basics, I'll
extend the analogy to learning to drive a different car model. Or, perhaps a different vehicle such as a truck or a bus
might be more appropriate here.
info
Depending on the command line shell you are using, you might have the
readline library that makes it easier to use the REPL. For example, up and down arrow keys to browse code
history, re-execute them (after editing if necessary), search history, autocomplete based on first few characters
and so on. See wikipedia: GNU readline (https://en.wikipedia.org/wiki/GNU_Readline) and wiki.archlinux: readline
(https://wiki.archlinux.org/index.php/readline) for more information.
info
help print
info
If you get stuck with a problem, there are several ways to get it resolved. For example:
You can also ask for help on forums. Make sure to read the instructions provided by the respective forums before asking
a question. See also how to ask smart-questions (http://catb.org/%7Eesr/faqs/smart-questions.html#before). Here's some
forums you can use:
info
int
Integer numbers are made up of digits 0 to 9 and can be prefixed with unary operators like + or -. There is no restriction
to the size of numbers that can be used, only limited by the memory available on your system. Here's some examples:
>>> 42
42
>>> 0
0
>>> +100
100
>>> -5
-5
For readability purposes, you can use underscores in between the digits.
>>> 1_000_000_000
1000000000
warning
float
Here's some examples for floating-point numbers.
>>> 3.14
3.14
>>> -1.12
-1.12
Python also supports the exponential notation. See wikipedia: E scientific notation
(https://en.wikipedia.org/wiki/Scientific_notation#E_notation) for details about this form of expressing numbers.
>>> 543.15e20
5.4315e+22
>>> 1.5e-5
1.5e-05
Unlike integers, floating-point numbers have a limited precision. While displaying very small or very large floating-point
numbers, Python will automatically convert them to the exponential notation.
>>> 0.0000000001234567890123456789
1.2345678901234568e-10
>>> 31415926535897935809629384623048923.649234324234
3.1415926535897936e+34
info
You might also get seemingly strange results as shown below. See
docs.python: Floating Point Arithmetic Issues and Limitations (https://docs.python.org/3/tutorial/floatingpoint.html)
and stackoverflow: Is floating point math broken? (https://stackoverflow.com/q/588004/4082052) for details and
workarounds.
>>> 3.14 + 2
5.140000000000001
Arithmetic operators
All arithmetic operators you'd typically expect are available. If any operand is a floating-point number, result will be of
float data type. Use + for addition, - for subtraction, * for multiplication and ** for exponentiation. As mentioned
before, REPL is quite useful for learning purposes. It makes for a good calculator for number crunching as well. You can
also use _ to refer to the result of the previous expression (this is applicable only in the REPL, not in Python scripts).
>>> 25 + 17
42
>>> 10 - 8
2
>>> 25 * 3.3
82.5
>>> 32 ** 42
1645504557321206042154969182557350504982735865633579863348609024
>>> 5 + 2
7
>>> _ * 3
21
There are two operators for division. Use / if you want a floating-point result. Using // between two integers will give
only the integer portion of the result (no rounding).
>>> 4.5 / 1.5
3.0
>>> 5 / 3
1.6666666666666667
>>> 5 // 3
1
Use modulo operator % to get the remainder. Sign of the result is same as the sign of the second operand.
>>> 5 % 3
2
>>> -5 % 3
1
>>> 5 % -3
-1
>>> 6.5 % -3
-2.5
info
Operator precedence
Arithmetic operator precedence follows the familiar PEMDAS or BODMAS abbreviations. Precedence, higher to lower is
listed below:
Expression is evaluated left-to-right when operators have the same precedence. Unary operator precedence is between
exponentiation and multiplication/division operators. See docs.python: Operator precedence
(https://docs.python.org/3/reference/expressions.html#operator-precedence) for complete details.
Integer formats
The integer examples so far have been coded using base 10, i.e. decimal format. Python has provision for representing
binary, octal and hexadecimal formats as well. To distinguish between these different formats, a prefix is used:
0b or 0B for binary
0o or 0O for octal
0x or 0X for hexadecimal
All four formats fall under the int data type. Python displays them in decimal format by default. Underscores can be
used for readability for any of these formats.
>>> 0b1000_1111
143
>>> 0o10
8
>>> 0x10
16
>>> 5 + 0xa
15
>>> 00000
0
>>> 09
File "<stdin>", line 1
09
^
SyntaxError: leading zeros in decimal integer literals are not permitted;
use an 0o prefix for octal integers
If code execution hits a snag, you'll get an error message along with the code snippet that the interpreter thinks caused
the issue. In Python parlance, an exception has occurred. The exception has a name (SyntaxError in the above
example) followed by the error message. See Exception handling chapter for more details.
Other numeric types
Python's standard data type also includes complex type (imaginary part is suffixed with the character j). Others like
decimal and fractions are provided as modules.
warning
Some of the numeric types can have alphabets like e, b, j, etc in their
values. As Python is a dynamically typed language, you cannot use variable names beginning with a number.
Otherwise, it would be impossible to evaluate an expression like result = input_value + 0x12 - 2j.
info
There are many third-party libraries that are useful for number crunching in
mathematical context, engineering applications, etc. See my list py_resources: Scientific computing
(https://learnbyexample.github.io/py_resources/domain.html#scientific-computing) for curated resources.
REPL will again be used predominantly in this chapter. One important detail to note is that the result of an expression is
displayed using the syntax of that particular data type. Use print() function when you want to see how a string literal
looks visually.
>>> 'hello'
'hello'
>>> print("world")
world
If the string literal itself contains single or double quote characters, the other form can be used.
What to do if a string literal has both single and double quotes? You can use the \ character to escape the quote
characters. In the below examples, \' and \" will evaluate to ' and " characters respectively, instead of prematurely
terminating the string definition. Use \\ if a literal backslash character is needed.
In general, the backslash character is used to construct escape sequences. For example, \n represents the newline
character, \t is for tab character and so on. You can use \ooo and \xhh to represent 256 characters in octal and
hexadecimal formats respectively. For Unicode characters, you can use \N{name}, \uxxxx and \Uxxxxxxxx formats.
See docs.python: String and Bytes literals (https://docs.python.org/3/reference/lexical_analysis.html#strings) for full list of
escape sequences and details about undefined ones.
>>> greeting = 'hi there.\nhow are you?'
>>> greeting
'hi there.\nhow are you?'
>>> print(greeting)
hi there.
how are you?
>>> print('item\tquantity')
item quantity
# triple_quotes.py
print('''hi there.
how are you?''')
student = '''\
Name:\tlearnbyexample
Age:\t25
Dept:\tCSE'''
print(student)
$ python3.9 triple_quotes.py
hi there.
how are you?
Name: learnbyexample
Age: 25
Dept: CSE
info
Raw strings
For certain cases, escape sequences would be too much of a hindrance to workaround. For example, filepaths in
Windows use \ as the delimiter. Another would be regular expressions, where the backslash character has yet another
special meaning. Python provides a raw string syntax, where all the characters are treated literally. This form, also known
as r-strings for short, requires a r or R character prefix to quoted strings. Forms like triple quoted strings and raw strings
are for user convenience. Internally, there's just a single representation for string literals.
>>> print(r'item\tquantity')
item\tquantity
>>> r'item\tquantity'
'item\\tquantity'
>>> r'C:\Documents\blog\monsoon_trip.txt'
'C:\\Documents\\blog\\monsoon_trip.txt'
Here's an example with re built-in module. The import statement used below will be discussed in Importing and
creating modules chapter. See my book Python re(gex)? (https://github.com/learnbyexample/py_regular_expressions) for
details on regular expressions.
>>> import re
String operators
Python provides a wide variety of features to work with strings. This chapter introduces some of them, like the + and *
operators in this section. Here's some examples to concatenate strings using the + operator. The operands can be any
expression that results in a string value and you can use any of the different ways to specify a string literal.
Another way to concatenate is to simply place any kind of string literal next to each other. You can use zero or more
whitespaces between the two literals. But you cannot mix an expression and a string literal. If the strings are inside
parentheses, you can also use newline to separate the literals and optionally use comments.
# note that ... is REPL's indication for multiline statements, blocks, etc
>>> print('hi '
... 'there')
hi there
You can repeat a string by using the * operator between a string and an integer.
>>> style_char = '-'
>>> print(style_char * 50)
--------------------------------------------------
>>> word = 'buffalo '
>>> print(8 * word)
buffalo buffalo buffalo buffalo buffalo buffalo buffalo buffalo
String formatting
As per PEP 20: The Zen of Python (https://www.python.org/dev/peps/pep-0020/),
There should be one-- and preferably only one --obvious way to do it.
However, there are several approaches for formatting strings. This section will focus mostly on formatted string literals
(f-strings for short). And then show alternate approaches.
f-strings allow you to embed an expression within {} characters as part of the string literal. Like raw strings, you need to
use a prefix, which is f or F in this case. Python will substitute the embeds with the result of the expression, converting it
to string if necessary (such as numeric results). See docs.python: Format String Syntax
(https://docs.python.org/3/library/string.html#formatstrings) and docs.python: Formatted string literals
(https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals) for documentation and more
examples.
>>> f'{{hello'
'{hello'
>>> f'world}}'
'world}'
A recent feature allows you to add = after an expression to get both the expression and the result in the output.
>>> num1 = 42
>>> num2 = 7
Optionally, you can provide a format specifier along with the expression after a : character. These specifiers are similar
to the ones provided by printf() function in C language, printf built-in command in Bash and so on. Here's some
examples for numeric formatting.
>>> appx_pi = 22 / 7
# rounding is applied
>>> f'{appx_pi:.3f}'
'3.143'
# exponential notation
>>> f'{32 ** appx_pi:.2e}'
'5.38e+04'
>>> f'{fruit:=>10}'
'=====apple'
>>> f'{fruit:=<10}'
'apple====='
>>> f'{fruit:=^10}'
'==apple==='
>>> num = 42
>>> f'{num:b}'
'101010'
>>> f'{num:o}'
'52'
>>> f'{num:x}'
'2a'
>>> f'{num:#x}'
'0x2a'
str.format() method, format() function and % operator are alternate approaches for string formatting.
>>> num1 = 22
>>> num2 = 7
info
In case you don't know what a method is, see stackoverflow: What's the
difference between a method and a function? (https://stackoverflow.com/q/155609/4082052)
User input
The input() (https://docs.python.org/3/library/functions.html#input) built-in function can be used to get data from the user.
It also allows an optional string to make it an interactive process. It always returns a string data type, which you can
convert to another type (explained in the next section).
# Python will wait until you type your data and press the Enter key
# the blinking cursor is represented by a rectangular block shown below
>>> name = input('what is your name? ')
what is your name? █
# note that newline isn't part of the value saved in the 'name' variable
>>> print(f'pleased to meet you {name}.')
pleased to meet you learnbyexample.
Type conversion
The type() (https://docs.python.org/3/library/functions.html#type) built-in function can be used to know what data type you
are dealing with. You can pass any expression as an argument.
>>> num = 42
>>> type(num)
<class 'int'>
>>> type(22 / 7)
<class 'float'>
Exercises
Read about Bytes literals from docs.python: String and Bytes literals
(https://docs.python.org/3/reference/lexical_analysis.html#strings). See also stackoverflow: What is the difference
between a string and a byte string? (https://stackoverflow.com/q/6224052/4082052)
If you check out docs.python: int() function (https://docs.python.org/3/library/functions.html#int), you'll see that the
int() function accepts an optional argument. As an example, write a program that asks the user for
hexadecimal number as input. Then, use int() function to convert the input string to an integer (you'll need the
second argument for this). Add 5 and display the result in hexadecimal format.
Write a program to accept two input values. First can be either a number or a string value. Second is an integer
value, which should be used to display the first value in centered alignment. You can use any character you prefer
to surround the value, other than the default space character.
What happens if you use a combination of r, f and other such valid prefix characters while declaring a string
literal? What happens if you use raw strings syntax and provide only a single \ character? Does the
documentation describe these cases?
Try out at least two format specifiers not discussed in this chapter.
Given a = 5, get '{5}' as the output using f-strings.
Defining functions
This chapter will discuss how to define your own functions, pass arguments to them and get back results. You'll also learn
more about the print() built-in function.
def
Use the def keyword to define a function. The function name is specified after the keyword, followed by arguments
inside parentheses and finally a : character to end the definition. It is a common mistake for beginners to miss the :
character. Arguments are optional, as shown in the below program.
# no_args.py
def greeting():
print('-----------------------------')
print(' Hello World ')
print('-----------------------------')
greeting()
The above code defines a function named greeting and contains three statements as part of the function. Unlike many
other programming languages, whitespaces are significant in Python. Instead of a pair of curly braces, indentation is
used to distinguish the body of the function and statements outside of that function. Typically, 4 spaces is used, as shown
above. The function call greeting() has the same indentation level as the function definition, so it is not part of the
function. For readability, an empty line is used to separate the function definition and subsequent statements.
$ python3.9 no_args.py
-----------------------------
Hello World
-----------------------------
info
info
Accepting arguments
Functions can accept one or more arguments, specified as comma separated variable names.
# with_args.py
def greeting(ip):
op_length = 10 + len(ip)
styled_line = '-' * op_length
print(styled_line)
print(f'{ip:^{op_length}}')
print(styled_line)
greeting('hi')
weather = 'Today would be a nice, sunny day'
greeting(weather)
In this script, the function from the previous example has been modified to accept an input string as the sole argument.
The len() (https://docs.python.org/3/library/functions.html#len) built-in function is used here to get the length of a string
value. The code also showcases the usefulness of variables, string operators and string formatting.
$ python3.9 with_args.py
------------
hi
------------
------------------------------------------
Today would be a nice, sunny day
------------------------------------------
As an exercise, modify the above program as suggested below and observe the results you get.
add print statements for ip, op_length and styled_line variables at the end of the program (after the
function calls)
pass a numeric value to the greeting() function
don't pass any argument while calling the greeting() function
info
The argument variables, and those that are defined within the body, are
local to the function and would result in an exception if used outside the function. See also docs.python: Scopes
and Namespaces (https://docs.python.org/3/tutorial/classes.html#scopes-and-namespaces-example) and
docs.python: global statement (https://docs.python.org/3/reference/simple_stmts.html#the-global-statement).
info
greeting('hi')
greeting('bye', spacing=5)
greeting('hello', style='=')
greeting('good day', ':', 2)
There are various ways in which you can call functions with default values. If you specify the argument name, they can
be passed in any order. But, if you pass values positionally, the order has to be same as the declaration.
$ python3.9 default_args.py
------------
hi
------------
--------
bye
--------
===============
hello
===============
::::::::::
good day
::::::::::
As another exercise, what do you think will happen if you use greeting(spacing=5, ip='Oh!') to call the function
shown above?
info
Return value
The default return value of a function is None, which is typically used to indicate the absence of a meaningful value. The
print() function, for example, has a None return value. Functions like int(), len() and type() have specific return
values, as seen in prior examples.
>>> print('hi')
hi
>>> value = print('hi')
hi
>>> value
>>> print(value)
None
>>> type(value)
<class 'NoneType'>
Use the return statement to explicitly give back a value when the function is called. You can use this keyword by itself
as well (default value is None).
>>> def num_square(n):
... return n * n
...
>>> num_square(5)
25
>>> num_square(3.14)
9.8596
>>> op = num_square(-42)
>>> type(op)
<class 'int'>
info
info
help print
As you can see, there are four default valued arguments. But, what does value, ..., mean? It indicates that the
print() function can accept arbitrary number of arguments. Here's some examples:
>>> print('hi')
hi
>>> print('hi', 5)
hi 5
If you observe closely, you'll notice that a space character is inserted between the arguments. That separator can be
changed by using the sep argument.
Similarly, you can change the string that gets appended to something else.
>>> print('hi', end='----\n')
hi----
>>> print('hi', 'bye', sep='-', end='\n======\n')
hi-bye
======
info
The file argument will be discussed later. Writing your own function to
accept arbitrary number of arguments will also be discussed later.
Docstrings
Triple quoted strings are also used for multiline comments and to document various part of a Python script. The latter is
achieved by adding help content as string literals (but without being assigned to a variable) at the start of a function,
class, etc. Such literals are known as documentation strings, or docstrings for short. Idiomatically, triple quoted strings
are used for docstrings. The help() function reads these docstrings to display the documentation. There are also
numerous third-party tools that make use of docstrings.
Here's an example:
info
Control structures
This chapter will discuss various operators used in conditional expressions, followed by control structures.
Comparison operators
These operators yield True or False boolean values as a result of comparison between two values.
>>> 0 != '0'
True
>>> 0 == int('0')
True
>>> 'hi' == 'Hi'
False
info
Numerical value zero, empty string and None are Falsy. Non-zero numbers and non-empty strings are Truthy. See
docs.python: Truth Value Testing (https://docs.python.org/3/library/stdtypes.html#truth-value-testing) for a complete list.
>>> type(True)
<class 'bool'>
>>> type(False)
<class 'bool'>
>>> bool(4)
True
>>> bool(0)
False
>>> bool(-1)
True
>>> bool('')
False
>>> bool('hi')
True
>>> bool(None)
False
Boolean operators
Use and and or boolean operators to combine comparisons. The not operator is useful to invert a condition.
The and and or operators are also known as short-circuit operators. These will evaluate the second expression if and
only if the first one evaluates to True and False respectively. Also, these operators return the result of the expressions
used, which can be a non-boolean value. The not operator always returns a boolean value.
>>> num = 5
# here, num ** 2 will NOT be evaluated
>>> num < 3 and num ** 2
False
# here, num ** 2 will be evaluated as the first expression is True
>>> num < 10 and num ** 2
25
# not operator always gives a boolean value
>>> not (num < 10 and num ** 2)
False
>>> 0 or 3
3
>>> 1 or 3
1
Comparison chaining
You can chain comparison operators, which is similar to mathematical notations. Apart from terser conditional expression,
this also has the advantage of having to evaluate the middle expression only once.
>>> num = 5
# comparison chaining
>>> 3 < num <= 5
True
>>> 4 < num > 3
True
>>> 'bat' < 'cat' < 'cater'
True
Membership operator
The in comparison operator checks if a given value is part of a collection of values. Here's an example with range()
function:
>>> num = 5
# range() will be discussed in detail later in this chapter
# this checks if num is present among the integers 3 or 4 or 5
>>> num in range(3, 6)
True
You can build your own collection of values using various data types like list, set, tuple etc. These data types will be
discussed in detail in later chapters.
>>> num = 21
>>> num == 10 or num == 21 or num == 33
True
# RHS value here is a tuple data type
>>> num in (10, 21, 33)
True
if-elif-else
Similar to function definition, control structures require indenting its body of code. And, there's a : character after you
specify the conditional expression. You should be already familiar with if and else keywords from other programming
languages. Alternate conditional branches are specified using the elif keyword. You can nest these structures and
each branch can have one or more statements.
Here's an example of if-else structure within a user defined function. Note the use of indentation to separate different
structures. Examples with elif keyword will be seen later.
# odd_even.py
def isodd(n):
if n % 2:
return True
else:
return False
print(f'{isodd(42) = }')
print(f'{isodd(-21) = }')
print(f'{isodd(123) = }')
$ python3.9 odd_even.py
isodd(42) = False
isodd(-21) = True
isodd(123) = True
As an exercise, reduce the isodd() function body to a single statement instead of four. This is possible with features
already discussed in this chapter, the ternary operator discussed in the next section would be an overkill.
info
Ternary operator
Python doesn't support the traditional ?: ternary operator syntax. Instead, it uses if-else keywords in the same line as
illustrated below.
def absolute(num):
if num >= 0:
return num
else:
return -num
The above if-else structure can be rewritten using ternary operator as shown below:
def absolute(num):
return num if num >= 0 else -num
Or, just use the abs() (https://docs.python.org/3/library/functions.html#abs) built-in function, which has support for
complex numbers, fractions, etc. Unlike the above program, abs() will also handle -0.0 correctly.
info
for loop
Counter based loop can be constructed using the range() (https://docs.python.org/3/library/functions.html#func-range)
built-in function and the in operator. The range() function can be called in the following ways:
range(stop)
range(start, stop)
range(start, stop, step)
Both ascending and descending order arithmetic progression can be constructed using these variations. When skipped,
default values are start=0 and step=1. For understanding purposes, a C like code snippet is shown below:
# ascending order
for(i = start; i < stop; i += step)
# descending order
for(i = start; i > stop; i += step)
>>> num = 9
>>> for i in range(1, 5):
... print(f'{num} * {i} = {num * i}')
...
9 * 1 = 9
9 * 2 = 18
9 * 3 = 27
9 * 4 = 36
The range, list, tuple, str data types (and some more) fall under sequence types. There are multiple operations
that are common to these types (see docs.python: Common Sequence Operations
(https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) for details). For example, you could
iterate over these types using the for loop. The start:stop:step slicing operation is another commonality among
these types. You can test your understanding of slicing syntax by converting range to list or tuple type.
>>> list(range(5))
[0, 1, 2, 3, 4]
As an exercise, create this arithmetic progression -2, 1, 4, 7, 10, 13 using the range() function. Also, see what
value you get for each iteration of for c in 'hello'.
while loop
Use while loop when you want to execute statements as long as the condition evaluates to True. Here's an example:
# countdown.py
count = int(input('Enter a positive integer: '))
while count > 0:
print(count)
count -= 1
print('Go!')
$ python3.9 countdown.py
Enter a positive integer: 3
3
2
1
Go!
info
info
When continue is used, further statements are skipped and the next iteration of the loop is started, if any. This is
frequently used in file processing when you need to skip certain lines like headers, comments, etc.
As an exercise, use appropriate range() logic so that the if statement is no longer needed.
info
Assignment expression
Quoting from docs.python: Assignment expressions (https://docs.python.org/3/reference/expressions.html#assignment-
expressions):
An assignment expression (sometimes also called a “named expression” or “walrus”) assigns an expression to an
identifier, while also returning the value of the expression.
The while loop snippet from the previous section can be re-written using the assignment expression as shown below:
Exercises
If you don't already know about FizzBuzz, read Using FizzBuzz to Find Developers who Grok Coding
(https://imranontech.com/2007/01/24/using-fizzbuzz-to-find-developers-who-grok-coding/) and implement it in
Python. See also Why Can't Programmers.. Program? (https://blog.codinghorror.com/why-cant-programmers-
program/)
Print all numbers from 1 to 1000 (inclusive) which reads the same in reversed form in both binary and decimal
format. For example, 33 in decimal is 100001 in binary and both of these are palindromic. You can either
implement your own logic or search online for palindrome testing in Python.
Write a function that returns the maximum nested depth of curly braces for a given string input. For example,
'{{a+2}*{{b+{c*d}}+e*d}}' should give 4. Unbalanced or wrongly ordered braces like '{a}*b{' and
'}a+b{' should return -1.
If you'd like more exercises to test your understanding, check out these excellent resources:
A module is a file containing Python definitions and statements. The file name is the module name with the suffix
.py appended.
Random numbers
Say you want to generate a random number from a given range for a guessing game. You could write your own random
number generator. Or, you could save development/testing time, and make use of the random
(https://docs.python.org/3/library/random.html) built-in module.
import random will load this built-in module for use in this script, you'll see more details about import later in this
chapter. The randrange() method follows the same start/stop/step logic as the range() function and returns a
random integer from the given range. The for loop is used here to get the user input for a maximum of 4 attempts. The
loop body doesn't need to know the current iteration count. In such cases, _ is used to indicate a throwaway variable
name.
As mentioned in the previous chapter, else clause is supported by loops too. It is used to execute code if the loop is
completed normally. If the user correctly guesses the random number, break will be executed, which is not a normal
loop completion. In that case, the else clause will not be executed.
$ python3.9 rand_number.py
I have thought of a number between 0 and 10
Can you guess it within 4 attempts?
$ python3.9 rand_number.py
I have thought of a number between 0 and 10.
Can you guess it within 4 attempts?
# num_funcs.py
def sqr(n):
return n * n
def fact(n):
total = 1
for i in range(2, n+1):
total *= i
return total
num = 5
print(f'square of {num} is {sqr(num)}')
print(f'factorial of {num} is {fact(num)}')
The above program defines two functions, one variable and calls the print() function twice. After you've written this
program, open an interactive shell from the same directory. Then, load the module using import num_funcs where
num_funcs is the name of the program without the .py extension.
>>> import num_funcs
square of 5 is 25
factorial of 5 is 120
So what happened here? Not only did the sqr and fact functions get imported, the code outside of these functions got
executed as well. That isn't what you'd expect on loading a module. Next section will show how to prevent this behavior.
For now, continue the REPL session.
>>> num_funcs.sqr(12)
144
>>> num_funcs.fact(0)
1
>>> num_funcs.num
5
As an exercise,
add docstrings for the above program and check the output of help() function using num_funcs,
num_funcs.fact, etc as arguments.
check what would be the output of num_funcs.fact() for negative integers and floating-point numbers. Then
import the math built-in module and repeat the process with math.factorial(). Go through the Exception
handling chapter and modify the above program to gracefully handle negative integers and floating-point
numbers.
How does Python know where a module is located? Quoting from docs.python: The Module Search Path
(https://docs.python.org/3/tutorial/modules.html#the-module-search-path):
When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not
found, it then searches for a file named spam.py in a list of directories given by the variable sys.path.
sys.path is initialized from these locations:
• The directory containing the input script (or the current directory when no file is specified).
• PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
# num_funcs_module.py
def sqr(n):
return n * n
def fact(n):
total = 1
for i in range(2, n+1):
total *= i
return total
if __name__ == '__main__':
num = 5
print(f'square of {num} is {sqr(num)}')
print(f'factorial of {num} is {fact(num)}')
When you run the above program as a standalone application, the if condition will get evaluated to True.
$ python3.9 num_funcs_module.py
square of 5 is 25
factorial of 5 is 120
On importing, the above if condition will evaluate to False as num_funcs_module.py is no longer the main program.
In the below example, the REPL session is the main program.
>>> __name__
'__main__'
>>> import num_funcs_module
>>> num_funcs_module.sqr(12)
144
>>> num_funcs_module.fact(0)
1
info
In the above example, there are three statements that'll be executed if the
program is run as the main program. It is common to put such statements under a main() user defined function
and then call it inside the if block.
info
There are many such special variables and methods with double
underscores around their names. They are also called as dunder variables and methods. See stackoverflow:
__name__ special variable (https://stackoverflow.com/q/419163/4082052) for a detailed discussion and strange
use cases.
Different ways of importing
When you use import <module> statement, you'll have to prefix the module name whenever you need to use its
features. If this becomes cumbersome, you can use alternate ways of importing.
First up, removing the prefix altogether as shown below. This will load all names from the module except those beginning
with a _ character. Use this feature only if needed, one of the other alternatives might suit better.
You can also alias the name being imported using the as keyword. You can specify multiple aliases with comma
separation.
__pycache__ directory
If you notice the __pycache__ directory after you import your own module, don't panic. Quoting from docs.python:
Compiled Python files (https://docs.python.org/3/tutorial/modules.html#compiled-python-files):
To speed up loading modules, Python caches the compiled version of each module in the __pycache__ directory
under the name module.version.pyc, where the version encodes the format of the compiled file; it generally
contains the Python version number. For example, in CPython release 3.3 the compiled version of spam.py would
be cached as __pycache__/spam.cpython-33.pyc. This naming convention allows compiled modules from
different releases and different versions of Python to coexist.
You can use python3.9 -B if you do not wish the __pycache__ directory to be created.
Explore modules
docs.python: The Python Standard Library (https://docs.python.org/3/library/index.html)
github: awesome-python (https://github.com/vinta/awesome-python) — curated list of awesome Python
frameworks, libraries, software and resources
github: best-of-python (https://github.com/ml-tooling/best-of-python) — awesome Python open-source libraries &
tools, updated weekly
Turtle examples (https://michael0x2a.com/blog/turtle-examples) — a fun module to create graphical shapes,
inspired from Logo
The Python Package Index (PyPI) is a repository of software for the Python programming language. PyPI helps
you find and install software developed and shared by the Python community.
This chapter will discuss how to use pip for installing modules. You'll also see how to create virtual environments using
the venv module.
pip
Modern Python versions come with the pip installer program. The below code shows how to install the latest version of a
module (along with dependencies, if any) from PyPI. See pip user guide (https://pip.pypa.io/en/stable/user_guide/) for
documentation and other details like how to use pip on Windows. --user option limits the availability of the module to
the current user, see packaging.python: Installing to the User Site (https://packaging.python.org/tutorials/installing-
packages/#installing-to-the-user-site) for more details.
warning
Make sure that the package you want to install supports your Python
version.
Here's an example with regex module that makes use of possessive quantifiers, a feature not yet supported by the re
module.
info
warning
Unless you really know what you are doing, do NOT ever use pip as a
root/admin user. Problematic packages are an issue, see Malicious packages found to be typo-squatting
(https://snyk.io/blog/malicious-packages-found-to-be-typo-squatting-in-pypi/) and Hunting for Malicious Packages
on PyPI (https://news.ycombinator.com/item?id=25081937) for examples. See also security.stackexchange: PyPI
security measures (https://security.stackexchange.com/questions/79326/which-security-measures-does-pypi-and-
similar-third-party-software-repositories).
venv
Virtual environments allow you to work with specific Python and package versions without interfering with other projects.
Modern Python versions come with built-in module venv to easily create and manage virtual environments.
The flow I use is summarized below. If you are using an IDE, it will likely have options to create and manage virtual
environments.
# this is needed only once
# 'new_project' is the name of the folder, can be new or already existing
$ python3.9 -m venv new_project
$ cd new_project/
$ source bin/activate
(new_project) $ # pip install <modules>
(new_project) $ # do some scripting
(new_project) $ deactivate
$ # you're now out of the virtual environment
Here, new_project is the name of the folder containing the virtual environment. If the folder doesn't already exist, a
new folder will be created. source bin/activate will enable the virtual environment. Among other things, python or
python3 will point to the version you used to create the environment, which is python3.9 in the above example. The
prompt will change to the name of the folder, which is an easy way to know that you are inside a virtual environment
(unless your normal prompt is something similar). pip install will be restricted to this environment.
Once your work is done, use deactivate command to exit the virtual environment. If you delete the folder, your
installed modules in that environment will be lost as well. See also:
Exception handling
This chapter will discuss different types of errors and how to handle some of the them within the program gracefully. You'll
also see how to raise exceptions programmatically.
Syntax errors
Quoting from docs.python: Errors and Exceptions (https://docs.python.org/3/tutorial/errors.html):
There are (at least) two distinguishable kinds of errors: syntax errors and exceptions
# syntax_error.py
print('hello')
def main():
num = 5
total = num + 09
print(total)
main)
The above code is using an unsupported syntax for a numerical value. Note that the syntax check happens before any
code is executed, which is why you don't see the output for the print('hello') statement. Can you spot the rest of
the syntax issues in the above program?
$ python3.9 syntax_error.py
File "/home/learnbyexample/Python/programs/syntax_error.py", line 5
total = num + 09
^
SyntaxError: leading zeros in decimal integer literals are not permitted;
use an 0o prefix for octal integers
try-except
Exceptions happen when something goes wrong during the code execution. For example, passing a wrong data type to a
function, dividing a number by 0 and so on. Such errors are typically difficult or impossible to determine just by looking at
the code.
>>> int('42')
42
>>> int('42x')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '42x'
>>> 3.14 / 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: float division by zero
When an exception occurs, the program stops executing and displays the line that caused the error. You also get an error
type, such as ValueError and ZeroDivisionError seen in the above example, followed by a message. This may
differ for user defined error types.
You could implement alternatives to be followed for certain types of errors instead of premature end to the program
execution. For example, you could allow the user to correct their input data. In some cases, you want the program to end,
but display a user friendly message instead of developer friendly traceback.
Put the code likely to generate an exception inside try block and provide alternate path(s) inside one or more except
blocks. Here's an example to get a positive integer number from the user, and continue doing so if the input was invalid.
# try_except.py
from math import factorial
while True:
try:
num = int(input('Enter a positive integer: '))
print(f'{num}! = {factorial(num)}')
break
except ValueError:
print('Not a positive integer, try again')
It so happens that both int() and factorial() generate ValueError in the above example. If you wish to take the
same alternate path for multiple errors, you can pass a tuple to except instead of a single error type. Here's a sample
run:
$ python3.9 try_except.py
Enter a positive integer: 3.14
Not a positive integer, try again
Enter a positive integer: hi
Not a positive integer, try again
Enter a positive integer: -2
Not a positive integer, try again
Enter a positive integer: 5
5! = 120
You can also capture the error message using the as keyword (which you have seen previously with import statement,
and will come up again in later chapters). Here's an example:
>>> try:
... num = 5 / 0
... except ZeroDivisionError as e:
... print(f'oops something went wrong! the error msg is:\n"{e}"')
...
oops something went wrong! the error msg is:
"division by zero"
info
info
else
The else clause behaves similarly to the else clause seen with loops. If there's no exception raised in the try block,
then the code in the else block will be executed. This block should be defined after the except block(s). As per the
documentation (https://docs.python.org/3/tutorial/errors.html#handling-exceptions):
The use of the else clause is better than adding additional code to the try clause because it avoids accidentally
catching an exception that wasn’t raised by the code being protected by the try ... except statement.
# try_except_else.py
while True:
try:
num = int(input('Enter an integer number: '))
except ValueError:
print('Not an integer, try again')
else:
print(f'Square of {num} is {num ** 2}')
break
$ python3.9 try_except_else.py
Enter an integer number: hi
Not an integer, try again
Enter an integer number: 3.14
Not an integer, try again
Enter an integer number: 42x
Not an integer, try again
Enter an integer number: -2
Square of -2 is 4
raise
You can also manually raise exceptions if needed. It accepts an optional error type, which can be either a built-in or a
user defined one (see docs.python: User-defined Exceptions (https://docs.python.org/3/tutorial/errors.html#user-defined-
exceptions)). And you can optionally specify an error message. raise by itself re-raises the currently active exception, if
any (RuntimeError otherwise).
>>> def sum2nums(n1, n2):
... types_allowed = (int, float)
... if type(n1) not in types_allowed or type(n2) not in types_allowed:
... raise TypeError('Argument should be an integer or a float value')
... return n1 + n2
...
>>> sum2nums(3.14, -2)
1.1400000000000001
>>> sum2nums(3.14, 'a')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in sum2nums
TypeError: Argument should be an integer or a float value
finally
You can add code in finally block that should always be the last thing done by the try statement, irrespective of
whether an exception has occurred. This should be declared after except and the optional else blocks.
# try_except_finally.py
try:
num = int(input('Enter a positive integer: '))
if num < 0:
raise ValueError
except ValueError:
print('Not a positive integer, run the program again')
else:
print(f'Square root of {num} is {num ** 0.5:.3f}')
finally:
print('\nThanks for using the program, have a nice day')
Here's some sample runs when the user enters some value:
$ python3.9 try_except_finally.py
Enter a positive integer: -2
Not a positive integer, run the program again
Here's an example where something goes wrong, but not handled by the try statement. Note that finally block is still
executed.
Exercises
Identify the syntax errors in the following code snippets. Try to spot them manually.
# snippet 1:
def greeting()
print('hello')
# snippet 2:
num = 5
if num = 4:
print('what is going on?!')
# snippet 3:
greeting = “hi”
In case you didn't complete the exercises from Importing your own module section, you should be able to do it
now.
Write a function num(ip) that accepts a single argument and returns the corresponding integer or floating-point
number contained in the argument. Only int, float and str should be accepted as valid input data type.
Provide custom error message if the input cannot be converted to a valid number. Examples are shown below:
>>> num(0x1f)
31
>>> num(3.32)
3.32
>>> num(' \t 52 \t')
52
>>> num('3.982e5')
398200.0
Debugging
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as
possible, you are, by definition, not smart enough to debug it. — Brian W. Kernighan
There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors. — Leon
Bambrick
Debuggers don't remove bugs. They only show them in slow motion. — Unknown
General tips
Knowing how to debug your programs is crucial and should be ideally taught right from the start instead of a chapter at
the end of a beginner's learning resource. Think Python (https://greenteapress.com/wp/think-python-2e/) is an awesome
example for such a resource material.
Debugging is often a frustrating experience. Taking a break helps. It is common to find or solve issues in your dreams too
(I've had my share of these, especially during college and intense work days).
If you are stuck with a problem, reduce the code as much as possible so that you are left with minimal code necessary to
reproduce the issue. Talking about the problem to a friend/colleague/inanimate-objects/etc can help too — famously
termed as Rubber duck debugging (https://rubberduckdebugging.com/). I have often found the issue while formulating a
question to be asked on forums like stackoverflow/reddit because writing down your problem is another way to bring
clarity than just having a vague idea in your mind.
Here's an interesting snippet (modified to keep it small) from a collection of interesting bug stories
(https://stackoverflow.com/q/169713/4082052).
A jpeg parser choked whenever the CEO came into the room, because he always had a shirt with a square
pattern on it, which triggered some special case of contrast and block boundary algorithms.
See also this curated list of absurd software bug stories (https://500mile.email/).
Common beginner mistakes
The previous chapter already covered syntax errors. This section will discuss more Python gotchas.
Python allows you to redefine built-in functions, modules, classes etc (see stackoverflow: metaprogramming
(https://stackoverflow.com/q/514644/4082052)). Unless that's your intention, do not use keywords
(https://docs.python.org/3/reference/lexical_analysis.html#keywords), built-in functions
(https://docs.python.org/3/library/functions.html) and modules as your variable name, function name, program filename,
etc. Here's an example:
# normal behavior
>>> str(2)
'2'
>>> len = 5
>>> len('hi')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not callable
As an exercise, create an empty file named as math.py. In the same directory, create another program file that imports
the math module and then uses some feature, print(math.pi) for example. What happens if you execute this
program?
See also:
l prints code around the current statement the debugger is at, useful to visualize the progress of debug effort
ll prints entire code for the current function or frame
s execute current line, steps inside function calls
n execute current line, treats function call as a single execution step
c continue execution until the next breakpoint
p expression print value of an expression in the current context, usually used to see the current value of a
variable
h list of available commands
h c help on c command
q quit the debugger
Here's an example invocation of the debugger for the num_funcs.py program seen earlier in the Importing your own
module section. Only the n command is used below. The lines with > prefix tells you about the program file being
debugged, current line number, function name and return value when applicable. The lines with -> prefix is the code
present at the current line. (Pdb) is the prompt for this interactive session. You can also see the output of print()
function for the last n command in the illustration below.
Continuation of the above debugging session is shown below, this time with s command to step into the function. Use r
while you are still inside the function to skip until the function encounters a return statement. Examples for p and ll
commands are also shown below.
(Pdb) s
--Call--
> /home/learnbyexample/Python/programs/num_funcs.py(4)fact()
-> def fact(n):
(Pdb) ll
4 -> def fact(n):
5 total = 1
6 for i in range(2, n+1):
7 total *= i
8 return total
(Pdb) n
> /home/learnbyexample/Python/programs/num_funcs.py(5)fact()
-> total = 1
(Pdb) p n
5
(Pdb) r
--Return--
> /home/learnbyexample/Python/programs/num_funcs.py(8)fact()->120
-> return total
(Pdb) n
factorial of 5 is 120
--Return--
> /home/learnbyexample/Python/programs/num_funcs.py(12)<module>()->None
-> print(f'factorial of {num} is {fact(num)}')
If you continue beyond the last instruction, you can restart from the beginning if you wish. Use q to end the session.
(Pdb) n
--Return--
> <string>(1)<module>()->None
(Pdb) n
The program finished and will be restarted
> /home/learnbyexample/Python/programs/num_funcs.py(1)<module>()
-> def sqr(n):
(Pdb) q
info
See also:
IDLE debugging
Sites like Pythontutor (http://www.pythontutor.com/visualize.html#mode=edit) allow you to visually debug a program —
you can execute a program step by step and see the current value of variables. Similar feature is typically provided by
IDEs as well. Under the hood, these visualizations would likely be using the pdb module discussed in the previous
section.
This section will show an example with IDLE. Before you can run the program, first select Debugger option under
Debug menu. You can also use idle3.9 -d to launch IDLE in debug mode directly. You'll see a new window pop up as
shown below:
Debug window.png
Then, with debug mode active, run the program. Use the buttons and options to go over the code. Variable values will be
automatically available, as shown below.
info
You can right-click on a line from the text editor to set/clear breakpoints.
info
Testing
Testing can only prove the presence of bugs, not their absence. — Edsger W. Dijkstra
General tips
Another crucial aspect in the programming journey is knowing how to write tests. In bigger projects, usually there are
separate engineers (often in much larger number than developers) to test the code. Even in those cases, writing a few
sanity test cases yourself can help you develop faster knowing that the changes aren't breaking basic functionality.
When I start a project, I usually try to write the programs incrementally. Say I need to iterate over files from a directory. I
will make sure that portion is working (usually with print() statements), then add another feature — say file reading
and test that and so on. This reduces the burden of testing a large program at once at the end. And depending upon the
nature of the program, I'll add a few sanity tests at the end. For example, for my command_help
(https://github.com/learnbyexample/command_help) project, I copy pasted a few test runs of the program with different
options and arguments into a separate file and wrote a program to perform these tests programmatically whenever the
source code is modified.
assert
For simple cases, the assert statement is good enough. If the expression passed to assert evaluates to False, the
AssertionError exception will be raised. You can optionally pass a message, separated by a comma after the
expression to be tested. See docs.python: assert (https://docs.python.org/3/reference/simple_stmts.html#the-assert-
statement) for documentation.
# passing case
>>> assert 2 < 3
# failing case
>>> num = -2
>>> assert num >= 0, 'only positive integer allowed'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError: only positive integer allowed
Here's a sample program (solution for one of the exercises from Control structures chapter).
# nested_braces.py
def max_nested_braces(expr):
max_count = count = 0
for char in expr:
if char == '{':
count += 1
if count > max_count:
max_count = count
elif char == '}':
if count == 0:
return -1
count -= 1
if count != 0:
return -1
return max_count
def test_cases():
assert max_nested_braces('a*b') == 0
assert max_nested_braces('a*b+{}') == 1
assert max_nested_braces('a*{b+c}') == 1
assert max_nested_braces('{a+2}*{b+c}') == 1
assert max_nested_braces('a*{b+c*{e*3.14}}') == 2
assert max_nested_braces('{{a+2}*{b+c}+e}') == 2
assert max_nested_braces('{{a+2}*{b+{c*d}}+e}') == 3
assert max_nested_braces('{{a+2}*{{b+{c*d}}+e*d}}') == 4
assert max_nested_braces('a*b{') == -1
assert max_nested_braces('a*{b+c}}') == -1
assert max_nested_braces('}a+b{') == -1
assert max_nested_braces('a*{b+c*{e*3.14}}}') == -1
assert max_nested_braces('{{a+2}*{{b}+{c*d}}+e*d}}') == -1
if __name__ == '__main__':
test_cases()
print('all tests passed')
max_count = count = 0 is a terse way to initialize multiple variables to the same value. Okay to use for immutable
types (see Mutability chapter) like int, float and str.
As an exercise, randomly change the logic of max_nested_braces function and see if any of the tests fail.
info
Writing tests helps you in many ways. It could help you guard against typos and accidental editing. Often, you'll need to
tweak a program in future to correct some bugs or add a feature — tests would again help to give you confidence that
you haven't messed up already working cases. Another use case is refactoring, where you rewrite a portion of the
program (sometimes entire) without changing its functionality.
Here's an alternate implementation of max_nested_braces(expr) function from the above program using regular
expressions.
# nested_braces_re.py
# only the function is shown below
import re
def max_nested_braces(expr):
count = 0
while True:
expr, no_of_subs = re.subn(r'\{[^{}]*\}', '', expr)
if no_of_subs == 0:
break
count += 1
if re.search(r'[{}]', expr):
return -1
return count
pytest
For larger projects, simple assert statements aren't enough to adequately write and manage tests. You'll require built-in
module unittest (https://docs.python.org/3/library/unittest.html) or popular third-party modules like pytest
(https://doc.pytest.org/en/latest/contents.html). See python test automation frameworks
(https://github.com/atinfo/awesome-test-automation/blob/master/python-test-automation.md) for more resources.
This section will show a few introductory examples with pytest. If you visit a project on PyPI, the pytest page
(https://pypi.org/project/pytest/) for example, you can copy the installation command as shown in the image below. You
can also check out the statistics link (https://libraries.io/pypi/pytest (https://libraries.io/pypi/pytest) for example) as a
minimal sanity check that you are installing the correct module.
# virtual environment
$ pip install pytest
# normal environment
$ python3.9 -m pip install --user pytest
After installation, you'll have pytest usable as a command line application by itself. The two programs discussed in the
previous section can be run without any modification as shown below. This is because pytest will automatically use
function names starting with test for its purpose. See doc.pytest: Conventions for Python test discovery
(https://doc.pytest.org/en/latest/goodpractices.html#test-discovery) for full details.
# -v is verbose option, use -q for quiet version
$ pytest -v nested_braces.py
=================== test session starts ====================
platform linux -- Python 3.9.5, pytest-6.2.3, py-1.10.0,
pluggy-0.13.1 -- /usr/local/bin/python3.9
cachedir: .pytest_cache
rootdir: /home/learnbyexample/Python/programs
collected 1 item
# exception_testing.py
import pytest
def test_valid_values():
assert sum2nums(3, -2) == 1
# see https://stackoverflow.com/q/5595425
from math import isclose
assert isclose(sum2nums(-3.14, 2), -1.14)
def test_exception():
with pytest.raises(AssertionError) as e:
sum2nums('hi', 3)
assert 'only int/float allowed' in str(e.value)
with pytest.raises(AssertionError) as e:
sum2nums(3.14, 'a')
assert 'only int/float allowed' in str(e.value)
pytest.raises() allows you to check if exceptions are raised for the given test cases. You can optionally check the
error message as well. The with context manager (https://docs.python.org/3/reference/compound_stmts.html#with) will
be discussed in a later chapter. Note that the above program doesn't actually call any executable code, since pytest will
automatically run the test functions.
$ pytest -v exception_testing.py
=================== test session starts ====================
platform linux -- Python 3.9.5, pytest-6.2.3, py-1.10.0,
pluggy-0.13.1 -- /usr/local/bin/python3.9
cachedir: .pytest_cache
rootdir: /home/learnbyexample/Python/programs
collected 2 items
The above illustrations are trivial examples. And tests are typically organized in different files/folders from the program(s)
being tested. Here's some advanced learning resources:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such
as list, str, and tuple) and some non-sequence types like dict, file objects...
Some of the operations behave differently or do not apply for certain types, see docs.python: Common Sequence
Operations (https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) for details.
Initialization
Tuples are declared as a collection of zero or more objects, separated by a comma within () parentheses characters.
Each element can be specified as a value by itself or as an expression. The outer parentheses are optional if comma
separation is present. Here's some examples:
# note the trailing comma, otherwise it will result in a 'str' data type
# same as 'apple', since parentheses are optional here
>>> one_element = ('apple',)
You can use the tuple() (https://docs.python.org/3/library/functions.html#func-tuple) built-in function to create a tuple
from an iterable (described in the previous section).
>>> chars = tuple('hello')
>>> chars
('h', 'e', 'l', 'l', 'o')
>>> tuple(range(3, 10, 3))
(3, 6, 9)
info
Slicing
One or more elements can be retrieved from a sequence using the slicing notation (this wouldn't work for an iterable like
dict or set). It works similarly to the start/stop/step logic seen with the range() function. The default step is 1.
Default value for start and stop depends on whether the step is positive or negative.
>>> primes = (2, 3, 5, 7, 11)
You can use negative index to get elements from the end of the sequence. This is especially helpful when you don't know
the size of the sequence. Given a positive integer n greater than zero, the expression seq[-n] is evaluated as
seq[len(seq) - n].
# same as primes[0:5:2]
>>> primes[::2]
(2, 5, 11)
Sequence unpacking
You can assign the individual elements of an iterable to multiple variables. This is known as sequence unpacking and it
is handy in many situations.
Unpacking isn't limited to single value assignments. You can use a * prefix to assign all the remaining values, if any is
left, to a list variable.
>>> x, *y = values
>>> x
'first'
>>> y
[6.2, -3, 500, 'last']
As an exercise, what do you think will happen for these cases, given nums = (1, 2):
a, b, c = nums
a, *b, c = nums
*a, *b = nums
The min_max(iterable) user-defined function in the above snippet returns both minimum and maximum values of a
given iterable input. min() and max() are built-in functions. You can either save the output as a tuple or unpack into
multiple variables. You'll see built-in functions that return tuple as output later in this chapter.
warning
Iteration
You have already seen examples with for loop that iterates over a sequence data type. Here's a refresher:
info
You can write your own functions to accept arbitrary number of arguments as well. The packing syntax is similar to the
sequence unpacking examples seen earlier in the chapter. A * prefix to an argument name will allow it to accept zero or
more values. Such an argument will be packed as a tuple data type and it should always be specified after positional
arguments (if any). Idiomatically, args is used as the variable name. Here's an example:
As an exercise,
add a default valued argument initial which should be used to initialize total instead of 0 in the above
sum_nums() function. For example, sum_nums(3, -8) should give -5 and sum_nums(1, 2, 3, 4, 5,
initial=5) should give 20.
what would happen if you call the above function like sum_nums(initial=5, 2)?
what would happen if you have nums = (1, 2) and call the above function like sum_nums(*nums,
initial=3)?
in what ways does this function differ from the sum() (https://docs.python.org/3/library/functions.html#sum) built-in
function?
info
info
zip
Use zip() (https://docs.python.org/3/library/functions.html#zip) to iterate over two or more iterables simultaneously. Every
iteration, you'll get a tuple with an item from each of the iterables. Iteration will stop when any of the iterables is
exhausted. See itertools.zip_longest() (https://docs.python.org/3/library/itertools.html#itertools.zip_longest) and
stackoverflow: Zipped Python generators with 2nd one being shorter (https://stackoverflow.com/q/61126284/4082052) for
alternatives.
Here's an example:
>>> odd = (1, 3, 5)
>>> even = (2, 4, 6)
>>> for i, j in zip(odd, even):
... print(i + j)
...
3
7
11
As an exercise, write a function that returns the sum of product of corresponding elements of two sequences. For
example, the result should be 44 for (1, 3, 5) and (2, 4, 6).
Tuple methods
While this book won't discuss Object-Oriented Programming (OOP) (https://en.wikipedia.org/wiki/Object-
oriented_programming) in any detail, you'll still see plenty examples for using them. You've already seen a few examples
with modules. See Practical Python Programming (https://dabeaz-course.github.io/practical-python/Notes/Contents.html)
and Fluent Python (https://www.oreilly.com/library/view/fluent-python/9781491946237/) if you want to learn about Python
OOP in depth. See also docs.python: Data model (https://docs.python.org/3/reference/datamodel.html).
Data types in Python are all internally implemented as classes. You can use the dir()
(https://docs.python.org/3/library/functions.html#dir) built-in function to get a list of valid attributes for an object.
# you can also use tuple objects such as 'odd' and 'even' declared earlier
>>> dir(tuple)
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__',
'__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__',
'__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__',
'__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']
The non-dunder names (last two items) in the above listing will be discussed in this section. But first, a refresher on the
in membership operator is shown below.
>>> num = 5
>>> num in (10, 21, 33)
False
>>> num = 21
>>> num in (10, 21, 33)
True
The count() method returns the number of times a value is present in the tuple object.
The index() method will give the index of the first occurrence of a value. It will raise ValueError if the value isn't
present, which you can avoid by using the in operator first. Or, you can use the try-except statement to handle the
exception as needed.
>>> nums.index(3)
4
>>> n = 31
>>> nums.index(n)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: tuple.index(x): x not in tuple
>>> if n in nums:
... print(nums.index(n))
... else:
... print(f'{n} not present in "nums" tuple')
...
31 not present in "nums" tuple
info
The list and str sequence types have many more methods and they will
be discussed separately in later chapters.
List
List is a container data type, similar to tuple, with lots of added functionality and mutable. Lists are typically used to
store and manipulate ordered collection of values.
info
# 1D example
>>> vowels = ['a', 'e', 'i', 'o', 'u']
>>> vowels[0]
'a'
# same as vowels[4] since len(vowels) - 1 = 4
>>> vowels[-1]
'u'
# 2D example
>>> student = ['learnbyexample', 2021, ['Linux', 'Vim', 'Python']]
>>> student[1]
2021
>>> student[2]
['Linux', 'Vim', 'Python']
>>> student[2][-1]
'Python'
Since list is a mutable data type, you can modify the object after initialization. You can either change a single element
or use slicing notation to modify multiple elements.
>>> nums = [1, 4, 6, 22, 3, 5]
Use the append() method to add a single element to the end of a list object. If you need to append multiple items,
you can pass an iterable to the extend() method. As an exercise, check what happens if you pass an iterable to the
append() method and a non-iterable value to the extend() method. What happens if you pass multiple values to both
these methods?
>>> books = []
>>> books.append('Cradle')
>>> books.append('Mistborn')
>>> books
['Cradle', 'Mistborn']
The count() method will give the number of times a value is present.
The index() method will give the index of the first occurrence of a value. As seen with tuple, this method will raise
ValueError if the value isn't present.
The pop() method removes the last element of a list by default. You can pass an index to delete that specific item
and the list will be automatically re-arranged. Return value is the element being deleted.
>>> primes = [2, 3, 5, 7, 11]
>>> last = primes.pop()
>>> last
11
>>> primes
[2, 3, 5, 7]
>>> primes.pop(2)
5
>>> primes
[2, 3, 7]
To remove multiple elements using slicing notation, use the del statement. Unlike the pop() method, there is no return
value.
The pop() method deletes an element based on its index. Use the remove() method to delete an element based on its
value. You'll get ValueError if the value isn't found.
>>> even_numbers = [2, 4, 6, 8, 10]
>>> even_numbers.remove(8)
>>> even_numbers
[2, 4, 6, 10]
The clear() method removes all the elements. You might wonder why not just assign an empty list? If you have
observed closely, all of the methods seen so far modified the list object in-place. This is useful if you are passing a
list object to a function and expect the function to modify the object itself instead of returning a new object. See
Mutability chapter for more details.
You've already seen how to add element(s) at the end of a list using append() and extend() methods. The
insert() method is the opposite of pop() method. You can provide a value to be inserted at the given index. As an
exercise, check what happens if you pass a list value. Also, what happens if you pass more than one value?
The reverse() method reverses a list object in-place. Use slicing notation if you want a new object.
>>> primes[::-1]
[2, 3, 5, 7, 11]
>>> primes
[11, 7, 5, 3, 2]
Collections that support order comparison are ordered the same as their first unequal elements (for example,
[1,2,x] <= [1,2,y] has the same value as x <= y). If a corresponding element does not exist, the shorter
collection is ordered first (for example, [1,2] < [1,2,3] is true).
# ascending order
>>> nums.sort()
>>> nums
[0, 1, 1, 2, 5.3, 321]
# descending order
>>> nums.sort(reverse=True)
>>> nums
[321, 5.3, 2, 1, 1, 0]
>>> sorted('fuliginous')
['f', 'g', 'i', 'i', 'l', 'n', 'o', 's', 'u', 'u']
The key argument accepts the name of a built-in/user-defined function (i.e. function object) for custom sorting. If two
elements are deemed equal based on the result of the function, the original order will be maintained (known as stable
sorting). Here's some examples:
If the custom user-defined function required is just a single expression, you can create anonymous functions with lambda
expressions (https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions) instead of a full-fledged function. As
an exercise, read docs.python HOWTOs: Sorting (https://docs.python.org/3/howto/sorting.html) and implement the below
examples using operator module instead of lambda expressions.
# based on second element of each item
>>> items = [('bus', 10), ('car', 20), ('jeep', 3), ('cycle', 5)]
>>> sorted(items, key=lambda e: e[1], reverse=True)
[('car', 20), ('bus', 10), ('cycle', 5), ('jeep', 3)]
You can use sequence types like list or tuple to specify multiple sorting conditions. Make sure to read the sequence
comparison examples from previous section before trying to understand the following examples.
As an exercise, given nums = [1, 4, 5, 2, 51, 3, 6, 22], determine and implement the sorting condition
based on the required output shown below:
First up, getting a random element from a non-empty sequence using the choice() method.
>>> random.shuffle(items)
>>> items
['car', 3, -3.14, 'jeep', 'hi', 20]
Use the sample() method to get a list of specified number of random elements. As an exercise, see what happens if
you pass a slice size greater than the number of elements present in the input sequence.
separating out even numbers is Filter (i.e. only elements that satisfy a condition are retained)
square of such numbers is Map (i.e. each element is transformed by a mapping function)
final sum is Reduce (i.e. you get one value out of multiple values)
One or more of these operations may be absent depending on the problem statement. A function for the first of these
steps could look like:
>>> def get_evens(iterable):
... op = []
... for n in iterable:
... if n % 2 == 0:
... op.append(n)
... return op
...
>>> get_evens([100, 53, 32, 0, 11, 5, 2])
[100, 32, 0, 2]
And finally, the function after the third step could be:
Exercises
Write a function that returns the product of a sequence of numbers. Empty sequence or sequence containing non-
numerical values should raise TypeError.
product([-4, 2.3e12, 77.23, 982, 0b101]) should give -3.48863356e+18
product(range(2, 6)) should give 120
product(()) and product(['a', 'b']) should raise TypeError
>>> remove_dunder(list)
['append', 'clear', 'copy', 'count', 'extend', 'index',
'insert', 'pop', 'remove', 'reverse', 'sort']
>>> remove_dunder(tuple)
['count', 'index']
Mutability
int, float, str and tuple are examples for immutable data types. On the other hand, types like list and dict are
mutable. This chapter will discuss what happens when you pass a variable to a function or when you assign them to
another value/variable.
id
The id() (https://docs.python.org/3/library/functions.html#id) built-in function returns the identity (reference) of an object.
Here's some examples to show what happens when you assign a variable to another value/variable.
>>> num1 = 5
>>> id(num1)
140204812958128
# here, num1 gets a new identity
>>> num1 = 10
>>> id(num1)
140204812958288
Pass by reference
Variables in Python store references to an object, not their values. When you pass a list object to a function, you are
passing the reference to this object. Since list is mutable, any in-place changes made to this object within the function
will also be reflected in the original variable that was passed to the function. Here's an example:
This is true even for slices of a sequence containing mutable objects. Also, as shown in the example below, tuple
doesn't prevent mutable elements from being changed.
>>> nums_2d = ([1, 3, 2, 10], [1.2, -0.2, 0, 2], [100, 200])
>>> last_two = nums_2d[-2:]
>>> last_two
([1.2, -0.2, 0, 'apple'], [100, 'ball'])
>>> nums_2d
([1, 3, 2, 10], [1.2, -0.2, 0, 'apple'], [100, 'ball'])
As an exercise, use id() function to verify that the identity of last two elements of nums_2d variable in the above
example is the same as the identity of both the elements of last_two variable.
Here's an example where all the elements are immutable. In this case, using slice notation is safe for copying.
>>> id(items)
140204765864256
>>> id(items_copy)
140204765771968
On the other hand, if the sequence has mutable objects, a shallow copy made using slicing notation won't stop the copy
from modifying the original.
>>> nums_2d = [[1, 3, 2, 10], [1.2, -0.2, 0, 2], [100, 200]]
>>> nums_2d_copy = nums_2d[:]
>>> nums_2d_copy
[['oops', 3, 2, 10], [1.2, -0.2, 0, 2], [100, 200]]
>>> nums_2d
[['oops', 3, 2, 10], [1.2, -0.2, 0, 2], [100, 200]]
copy.deepcopy
The copy (https://docs.python.org/3/library/copy.html#module-copy) built-in module has a deepcopy() method if you
wish to recursively create new copies of all the elements of a mutable object.
>>> nums_2d_deepcopy
[['yay', 3, 2, 10], [1.2, -0.2, 0, 2], [100, 200]]
>>> nums_2d
[[1, 3, 2, 10], [1.2, -0.2, 0, 2], [100, 200]]
As an exercise, create a deepcopy of only the first two elements of nums_2d object from the above example.
Dict
Dictionaries can be thought of as a collection of key-value pairs or a named list of items. It used to be unordered, but
recent Python versions ensure that the insertion order is maintained. See this tutorial (https://sharats.me/posts/the-
python-dictionary/) for a more detailed discussion on dict usage.
>>> marks = {'Rahul': 86, 'Ravi': 92, 'Rohit': 75, 'Rajan': 79}
>>> marks['Rohit']
75
>>> marks['Rahul'] += 5
>>> marks['Ram'] = 67
>>> del marks['Rohit']
>>> items = {('car', 2): 'honda', ('car', 5): 'tesla', ('bike', 10): 'hero'}
>>> items[('bike', 10)]
'hero'
You can also use the dict() (https://docs.python.org/3/library/functions.html#func-dict) function for initialization in various
ways. If all the keys are of str data type, you can use the same syntax as keyword arguments seen earlier with function
definitions. You can also pass a container type having two values per element, such as a list of tuples as shown
below.
>>> dict.fromkeys(colors)
{'red': None, 'blue': None, 'green': None}
>>> dict.fromkeys(colors, 255)
{'red': 255, 'blue': 255, 'green': 255}
>>> marks['Ron']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'Ron'
>>> marks.get('Ravi')
92
>>> value = marks.get('Ron')
>>> print(value)
None
>>> marks.get('Ron', 0)
0
Iteration
The default for loop over a dict object will give you a key for each iteration.
# you'll also get only the keys if you apply list(), tuple() or set()
>>> list(fruits)
['banana', 'papaya', 'mango', 'fig']
As an exercise,
given fruits dictionary as defined in the above code snippet, what do you think will happen when you use a,
*b, c = fruits?
given nums = [1, 4, 6, 22, 3, 5, 4, 3, 6, 2, 1, 51, 3, 1], keep only first occurrences of a
value from this list without changing the order of elements. You can do it with dict features presented so far. [1,
4, 6, 22, 3, 5, 2, 51] should be the output. See Using dict to eliminate duplicates while retaining order
(https://twitter.com/raymondh/status/944125570534621185) if you are not able to solve it.
Dict methods and operations
The in operator checks if a key is present in the given dictionary. The keys() method returns all the keys and
values() method returns all the values. These methods return a custom set-like object, but with insertion order
maintained.
>>> marks.keys()
dict_keys(['Rahul', 'Ravi', 'Rohit', 'Rajan'])
>>> marks.values()
dict_values([86, 92, 75, 79])
The items() method can be used to get a key-value tuple for each iteration.
# set-like object
>>> fruits.items()
dict_items([('banana', 12), ('papaya', 5), ('mango', 10), ('fig', 100)])
The del statement example seen earlier removes the given key without returning the value associated with it. You can
use the pop() method to get the value as well. The popitem() method removes the last added item and returns the
key-value pair as a tuple.
>>> marks = dict(Rahul=86, Ravi=92, Rohit=75, Rajan=79)
>>> marks.pop('Ravi')
92
>>> marks
{'Rahul': 86, 'Rohit': 75, 'Rajan': 79}
>>> marks.popitem()
('Rajan', 79)
>>> marks
{'Rahul': 86, 'Rohit': 75}
The update() method allows you to add/update items from another dictionary or a container with key-value pair
elements.
The | operator is similar to the update() method, except that you get a new dict object instead of in-place
modification.
Turning it around, when you have a function defined with keyword arguments, you can unpack a dictionary while calling
the function.
Set
set is a mutable, unordered collection of objects. frozenset is similar to set, but immutable. See docs.python: set,
frozenset (https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset) for documentation.
Initialization
Sets are declared as a collection of objects separated by a comma within {} curly brace characters. The set()
(https://docs.python.org/3/library/functions.html#func-set) function can be used to initialize an empty set and to convert
iterables.
>>> empty_set = set()
>>> empty_set
set()
Here's some examples for set operations like union, intersection, etc. You can either use methods or operators, both will
give you a new set object instead of in-place modification. The difference is that set methods can accept any iterable,
whereas the operators can work only with set or set-like objects.
>>> color_1 = {'teal', 'light blue', 'green', 'yellow'}
>>> color_2 = {'light blue', 'black', 'dark green', 'yellow'}
As mentioned in Dict chapter, methods like keys(), values() and items() return a set-like object. You can apply set
operators on them.
# union
>>> color_1.update(color_2)
>>> color_1
{'light blue', 'green', 'dark green', 'black', 'teal', 'yellow'}
The pop() method will return a random element being removed. Use the remove() method if you want to delete an
element based on its value. The discard() method is similar to remove(), but it will not generate an error if the
element doesn't exist. The clear() method will delete all the elements.
>>> colors.pop()
'blue'
>>> colors
{'green', 'red'}
>>> colors.clear()
>>> colors
set()
Exercises
Write a function that checks whether an iterable has duplicate values or not.
>>> has_duplicates('pip')
True
>>> has_duplicates((3, 2))
False
Text processing
This chapter will discuss str methods and introduce a few examples with the string and re modules.
join
The join() method is similar to what the print() function does with the sep option, except that you get a str object
as the result. The iterable you pass to join() can only have string elements. On the other hand, print() uses an
object's __str__() method (https://docs.python.org/3/reference/datamodel.html#object.__str__) to get its string
representation (__repr__() method (https://docs.python.org/3/reference/datamodel.html#object.__repr__) is used as a
fallback).
>>> print(1, 2)
1 2
>>> ' '.join((1, 2))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected str instance, int found
>>> ' '.join(('1', '2'))
'1 2'
As an exercise, check what happens if you pass multiple string values separated by comma to join() instead of an
iterable.
Transliteration
The translate() method accepts a table of codepoints (numerical value of a character) mapped to another
character/codepoint or None (if the character has to be deleted). You can use the ord()
(https://docs.python.org/3/library/functions.html#ord) built-in function to get the codepoint of characters. Or, you can use
the str.maketrans() method to generate the mapping for you.
>>> ord('a')
97
>>> ord('A')
65
>>> para = '"Hi", there! How *are* you? All fine here.'
>>> para.translate(str.maketrans('', '', string.punctuation))
'Hi there How are you All fine here'
As an exercise, read the documentation for features covered in this section. See also stackoverflow: character
translation examples (https://stackoverflow.com/q/555705/4082052).
>>> '"Hi",'.strip(string.punctuation)
'Hi'
The removeprefix() and removesuffix() methods will delete a substring from the start/end of the input string.
>>> 'spare'.removeprefix('sp')
'are'
>>> 'free'.removesuffix('e')
'fre'
>>> sentence.capitalize()
'This is a sample string'
>>> sentence.title()
'This Is A Sample String'
>>> sentence.lower()
'this is a sample string'
>>> sentence.upper()
'THIS IS A SAMPLE STRING'
>>> sentence.swapcase()
'THiS Is A SAmPLE sTRiNG'
The string.capwords() method is similar to title() but also allows a specific word separator (whose default is
whitespace).
>>> phrase = 'this-IS-a:colon:separated,PHRASE'
>>> phrase.title()
'This-Is-A:Colon:Separated,Phrase'
>>> string.capwords(phrase, ':')
'This-is-a:Colon:Separated,phrase'
is methods
The islower(), isupper() and istitle() methods check if the given string conforms to the specific case pattern.
Characters other than alphabets do not influence the result, but at least one alphabet needs to be present for a True
output.
>>> 'αλεπού'.islower()
True
>>> '123'.isupper()
False
>>> 'ABC123'.isupper()
True
Here's some examples with isnumeric() and isascii() methods. As an exercise, read the documentation for the
rest of the is methods.
# checks if string has numeric characters only, at least one
>>> '153'.isnumeric()
True
>>> ''.isnumeric()
False
>>> '1.2'.isnumeric()
False
>>> '-1'.isnumeric()
False
The count() method gives the number of times the given substring is present (non-overlapping).
>>> sentence = 'This is a sample string'
>>> sentence.count('is')
2
>>> sentence.count('w')
0
Match start/end
The startswith() and endswith() methods check for the presence of substrings only at the start/end of the input
string.
>>> sentence.startswith('This')
True
>>> sentence.startswith('is')
False
>>> sentence.endswith('ing')
True
>>> sentence.endswith('ly')
False
split
The split() method splits a string based on the given substring and returns a list. By default, whitespace is used for
splitting. You can also control the number of splits.
replace
Use replace() method for substitution operations. Optional third argument allows you to specify number of
replacements to be made.
re module
Regular Expressions is a versatile tool for text processing. You'll find them included as part of standard library of most
programming languages that are used for scripting purposes. If not, you can usually find a third-party library. Syntax and
features of regular expressions vary from language to language though. re module
(https://docs.python.org/3/library/re.html) is the built-in library for Python.
What's so special about regular expressions and why would you need it? It is a mini programming language in itself,
specialized for text processing. Parts of a regular expression can be saved for future use, analogous to variables and
functions. There are ways to perform AND, OR, NOT conditionals. Operations similar to range() function, string
repetition operator and so on. Here's some common use cases:
Sanitizing a string to ensure that it satisfies a known set of rules. For example, to check if a given string matches
password rules.
Filtering or extracting portions on an abstract level like alphabets, numbers, punctuation and so on.
Qualified string replacement. For example, at the start or the end of a string, only whole words, based on
surrounding text, etc.
>>> import re
Exercises
Write a function that checks if two strings are anagrams irrespective of case (assume input is made up of
alphabets only).
Read the documentation and implement these formatting examples with equivalent str methods.
>>> f'{fruit:=>10}'
'=====apple'
>>> f'{fruit:=<10}'
'apple====='
>>> f'{fruit:=^10}'
'==apple==='
>>> f'{fruit:^10}'
' apple '
Write a function that returns a list of words present in the input string.
Comprehensions and
Generator expressions
This chapter will show how to use comprehensions and generator expressions for map, filter and reduce operations.
You'll also learn about iterators and the yield statement.
Comprehensions
As mentioned earlier, Python provides map() (https://docs.python.org/3/library/functions.html#map) and filter()
(https://docs.python.org/3/library/functions.html#filter) built-in functions. Comprehensions provide a terser and a (usually)
faster way to implement them. However, the syntax can take a while to understand and get comfortable with.
The minimal requirement for a comprehension is a mapping expression (which could include a function call) and a loop.
Here's an example:
# manual implementation
>>> sqr_nums = []
>>> for n in nums:
... sqr_nums.append(n * n)
...
>>> sqr_nums
[103041, 1, 1, 0, 28.09, 4]
# list comprehension
>>> [n * n for n in nums]
[103041, 1, 1, 0, 28.09, 4]
The general form of the above list comprehension is [expr loop]. Comparing with the manual implementation, the
difference is that append() is automatically performed, which is where most of the performance benefit comes from.
Note that list comprehension is defined based on the output being a list, input to the for loop can be any iterable
(like tuple in the above example).
# manual implementation
def remove_dunder(obj):
names = []
for n in dir(obj):
if '__' not in n:
names.append(n)
return names
The general form of the above comprehension is [expr loop condition]. If you can write the manual
implementation, it is easy to derive the comprehension version. Put the expression (the argument passed to append()
method) first, and then put the loops and conditions in the same order as the manual implementation. With practice, you'll
be able to read and write the comprehension versions naturally.
>>> p = [1, 3, 5]
>>> q = [3, 214, 53]
>>> [i + j for i, j in zip(p, q)]
[4, 217, 58]
>>> [i * j for i, j in zip(p, q)]
[3, 642, 265]
Similarly, you can build dict and set comprehensions by using {} instead of [] characters. Comprehension syntax
inside () characters becomes a generator expression (discussed later in this chapter), so you'll need to use tuple() for
tuple comprehension. You can use list(), dict() and set() instead of [] and {} respectively as well.
Iterator
Partial quote from docs.python glossary: iterator (https://docs.python.org/3/glossary.html#term-iterator):
An object representing a stream of data. Repeated calls to the iterator’s __next__() method (or passing it to the
built-in function next()) return successive items in the stream. When no more data are available a
StopIteration exception is raised instead.
The filter() example in the previous section required further processing, such as passing to the list() function to
get the output as a list object. This is because the filter() function returns an object that behaves like an iterator.
You can pass iterators anywhere iterables are allowed, such as the for loop. Here's an example:
One of the differences between an iterable and an iterator is that you can iterate over iterables any number of times
(quite the tongue twister, if I may say so myself). Also, the next() function can be used on an iterator, but not iterables.
Once you have exhausted an iterator, any attempt to get another item (such as next() or for loop) will result in a
StopIteration exception. Iterators are lazy and memory efficient (https://en.wikipedia.org/wiki/Lazy_evaluation) since
the results are evaluated only when needed, instead of lying around in a container.
You can convert an iterable to an iterator using the iter() (https://docs.python.org/3/library/functions.html#iter) built-in
function.
yield
Functions that use yield statement instead of return to create an iterator are known as generators. Quoting from
docs.python: Generators (https://docs.python.org/3/tutorial/classes.html#generators):
Each time next() is called on it, the generator resumes where it left off (it remembers all the data values and
which statement was last executed).
# inner product
>>> sum(i * j for i, j in zip((1, 3, 5), (2, 4, 6)))
44
Exercises
Write a function that returns a dictionary sorted by values in ascending order.
return the input string as the only element if its length is less than 3 characters
otherwise, return all slices that have 2 or more characters
>>> word_slices('i')
['i']
>>> word_slices('to')
['to']
>>> word_slices('table')
['ta', 'tab', 'tabl', 'table', 'ab', 'abl', 'able', 'bl', 'ble', 'le']
Square even numbers and cube odd numbers. For example, [321, 1, -4, 0, 5, 2] should give you
[33076161, 1, 16, 0, 125, 4] as the output.
Calculate sum of squares of the numbers, only if the square value is less than 50. Output for (7.1, 1, -4, 8,
5.1, 12) should be 43.01.
The mode argument specifies what kind of processing you want. Only text mode will be covered in this chapter, which is
the default. You can combine options, for example, rb means read in binary mode. Here's the relevant details from the
documentation:
The encoding argument is meaningful only in the text mode. You can check the default encoding for your environment
using the locale (https://docs.python.org/3/library/locale.html) module as shown below. See docs.python: standard
encodings (https://docs.python.org/3/library/codecs.html#standard-encodings) and docs.python HOWTOs: Unicode
(https://docs.python.org/3/howto/unicode.html) for more details.
>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'
Here's how Python handles line separation by default, see documentation for more details.
On input, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or
'\r\n', and these are translated into '\n' before being returned to the caller.
On output, if newline is None, any '\n' characters written are translated to the system default line separator,
os.linesep.
Context manager
Quoting from docs.python: Reading and Writing Files (https://docs.python.org/3/tutorial/inputoutput.html#reading-and-
writing-files):
It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is
properly closed after its suite finishes, even if an exception is raised at some point. Using with is also much
shorter than writing equivalent try-finally blocks.
# read_file.py
with open('ip.txt', mode='r', encoding='ascii') as f:
for ip_line in f:
op_line = ip_line.rstrip('\n').capitalize() + '.'
print(op_line)
Recall that as keyword was seen before in Different ways of importing and try-except sections. Here's the output of the
above program:
$ python3.9 read_file.py
Hi there.
Today is sunny.
Have a nice day.
info
>>> open('ip.txt').read()
'hi there\ntoday is sunny\nhave a nice day\n'
>>> fh = open('ip.txt')
# readline() is similar to next()
# but returns empty string instead of StopIteration exception
>>> fh.readline()
'hi there\n'
>>> fh.readlines()
['today is sunny\n', 'have a nice day\n']
>>> fh.readline()
''
write
# write_file.py
with open('op.txt', mode='w', encoding='ascii') as f:
f.write('this is a sample line of text\n')
f.write('yet another line\n')
You can call the write() method on a filehandle to add contents to that file (provided the mode you have set supports
writing). Unlike print(), the write() method doesn't automatically add newline characters.
$ python3.9 write_file.py
$ cat op.txt
this is a sample line of text
yet another line
$ file op.txt
op.txt: ASCII text
warning
If the file already exists, the w mode will overwrite the contents (i.e. existing
content will be lost).
info
You can also use the print() function for writing by passing the filehandle
to the file argument. The fileinput module (https://docs.python.org/3/library/fileinput.html) supports in-place
editing and other features (see In-place editing with fileinput section for examples).
This module provides a portable way of using operating system dependent functionality.
>>> import os
# file size
>>> os.stat('ip.txt').st_size
40
The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix
shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges
expressed with [] will be correctly matched.
The shutil module offers a number of high-level operations on files and collections of files. In particular,
functions are provided which support file copying and removal.
This module offers classes representing filesystem paths with semantics appropriate for different operating
systems. Path classes are divided between pure paths, which provide purely computational operations without
I/O, and concrete paths, which inherit from pure paths but also provide I/O operations.
There are specialized modules for structured data processing as well, for example:
Exercises
Write a program that reads a known filename f1.txt which contains a single column of numbers in Python
syntax. Your task is to display the sum of these numbers, which is 10485.14 for the given example.
$ cat f1.txt
8
53
3.14
84
73e2
100
2937
Read the documentation for glob.glob() and write a program to list all files ending with .txt in the current
directory as well as sub-directories, recursively.
Using os module
Last chapter showed a few examples with os module for file processing. The os module is a feature rich module with lot
of other uses, like providing an interface for working with external commands. Here's an example:
>>> import os
Similar to the print() function, the output of the external command, if any, is displayed on the screen. The return value
is the exit status of the command, which gets displayed by default on the REPL. 0 means the command executed
successfully, any other value indicates some kind of failure. As per docs.python: os.system
(https://docs.python.org/3/library/os.html#os.system):
On Unix, the return value is the exit status of the process encoded in the format specified for wait().
You can use the os.popen() method to save the results of an external command. It provides a file object like interface
for both read (default) and write. To check the status, call close() method on the filehandle (None means success).
>>> fh = os.popen('wc -w <ip.txt')
>>> op = fh.read()
>>> op
'9\n'
>>> status = fh.close()
>>> print(status)
None
subprocess.run
The subprocess module provides a more flexible and secure option to execute external commands, at the cost of being
more verbose.
The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and
obtain their return codes.
The recommended approach to invoking subprocesses is to use the run() function for all use cases it can
handle. For more advanced use cases, the underlying Popen interface can be used directly.
>>> subprocess.run('pwd')
'/home/learnbyexample/Python/programs/'
CompletedProcess(args='pwd', returncode=0)
See also:
shell=True
You can also construct a single string command, similar to os.system(), if you set shell keyword argument to True.
While this is convenient, use it only if you have total control over the command being executed such as your personal
scripts. Otherwise, it can lead to security issues, see stackoverflow: why not use shell=True
(https://stackoverflow.com/q/13491392/4082052) for details.
If shell is True, the specified command will be executed through the shell. This can be useful if you are using
Python primarily for the enhanced control flow it offers over most system shells and still want convenient access to
other shell features such as shell pipes, filename wildcards, environment variable expansion, and expansion of ~
to a user's home directory
>>> p = subprocess.run(('echo', '$HOME'))
$HOME
>>> p = subprocess.run('echo $HOME', shell=True)
/home/learnbyexample
If shell=True cannot be used but shell features as quoted above are needed, you can use modules like os, glob,
shutil and so on as applicable. See also docs.python: Replacing Older Functions with the subprocess Module
(https://docs.python.org/3/library/subprocess.html#replacing-older-functions-with-the-subprocess-module).
Changing shell
By default, /bin/sh is the shell used for POSIX systems. You can change that by setting the executable argument to
the shell of your choice.
Capture output
If you use capture_output=True, the CompletedProcess object will provide stdout and stderr results as well.
These are provided as bytes data type by default. You can change that by setting text=True.
>>> p = subprocess.run(('date', '-u', '+%A'), capture_output=True, text=True)
>>> p
CompletedProcess(args=('date', '-u', '+%A'), returncode=0,
stdout='Monday\n', stderr='')
>>> p.stdout
'Monday\n'
You can also use subprocess.check_output() method to directly get the output.
info
sys.argv
Command line arguments passed when executing a Python program can be accessed as a list of strings via
sys.argv. The first element (index 0) contains the name of the Python script or -c or empty string, depending upon
how the Python interpreter was called. Rest of the elements will have the command line arguments, if any were passed
along the script to be executed. See docs.python: sys.argv (https://docs.python.org/3/library/sys.html#sys.argv) for more
details.
Here's a program that accepts two numbers passed as CLI arguments and displays the sum only if the input was passed
correctly.
# sum_two_nums.py
import ast
import sys
try:
num1, num2 = sys.argv[1:]
total = ast.literal_eval(num1) + ast.literal_eval(num2)
except ValueError:
sys.exit('Error: Please provide exactly two numbers as arguments')
else:
print(f'{num1} + {num2} = {total}')
As an exercise, modify the above program to handle TypeError exceptions. Instead of the output shown below, inform
the user about the error using sys.exit() method.
$ python3.9 sum_two_nums.py 2 [1]
Traceback (most recent call last):
File "/home/learnbyexample/Python/programs/sum_two_nums.py", line 6, in <module>
total = ast.literal_eval(num1) + ast.literal_eval(num2)
TypeError: unsupported operand type(s) for +: 'int' and 'list'
As another exercise, accept one or more numbers as input arguments. Calculate and display the following details about
the input — sum, product and average.
# inplace_edit.py
import fileinput
with fileinput.input(inplace=True) as f:
for ip_line in f:
op_line = ip_line.rstrip('\n').capitalize() + '.'
print(op_line)
Note that unlike open(), the FileInput object doesn't support write() method. However, using print() is enough.
Here's a sample run:
argparse
sys.argv is good enough for simple use cases. If you wish to create a CLI application with various kinds of flags and
arguments (some of which may be optional/mandatory) and so on, use a module such as the built-in argparse or a
third-party solution like click (https://pypi.org/project/click/).
The argparse module makes it easy to write user-friendly command-line interfaces. The program defines what
arguments it requires, and argparse will figure out how to parse those out of sys.argv. The argparse module
also automatically generates help and usage messages and issues errors when users give the program invalid
arguments.
Here's a CLI application that accepts a file containing a list of filenames that are to be sorted by their extension. Files with
the same extension are further sorted in ascending order. The program also implements an optional flag to remove
duplicate entries.
# sort_ext.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-f', '--file', required=True,
help="input file to be sorted")
parser.add_argument('-u', '--unique', action='store_true',
help="sort uniquely")
args = parser.parse_args()
ip_lines = open(args.file).readlines()
if args.unique:
ip_lines = set(ip_lines)
The documentation for the CLI application is generated automatically based on the information passed to the parser. You
can use help options (which is added automatically too) to view the documentation, as shown below:
$ python3.9 sort_ext.py -h
usage: sort_ext.py [-h] -f FILE [-u]
optional arguments:
-h, --help show this help message and exit
-f FILE, --file FILE input file to be sorted
-u, --unique sort uniquely
$ python3.9 sort_ext.py
usage: sort_ext.py [-h] -f FILE [-u]
sort_ext.py: error: the following arguments are required: -f/--file
The add_argument() method allows you to add details about an option/argument for the CLI application. The first
parameter names an argument or option (starts with -). The help keyword argument lets you add documentation for that
particular option/argument. See docs.python: add_argument
(https://docs.python.org/3/library/argparse.html#argparse.ArgumentParser.add_argument) for documentation and details
about other keyword arguments.
The above program adds two options, one to store the filename to be sorted and the other to act as a flag for sorting
uniquely. Here's a sample text file that needs to be sorted based on the extension.
$ cat sample.txt
input.log
basic.test
input.log
out.put.txt
sync.py
input.log
async.txt
Here's the output with both types of sorting supported by the program.
# default sort
$ python3.9 sort_ext.py -f sample.txt
input.log
input.log
input.log
sync.py
basic.test
async.txt
out.put.txt
# unique sort
$ python3.9 sort_ext.py -uf sample.txt
input.log
sync.py
basic.test
async.txt
out.put.txt
info
args.file is now a positional argument instead of an option. nargs='?' indicates that this argument is optional.
type=argparse.FileType('r') allows you to automatically get a filehandle in read mode for the filename supplied
as an argument. If filename isn't provided, default=sys.stdin kicks in and you get a filehandle for the stdin data.
# sort_ext_stdin.py
import argparse, sys
parser = argparse.ArgumentParser()
parser.add_argument('file', nargs='?',
type=argparse.FileType('r'), default=sys.stdin,
help="input file to be sorted")
parser.add_argument('-u', '--unique', action='store_true',
help="sort uniquely")
args = parser.parse_args()
ip_lines = args.file.readlines()
if args.unique:
ip_lines = set(ip_lines)
$ python3.9 sort_ext_stdin.py -h
usage: sort_ext_stdin.py [-h] [-u] [file]
positional arguments:
file input file to be sorted
optional arguments:
-h, --help show this help message and exit
-u, --unique sort uniquely
Here's a sample run showing both stdin and filename argument functionality.
# 'cat' is used here for illustration purposes only
$ cat sample.txt | python3.9 sort_ext_stdin.py
input.log
input.log
input.log
sync.py
basic.test
async.txt
out.put.txt
$ python3.9 sort_ext_stdin.py -u sample.txt
input.log
sync.py
basic.test
async.txt
out.put.txt
As an exercise, add -o, --output optional argument to store the output in a file for the above program.