Python For Econometrics
Python For Econometrics
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Learning Python for econometrics 3
Essential concepts
Getting started
Procedural Knowledge after completing this course:
programming
Object-orientation
Numerical
You have acquired a basic understanding of programming in general
programming with Python and a special knowledge of working with standard
NumPy package
Array basics numerical packages.
Linear algebra
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Learning Python for econometrics 4
Essential concepts
Getting started
Procedural What you should not expect from this course:
programming
Object-orientation
Numerical
A guide how to install or maintain an application.
programming
NumPy package
An introduction to programming for beginners.
Array basics
Linear algebra
An introduction to professional development tools.
Data formats and
handling
Non-scientific, general purpose programming (beyond the language
Pandas package essentials).
Series
DataFrame Few content and less effort...
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Course organisation 5
Essential concepts
Getting started
Procedural This course can be seen as an applied lecture:
programming
Object-orientation
Lecture:
Numerical
programming We try to explain the partly theoretical knowledge on Python by sim-
NumPy package
Array basics
ple, easy to understand examples. You can learn the programming
Linear algebra language’s subtleties by reading literature.
Data formats and
handling Exercises:
Pandas package
Series
Digital work sheets in the form of Jupyter notebooks with applied
DataFrame tasks are available for each chapter. For all exercises there are sample
Import/Export data
Visual illustrations
solutions available in separate notebooks.
Matplotlib package
Figures and subplots
Self-tests:
Plot types and styles At the end of each of the five chapters there are typical exam questions.
Pandas layers
© 2022 PyEcon.org
Literature 6
Essential concepts
Getting started
Procedural The programming language Python is already established and very well
programming
Object-orientation in trend for numerical applications. Some keywords:
Numerical
programming Data science,
NumPy package
Array basics Data wrangling,
Linear algebra
© 2022 PyEcon.org
Software: Python 3 7
Essential concepts
Getting started
Procedural We are using Python 3. There was a big revision in the migration
programming
Object-orientation from Python 2 to version 3 and the new version is no longer backwards
Numerical compatible to the old version.
programming
NumPy package
Array basics Python 3 running [command line]
Linear algebra
python --version
Data formats and
handling
Pandas package
Series
## Python 3.9.10
DataFrame
Import/Export data
The normal execution mode is that the Python interpreter processes
Visual illustrations
Matplotlib package the instructions in the background – in other numeric programming
Figures and subplots
Plot types and styles
languages such as R this is known as batch mode. It executes program
Pandas layers code that is usually located in a source code file.
Applications
Time series
The interpreter can also be started in an interactive mode. It is used
Moving window for testing and analytic purposes in order to obtain fast results when
Financial applications
Optimization performing simple applications.
© 2022 PyEcon.org
Software: IDEs 8
Essential concepts
Getting started
Procedural For everyday work with Python it would be extremely tedious to make
programming
Object-orientation all edits in interactive mode.
Numerical
programming
There are a number of excellent integrated development environments
NumPy package (IDEs) for Python, with three being emphasized here:
Array basics
Linear algebra
Visual illustrations Of course, you can also use a simple text editor. However, you would
Matplotlib package
Figures and subplots
probably miss the comfort of an IDE.
Plot types and styles
Pandas layers
Installing, adding and maintaining Python is not trivial at the beginning.
Applications Therefore, as a beginner, you are well advised to download and install
Time series the Python distribution Anaconda. Bonus: Many standard packages
Moving window
Financial applications are supplied directly or you can post-install them conveniently.
Optimization
© 2022 PyEcon.org
Following this course 9
Essential concepts
Getting started
Procedural In this course – in a numeric and analytic context – we use only Jupyter
programming
Object-orientation with the IPython kernel.
Numerical
programming
That is why we have combined
NumPy package
Array basics 1 all the code from the slides, and
Linear algebra
Visual illustrations
set up a cloud-based Jupyter-Hub for you.
Matplotlib package
Figures and subplots
You can access the working environment with your university credentials
Plot types and styles at
Pandas layers
Applications https://jupyter-cloud.gwdg.de/
Time series
Moving window
create a profile and get started right away – even using your smart
Financial applications
devices. However, so far you are still asked to upload the course
Optimization
notebooks by yourself or rewrite the code from scratch.
© 2022 PyEcon.org
Notebook workflow 10
Essential concepts
Getting started
Procedural A Jupyter notebook is divided into individual, vertically arranged cells,
programming
Object-orientation which can be executed separately:
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
The notebook approach is not novel and comes from the field of
computer algebra software.
© 2022 PyEcon.org
Notebook workflow 11
Essential concepts
Getting started
Procedural Actually, an interactive Python interpreter called IPython is started “in
programming
Object-orientation the core”.
Numerical
programming IPython running [command line]
NumPy package
Array basics ipython --version
Linear algebra
© 2022 PyEcon.org
Following this course 12
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Finally, we wish you a lot of fun and success with and in this course!
Data formats and
handling
Pandas package
Series
Practice makes perfect!
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Contribution and credits:
Applications
Time series
Fabian H. C. Raters
Moving window
Financial applications
Eike Manßen
Optimization
GWDG for the Jupyter-Hub
© 2022 PyEcon.org
Table of contents 13
Essential concepts
Getting started
Procedural
programming
Object-orientation
1 Essential concepts 4 Visual illustrations
Numerical 1.1 Getting started 4.1 Matplotlib package
programming
NumPy package
1.2 Procedural programming 4.2 Figures and subplots
Array basics
1.3 Object-orientation 4.3 Plot types and styles
Linear algebra
© 2022 PyEcon.org
Chapter 1 14
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Essential concepts
programming
NumPy package
Array basics
Linear algebra
1.1 Getting started
Data formats and
handling
1.2 Procedural programming
Pandas package
Series 1.3 Object-orientation
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 1.1 15
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Essential concepts
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Motivation for learning Python 16
Essential concepts
Getting started
Procedural Python can be described as
programming
Object-orientation
Numerical
a dynamic, strongly typed, multi-paradigm and object-oriented
programming programming language,
NumPy package
Array basics for versatile, powerful, elegant and clear programming,
Linear algebra
Visual illustrations Moreover, Python is relatively easy to learn and its successful language
Matplotlib package
Figures and subplots
design supports novices to professional developers. Much of Python’s
Plot types and styles success is due to a high degree of standardization and a huge community
Pandas layers
that elaborates and collectively recognizes conventions and paradigms.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
A short history of time 17
Essential concepts
Getting started
Procedural ... of the Python era:
programming
Object-orientation
The language was originally developed in 1991 by Guido van Rossum.
Numerical
programming Its name was based on Monty Python’s Flying Circus. Its main identifi-
NumPy package
Array basics
cation feature is the novel markup of code blocks – by indentation:
Linear algebra
Numerical
programming
NumPy package
1990 1995 2000 2005 2010 2015 2020 2025
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles Python 2.7 lives forever Python 3.9
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization Python 3.6 Python 2.7 will die
© 2022 PyEcon.org
In comparison 19
Essential concepts
Getting started
Procedural Comparing the way Python works with common programming languages,
programming
Object-orientation we briefly discuss a selection of popular competitors:
Numerical
programming C/C++:
NumPy package
Array basics CPython is interpreted, not compiled.
Linear algebra
Applications
MATLAB you write matrix-based programs.
Time series
Moving window
In the numerical context, the matrix view and syntax are very
Financial applications similar to those of MATLAB.
Optimization
Visual illustrations
Reference semantics
Matplotlib package
Figures and subplots
An extremely important difference between the first two languages,
Plot types and styles C/C++ and Java, as well as Python itself, and the last three languages
Pandas layers
Applications
is that they follow a call-by-reference semantic, while MATLAB, R and
Time series Stata are call-by-copy.
Moving window
Financial applications
Optimization Further specific differences and similarities to MATLAB and R will be
addressed in other parts of this course.
© 2022 PyEcon.org
Versatility – diversity 21
Essential concepts
Getting started
Procedural Python has become extremely popular:
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2022 PyEcon.org
Versatility – diversity 22
Essential concepts
Getting started
Procedural So, you’re on the right track – because who wants to bet on the wrong
programming
Object-orientation hoRse?
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2022 PyEcon.org
Versatility – diversity 23
Essential concepts
Getting started
Procedural Areas in which Python is used with great success:
programming
Object-orientation
Scripts,
Numerical
programming Console applications,
NumPy package
Array basics GUI applications,
Linear algebra
Game development,
Data formats and
handling Website development, and
Pandas package
Series Numerical programming.
DataFrame
Import/Export data Places where Python is used:
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Yet another outline 24
Essential concepts
Getting started
Procedural In this course we will successively gain the following insights:
programming
Object-orientation
Numerical
programming 1 General basics of the language.
NumPy package
Array basics
Linear algebra 2 Numerical programming and handling of data sets.
Data formats and
handling
Pandas package
3 Application to economic and analytical questions.
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 1.2 25
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Essential concepts
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
The first program 26
Essential concepts
Getting started
Procedural Programs can be implemented very quickly – this is a pretty minimal
programming
Object-orientation example. You can write this command to a text file of your choice and
Numerical run it directly on your system:
programming
NumPy package
Array basics Hello there
Linear algebra
print("Hello there!")
Data formats and
handling
Pandas package
## Hello there!
Series
DataFrame
Import/Export data
Only one function print() (shown here as a keyword),
Visual illustrations
Matplotlib package Function displays argument (a string) on screen,
Figures and subplots
Plot types and styles Arguments are passed to the function in parentheses,
Pandas layers
© 2022 PyEcon.org
User input 27
Essential concepts
Getting started
Procedural Let’s add a user input to the program:
programming
Object-orientation
Numerical
Hello you
name = input("Please enter your name: ")
programming
NumPy package
Array basics
## Please enter your name: Angela Merkel
Linear algebra
Visual illustrations The function input() is used for interactive text input,
Matplotlib package
Figures and subplots You can use the equal sign = to assign variables (here: name),
Plot types and styles
Pandas layers Strings can be joined by the (overloaded) Operator +.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Determining weekdays 28
Essential concepts
Getting started
Procedural We are now trying to find out on which weekday a person was born
programming
Object-orientation (Merkel’s birthday is 17-07-1954):
Numerical
programming Weekday of birth
NumPy package
Array basics from datetime import datetime
Linear algebra
Applications
It is really easy to import functionality from other modules,
Time series
Moving window
Function strptime() is a method of class datetime,
Financial applications
Optimization
Both methods, strptime() and strftime(), are used to convert
between strings and date time specifications.
© 2022 PyEcon.org
Time since birth 29
Essential concepts
Getting started
Procedural And how many days have passed since then (until Merkel’s 4th swearing-
programming
Object-orientation in as Federal Chancellor)?
Numerical
programming Age in days
NumPy package
Array basics someday = datetime.strptime("14-03-2018", "%d-%m-%Y")
Linear algebra
print("You are " + str((someday - birthday).days) + " days old!")
Data formats and
handling
Pandas package
## You are 23251 days old!
Series
DataFrame
Import/Export data
You can create time differences, i.e., the operator - is overloaded,
Visual illustrations
Matplotlib package The difference represents a new object, with its own attributes,
such as days,
Figures and subplots
Plot types and styles
Pandas layers
Applications
When using the overloaded operator +, you have to explicitly
Time series convert the number of days by means of str() into a string.
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Time since birth 30
Essential concepts
Getting started
Procedural How many years, weeks and days do you think that is?
programming
Object-orientation
Numerical
Human readable age
programming
from dateutil.relativedelta import relativedelta
NumPy package
Array basics
delta = relativedelta(someday, birthday)
Linear algebra print(f"That’s {delta.years} years, {delta.months} months "
Data formats and f"and {delta.days} days!!")
handling
Pandas package
## That's 63 years, 7 months and 25 days!!
Series
DataFrame
Import/Export data
Visual illustrations You don’t have to keep reinventing the wheel – a wealth of packages
Matplotlib package
Figures and subplots
and individual modules are freely available,
Plot types and styles
Pandas layers
A lowercase f before "..." provides convenient formatting – there
Applications are other options as well,
Two strings in sequence are implicitly joined together – "That"
Time series
Moving window
Financial applications
Optimization
"’s nice"!
© 2022 PyEcon.org
Getting help 31
Essential concepts
Getting started
Procedural When working with the interactive interpreter, i.e., in a notebook, you
programming
Object-orientation can quickly get useful information about Python objects:
Numerical
programming Help system
NumPy package
Array basics help(len)
Linear algebra
Visual illustrations
Matplotlib package
Alternatively, e.g., for more complex problems, it is best to search
Figures and subplots directly with your preferred internet search engine.
Plot types and styles
Pandas layers You can find neat solutions to conventional challenges in literature.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Lexical structure 32
Essential concepts
Getting started
Procedural As with natural language, programming languages have a lexical struc-
programming
Object-orientation ture. Source code consists of the smallest possible, indivisible elements,
Numerical the tokens. In Python you can find the following groups of elements:
programming
NumPy package
Array basics Literals
Linear algebra
© 2022 PyEcon.org
Literals and variables 33
Essential concepts
Getting started
Procedural Basically, we distinguish between literals and variables:
programming
Object-orientation
Numerical
Assigning variables with literals
programming
myint = 7
NumPy package
Array basics
myfloat = 4.0
Linear algebra myboat = "nice"
Data formats and mybool = True
handling myfloat = myboat
Pandas package
Series
DataFrame
Import/Export data
In this course, we will work with four different literals: integer (7),
Visual illustrations
Matplotlib package
float (4.0), string ("nice") and boolean (True),
Figures and subplots
Plot types and styles
Literals are assigned to variables at runtime,
Pandas layers
In Python the data type is derived from the literal and does not
Applications
Time series have to be described explicitly,
Moving window
Financial applications It is allowed to assign values of different data types to the same
Optimization
variable (name) sequentially,
If we don’t assign a literal to any variables, we forfeit it.
© 2022 PyEcon.org
Operators and delimiters 34
Essential concepts
Getting started
Procedural Most operators and delimiters will be introduced to you during this
programming
Object-orientation course. Here is an overview of the operators:
Numerical
programming Overview of operators
NumPy package
Array basics ## + - * / ** //
Linear algebra
## % @ << >> & |
Data formats and
handling
## ^ ~ == != < >
Pandas package ## <= >= and or not in
Series ## not in is is not
DataFrame
Import/Export data
Visual illustrations
An overview of the delimiters follows:
Matplotlib package
Figures and subplots Overview of delimiters
Plot types and styles
Pandas layers
## ( ) [ ] { }
## , : . = ; ->
Applications
Time series
## += -= *= /= **= //=
Moving window ## %= @= <<= >>= &= |=
Financial applications ## ^= ' " \ @ SPACE
Optimization
© 2022 PyEcon.org
Arithmetic operators 35
Essential concepts
Getting started
Procedural All regular arithmetic operations involving numbers are possible:
programming
Object-orientation
Numerical
Pocket calculator
programming 10 + 5
NumPy package
100 - 20
Array basics
Linear algebra 8 / 2
Data formats and
4 * (10 + 20)
handling 2**3
## 15
Pandas package
Series
DataFrame ## 80
Import/Export data ## 4.0
Visual illustrations ## 120
Matplotlib package
## 8
Figures and subplots
Plot types and styles
Pandas layers
© 2022 PyEcon.org
Boolean operators 36
Essential concepts
Getting started
Procedural In order to demonstrate the use of logical operators (and formatted
strings and for-loops), we create a handy table summarizing some
programming
Object-orientation
© 2022 PyEcon.org
Keywords and comments 37
Essential concepts
Getting started
Procedural The programmer explains the structure of his/her program to the
programming
Object-orientation interpreter via a restricted set of short commands, the keywords:
Numerical
programming Overview of keywords
NumPy package
Array basics ## and as assert break class continue
Linear algebra ## def del elif else except False
Data formats and ## finally for from global if import
handling
## in is lambda None nonlocal not
Pandas package
Series
## or pass raise return True try
DataFrame ## while with yield
Import/Export data
Visual illustrations
Matplotlib package
There are two ways to make comments:
Figures and subplots
Plot types and styles Provide some comments
Pandas layers
# Set variable to something - or nothing?
Applications something = None
Time series
Moving window
Financial applications """
Optimization I am a docstring!
A multiline string comment hybrid.
I will be useful for describing classes and methods.
"""
© 2022 PyEcon.org
Data types 38
Essential concepts
Getting started
Procedural Python offers the following basic data types, which we will use in this
programming
Object-orientation course:
Numerical
programming Data type Description
NumPy package
Array basics
int() Integers
Linear algebra float() Floating point numbers
Data formats and
handling
str() Strings, i.e., unicode (UTF-8) texts
Pandas package bool() Boolean, i.e., True or False
list()
Series
DataFrame List, an ordered array of objects
Import/Export data
tuple() Tuple, an ordered, unmutable array of objects
dict()
Visual illustrations
Matplotlib package
Dictionary, an unordered, associative array of objects
Figures and subplots
Plot types and styles
set() Set, an unordered array/set of objects
Pandas layers None() Nothing, emptyness, the void..
Applications
Time series Each data type has its own methods, that is, functions that are appli-
Moving window
Financial applications
cable specifically to an object of this type.
Optimization
You will gradually get to know new and more complex data types or
object classes.
© 2022 PyEcon.org
Lists 39
Essential concepts
Getting started
Procedural A list is an ordered array of objects, accessible via an index:
programming
Object-orientation
Numerical
Listing tech companies
stocks = ["Google", "Amazon", "Facebook", "Apple"]
programming
NumPy package
Array basics stocks[1]
Linear algebra stocks.append("Twitter")
Data formats and stocks.insert(2, "Microsoft")
handling
stocks.sort()
Pandas package
Series ## ['Google', 'Amazon', 'Facebook', 'Apple']
DataFrame
Import/Export data
## Amazon
## ['Google', 'Amazon', 'Facebook', 'Apple', 'Twitter']
Visual illustrations
Matplotlib package
## ['Google', 'Amazon', 'Microsoft', 'Facebook', 'Apple', 'Twitter']
Figures and subplots ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft', 'Twitter']
Plot types and styles
Pandas layers
Applications
Time series
The constructor for new lists is [ ],
Moving window
Financial applications
The first element has the index 0,
Optimization
The data type list() possesses its own methods.
© 2022 PyEcon.org
Tuples 40
Essential concepts
Getting started
Procedural Tuples are immutable sequences related to lists that cannot be extended,
programming
Object-orientation for example. The drawbacks in flexibility are compensated by the
Numerical advantages in speed and memory usage:
programming
NumPy package
Array basics Selecting elements in sequences
Linear algebra
lottery = (1, 8, 9, 12, 24, 28)
Data formats and
handling len(lottery)
Pandas package lottery[1:3]
Series
lottery[:4]
DataFrame
Import/Export data
lottery[-1]
Visual illustrations
lottery[-2:]
Matplotlib package
## (1, 8, 9, 12, 24, 28)
Figures and subplots
Plot types and styles
## 6
Pandas layers ## (8, 9)
Applications ## (1, 8, 9, 12)
Time series ## 28
Moving window
## (24, 28)
Financial applications
Optimization
© 2022 PyEcon.org
Dictionaries 41
Essential concepts
Getting started
Procedural Dictionaries are associative collections of key-value pairs. The key must
programming
Object-orientation be immutable and unique:
Numerical
programming Internet slang dictionary
NumPy package
Array basics slang = {"imho": "in my humble opinion",
Linear algebra
"lol": "laughing out loud",
Data formats and
handling
"tl;dr": "too long; didn’t read"}
Pandas package slang["lol"]
Series slang["gl&hl"] = "good luck & have fun"
DataFrame
slang.keys()
slang.values()
Import/Export data
Visual illustrations
Matplotlib package ## {'imho': 'in...ion', 'lol': 'la...oud', 'tl;dr': 'to...ead'}
Figures and subplots
## laughing out loud
Plot types and styles
Pandas layers
## good luck & have fun
Applications
## dict_keys(['imho', 'lol', 'tl;dr', 'gl&hl'])
Time series ## dict_values([... & have fun'])
Moving window
Financial applications
Numerical
Set operations
x = {"o", "n", "y", "t"}
programming
NumPy package
Array basics y = {"p", "h", "o", "n"}
Linear algebra x & y
Data formats and x | y
handling
x - y
Pandas package
Series
## {'y', 'n', 't', 'o'}
## {'p', 'n', 'h', 'o'}
DataFrame
Import/Export data
Visual illustrations
## {'n', 'o'}
Matplotlib package ## {'p', 'y', 'o', 't', 'n', 'h'}
Figures and subplots ## {'y', 't'}
Plot types and styles
Pandas layers
© 2022 PyEcon.org
Comparison operators 43
Essential concepts
Getting started
Procedural The <, <=, >, >=, ==, != operators compare the values of two objects
and return True or False.
programming
Object-orientation
Numerical
programming Op. True, only if the value of the left operand is
NumPy package
Array basics
< less than the value of the right operand
Linear algebra <= less than or equal to the value of the right operand
Data formats and
handling
> greater than the value of the right operand
Pandas package >= greater than or equal to the value of the right operand
Series
DataFrame == equal to the right operand
Import/Export data
!= not equal to the right operand
Visual illustrations
Matplotlib package
Figures and subplots
The comparison depends on the datatype of the objects. For example
Plot types and styles "7" == 7 will return False, while 7.0 == 7 will return True.
Pandas layers
© 2022 PyEcon.org
Comparison operators 44
Essential concepts
Getting started
Procedural
programming Comparing examples
Object-orientation
x, y = 5, 8
Numerical
programming print("x < y is", x < y)
NumPy package
Array basics ## x < y is True
Linear algebra
Applications ## x != y is True
Time series
Numerical
Chaining comparison examples
programming
x = 5
NumPy package
Array basics
Linear algebra 5 >= x > 4
Data formats and
handling ## True
Pandas package
Series
DataFrame
12 < x < 20
Import/Export data
## False
Visual illustrations
Matplotlib package
Figures and subplots 2 < x < 10
Plot types and styles
Pandas layers ## True
Applications
Time series 2 < x and x < 10 # unchained expression
Moving window
Financial applications
## True
Optimization
© 2022 PyEcon.org
Logical operators 46
Essential concepts
Applications (x == 5) or (y == 8)
Time series
Moving window
Financial applications
## True
Optimization
not(x == 4) or (y == 9)
## True
© 2022 PyEcon.org
Exclusive or 47
Essential concepts
In some situations, you need a logical operation that is True only when
Getting started
Procedural
the operands differ (one is True, the other is False). This task can
programming
Object-orientation
Numerical be solved by using the logical operators not, and, or or simply !=.
programming
NumPy package
Array basics Exclusive or
Linear algebra
x, y = 5, 8
Data formats and
handling
Pandas package
((x == 5) and not (y == 8)) or (not (x == 5) and (y == 8))
Series
DataFrame ## False
Import/Export data
Visual illustrations x = 4
Matplotlib package
((x == 5) and not (y == 8)) or (not (x == 5) and (y == 8))
Figures and subplots
Plot types and styles
Pandas layers
## True
Applications
Time series
(x == 5) != (y == 8)
Moving window
Financial applications ## True
Optimization
© 2022 PyEcon.org
Binary numbers 49
Essential concepts
Getting started
Procedural How to convert binary numbers to integers (the unknown keywords and
programming
Object-orientation language structures will be introduced soon):
Numerical
programming Binary to integer
NumPy package
Array basics def bintoint(binary):
Linear algebra
binary = binary[::-1]
Data formats and
handling
num = 0
Pandas package for i in range(len(binary)):
Series num += int(binary[i]) * 2**i
DataFrame
return num
Import/Export data
Visual illustrations
Matplotlib package
bintoint("1101001")
Figures and subplots
Plot types and styles ## 105
Pandas layers
© 2022 PyEcon.org
Binary numbers 50
Essential concepts
Getting started
Procedural How to convert integers to binary numbers:
programming
Object-orientation
Integers to binary
Numerical
programming def inttobin(num):
NumPy package
Array basics
binary = ""
Linear algebra if num != 0:
Data formats and while num >= 1:
handling if num % 2 == 0:
Pandas package
binary += "0"
Series
DataFrame
num = num / 2
Import/Export data else:
Visual illustrations binary += "1"
Matplotlib package num = (num - 1) / 2
Figures and subplots
else:
Plot types and styles
Pandas layers
binary = "0"
Applications
return binary[::-1]
Time series inttobin(105)
Moving window
Financial applications ## '1101001'
Optimization
## '1101001'
© 2022 PyEcon.org
Bitwise operators 51
Essential concepts
Getting started
Procedural Python offers distinct bitwise operators. Some of them will be redefined
programming
Object-orientation entirely different by extensions, such as, e. g., vectorization.
Numerical
programming Bit. op. Description
NumPy package
Array basics
x >> y Returns x with the bits shifted to the left by y places
Linear algebra x << y Returns x with the bits shifted to the right by y places
Data formats and
handling
x&y Does a bitwise and
Pandas package x|y Does a bitwise or
Series
DataFrame ~x Returns the complement of x
Import/Export data
x^y Does a bitwise exclusive or
Visual illustrations
Matplotlib package
Figures and subplots Bitwise operators
Plot types and styles
Pandas layers a, b = 5, 7
Applications
c = a & b # bitwise and
Time series ## a: 101
Moving window
Financial applications
## b: 111
Optimization ## c: 101
print(c)
## 5
© 2022 PyEcon.org
Bitwise operators 52
Essential concepts
Getting started
Procedural
programming Bitwise operators
Object-orientation
a, b = 5, 7
Numerical
programming c = a | b # bitwise or
NumPy package
Array basics
## a: 101
Linear algebra ## b: 111
Data formats and ## c: 111
handling
Pandas package
print(c)
Series
DataFrame ## 7
Import/Export data
a = 13
Visual illustrations
b = a << 2 # bitwise shift
Matplotlib package
Figures and subplots ## a: 1101
Plot types and styles
## b: 110100
Pandas layers
Applications
a, b = 35, 37
Time series c = a ^ b # bitwise exclusive or
Moving window
Financial applications
## a: 100011
Optimization ## b: 100101
## c: 000110
© 2022 PyEcon.org
Control flow: Conditional statements 53
Essential concepts
Numerical
Computer data sizes
bytes = 100000000 / 8 # e.g. DSL 100000
programming
NumPy package
Array basics if bytes >= 1e9:
Linear algebra print(f"{bytes/1e9:6.2f} GByte")
Data formats and elif bytes >= 1e6:
handling
print(f"{bytes/1e6:6.2f} MByte")
Pandas package
Series
elif bytes >= 1e3:
DataFrame print(f"{bytes/1e3:6.2f} KByte")
Import/Export data
else:
Visual illustrations print(f"{bytes:6.2f} Byte")
Matplotlib package
Figures and subplots
Plot types and styles
## 12.50 MByte
Pandas layers
if b > 2:
pass # a special keyword for empty blocks
© 2022 PyEcon.org
Control flow: The for loop 54
Essential concepts
Numerical
Total sum
programming numbers = [7, 3, 4, 5, 6, 15]
NumPy package
Array basics
y = 0
Linear algebra for i in numbers:
Data formats and y += i
handling print(f"The sum of ’numbers’ is {y}.")
Pandas package
Series
DataFrame
## The sum of 'numbers' is 40.
Import/Export data
© 2022 PyEcon.org
Control flow: continue and break 55
Essential concepts
Getting started
Procedural Loops can skip iterations (continue):
programming
Object-orientation
Continue the loop
Numerical
programming
NumPy package
for x in ["a", "b", "c"]:
Array basics a = x.upper()
Linear algebra continue
Data formats and print(x)
handling
print(a)
Pandas package
Series
DataFrame
## C
Import/Export data
Numerical
programming
Have you already noticed the keyword else? Python only executes the
NumPy package branch if it was not terminated by break:
Array basics
Linear algebra
© 2022 PyEcon.org
Functions 57
Essential concepts
Functions are defined using the keyword def. The structure of function
Getting started
Procedural
programming
Object-orientation signature and body is specified by indentation, too:
Numerical
programming Drawing lottery numbers
NumPy package
Array basics def draw_sample(n, first=1, last=49):
Linear algebra
numbers = list(range(first, last + 1))
Data formats and
handling
sample = []
Pandas package for i in range(n):
Series ind = random.randint(0, len(numbers) - 1)
DataFrame
sample.append(numbers.pop(ind))
sample.sort()
Import/Export data
Visual illustrations
return sample
Matplotlib package
Figures and subplots
draw_sample(6)
Plot types and styles
Pandas layers
draw_sample(6, 80, 100)
Applications
draw_sample(3, first=5)
Time series
## [2, 3, 4, 16, 23, 28]
Moving window
Financial applications
## [82, 84, 94, 95, 99, 100]
Optimization ## [5, 12, 16]
© 2022 PyEcon.org
Functions 58
Essential concepts
## [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
© 2022 PyEcon.org
Seems weird? We discuss namespaces in the next section.
Section 1.3 59
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Essential concepts
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Python is object-oriented 60
Essential concepts
Getting started
Procedural There are three widely known programming paradigms: procedural,
programming
Object-orientation functional and object-oriented programming (OOP). Python supports
Numerical them all.
programming
NumPy package
Array basics
You have learned how to handle predefined data types in Python.
Linear algebra Actually, we have already encountered classes and instances, take for
Data formats and
handling
example dict().
Pandas package
Series
In this section you will learn the basics of dealing with (your own)
DataFrame classes:
Import/Export data
© 2022 PyEcon.org
References 61
Essential concepts
Getting started
Procedural When you assign a variable, a reference to an object is set:
programming
Object-orientation
Numerical
Equal but not identical
a = ["Star", "Trek"]
programming
NumPy package
Array basics b = ["Star", "Trek"]
Linear algebra c = a
Data formats and a == b
handling
a == c
Pandas package
Series
a is b
DataFrame a is c
Import/Export data
## ['Star', 'Trek']
Visual illustrations
Matplotlib package
## ['Star', 'Trek']
Figures and subplots ## ['Star', 'Trek']
Plot types and styles ## True
Pandas layers
## True
Applications ## False
Time series
Moving window
## True
Financial applications
Optimization
Numerical
Side effects
programming def last_element(x):
NumPy package
Array basics
return x.pop(-1)
Linear algebra
© 2022 PyEcon.org
Copying objects 64
Essential concepts
Getting started
Procedural We are able to make an exact copy of the object:
programming
Object-orientation
Numerical
Copying
programming
NumPy package
def last_element(x):
Array basics y = x.copy()
Linear algebra return y.pop(-1)
Data formats and
handling
a = stocks
Pandas package
Series
last_element(a)
DataFrame a
Import/Export data
## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Visual illustrations
Matplotlib package
## Microsoft
Figures and subplots ## ['Amazon', 'Apple', 'Facebook', 'Google', 'Microsoft']
Plot types and styles
Pandas layers
Applications
Time series
We receive a new object,
Moving window
Financial applications
The new object is not identical to the old one.
Optimization
© 2022 PyEcon.org
Deep and shallow copying 65
Essential concepts
Applications
Time series ## [['risotto', 'pasta']]
Moving window ## [['burgers', 'hot dogs']]
Financial applications
Optimization
Both approaches, copy() and list(), create new list objects con-
taining new references to the original sub-lists. But for a deep copy,
© 2022 PyEcon.org
you have to recursively create duplicates of all its objects.
Classes 66
Essential concepts
Getting started
Procedural In Python everything is an object and more complex objects consist of
programming
Object-orientation several other objects.
Numerical
programming In the OOP, we create objects according to patterns. These kinds of
NumPy package
Array basics
blueprints are called classes and are characterized by two categories of
Linear algebra elements:
Data formats and
handling
Pandas package
Attributes:
Series Variables that represent the properties of
DataFrame
Import/Export data an object, object attributes, or
Visual illustrations
Matplotlib package
a class, named class attributes.
Figures and subplots
Plot types and styles
Methods:
Pandas layers
Functions that are defined within a class:
Applications
Time series (non-static) methods can access all attributes, while
Moving window
Financial applications static methods can only access class attributes.
Optimization
© 2022 PyEcon.org
Class definition 67
Essential concepts
Getting started
Procedural Specifically, we want to create “rectangle object” and define a separate
Rectangle class for it:
programming
Object-orientation
Numerical
programming Rectangle class
NumPy package
Array basics class Rectangle:
Linear algebra
width = 0
Data formats and height = 0
handling
Pandas package
Series def area(self):
DataFrame return self.width * self.height
Import/Export data
Visual illustrations
myrectangle = Rectangle()
Matplotlib package
Figures and subplots
myrectangle.width = 10
Plot types and styles myrectangle.height = 20
Pandas layers myrectangle.area()
Applications
Time series ## 200
Moving window
Financial applications
Optimization
New classes are defined using the keyword class,
The variable self always refers to the instance itself.
© 2022 PyEcon.org
Class constructor 68
Essential concepts
an object of Rectangle:
programming
Object-orientation
Numerical
programming Rectangle class with constructor
NumPy package
Array basics class Rectangle:
Linear algebra
width = 0
Data formats and height = 0
handling
Pandas package
Series def __init__(self, width, height):
DataFrame self.width = width
self.height = height
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
def area(self):
Plot types and styles return self.width * self.height
Pandas layers myrectangle = Rectangle(15, 30)
Applications myrectangle.area()
Time series
Moving window
## 450
Financial applications
Optimization
© 2022 PyEcon.org
Garbage collection 70
Essential concepts
Getting started
Procedural You do not have to worry about memory management in Python. The
programming
Object-orientation garbage collector will tidy up for you.
Numerical
programming If there are no more references to an object, it is automatically disposed
NumPy package
Array basics
of by the garbage collector:
Linear algebra
Visual illustrations
error. In Python you often find the important information at the
Matplotlib package end of the error message.
Figures and subplots
Plot types and styles Errors are often oversights: In most cases the error massage will
Pandas layers
give you the line in your code where the error occurred.
Applications
Time series
Search the web: If you are not able to fix the errors on your own,
Moving window
Financial applications copy the error message into a search engine and read through
Optimization
the results. Probably someone else also had this problem and the
community already found a solution.
© 2022 PyEcon.org
Exceptions versus syntax errors 72
Essential concepts
Getting started
Procedural A Python program terminates immediately as it encounters an error. In
programming
Object-orientation Python, errors can be either syntax errors or exceptions. Syntax errors
Numerical occur when the parser detects a wrong sequence in the Python code.
programming
NumPy package An arrow indicates the exact position of the syntax error:
Array basics
Linear algebra Syntax Error
Data formats and
handling ## print("Hello Word"))
Pandas package
Series ## File "<stdin>", line 1
DataFrame ## print("Hello World"))
Import/Export data
## ^
Visual illustrations ## SyntaxError: invalid syntax
Matplotlib package
Figures and subplots
Plot types and styles An exception occurs whenever a syntactically correct Python code
Pandas layers
results in an error:
Applications
Time series
Exception
Moving window
Financial applications
a = 0 / 0
Optimization
## <stdin> in <module>()
## ----> 1 a = 0 / 0
## ZeroDivisionError: division by zero
© 2022 PyEcon.org
Exceptions 73
Essential concepts
Getting started
Procedural Exceptions appear in different types and the type is printed as a part
programming
Object-orientation of the error message. The next example shows three common built-in
Numerical exceptions:
programming
NumPy package
Array basics Frequent exception
Linear algebra
0 / 0
Data formats and
handling ## <stdin> in <module>()
## ----> 1 0 / 0
Pandas package
Series
DataFrame ## ZeroDivisionError: division by zero
Import/Export data
3 + a
Visual illustrations
Matplotlib package ## <stdin> in <module>()
Figures and subplots ## ----> 1 3 + a
Plot types and styles
Pandas layers
## NameError: name 'a' is not defined
Applications 3 + "2"
Time series
## <stdin> in <module>()
Moving window
Financial applications
## ----> 1 3 + "2"
Optimization ## TypeError: unsupported operand type(s) for +: 'int' and 'str'
A list of all exception classes of the standard library can be found here.
© 2022 PyEcon.org
Exception handling 74
Essential concepts
Getting started
Procedural When an exception occurs, the Python interpreter throws an error
programming
Object-orientation message and exits. But in most situations, you do not want your whole
Numerical program to stop.
programming
NumPy package
Array basics
The try block can test a block of code for errors.
Linear algebra The except block lets you handle the error.
Data formats and
handling
Pandas package
Series
Try and except
DataFrame
try:
Import/Export data
print(abc)
Visual illustrations
except:
Matplotlib package
Figures and subplots
print("An exception occurred")
Plot types and styles
Pandas layers ## An exception occurred
Applications
Time series
Moving window
The statement above will raise an error, because the variable abc is
Financial applications
not defined.
Optimization
© 2022 PyEcon.org
Exception handling 75
Essential concepts
Getting started
Procedural You can define multiple exception blocks. For example, if you want to
programming
Object-orientation execute code when you expect a special kind of error to occur:
Numerical
programming Multiple exception blocks
NumPy package
Array basics try:
Linear algebra
print(abc)
Data formats and except NameError:
handling
Pandas package
print("Variable abc is not defined")
Series except:
DataFrame print("Something else went wrong")
Import/Export data
© 2022 PyEcon.org
Exception handling 76
Essential concepts
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Exception handling 77
Essential concepts
The finally block will be executed regardless if the try block raises
Getting started
Procedural
programming
Object-orientation an error or not. Hence, you can make sure the code is run:
Numerical
programming Finally exception
NumPy package
Array basics try:
Linear algebra print(abc)
Data formats and except:
handling
print("Something went wrong")
Pandas package
Series
finally:
DataFrame print("This will always be displayed")
Import/Export data
## Hello World
## This will always be displayed
© 2022 PyEcon.org
Raise exception 78
Essential concepts
Getting started
Procedural Built-in exceptions are raised whenever pre-defined interpreter errors
programming
Object-orientation occur. In some situations you might want to raise exceptions on your
Numerical own:
programming
NumPy package
Array basics
The raise keyword is used to raise an exception.
Linear algebra
© 2022 PyEcon.org
EAFP versus LBYL 79
Essential concepts
Getting started
Procedural LBYL: Look before you leap.
programming
Object-orientation EAFP: It is easier to ask forgiveness than it is to get permission.
Numerical
programming LBYL and EAFP are two techniques to deal (i.e., avoid) with exceptions.
NumPy package
Array basics
In short, in LBYL you first check whether something will succeed and
Linear algebra only proceed if it does. EAFP means that you do what you expect and
Data formats and
handling
if an exception might occur, you deal with it:
Pandas package
Series LBYL
DataFrame
Import/Export data if x != 0:
Visual illustrations print(10 / x)
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
EAFP
Applications
try:
Time series print(10 / x)
Moving window except ZeroDivisionError:
Financial applications
Optimization
pass
© 2022 PyEcon.org
EAFP versus LBYL 80
Essential concepts
Getting started
Procedural So, why use EAFP although it needs more lines of code?
programming
Object-orientation Often, the code is more readable and straight.
Numerical
programming Explicit is better than implicit (Zen of Python, see below).
NumPy package
Array basics Best performance in case no exception is raised.
Linear algebra
© 2022 PyEcon.org
Built-in versus user-defined exceptions 81
Essential concepts
Getting started
Procedural Python has multiple built-in exceptions which terminate your program
programming
Object-orientation when something goes wrong. But you can also create custom exceptions
Numerical that serve specific purposes.
programming
NumPy package
Array basics
Your own exception can implemented by defining a new class which
Linear algebra derives from the Exception class or a subclass:
Data formats and
handling
Pandas package
User-defined exception
Series
DataFrame
class ValueTooLargeError(Exception):
Import/Export data """Raised when the input value is too large"""
Visual illustrations pass
Matplotlib package x = 3
Figures and subplots
try:
if x > 2:
Plot types and styles
Pandas layers
Applications
raise ValueTooLargeError
Time series
except ValueTooLargeError:
Moving window print("The number is too large.")
Financial applications
Optimization
## The number is too large.
© 2022 PyEcon.org
Namespaces 82
Essential concepts
Getting started
Procedural We have already come into contact with namenspaces in Python many
programming
Object-orientation times. These are hierarchically linked layers in which the references to
Numerical objects are defined. A rough distinction is made between
programming
NumPy package
Array basics the global namespace, and
Linear algebra
Visual illustrations
On the other hand, locally defined references are only known in a local,
Matplotlib package i.e., internal environment.
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Namespaces 83
Essential concepts
Getting started
Procedural Reference names from the local namespace mask the same names in
programming
Object-orientation an outer or in the global namespace:
Numerical
programming Namespaces
NumPy package
Array basics def multiplier(x):
Linear algebra
x = 4 * x
Data formats and
handling
return x
Pandas package x = "OH"
Series multiplier("AH")
DataFrame
multiplier(x)
x
Import/Export data
Visual illustrations
Matplotlib package ## OH
Figures and subplots ## AHAHAHAH
Plot types and styles
## OHOHOHOH
## OH
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Namespaces 84
Essential concepts
Getting started
Procedural In fact, functions defined in Python are themselves objects that remem-
programming
Object-orientation ber and can access their own context where they were created. This
Numerical concept comes from functional programming and is called closure:
programming
NumPy package
Array basics Closures
Linear algebra
def gen_multiplier(a):
Data formats and
handling
def fun(x):
Pandas package return a * x
Series return fun
DataFrame
Import/Export data
multi1 = gen_multiplier(4)
multi2 = gen_multiplier(5)
Visual illustrations
Matplotlib package
Figures and subplots multi1
Plot types and styles multi1("EH")
Pandas layers
multi2("EH")
Applications
Time series
## <function gen_multiplier.<locals>.fun at 0x127fc4ee0>
Moving window ## EHEHEHEH
Financial applications ## EHEHEHEHEH
Optimization
© 2022 PyEcon.org
Managing code 85
Essential concepts
Getting started
Procedural In order to provide, maintain and extend modular functionality with
programming
Object-orientation Python, its code containing components can be described hierarchically:
Numerical
programming
NumPy package Packages
Array basics
Linear algebra
Applications
date.today()
Time series
timedelta.days
Moving window
Financial applications
datetime.now()
Optimization
In the latter case, all classes and functions, but no instances, are
imported from the datetime namespace.
© 2022 PyEcon.org
Build-in modules 87
Essential concepts
Getting started
Procedural A Python installation ships with a standard library consisting of built-
programming
Object-orientation in modules. These modules provide standardized solutions for many
Numerical problems that occur in everyday programming - “batteries included”.
programming
NumPy package For example, they provide access to system functionality such as file
Array basics
Linear algebra
management. The Python Docs give an overview of all build-in modules.
Data formats and
handling Usage of build-in modules
Pandas package
Series
import math
DataFrame from random import randint
Import/Export data
## 18
© 2022 PyEcon.org
Installing modules 88
Essential concepts
Getting started
Procedural Often you might want to use extended functionality. Python has a large
programming
Object-orientation and active community of users who make their developments publicly
Numerical available under open source license terms. Packages are containers of
programming
NumPy package modules which can be imported and used within your Python code.
Array basics
Linear algebra These third-party packages can be installed comfortably by using the
Data formats and (command line) package manager pip. The Python Package Index
handling
Pandas package provides an overview of the thousands of packages available. Basic
Series
DataFrame
commands for maintaining, for example, the installation of the package
Import/Export data “numpy”:
Visual illustrations
Matplotlib package
Installing the package: pip install numpy
Figures and subplots
Plot types and styles
Upgrading the package: pip install --upgrade numpy
Pandas layers
Installing the package locally for the current user:
Applications
Time series pip install --user numpy
Moving window
Financial applications Uninstalling the package: pip uninstall numpy
Optimization
© 2022 PyEcon.org
Installing modules 89
Essential concepts
Getting started
Procedural Example: OpenCV is a package for image processing in Python. Here
programming
Object-orientation you can see how the installation proceeds in a Unix terminal.
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Writing modules 90
Essential concepts
Getting started
Procedural Your Python projects will become complex and you will need to main-
programming
Object-orientation tain the codes properly. Therefore, one can break a large, unwieldy
Numerical programming task into separate, more manageable modules. Modules
programming
NumPy package can be written in Python itself or in C, but here we keep focussing on
Array basics
Linear algebra
the Python language.
Data formats and Creating modules in Python is very straightforward - a Python module
handling
Pandas package is a file containing Python code, for example:
Series
DataFrame
Import/Export data
s = "Hello world!"
Visual illustrations
Matplotlib package
l = [1, 2, 3, 5, 5]
Figures and subplots
def add_one(n):
Plot types and styles
Pandas layers
Applications return n + 1
Time series
Moving window
Financial applications File: mymodule.py
Optimization
© 2022 PyEcon.org
Working with modules 91
Essential concepts
Getting started
Procedural If you import the module mymodule, the interpreter looks in the
programming
Object-orientation current working directory for a file mymodule.py, reads and interprets
Numerical its contents and makes its namespace available:
programming
NumPy package
Array basics Usage of own modules
Linear algebra
import mymodule
Data formats and
handling mymodule.s
Pandas package mymodule.l
Series mymodule.add_one(5)
DataFrame
Import/Export data ## Hello world!
Visual illustrations ## [1, 2, 3, 5, 5]
Matplotlib package ## 6
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Python packages 92
Essential concepts
Getting started
Procedural Large projects could require more than one module. Packages allow
programming
Object-orientation to structure the modules and their namespaces hierarchically by using
Numerical the dot notation. They are simple folders containing modules and
programming
NumPy package (sub-)packages. Consider the following structure:
Array basics
Linear algebra
Visual illustrations
Matplotlib package
The directory mypackage contains two modules which we can import
Figures and subplots
Plot types and styles separately:
Pandas layers
## 5
© 2022 PyEcon.org
Package initialization 93
Essential concepts
Getting started
Procedural If a package directory contains a file __init__.py, its code is invoked
programming
Object-orientation when the package gets imported. The directory mypackage, now,
Numerical contains the two modules and the initialization file:
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles The file __init__.py can be empty but can also be used for package
Pandas layers
initialization purposes.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
The Zen of Python 94
Essential concepts
Getting started
Procedural
programming The Zen of Python
Object-orientation
import this
Numerical
programming
## The Zen of Python, by Tim Peters
##
NumPy package
Array basics
Linear algebra ##
Data formats and ## Beautiful is better than ugly.
handling ## Explicit is better than implicit.
Pandas package
Series
## Simple is better than complex.
DataFrame ## Complex is better than complicated.
Import/Export data ## Flat is better than nested.
Visual illustrations ## Sparse is better than dense.
Matplotlib package ## Readability counts.
Figures and subplots
Plot types and styles
## Special cases aren't special enough to break the rules.
Pandas layers ## Although practicality beats purity.
Applications ## Errors should never pass silently.
Time series ## Unless explicitly silenced.
Moving window ## In the face of ambiguity, refuse the temptation to guess.
Financial applications
Optimization
## ...
© 2022 PyEcon.org
Further topics 95
Essential concepts
Getting started
Procedural A selection of exciting topics that are among the advanced basics but
programming
Object-orientation are not covered in this lecture:
Numerical
programming Dynamic language concepts, such as duck typing,
NumPy package
Array basics Further, complex type classes, such as ChainMap or OrderedDict,
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Chapter 2 96
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Numerical programming
programming
NumPy package
Array basics
Linear algebra
2.1 NumPy package
Data formats and
handling
2.2 Array basics
Pandas package
Series 2.3 Linear algebra
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 2.1 97
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Numerical programming
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
The NumPy package 98
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Applications
Built-in mathematical functions on arrays without writing loops,
Time series
Moving window
Built-in linear algebra functions.
Financial applications
Optimization
Import NumPy
import numpy as np
© 2022 PyEcon.org
Motivation 99
Essential concepts
Getting started
Procedural
programming Element-wise addition
vec1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
Object-orientation
Numerical
programming vec2 = np.array(vec1)
NumPy package vec1 + vec1
Array basics
Linear algebra
## [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Data formats and
handling
vec2 + vec2
Pandas package
Series
DataFrame ## array([ 2, 4, 6, 8, 10, 12, 14, 16, 18])
Import/Export data
© 2022 PyEcon.org
Motivation 100
Essential concepts
Getting started
Procedural
programming Matrix multiplication
Object-orientation
Numerical
mat1 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
programming mat2 = np.array(mat1)
NumPy package np.dot(mat2, mat2)
Array basics
Visual illustrations
for k in range(3):
Matplotlib package
Figures and subplots
for j in range(3):
Plot types and styles mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
Pandas layers mat3
Applications
Time series ## array([[ 30., 36., 42.],
Moving window
## [ 66., 81., 96.],
Financial applications
Optimization ## [102., 126., 150.]])
© 2022 PyEcon.org
Motivation 101
Essential concepts
Getting started
Procedural
programming Time comparison
Object-orientation
import time
Numerical
programming mat1 = np.random.rand(50, 50)
NumPy package mat2 = np.array(mat1)
Array basics
t = time.time()
Linear algebra
mat3 = np.dot(mat2, mat2)
nptime = time.time() - t
Data formats and
handling
Pandas package mat3 = np.zeros([50, 50])
Series
t = time.time()
for i in range(50):
DataFrame
Import/Export data
for k in range(50):
Visual illustrations
Matplotlib package
for j in range(50):
Figures and subplots mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
Plot types and styles pytime = time.time() - t
times = str(pytime / nptime)
Pandas layers
Applications
print("NumPy is " + times + " times faster!")
Time series
Moving window
Financial applications ## NumPy is 19.49091343854615 times faster!
Optimization
© 2022 PyEcon.org
Section 2.2 102
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Numerical programming
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Creating NumPy arrays 103
Essential concepts
Visual illustrations
## 1
Matplotlib package
Figures and subplots
Plot types and styles
arr3.shape
Pandas layers
Applications ## (3, 3)
Time series
Moving window
© 2022 PyEcon.org
Array creation functions 104
Essential concepts
Numerical np.full((row, column), k): Creates array with all values set to k.
programming
NumPy package
Array basics Array creation
Linear algebra
np.linspace(0, 80, 5)
Data formats and
handling
Pandas package ## array([ 0., 20., 40., 60., 80.])
Series
DataFrame
np.full((5, 4), 7)
Import/Export data
© 2022 PyEcon.org
Array creation functions 106
Essential concepts
© 2022 PyEcon.org
Copy arrays 107
Essential concepts
Getting started
Procedural
programming Reference
arr3
Object-orientation
Numerical
programming
NumPy package
## array([[4, 8, 5],
Array basics ## [9, 3, 4],
Linear algebra ## [1, 0, 6]])
Data formats and
handling
arr = arr3
arr[1, 1] = 777
Pandas package
Series
DataFrame arr3
Import/Export data
arr = arr3 binds arr to the existing arr3. They both refer to the
same object.
© 2022 PyEcon.org
Copy array 108
Essential concepts
Numerical
programming Copy Reference
NumPy package
Array basics arr3 arr3
Linear algebra
arr = arr3
DataFrame
Import/Export data arr = arr3.copy()
arr[1, 1] = 777 arr[1, 1] = 777
Visual illustrations
Matplotlib package arr3 arr3
Figures and subplots
Plot types and styles
## array([[4, 8, 5], ## array([[ 4, 8, 5],
Pandas layers
## [9, 3, 4], ## [ 9, 777, 4],
Applications
## [1, 0, 6]]) ## [ 1, 0, 6]])
Time series
arr3[1, 1] = 3
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Overview: Array creation functions 109
Essential concepts
Getting started
Procedural
programming
Object-orientation
© 2022 PyEcon.org
Data types of arrays 110
Essential concepts
Numerical
programming Data types
NumPy package
Array basics arr1.dtype
Linear algebra
arr2.dtype
Pandas package
Series
DataFrame
Import/Export data ## dtype('float64')
Visual illustrations
Matplotlib package arr1 = arr1 * 2.5
Figures and subplots arr1.dtype
Plot types and styles
Pandas layers
## dtype('float64')
Applications
Time series
arr1 = (arr1 / 2.5).astype(np.int64)
Moving window
Financial applications
arr1.dtype
Optimization
## dtype('int64')
© 2022 PyEcon.org
Array operations 111
Essential concepts
Getting started
Procedural
programming Element-wise operations
Object-orientation
© 2022 PyEcon.org
Integer indexing 113
Essential concepts
Numerical
Indexing with an integer
programming
NumPy package
arr = np.arange(10)
Array basics arr
Linear algebra
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Slicing 114
Essential concepts
Numerical
Slicing in one dimension
programming
NumPy package
arr = np.arange(10)
Array basics arr
Linear algebra
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Slicing 115
Essential concepts
Getting started
Procedural
programming Slicing in one dimension with steps
Object-orientation
Numerical
arr[:7]
programming
NumPy package ## array([0, 1, 2, 3, 4, 5, 6])
Array basics
Linear algebra
arr[-3:]
Data formats and
handling
Pandas package
## array([7, 8, 9])
Series
DataFrame arr[::-1]
Import/Export data
© 2022 PyEcon.org
Slicing 116
Essential concepts
Getting started
Procedural
programming Slicing in higher dimensions
Object-orientation
vec = arr3[1]
Figures and subplots
Plot types and styles
Pandas layers vec
Applications
Time series ## array([9, 3, 4])
Moving window
Financial applications arr3[-1]
Optimization
## array([1, 0, 6])
© 2022 PyEcon.org
Slicing 117
Essential concepts
Getting started
Procedural
programming Slicing in two dimensions
Object-orientation
arr3
Numerical
programming
NumPy package ## array([[4, 8, 5],
Array basics ## [9, 3, 4],
Linear algebra
## [1, 0, 6]])
Data formats and
handling
Pandas package
arr3[0:2, 0:2]
Series
DataFrame ## array([[4, 8],
Import/Export data ## [9, 3]])
Visual illustrations
Matplotlib package arr3[2:, :]
Figures and subplots
Plot types and styles
Pandas layers
## array([[1, 0, 6]])
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Slicing 118
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Views on arrays 119
Essential concepts
Getting started
Procedural So far, selecting by index numbers or slicing belongs to basic indexing
programming
Object-orientation in NumPy. With basic indexing you get NO COPY of your data but a
Numerical so-called view on the existing data set – a different perspective.
programming
NumPy package A view on an array can be seen as a reference to a rectangular memory
Array basics
Linear algebra
area of its values. The view is intended to
Data formats and edit a rectangular part of a matrix, e.g., a sub-matrix, a column,
handling
Pandas package or a single value,
Series
DataFrame change the shape of the matrix or the arrangement of its elements,
Import/Export data
e.g., transpose or reshape a matrix,
Visual illustrations
Matplotlib package change the visual representation of values, e.g., to cast a float
Figures and subplots
Plot types and styles
array into an int array,
Pandas layers
map the values in other program areas.
Applications
Time series
The crucial point here is that for efficiency reasons data arrays in your
Moving window
Financial applications working memory do not have to be copied again and again for simple
Optimization
index operations, which would require an excessive additional effort
writing to the computer memory.
© 2022 PyEcon.org
Creating views implicitly 120
Essential concepts
Getting started
Procedural A view is created automatically when you do basic indexing such as
programming
Object-orientation slicing:
Numerical
programming Create a view by slicing
NumPy package
Array basics column = arr3[:, 1]
Linear algebra
column
Data formats and
handling
Pandas package
## array([8, 3, 0])
Series
DataFrame column.base
Import/Export data
© 2022 PyEcon.org
Creating views implicitly 121
Essential concepts
Getting started
Procedural
programming Create a view by slicing
Object-orientation
Numerical
elem = column[1:2]
programming elem.base
NumPy package
Array basics
## array([[ 4, 8, 5],
## [ 9, 100, 4],
Linear algebra
Applications
Time series
The middle column is a view of the base array referenced by arr3,
Moving window
Financial applications
Optimization
Any changes to the values of a view directly affect the base data,
A view of a view is another view on the same base matrix.
© 2022 PyEcon.org
Obtaining views explicitly 122
Essential concepts
Getting started
Procedural In addition, an array contains methods and attributes that return a
programming
Object-orientation view of its data:
Numerical
programming Obtain a view
NumPy package
Array basics
arr3_t = arr3.T
Linear algebra arr3_t
Data formats and
handling ## array([[4, 9, 1],
Pandas package ## [8, 3, 0],
Series
DataFrame
## [5, 4, 6]])
Import/Export data
arr3_t.flags.owndata
Visual illustrations
Matplotlib package
Figures and subplots ## False
Plot types and styles
Pandas layers arr3_r = arr3.reshape(1, 9)
Applications arr3_r
Time series
Moving window
## array([[4, 8, 5, 9, 3, 4, 1, 0, 6]])
Financial applications
Optimization
arr3_t.flags.owndata
## False
© 2022 PyEcon.org
Obtaining views explicitly 123
Essential concepts
Getting started
Procedural
programming Obtain a view
arr3_v = arr3.view()
Object-orientation
Numerical
programming arr3_v.flags.owndata
NumPy package
Array basics ## False
Linear algebra
© 2022 PyEcon.org
Fancy indexing 124
Essential concepts
Getting started
Procedural The behavior described above changes with advanced indexing, i. e., if
programming
Object-orientation at least one component of the index tuple is not a scalar index number
Numerical or slice. The case of fancy indexing is described below:
programming
NumPy package
Array basics Advanced and basic indexing
Linear algebra
arr3
Data formats and
handling
Pandas package ## array([[4, 8, 5],
Series ## [9, 3, 4],
DataFrame
## [1, 0, 6]])
Import/Export data
© 2022 PyEcon.org
Fancy indexing 125
Essential concepts
Getting started
Procedural
programming Advanced and basic indexing
Object-orientation
Numerical
arr = arr3[0:3:2, 0:3:2]
programming arr
NumPy package
Array basics
## array([[4, 5],
## [1, 6]])
Linear algebra
A boolean array is a NumPy array with boolean True and False values.
Getting started
Procedural
programming
Object-orientation Such an array can be created by applying a comparison operator on
Numerical NumPy arrays.
programming
NumPy package
Array basics Boolean arrays
Linear algebra
bool_arr = (arr3 < 5)
Data formats and
handling bool_arr
Pandas package
Series ## array([[ True, False, False],
DataFrame
Import/Export data
## [False, True, True],
## [ True, True, False]])
Visual illustrations
bool_arr1 = (arr3 == 0)
Matplotlib package
Figures and subplots
Plot types and styles bool_arr1
Pandas layers
Numerical
a = np.array([3, 8, 4, 1, 9, 5, 2])
programming b = np.array([2, 3, 5, 6, 11, 15, 17])
NumPy package c = (a % 2 == 0) | (b % 3 == 0) # or
Array basics
Linear algebra
c
Data formats and
handling
## array([False, True, True, True, False, True, True])
Pandas package
Series d = (a > b) ^ (a % 2 == 1) # exclusive or
DataFrame d
Import/Export data
c ^ d
Figures and subplots
Plot types and styles
# exclusive or
Pandas layers
Applications
## array([False, False, True, False, True, False, True])
Time series
Moving window
Financial applications Boolean arrays
Optimization
Numerical a[b] selects all the elements of x, for which the correspanding value (at
programming
NumPy package the same position) of y is True.
Array basics
Linear algebra
Indexing with boolean arrays
Data formats and
handling arr3
Pandas package
Series
DataFrame
## array([[4, 8, 5],
Import/Export data ## [9, 3, 4],
Visual illustrations
## [1, 0, 6]])
Matplotlib package
Figures and subplots y = arr3 % 2 == 0
Plot types and styles y
Pandas layers
arr3[y]
## array([4, 8, 4, 0, 6])
© 2022 PyEcon.org
Conditional indexing 129
Essential concepts
Getting started
Procedural Conditional indexing allows you using boolean arrays to select subsets
programming
Object-orientation of values and to avoid loops. Applying comparison operator on arrays,
Numerical every element of the array is tested, if it corresponds to the logical
programming
NumPy package condition. Consider an application setting all even numbers to 5:
Array basics
Linear algebra
Find and replace values in arrays
Data formats and
handling a, b = arr3.copy(), arr3.copy()
Pandas package
for i in range(a.shape[0]):
for j in range(a.shape[1]):
Series
DataFrame
Import/Export data if a[i, j] % 2 == 0:
Visual illustrations a[i, j] = 5
Matplotlib package
Figures and subplots
b[b % 2 == 0] = 5
b
Plot types and styles
Pandas layers
## array([[5, 5, 5],
Applications
Time series
Moving window ## [9, 3, 5],
Financial applications ## [1, 5, 5]])
Optimization
np.allclose(a, b)
## True
© 2022 PyEcon.org
Conditional indexing 130
Essential concepts
Getting started
Procedural
programming Find and replace values in arrays, condition: equal
Object-orientation
arr3
Numerical
programming
NumPy package ## array([[4, 8, 5],
Array basics ## [9, 3, 4],
Linear algebra
## [1, 0, 6]])
Data formats and
handling
Pandas package
arr = arr3.copy()
Series arr[arr == 4] = 100
DataFrame arr
Import/Export data
Applications
Time series In this example, arr == 4 creates a boolean array as described
before which is then used to index the array arr.
Moving window
Financial applications
Numerical dimensional array with n integer indices returns the single value at this
programming
NumPy package position.
Array basics
Linear algebra Best practice Step 1a
Data formats and
handling mat = np.arange(12).reshape((3, 4))
Pandas package mat
Series
DataFrame
Import/Export data
## array([[ 0, 1, 2, 3],
## [ 4, 5, 6, 7],
Visual illustrations
Matplotlib package
## [ 8, 9, 10, 11]])
Figures and subplots
Plot types and styles mat[2, 2]
Pandas layers
Applications ## 10
Time series
Moving window
mat[0, -1]
Financial applications
Optimization
## 3
Keep in mind that, in this case only, the results are not arrays but
© 2022 PyEcon.org values!
Best practice: Indexing arrays 132
Essential concepts
Getting started
Procedural Step 1b
Integer indexing array[row index]: In n-dimensional arrays, the ele-
programming
Object-orientation
By specifying the row index only, we create arrays which are views.
© 2022 PyEcon.org
Best practice: Indexing arrays 133
Essential concepts
Getting started
Procedural Step 2a
Slicing array[start : stop : step]: Slicing can be used separately
programming
Object-orientation
© 2022 PyEcon.org
Best practice: Indexing arrays 134
Essential concepts
Getting started
Procedural Step 2b
programming
Object-orientation A frequent task is to get a specific row or column of an array. This can
Numerical be done easily by slicing.
programming
NumPy package
Array basics Best practice Step 2b
mat
Linear algebra
Slicing with [:] means to take every element from the first to the last.
© 2022 PyEcon.org
Best practice: Indexing arrays 135
Essential concepts
Getting started
Procedural Step 3
Fancy indexing array[rows list, columns list]: Return a one di-
programming
Object-orientation
Numerical mensional array with the values at the index tuples specified elementwise
programming
NumPy package by the index lists.
Array basics
Linear algebra
Best practice Step 3
Data formats and
handling mat = np.arange(12).reshape((3, 4))
Pandas package
mat
Series
DataFrame
Import/Export data ## array([[ 0, 1, 2, 3],
Visual illustrations
## [ 4, 5, 6, 7],
Matplotlib package ## [ 8, 9, 10, 11]])
Figures and subplots
Plot types and styles
mat[[1, 2], [1, 2]]
Pandas layers
mat[bool_mat] = 111
Matplotlib package
Figures and subplots
# equivalent to mat[mat > 0] = 111
Plot types and styles mat
Pandas layers
© 2022 PyEcon.org
Best practice: Indexing arrays 137
Essential concepts
Getting started
Procedural Step 5
programming
Object-orientation Replacing values in arrays. Assigning a slice of an array to new values,
Numerical the shape of slice must be considered.
programming
NumPy package
Array basics Best practice Step 5
Linear algebra
mat[0] = np.array([3, 2, 1]) # Fails because the shapes do not fit
Data formats and
handling
## Error: could not broadcast array from shape (3) into shape (4)
Pandas package
Series
mat[2, 3] = 100
DataFrame
Import/Export data
mat[:, 0] = np.array([3, 3, 3])
Visual illustrations
mat
Matplotlib package
Figures and subplots ## array([[ 3, 111, 111, 111],
Plot types and styles ## [ 3, 111, 111, 111],
Pandas layers
## [ 3, 111, 111, 100]])
Applications
Time series
mat[1:3, 1:3] = np.array([[0, 0], [0, 0]])
mat
Moving window
Financial applications
Optimization
## array([[ 3, 111, 111, 111],
## [ 3, 0, 0, 111],
## [ 3, 0, 0, 100]])
© 2022 PyEcon.org
Reshaping arrays 138
Essential concepts
© 2022 PyEcon.org
Adding and removing elements of arrays 139
Essential concepts
© 2022 PyEcon.org
Combining and splitting 140
Essential concepts
© 2022 PyEcon.org
Transposing array 141
Essential concepts
Numerical
Transpose
programming
NumPy package
arr3
Array basics
Linear algebra ## array([[4, 8, 5],
Data formats and ## [9, 3, 4],
handling ## [1, 0, 6]])
Pandas package
arr3.T
Series
DataFrame
Import/Export data
Visual illustrations
## array([[4, 9, 1],
Matplotlib package ## [8, 3, 0],
Figures and subplots ## [5, 4, 6]])
Plot types and styles
Pandas layers
np.eye(3).T
Applications
Time series
## array([[1., 0., 0.],
## [0., 1., 0.],
Moving window
Financial applications
Optimization ## [0., 0., 1.]])
© 2022 PyEcon.org
Matrix multiplication 142
Essential concepts
Numerical
programming Matrix multiplication
NumPy package
Array basics res = np.dot(arr3, np.arange(18).reshape((3, 6)))
Linear algebra
res
Data formats and
handling
Pandas package
## array([[108, 125, 142, 159, 176, 193],
Series ## [ 66, 82, 98, 114, 130, 146],
DataFrame ## [ 72, 79, 86, 93, 100, 107]])
Import/Export data
## True
© 2022 PyEcon.org
Array functions 143
Essential concepts
Getting started
Procedural
programming Element-wise functions
arr3
Object-orientation
Numerical
programming
NumPy package
## array([[4, 8, 5],
Array basics ## [9, 3, 4],
Linear algebra ## [1, 0, 6]])
Data formats and
handling
np.sqrt(arr3)
Pandas package
Series
DataFrame
## array([[2. , 2.82842712, 2.23606798],
Import/Export data ## [3. , 1.73205081, 2. ],
Visual illustrations ## [1. , 0. , 2.44948974]])
Matplotlib package
Figures and subplots np.exp(arr3)
Plot types and styles
Applications
## [8.10308393e+03, 2.00855369e+01, 5.45981500e+01],
Time series
Moving window
## [2.71828183e+00, 1.00000000e+00, 4.03428793e+02]])
Financial applications
Optimization
© 2022 PyEcon.org
Overview: Element-wise array functions 144
Essential concepts
Getting started
Procedural
programming
Object-orientation
Function Description
Numerical abs Absolute value of integer and floating point
programming
NumPy package
sqrt Sqare root
Array basics
Linear algebra
exp Exponential function
Data formats and
log, log10, log2 Natural logarithm, log base 10, log base 2
handling
Pandas package
sign Sign (1 : positiv, 0: zero, -1 : negative)
Series ceil Rounding up to integer
DataFrame
Import/Export data floor Round down to integer
Visual illustrations rint Round to nearest integer
Matplotlib package
Figures and subplots
modf Returns fractional parts
Plot types and styles
Pandas layers
sin, cos, tan, sinh, cosh, tanh, arcsin, ...
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Binary functions 145
Essential concepts
Getting started
Procedural
programming Binary
Object-orientation
Numerical
x = np.array([3, -6, 8, 4, 3, 5])
programming y = np.array([3, 5, 7, 3, 5, 9])
NumPy package np.maximum(x, y)
Array basics
## array([3, 5, 8, 4, 5, 9])
Linear algebra
## array([0, 4, 1, 1, 3, 5])
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Overview: Binary functions 146
Essential concepts
Getting started
Procedural
programming
Object-orientation
Function Description
Numerical add Add elements of arrays
programming
NumPy package
subtract Subtract elements in the second from the first array
Array basics
Linear algebra
multiply Multiply elements
Data formats and
divide Divide elements
handling
Pandas package
power Raise elements in first array to powers in second
Series maximum Element-wise maximum
DataFrame
Import/Export data minimum Element-wise minimum
Visual illustrations mod Element-wise modulus
Matplotlib package
Figures and subplots
greater, less, equal gives boolean
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Data processing 147
Essential concepts
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Data processing 148
Essential concepts
Getting started
Procedural p
programming Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid.
Object-orientation
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Data processing 149
Essential concepts
Getting started
Procedural p
programming Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid.
Object-orientation
Numerical plt.show()
programming
NumPy package
Array basics
Linear algebra
4
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers 2
Applications
Time series
Moving window
Financial applications 0
Optimization
© 2022 PyEcon.org
Conditional logic 150
Essential concepts
otherwise returns b.
programming
Object-orientation
Numerical
programming Conditional logic
NumPy package
Array basics a = np.array([4, 7, 5, -7, 9, 0])
Linear algebra b = np.array([-1, 9, 8, 3, 3, 3])
Data formats and cond = np.array([True, True, False, True, False, False])
handling
Pandas package
res = np.where(cond, a, b)
Series res
DataFrame
Import/Export data ## array([ 4, 7, 8, -7, 3, 3])
Visual illustrations
Matplotlib package res = np.where(a <= b, b, a)
Figures and subplots
res
Plot types and styles
Pandas layers
## array([4, 9, 8, 3, 9, 3])
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Conditional logic 151
Essential concepts
Getting started
Procedural
programming Conditional logic, examples
Object-orientation
arr3
Numerical
programming
NumPy package ## array([[4, 8, 5],
Array basics ## [9, 3, 4],
Linear algebra
## [1, 0, 6]])
Data formats and
handling
Pandas package
res = np.where(arr3 < 5, 0, arr3)
Series res
DataFrame
Import/Export data ## array([[0, 8, 5],
Visual illustrations ## [9, 0, 0],
Matplotlib package
## [0, 0, 6]])
Figures and subplots
Plot types and styles
Pandas layers even = np.where(arr3 % 2 == 0, arr3, arr3 + 1)
Applications
even
Time series
Moving window ## array([[ 4, 8, 6],
Financial applications ## [10, 4, 4],
Optimization
## [ 2, 0, 6]])
© 2022 PyEcon.org
Statistical methods 152
Essential concepts
Numerical
programming Statistical methods
NumPy package
Array basics arr3
Linear algebra
Applications ## 40
Time series
Moving window
Financial applications
arr3.argmin()
Optimization
## 7
© 2022 PyEcon.org
Overview: Statistical methods 153
Essential concepts
Getting started
Procedural
programming
Object-orientation
Method Description
Numerical sum Sum of all array elements
programming
NumPy package
mean Mean of all array elements
Array basics
Linear algebra
std, var Standard deviation, variance
Data formats and
min, max Minimum and Maximum value in array
handling
Pandas package
argmin, argmax Indices of Minimum and Maximum value
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Axis 154
Essential concepts
Getting started
Procedural Axes are defined for arrays with more than one dimension. A two-
programming
Object-orientation dimensional array has two axes. The first one is running vertically
Numerical downwards across the rows (axis=0), the second one running horizon-
programming
NumPy package tally across the columns (axis=1).
Array basics
Linear algebra
Axis
Data formats and
handling arr3
Pandas package
Series ## array([[4, 8, 5],
DataFrame
Import/Export data
## [9, 3, 4],
## [1, 0, 6]])
Visual illustrations
Matplotlib package
Figures and subplots
arr3.sum(axis=0)
Plot types and styles
Pandas layers ## array([14, 11, 15])
Applications
Time series arr3.sum(axis=1)
Moving window
Financial applications
## array([17, 16, 7])
Optimization
© 2022 PyEcon.org
Sorting 155
Essential concepts
Numerical
Sorting one-dimensional arrays
programming
arr2
NumPy package
Array basics
Linear algebra ## array([24.3 , 0. , 8.9 , 4.4 , 1.65, 45. ])
Data formats and
handling arr2.sort()
Pandas package arr2
Series
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Sorting 156
Essential concepts
Getting started
Procedural
programming
Sorting two-dimensional arrays
Object-orientation
arr3
Numerical
programming
NumPy package
## array([[4, 8, 5],
Array basics ## [9, 3, 4],
Linear algebra ## [1, 0, 6]])
Data formats and
handling
arr3.sort()
arr3
Pandas package
Series
DataFrame
Import/Export data ## array([[4, 5, 8],
Visual illustrations ## [3, 4, 9],
Matplotlib package ## [0, 1, 6]])
Figures and subplots
Plot types and styles
Pandas layers
arr3.sort(axis=0)
arr3
Applications
Time series
Moving window
## array([[0, 1, 6],
Financial applications ## [3, 4, 8],
Optimization ## [4, 5, 9]])
The default axis using sort() is -1, which means to sort along the
© 2022 PyEcon.org
last axis (in this case axis 1).
Section 2.3 157
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Numerical programming
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Inverse matrix 158
Essential concepts
Getting started
Procedural
programming Import numpy.linalg
Object-orientation
import numpy.linalg as nplin
Numerical
programming
NumPy package
Array basics
nplin.inv(array): Computes the inverse matrix.
Linear algebra np.allclose(array1, array2): Returns True if two arrays are ele-
Data formats and
handling
ment-wise equal within a tolerance.
Pandas package
Series Inverse
DataFrame
Import/Export data inv = nplin.inv(arr3)
Visual illustrations inv
Matplotlib package
Figures and subplots ## array([[ 4., -21., 16.],
Plot types and styles
## [ -5., 24., -18.],
Pandas layers
## [ 1., -4., 3.]])
Applications
© 2022 PyEcon.org
Matrix functions 159
Essential concepts
Visual illustrations ## 13
Matplotlib package
np.diag(arr3)
Figures and subplots
Plot types and styles
Pandas layers
Applications
## array([0, 4, 9])
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Eigenvalues and eigenvectors 160
Essential concepts
© 2022 PyEcon.org
Eigenvalues and eigenvectors 161
Essential concepts
Getting started
Procedural
programming Check eigenvalues and eigenvectors
Object-orientation
eigenval * eigenvec
Numerical
programming
NumPy package ## array([[-0.00000000e+00, -4.08248290e-01, -1.41421356e+00],
Array basics ## [-0.00000000e+00, -8.16496581e-01, -1.41421356e+00],
Linear algebra
## [-1.00000000e+00, -4.08248290e-01, 2.34055565e-17]])
Data formats and
handling
Pandas package
np.dot(A, eigenvec)
Series
DataFrame ## array([[ 0.00000000e+00, -4.08248290e-01, -1.41421356e+00],
Import/Export data ## [ 0.00000000e+00, -8.16496581e-01, -1.41421356e+00],
Visual illustrations ## [-1.00000000e+00, -4.08248290e-01, -1.17027782e-17]])
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications 3 −1 0 0 0 0
Time series
Moving window
2 0 0 · 0 = (−1) · 0 = 0
Financial applications
Optimization
−2 2 −1 1 1 −1
© 2022 PyEcon.org
QR decomposition 162
Essential concepts
Visual illustrations R
Matplotlib package
Figures and subplots ## array([[ -5. , -6.4 , -12. ],
Plot types and styles
## [ 0. , 1.0198039 , 6.07960019],
Pandas layers
## [ 0. , 0. , 0.19611614]])
Applications
© 2022 PyEcon.org
Linearsystem 163
Essential concepts
Numerical
Solve linearsystems
programming
NumPy package
b = np.array([7, 4, 8])
Array basics x = nplin.solve(A, b)
Linear algebra x
Data formats and
handling ## array([ 2., -1., -14.])
Pandas package
np.allclose(np.dot(A, x), b)
Series
DataFrame
Import/Export data
Visual illustrations
## True
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
3x1 − 1x2 + 0x3 =7 x1 2
Time series 2x1 − 0x2 + 0x3 = 4 → x2 = −1
Moving window
Financial applications −2x1 + 2x2 − 1x3 =8 x3 −14
Optimization
© 2022 PyEcon.org
Overview: Linear algebra 164
Essential concepts
Getting started
Procedural
programming
Object-orientation
Function Description
Numerical np.dot Matrix multiplication
programming
NumPy package
np.trace Sum of the diagonal elements
Array basics
Linear algebra
np.diag Diagonal elements as an array
Data formats and
nplin.det Matrix determinant
handling
Pandas package
nplin.eig Eigenvalues and eigenvectors
Series nplin.inv Inverse matrix
DataFrame
Import/Export data nplin.qr QR decomposition
Visual illustrations nplin.solve Solve linearsystem
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Chapter 3 165
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Data formats and handling
programming
NumPy package
Array basics
Linear algebra
3.1 Pandas package
Data formats and
handling
3.2 Series
Pandas package
Series 3.3 DataFrame
DataFrame
Import/Export data 3.4 Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 3.1 166
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Data formats and handling
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Pandas 167
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
© 2022 PyEcon.org
Motivation 168
Essential concepts
With pandas you can import and visualize financial data in only a few
Getting started
Procedural
programming
Object-orientation lines of code.
Numerical
programming Motivation
NumPy package
Array basics
import pandas as pd
Linear algebra import matplotlib.pyplot as plt
Data formats and
handling fig = plt.figure()
Pandas package
ax = fig.add_subplot(1, 1, 1)
Series
DataFrame
dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
Import/Export data close = dow["Close"]
Visual illustrations close.plot(ax=ax)
Matplotlib package ax.set_xlabel("Date")
Figures and subplots
ax.set_ylabel("Price")
Plot types and styles
Pandas layers
ax.set_title("DJI")
Applications
fig.savefig("out/dji.pdf", format="pdf")
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Motivation 169
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical DJI
programming 27500
NumPy package
Array basics 25000
Linear algebra
Visual illustrations
Matplotlib package 15000
Figures and subplots
Plot types and styles
Pandas layers
12500
Applications
Time series
10000
Moving window
Financial applications 7500
Optimization
6 8 0 2 4 6 8
200 200 201 201 201 201 201
Date
© 2022 PyEcon.org
Section 3.2 170
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Data formats and handling
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Series 171
Essential concepts
Getting started
Procedural Series are a data structure in pandas.
programming
Object-orientation
Numerical
One-dimensional array-like object,
programming
NumPy package
Containing a sequence of values and a corresponding array of
Array basics labels, called the index,
Linear algebra
Data formats and The string representation of a Series displays the index on the left
handling
Pandas package
and the values on the right,
Series
DataFrame
The default index consists of the integers 0 through N-1.
Import/Export data
Visual illustrations
Matplotlib package String representation of a Series
Figures and subplots
Plot types and styles
## 0 3
Pandas layers ## 1 7
Applications ## 2 -8
Time series ## 3 4
Moving window
## 4 26
Financial applications
Optimization
## dtype: int64
© 2022 PyEcon.org
Create Series 172
Essential concepts
© 2022 PyEcon.org
Create Series 173
Essential concepts
Getting started
Procedural
programming Series indexing vs. Numpy indexing
Object-orientation
Numerical
obj2 = pd.Series([2, -5, 9, 4], index=["a", "b", "c", "d"])
programming npobj = np.array([2, -5, 9, 4])
NumPy package obj2
Array basics
## a 2
Linear algebra
Applications npobj[1]
Time series
Moving window ## -5
Financial applications
Optimization
© 2022 PyEcon.org
Create Series 175
Essential concepts
Getting started
Procedural
programming Series from dicts
dictdata = {"Göttingen": 117665, "Northeim": 28920,
Object-orientation
Numerical
programming "Hannover": 532163, "Berlin": 3574830}
NumPy package obj3 = pd.Series(dictdata)
Array basics obj3
Linear algebra
Applications Compared to NumPy array you can use the set index to select
Time series
single values,
Moving window
Financial applications
Optimization
Data contained in a dict can be passed to a Series. The index of
the resulting Series consists of the dict’s keys.
© 2022 PyEcon.org
Create Series 176
Essential concepts
Getting started
Procedural
programming Dict to Series with manual index
cities = ["Hamburg", "Göttingen", "Berlin", "Hannover"]
Object-orientation
Numerical
programming obj4 = pd.Series(dictdata, index=cities)
NumPy package obj4
Array basics
Linear algebra
## Hamburg NaN
Data formats and ## Göttingen 117665.0
handling
Pandas package
## Berlin 3574830.0
Series ## Hannover 532163.0
DataFrame ## dtype: float64
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots Passing a dict to a Series, the index can be set manually,
NaN (not a number) marks missing values where the index and the
Plot types and styles
Pandas layers
© 2022 PyEcon.org
Series properties 177
Essential concepts
Numerical
programming Series properties
NumPy package
Array basics obj.values
Linear algebra
Visual illustrations
Matplotlib package
obj2.index
Figures and subplots
Plot types and styles ## Index(['a', 'b', 'c', 'd'], dtype='object')
Pandas layers
Applications
Time series
Moving window
The values and the index of a Series can be printed separately.
Financial applications
Optimization
The default index, if none was explicitly specified, is a RangeIndex.
RangeIndex inherits from Index class.
© 2022 PyEcon.org
Selecting and manipulating values 178
Essential concepts
Getting started
Procedural
programming Series manipulation
Object-orientation
Numerical
obj2[["c", "d", "a"]]
programming
NumPy package ## c 9
Array basics
## d 4
## a 2
Linear algebra
© 2022 PyEcon.org
Selecting and manipulating values 179
Essential concepts
Getting started
Procedural
programming Series functions
Object-orientation
obj2 * 2
Numerical
programming
NumPy package
## a 4
Array basics ## b -10
Linear algebra ## c 18
Data formats and ## d 8
handling
## dtype: int64
Pandas package
Series
DataFrame np.exp(obj2)["a":"c"]
Import/Export data
Applications
Time series
"c" in obj2
Moving window
Financial applications ## True
Optimization
## Hamburg 1900000.0
Figures and subplots
Plot types and styles
Pandas layers ## Göttingen 117665.0
Applications ## Berlin 3600000.0
Time series ## Hannover 1100000.0
Moving window
## dtype: float64
Financial applications
Optimization
Numerical
programming NaN
NumPy package
Array basics pd.isnull(obj4)
Linear algebra
© 2022 PyEcon.org
Align differently indexed data 182
Essential concepts
There are not two values to align for Hamburg and Northeim – so they
Getting started
Procedural
Numerical
programming
NumPy package
Array basics
Data 1 Data 2
Linear algebra obj3 obj4
Data formats and
handling ## Göttingen 117665 ## Hamburg 1900000.0
Pandas package
Series
## Northeim 28920 ## Göttingen 117665.0
DataFrame ## Hannover 532163 ## Berlin 3600000.0
Import/Export data ## Berlin 3574830 ## Hannover 1100000.0
Visual illustrations ## dtype: int64 ## dtype: float64
Matplotlib package
Figures and subplots
Plot types and styles Align data
Pandas layers
Applications
obj3 + obj4
Time series
Moving window ## Berlin 7174830.0
Financial applications ## Göttingen 235330.0
Optimization
## Hamburg NaN
## Hannover 1632163.0
## Northeim NaN
## dtype: float64
© 2022 PyEcon.org
Naming Series 183
Essential concepts
Numerical
programming Naming
NumPy package
Array basics obj4.name = "population"
Linear algebra obj4.index.name = "city"
Data formats and obj4
handling
Pandas package
Series
## city
DataFrame ## Hamburg 1900000.0
Import/Export data ## Göttingen 117665.0
Visual illustrations ## Berlin 3600000.0
Matplotlib package ## Hannover 1100000.0
Figures and subplots
Plot types and styles
## Name: population, dtype: float64
Pandas layers
Applications
Time series The attribute name will change the name of the existing Series,
Moving window
Financial applications There is no default name of the Series or the index.
Optimization
© 2022 PyEcon.org
Series vs. NumPy arrays 184
Essential concepts
Getting started
Procedural
programming NumPy arrays are accessed by their integer positions,
Object-orientation
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 3.3 185
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Data formats and handling
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
DataFrame 186
Essential concepts
Getting started
Procedural
programming DataFrames are the primary structure of pandas,
Object-orientation
Visual illustrations
Matplotlib package String representation of a DataFrame
Figures and subplots
Plot types and styles ## company price volume
Pandas layers
## 0 Daimler 69.20 4456290
Applications ## 1 E.ON 8.11 3667975
Time series
Moving window
## 2 Siemens 110.92 3669487
Financial applications ## 3 BASF 87.28 1778058
Optimization ## 4 BMW 87.81 1824582
© 2022 PyEcon.org
DataFrame 187
Essential concepts
Numerical
programming "price", "change"])
NumPy package frame2
Array basics
Linear algebra
## company volume price change
Data formats and ## 0 Daimler 4456290 69.20 NaN
handling
Pandas package
## 1 E.ON 3667975 8.11 NaN
Series ## 2 Siemens 3669487 110.92 NaN
DataFrame ## 3 BASF 1778058 87.28 NaN
Import/Export data
## 4 BMW 1824582 87.81 NaN
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles Passing a column that is not contained in the dict, it will be
Pandas layers
marked with NaN,
Applications
Time series The default index will be assigned automatically as with Series.
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Inputs to DataFrame constructor 189
Essential concepts
Getting started
Procedural
programming
Object-orientation
Type Description
Numerical 2D NumPy arrays A matrix of data
programming
NumPy package
dict of arrays, lists, or tuples Each sequence becomes a column
Array basics
Linear algebra
dict of Series Each value becomes a column
Data formats and
dict of dicts Each inner dict becomes a column
handling
Pandas package
List of dicts or Series Each item becomes a row
Series List of lists or tuples Treated as the 2D NumPy arrays
DataFrame
Import/Export data Another DataFrame Same indexes
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Indexing and adding DataFrames 190
Essential concepts
Getting started
Procedural
programming Add data to DataFrame
frame2["change"] = [1.2, -3.2, 0.4, -0.12, 2.4]
Object-orientation
Numerical
programming frame2["change"]
NumPy package
Array basics ## 0 1.20
Linear algebra
## 1 -3.20
Data formats and ## 2 0.40
handling
Pandas package
## 3 -0.12
Series ## 4 2.40
DataFrame ## Name: change, dtype: float64
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots Selecting the column of DataFrame, a Series is returned,
A attribute-like access, e.g., frame2.change, is also possible,
Plot types and styles
Pandas layers
Applications
Time series
The returned Series has the same index as the initial DataFrame.
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Indexing DataFrames 191
Essential concepts
Getting started
Procedural
programming Indexing DataFrames
Object-orientation
Numerical
frame2[["company", "change"]]
programming
NumPy package ## company change
Array basics
## 0 Daimler 1.20
## 1 E.ON -3.20
Linear algebra
Visual illustrations
Matplotlib package
Using a list of multiple columns while indexing, the result is a
Figures and subplots DataFrame,
Plot types and styles
Pandas layers The returned DataFrame has the same index as the initial one.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Changing DataFrames 192
Essential concepts
Numerical
DataFrame delete column
programming
del frame2["volume"]
frame2
NumPy package
Array basics
Linear algebra
© 2022 PyEcon.org
Naming DataFrames 193
Essential concepts
Getting started
Procedural
programming Naming properties
Object-orientation
frame2.index.name = "number:"
Numerical
programming frame2.columns.name = "feature:"
NumPy package frame2
Array basics
Linear algebra
## feature: company price change
Data formats and
handling
## number:
Pandas package ## 0 Daimler 69.20 1.20
Series ## 1 E.ON 8.11 -3.20
DataFrame
## 2 Siemens 110.92 0.40
Import/Export data
## 3 BASF 87.28 -0.12
## 4 BMW 87.81 2.40
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
In DataFrames there is no default name for the index or the
Applications
Time series
columns.
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Reindexing 194
Essential concepts
© 2022 PyEcon.org
Reindexing 195
Essential concepts
Getting started
Procedural
programming Filling missing values
Object-orientation
Numerical
frame4 = frame.reindex(index=[0, 2, 3, 4, 5], fill_value=0,
programming columns=["company", "price", "market cap"])
NumPy package frame4
Array basics
© 2022 PyEcon.org
Fill NaN 196
Essential concepts
© 2022 PyEcon.org
Dropping entries 198
Essential concepts
Getting started
Procedural
programming Dropping column
Object-orientation
Numerical
frame5[:2]
programming
NumPy package ## company price volume
Array basics
## 0 Daimler 69.20 4456290
## 1 E.ON 8.11 3667975
Linear algebra
© 2022 PyEcon.org
Indexing, selecting and filtering 199
Essential concepts
Applications
## company price volume
Time series
## 2 Siemens 110.92 3669487
Moving window ## 3 BASF 87.28 1778058
Financial applications
## 4 BMW 87.81 1824582
Optimization
© 2022 PyEcon.org
Indexing, selecting and filtering 200
Essential concepts
Getting started
Procedural
programming Indexing
Object-orientation
Numerical
frame6 = pd.DataFrame(data, index=["a", "b", "c", "d", "e"])
programming frame6
NumPy package
Array basics
## company price volume
## a Daimler 69.20 4456290
Linear algebra
© 2022 PyEcon.org
Indexing, selecting and filtering 201
Essential concepts
© 2022 PyEcon.org
Indexing, selecting and filtering 202
Essential concepts
Getting started
Procedural
programming Selection with loc and iloc
frame6.loc[["c", "d", "e"], ["volume", "price", "company"]]
Object-orientation
Numerical
programming
NumPy package ## volume price company
Array basics ## c 3669487 110.92 Siemens
Linear algebra
## d 1778058 87.28 BASF
Data formats and ## e 1824582 87.81 BMW
handling
Pandas package
Series
frame6.iloc[2:, ::-1]
DataFrame
Import/Export data ## volume price company
Visual illustrations ## c 3669487 110.92 Siemens
Matplotlib package ## d 1778058 87.28 BASF
Figures and subplots
Plot types and styles
## e 1824582 87.81 BMW
Pandas layers
Applications
Time series Both of the indexing functions work with slices or lists of labels,
Moving window
Financial applications Many ways to select and rearrange pandas objects.
Optimization
© 2022 PyEcon.org
DataFrame indexing options 203
Essential concepts
Getting started
Procedural
programming
Object-orientation
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Hierarchical indexing 204
Essential concepts
Getting started
Procedural Hierarchical indexing enables you to have multiple index levels.
programming
Object-orientation
Multiindex
Numerical
programming ind = [["a", "a", "a", "b", "b"], [1, 2, 3, 1, 2]]
NumPy package frame6 = pd.DataFrame(np.arange(15).reshape((5, 3)), index=ind,
columns=["first", "second", "third"])
Array basics
Linear algebra
frame6
Data formats and
handling
Pandas package ## first second third
Series ## a 1 0 1 2
DataFrame
Import/Export data
## 2 3 4 5
## 3 6 7 8
Visual illustrations
Matplotlib package
## b 1 9 10 11
Figures and subplots ## 2 12 13 14
Plot types and styles
Pandas layers
frame6.index.names = ["index1", "index2"]
Applications frame6.index
Time series
Moving window
Financial applications
## MultiIndex([('a', 1),
Optimization ## ('a', 2),
## ('a', 3),
## ('b', 1),
## ('b', 2)],
© 2022 PyEcon.org ## names=['index1', 'index2'])
Hierarchical indexing 205
Essential concepts
Getting started
Procedural
programming Selecting of a multiindex
Object-orientation
Numerical
frame6.loc["a"]
programming
NumPy package ## first second third
Array basics
## index2
## 1 0 1 2
Linear algebra
© 2022 PyEcon.org
Operations between DataFrame and Series 206
Essential concepts
Getting started
Procedural
programming Series and DataFrames
frame7 = frame[["price", "volume"]]
Object-orientation
Numerical
programming frame7.index = ["Daimler", "E.ON", "Siemens", "BASF", "BMW"]
NumPy package series = frame7.iloc[2]
Array basics frame7
Linear algebra
Applications
## price 110.92
Time series ## volume 3669487.00
Moving window ## Name: Siemens, dtype: float64
Financial applications
Optimization
Here the Series was generated from the first row of the DataFrame.
© 2022 PyEcon.org
Operations between DataFrames and Series 207
Essential concepts
Getting started
Procedural
programming Operations between Series and DataFrames down the rows
Object-orientation
frame7 + series
Numerical
programming
NumPy package ## price volume
Array basics ## Daimler 180.12 8125777.0
Linear algebra
## E.ON 119.03 7337462.0
Data formats and
handling
## Siemens 221.84 7338974.0
Pandas package ## BASF 198.20 5447545.0
Series ## BMW 198.73 5494069.0
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
By default arithmetic operations between DataFrames and Series
Figures and subplots match the index of the Series on the DataFrame’s columns,
Plot types and styles
Pandas layers The operations will be broadcasted along the rows.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Operations between DataFrames and Series 208
Essential concepts
Getting started
Procedural
programming Operations between Series and DataFrames down the columns
Object-orientation
Numerical
series2 = frame7["price"]
programming frame7.add(series2, axis=0)
NumPy package
Array basics
## price volume
## Daimler 138.40 4456359.20
Linear algebra
© 2022 PyEcon.org
Operations between DataFrames and Series 209
Essential concepts
Getting started
Procedural
programming Pandas vs Numpy
Object-orientation
Numerical
nparr = np.arange(12.).reshape((3, 4))
programming row = nparr[0]
NumPy package nparr - row
Array basics
Visual illustrations
Operations between DataFrames are similar to operations between
Matplotlib package one- and two-dimensional Numpy arrays,
Figures and subplots
Plot types and styles As in DataFrames and Series the arithmetic operations will be
Pandas layers
broadcasted along the rows.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
NumPy functions on DataFrames 210
Essential concepts
© 2022 PyEcon.org
Grouping DataFrames 211
Essential concepts
© 2022 PyEcon.org
Grouping DataFrames 212
Essential concepts
Getting started
Procedural
programming Groupby
Object-orientation
Numerical
res = vote.groupby(["Party", "Vote"]).count()
programming res
NumPy package
Array basics
## Member
## Party Vote
Linear algebra
© 2022 PyEcon.org
Section 3.4 213
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Data formats and handling
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Reading data in text format 214
Essential concepts
Getting started
Procedural ex1.csv
programming
Object-orientation
Numerical
programming
a, b, c, d, hello
NumPy package 1, 2, 3, 4, world
Array basics
Linear algebra
5, 6, 7, 8, python
Data formats and 2, 3, 5, 7, pandas
handling
Pandas package
Series
DataFrame
pd.read_csv("file"): Reads CSV into DataFrame.
Import/Export data
Applications ## a b c d hello
Time series ## 0 1 2 3 4 world
## 1 5 6 7 8 python
Moving window
Financial applications
Optimization ## 2 2 3 5 7 pandas
© 2022 PyEcon.org
Reading data in text format 215
Essential concepts
Getting started
Procedural tab.txt
programming
Object-orientation
Numerical
programming
a| b| c| d| hello
NumPy package 1| 2| 3| 4| world
5| 6| 7| 8| python
Array basics
Linear algebra
© 2022 PyEcon.org
Reading data in text format 216
Essential concepts
Getting started
Procedural ex2.csv
programming
Object-orientation
Numerical
programming
1, 2, 3, 4, world
NumPy package 5, 6, 7, 8, python
Array basics
Linear algebra
2, 3, 5, 7, pandas
Data formats and
handling
Pandas package CSV file without header row:
Series
DataFrame
Read CSV and header settings
Import/Export data
© 2022 PyEcon.org
Reading data in text format 217
Essential concepts
Getting started
Procedural ex2.csv
programming
Object-orientation
Numerical
programming
1, 2, 3, 4, world
NumPy package 5, 6, 7, 8, python
Array basics
Linear algebra
2, 3, 5, 7, pandas
Data formats and
handling
Pandas package Specify header:
Series
DataFrame
Import/Export data
Read CSV and header names
Visual illustrations df = pd.read_csv("data/ex2.csv",
Matplotlib package names=["a", "b", "c", "d", "hello"])
Figures and subplots
df
Plot types and styles
Pandas layers
## a b c d hello
Applications
Time series
## 0 1 2 3 4 world
Moving window ## 1 5 6 7 8 python
Financial applications ## 2 2 3 5 7 pandas
Optimization
© 2022 PyEcon.org
Reading data in text format 218
Essential concepts
Getting started
Procedural ex2.csv
programming
Object-orientation
Numerical
programming
1, 2, 3, 4, world
NumPy package 5, 6, 7, 8, python
Array basics
Linear algebra
2, 3, 5, 7, pandas
Data formats and
handling
Pandas package Use hello-column as the index:
Series
DataFrame
Read CSV and specify index
Import/Export data
© 2022 PyEcon.org
Reading data in text format 219
Essential concepts
Getting started
Procedural ex3.csv
programming
Object-orientation
Numerical
programming
1, 2, 3, 4, world
NumPy package #+#-.,.-'*'-.,
Array basics
Linear algebra
5, 6, 7, 8, python
Data formats and 87646756754456978
handling
Pandas package
2, 3, 5, 7, pandas
Series
DataFrame
Import/Export data Skip rows while reading:
Visual illustrations
Matplotlib package
Figures and subplots
Read CSV and choose rows
Plot types and styles df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
Pandas layers
df
Applications
Time series
## 1 2 3 4 world
## 0 5 6 7 8 python
Moving window
Financial applications
Optimization ## 1 2 3 5 7 pandas
© 2022 PyEcon.org
Writing data to text file 220
Essential concepts
Numerical
Write to CSV
programming df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
NumPy package
Array basics
df.to_csv("out/out1.csv")
Linear algebra
,1, 2, 3, 4, world
Series
DataFrame
Import/Export data
0,5,6,7,8, python
Visual illustrations
Matplotlib package
1,2,3,5,7, pandas
Figures and subplots
Plot types and styles
Pandas layers In the .csv file, the index and header is included (reason why ,1).
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Writing data to text file 221
Essential concepts
Getting started
Procedural
programming Write to CSV and settings
Object-orientation
Numerical
df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
programming df.to_csv("out/out2.csv", index=False, header=False)
NumPy package
Array basics
Linear algebra out2.csv
Data formats and
handling
Pandas package 5,6,7,8, python
2,3,5,7, pandas
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Writing data to text file 222
Essential concepts
Getting started
Procedural
programming Write to CSV and specify header
Object-orientation
Numerical
df = pd.read_csv("data/ex3.csv", skiprows=[1, 3, 4])
programming df.to_csv("out/out3.csv", index=False,
NumPy package header=["a", "b", "c", "d", "e"])
Array basics
Linear algebra
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Reading Excel files 223
Essential concepts
Numerical
programming
NumPy package
Array basics
Linear algebra
© 2022 PyEcon.org
Reading Excel files 224
Essential concepts
Getting started
Procedural
programming Excel as a DataFrame
xls_frame[["Adj Close", "Volume", "High"]]
Object-orientation
Numerical
programming
NumPy package ## Adj Close Volume High
Array basics ## 0 1169.939941 1538700 1173.000000
Linear algebra
## 1 1167.699951 2412100 1174.000000
Data formats and ## 2 1111.900024 4857900 1123.069946
handling
Pandas package
## 3 1055.800049 3798300 1110.000000
Series ## 4 1080.599976 3448000 1081.709961
DataFrame ## 5 1048.579956 2341700 1081.780029
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Remote data access 225
Essential concepts
Getting started
Procedural Extract financial data from Internet sources into a DataFrame. There
programming
Object-orientation are different sources offering different kind of data. Some sources are:
Numerical
programming
Robinhood
NumPy package
Array basics
IEX
Linear algebra
Yahoo Finance
Data formats and
handling
Pandas package
World Bank
Series
DataFrame
OECD
Import/Export data
Eurostat
Visual illustrations
Matplotlib package A complete list of the sources and the usage can be found here:
Figures and subplots
pandas-datareader
Plot types and styles
Pandas layers
Applications
Import pandas-datareader
Time series
Moving window
from pandas_datareader import data
Financial applications
Optimization
© 2022 PyEcon.org
Data access: Yahoo Finance 226
Essential concepts
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Data access: Yahoo Finance 227
Essential concepts
Getting started
Procedural
programming Explore Ford dataset
Object-orientation
ford.index
Numerical
programming ## DatetimeIndex(['2020-01-02', '2020-01-03',...
NumPy package
Array basics
## ...dtype='datetime64[ns]', name='Date',...
Linear algebra
ford.loc["2020-01-28"]
Data formats and
handling
Pandas package
## High 9.000000e+00
Series ## Low 8.860000e+00
DataFrame ## Open 8.940000e+00
Import/Export data
## Close 8.970000e+00
Visual illustrations ## Volume 8.516340e+07
Matplotlib package
Figures and subplots
## Adj Close 8.730923e+00
Plot types and styles ## Name: 2020-01-28 00:00:00, dtype: float64
Pandas layers
Applications
Time series DataFrame index
Moving window
Financial applications Index of the DataFrame is different at different sources. Always check
DataFrame.index!
Optimization
© 2022 PyEcon.org
Data access: Yahoo Finance 228
Essential concepts
Getting started
Procedural
programming Download and explore SAP data
Object-orientation
Numerical
sap = data.DataReader("SAP", "yahoo", "2020-01-01", "2020-06-30")
programming sap[25:27]
NumPy package
Array basics
## High Low ... Volume Adj Close
## Date ...
Linear algebra
© 2022 PyEcon.org
Data access: Eurostat 229
Essential concepts
Getting started
Procedural
programming Eurostat
population = data.DataReader("tps00001", "eurostat", "2010-01-01",
Object-orientation
Numerical
programming "2020-01-01")
NumPy package
population.columns
Array basics
Linear algebra
## MultiIndex(levels=[[Population on 1 January - total], [Albania,
Data formats and ## Andorra, Armenia, Austria, Azerbaijan, Belarus, Belgium, ...
handling
Pandas package population["Population on 1 January - total", "France"][-5:]
Series
DataFrame ## FREQ Annual
Import/Export data
## TIME_PERIOD
Visual illustrations ## 2016-01-01 66638391.0
Matplotlib package
Figures and subplots
## 2017-01-01 66809816.0
Plot types and styles ## 2018-01-01 66918941.0
Pandas layers ## 2019-01-01 67012883.0
Applications ## 2020-01-01 67098824.0
Time series
Moving window
Eurostat Database
Financial applications
Optimization
© 2022 PyEcon.org
Read data from HTML 230
Essential concepts
Getting started
Procedural Website used for the example: Econometrics
programming
Object-orientation
Numerical
Beautiful Soup
programming
from bs4 import BeautifulSoup
NumPy package
Array basics
import requests
Linear algebra url = "www.uni-goettingen.de/de/applied-econometrics/412565.html"
Data formats and r = requests.get("https://" + url)
handling d = r.text
soup = BeautifulSoup(d, "lxml")
Pandas package
Series
DataFrame
soup.title
Import/Export data
© 2022 PyEcon.org
Motivation 231
Essential concepts
Getting started
Procedural
programming Bollinger
Object-orientation
Numerical
sap = data.DataReader("SAP", "yahoo", "2019-01-01", "2020-08-31")
programming sap.index = pd.to_datetime(sap.index)
NumPy package boll = sap["Close"].rolling(window=20, center=False).mean()
Array basics
Linear algebra
std = sap["Close"].rolling(window=20, center=False).std()
upp = boll + std * 2
Data formats and
handling low = boll - std * 2
Pandas package fig = plt.figure()
Series
ax = fig.add_subplot(1, 1, 1)
DataFrame
Import/Export data
boll.plot(ax=ax, label="20 days Rolling mean")
Visual illustrations
upp.plot(ax=ax, label="Upper Band")
Matplotlib package low.plot(ax=ax, label="Lower Band")
Figures and subplots sap["Close"].plot(ax=ax, label="SAP Price")
Plot types and styles
ax.legend(loc="best")
Pandas layers
fig.savefig("out/boll.pdf")
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Motivation 232
Essential concepts
Getting started
Procedural
programming
Object-orientation
20 days Rolling mean
Upper Band
Numerical Lower Band
programming 160 SAP Price
NumPy package
Array basics
Linear algebra
Applications
Time series
Moving window
1 3 5 7 9 1 1 3 5 7 9
9-0 019-0 019-0 019-0 019-0 019-1 020-0 020-0 020-0 020-0 020-0
201
Financial applications
Optimization 2 2 2 2 2 2 2 2 2 2
Date
© 2022 PyEcon.org
Chapter 4 233
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Visual illustrations
programming
NumPy package
Array basics
Linear algebra
4.1 Matplotlib package
Data formats and
handling
4.2 Figures and subplots
Pandas package
Series 4.3 Plot types and styles
DataFrame
Import/Export data 4.4 Pandas layers
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 4.1 234
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Visual illustrations
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
matplotlib 235
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Data formats and The package matplotlib is a free software library for python including
handling
Pandas package
the following functions:
Series
DataFrame
Image plots, Contour plots, Scatter plots, Polar plots, Line plots,
Import/Export data 3D plots,
Visual illustrations
Matplotlib package Variety of hardcopy formats,
Figures and subplots
Plot types and styles Works in Python scripts, the Python and IPython shell and the
Pandas layers
Jupyter notebook,
Applications
Time series Interactive environments.
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
matplotlib 236
Essential concepts
Getting started
Procedural
programming
Usage of matplotlib
Object-orientation
Numerical
matplotlib has a vast number of functions and options, which is hard
programming to remember. But for almost every task there is an example you can
NumPy package
Array basics take code from. A great source of information is the examples gallery
Linear algebra
on the matplotlib homepage. Also note the best practice quick start
Data formats and
handling guide.
Pandas package
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Simple plot 237
Essential concepts
Numerical
programming Import matplotlib and simple example
NumPy package
Array basics import matplotlib.pyplot as plt
Linear algebra
import numpy as np
Data formats and plt.plot(np.arange(10))
handling
Pandas package
plt.savefig("out/list.pdf")
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
8
Figures and subplots
Plot types and styles 6
Pandas layers
4
Applications
Time series
2
Moving window
Financial applications 0
Optimization 0 2 4 6 8
© 2022 PyEcon.org
Section 4.2 238
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Visual illustrations
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Figures 239
Essential concepts
Numerical parameters.
programming
NumPy package plt.gcf(): Returns the reference of the active figure.
Array basics
Linear algebra
Create Figures
Data formats and
handling fig = plt.figure(figsize=(16, 8))
Pandas package
print(plt.gcf())
Series
DataFrame
Import/Export data ## Figure(1600x800)
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
A Figure object can be considered as an empty window,
Pandas layers
The Figure object has a number of options, such as the size or
Applications
Time series
the aspect ratio,
Moving window
Financial applications
You cannot draw a plot in a blank figure. There has to be a
Optimization
subplot in the Figure object.
© 2022 PyEcon.org
Saving plots to file 240
Essential concepts
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Subplots 241
Essential concepts
Visual illustrations
Matplotlib package
Figures and subplots The Figure object is filled with subplots in which the plots reside,
Plot types and styles
Pandas layers Using the plt.plot() command without creating a subplot in
Applications advance, matplotlib will create a Figure object and a subplot
Time series
Moving window automatically,
Financial applications
Optimization The Figure object and its subplots can be created in one line.
© 2022 PyEcon.org
Subplots 242
Essential concepts
Getting started
Procedural
programming
Object-orientation
1.0 1.0
Numerical
programming 0.8 0.8
NumPy package
0.6 0.6
Array basics
Linear algebra 0.4 0.4
0.2 0.2
Data formats and
handling 0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Pandas package
1.0 1.0
Series
DataFrame 0.8 0.8
Import/Export data
0.6 0.6
Visual illustrations 0.4 0.4
Matplotlib package
0.2 0.2
Figures and subplots
Plot types and styles 0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Subplots 243
Essential concepts
Getting started
Procedural
programming Filling subplots with content
Object-orientation
from numpy.random import randn
Numerical
programming ax1.plot([5, 7, 4, 3, 1])
NumPy package ax2.hist(randn(100), bins=20, color="r")
Array basics
ax3.scatter(np.arange(30), np.arange(30) * randn(30))
Linear algebra
ax4.plot(randn(40), "k--")
fig.savefig("out/content.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data The subplots in one Figure object can be filled with different plot
Visual illustrations types,
Matplotlib package
Figures and subplots Using only plt.plot() matplotlib draws the plot in the last
Plot types and styles
Pandas layers
Figure object and last subplot selected.
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Subplots 244
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical 7 14
programming 6 12
NumPy package 5 10
8
Array basics 4
6
Linear algebra 3
4
2
Data formats and 2
handling 1
0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 2 1 0 1 2
Pandas package
Series
2
40
DataFrame
Import/Export data 20 1
0 0
Visual illustrations
Matplotlib package 20
1
Figures and subplots 40
2
Plot types and styles
0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Standard creation of plots 245
Essential concepts
subplots in one line. If sharex or sharey are True, all subplots share
programming
Object-orientation
Visual illustrations
Matplotlib package 10
Pandas layers 4
2
Applications 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8
Time series
10
Moving window
8
Financial applications
6
Optimization 4
0
0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0
© 2022 PyEcon.org
Section 4.3 246
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Visual illustrations
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot types 247
Essential concepts
© 2022 PyEcon.org
Plot types 248
Essential concepts
Getting started
Procedural
programming
Object-orientation
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Adjusting the spacing around subplots 249
Essential concepts
Numerical the figure width and figure height, respectively, to use as spacing be-
programming
NumPy package tween subplots.
Array basics
Linear algebra
Adjust spacing
Data formats and
handling fig, axes = plt.subplots(2, 2, sharex=True, sharey=True)
Pandas package
for i in range(2):
Series
DataFrame
for j in range(2):
Import/Export data axes[i][j].plot(randn(10))
Visual illustrations plt.subplots_adjust(wspace=0, hspace=0)
Matplotlib package fig.savefig("out/spacing.pdf")
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Adjusting the spacing around subplots 250
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming 3
NumPy package
Array basics 2
Linear algebra
Visual illustrations 3
Matplotlib package
Figures and subplots 2
Plot types and styles
Pandas layers 1
Applications
0
Time series
Moving window
Financial applications
1
Optimization
0 2 4 6 8 0 2 4 6 8
© 2022 PyEcon.org
Colors, markers and line styles 251
Essential concepts
of subplot ax.
programming
Object-orientation
Numerical
programming Styles
NumPy package
Array basics fig, ax = plt.subplots(1, figsize=(15, 6))
Linear algebra
ax.plot(randn(10), linestyle="--", color="darkcyan", marker="p")
Data formats and
handling
fig.savefig("out/style.pdf")
Pandas package
Series
DataFrame
Import/Export data
Matplotlib package
1.5
Figures and subplots
Plot types and styles 1.0
Pandas layers
0.5
Applications
Time series 0.0
Moving window
0.5
Financial applications
Optimization 1.0
0 2 4 6 8
© 2022 PyEcon.org
Plot colors 252
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot line styles 253
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot markers 254
Essential concepts
Getting started
Procedural
programming
Marker Description
Object-orientation "." point
Numerical
programming
"," pixel
NumPy package "o" circle
Array basics
Linear algebra "v" triangle_down
Data formats and "8" octagon
handling
Pandas package "s" square
Series
DataFrame
"p" pentagon
Import/Export data
"P" plus (filled)
Visual illustrations
Matplotlib package
"*" star
Figures and subplots "h" hexagon1
Plot types and styles
Pandas layers "H" hexagon2
Applications "+" plus
Time series
Moving window "x" x
Financial applications
Optimization
"X" x (filled)
"D" diamond
© 2022 PyEcon.org
Ticks and labels 255
Essential concepts
Visual illustrations
Here, we create a Figure object as well as a subplot and fill it
Matplotlib package with a line plot of a random walk,
Figures and subplots
Plot types and styles By default matplotlib places the ticks evenly distributed along the
Pandas layers
data range. Individual ticks can be set as follows,
Applications
Time series
By default there is no axis label or title.
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Ticks and labels 256
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming
NumPy package
60
Array basics
Linear algebra
Visual illustrations 20
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
0
Applications
Time series
Moving window
0 200 400 600 800 1000
Financial applications
Optimization
© 2022 PyEcon.org
Ticks and labels 257
Essential concepts
Getting started
Procedural
programming Set ticks and labels
ax.set_xticks([0, 250, 500, 750, 1000])
Object-orientation
Numerical
programming ax.set_xlabel("Days", fontsize=20)
NumPy package ax.set_ylabel("Change", fontsize=20)
Array basics ax.set_title("Simulation", fontsize=30)
Linear algebra
fig.savefig("out/labels.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
The individual ticks are given as a list to ax.set_xticks(),
Import/Export data
The label and title can be set to an individual size using the
Visual illustrations
Matplotlib package
argument fontsize.
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Ticks and labels 258
Essential concepts
Getting started
Procedural
Simulation
programming
Object-orientation
Numerical
programming
NumPy package
60
Array basics
Linear algebra
Series
DataFrame
Import/Export data
Visual illustrations 20
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
0
Applications
Time series
Moving window
0 250 500 750 1000
Financial applications
Optimization
Days
© 2022 PyEcon.org
Legends 259
Essential concepts
Getting started
Procedural Using multiple plots in one subplot one needs a legend.
ax.legend(loc): Shows the legend at location loc.
programming
Object-orientation
Applications
The legend displays the label and the color of the associated plot,
Time series
Moving window
Using the option "best" the legend will placed in a corner where
Financial applications is does not interfere the plots.
Optimization
© 2022 PyEcon.org
Legends 260
Essential concepts
Getting started
Procedural
programming
Object-orientation
80
Numerical first
programming
second
NumPy package
Array basics
60 third
Linear algebra
© 2022 PyEcon.org
Annotations on a subplot 261
Essential concepts
Applications
Time series
Moving window
Using ax.annotate() the arrow head points at xy and the
Financial applications bottom left corner of the text will be placed at xytext.
Optimization
© 2022 PyEcon.org
Annotations 262
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical first
programming
second
NumPy package
Array basics
third
Linear algebra
there
20
Series
DataFrame 10
Import/Export data
0
Visual illustrations
10
here
Matplotlib package
Figures and subplots 20
Plot types and styles 30
Pandas layers
40
Applications
Time series
Moving window
0 200 400 600 800 1000
Financial applications
Optimization
© 2022 PyEcon.org
Annotations 263
Essential concepts
Getting started
Procedural
programming Annotation Lehman
import pandas as pd
Object-orientation
Numerical
programming
from datetime import datetime
NumPy package
Array basics date = datetime(2008, 9, 15)
Linear algebra
fig = plt.figure(figsize=(16, 8))
Data formats and
handling
ax = fig.add_subplot(1, 1, 1)
Pandas package
dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
Series close = dow["Close"]
DataFrame
close.plot(ax=ax)
ax.annotate("Lehman Bankruptcy",
Import/Export data
Visual illustrations
fontsize=30,
Matplotlib package
Figures and subplots
xy=(date, close.loc[date] + 400),
Plot types and styles xytext=(date, 22000),
Pandas layers
arrowprops=dict(facecolor="red",
Applications shrink=0.03))
Time series
ax.set_title("Dow Jones Industrial Average", size=40)
Moving window
Financial applications
fig.savefig("out/lehman.pdf")
Optimization
© 2022 PyEcon.org
Annotations 264
Essential concepts
Getting started
Procedural
Lehman Bankruptcy
NumPy package
Array basics 22500
Linear algebra
20000
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Drawing on a subplot 265
Essential concepts
© 2022 PyEcon.org
Drawing on a subplot 266
Essential concepts
Getting started
Procedural
programming
Object-orientation
5
Numerical
programming
NumPy package
Array basics
Linear algebra
4
Data formats and
handling
Pandas package
Series
3
DataFrame
Import/Export data
Visual illustrations
2
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
1
Applications
Time series
Moving window
0
Financial applications
Optimization 0 1 2 3 4 5
© 2022 PyEcon.org
Best practice: Visual illustrations 267
Essential concepts
Getting started
Procedural Step 1
programming
Object-orientation Create a Figure object and subplots
Numerical
programming Best practice Step 1
NumPy package
Array basics fig, ax = plt.subplots(1, 1, figsize=(16, 8))
Linear algebra
Visual illustrations
Matplotlib package
Best practice Step 2
Figures and subplots
x = np.arange(0, 10, 0.1)
y = np.sin(x)
Plot types and styles
Pandas layers
Applications
ax.scatter(x, y)
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Best practice: Visual illustrations 268
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical 1.00
programming
NumPy package 0.75
Array basics
Linear algebra 0.50
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Best practice: Visual illustrations 269
Essential concepts
Getting started
Procedural Step 3
programming
Object-orientation Set colors, markers and line styles
Numerical
programming Best practice Step 3
NumPy package
Array basics ax.scatter(x, y, color="green", marker="s")
Linear algebra
© 2022 PyEcon.org
Best practice: Visual illustrations 270
Essential concepts
Getting started
Procedural
programming
Object-orientation
Sine wave
Numerical 1
programming
NumPy package
Array basics
Linear algebra
Pandas package 0
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
1
Plot types and styles
0.0 2.5 5.0 7.5 10.0
Pandas layers x-value
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Best practice: Visual illustrations 271
Essential concepts
Getting started
Procedural Step 5
programming
Object-orientation Set labels
Numerical
programming Best practice Step 5
NumPy package
Array basics ax.scatter(x, y, color="green", marker="s", label="Sine")
Linear algebra
Applications
Time series
Step 7
Moving window
Financial applications
Save plot to file
Optimization
Best practice Step 7
fig.savefig("out/sinewave.pdf")
© 2022 PyEcon.org
Best practice: Visual illustrations 272
Essential concepts
Getting started
Procedural
programming
Object-orientation
Sine wave
Numerical 1
programming
NumPy package
Array basics
Linear algebra
Pandas package 0
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package Sine
Figures and subplots
1 Linear
Plot types and styles
0.0 2.5 5.0 7.5 10.0
Pandas layers x-value
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 4.4 273
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Visual illustrations
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plotting with layers 274
Essential concepts
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles pandas provides a convenient layer with frequently demanded
Pandas layers
plotting methods for its objects, such as Series and DataFrames.
Applications
Time series Seaborn is a powerful graphics framework that allows you to easily
Moving window
Financial applications create beautiful, complex graphics using a simple interface.
Optimization
Numerical
Simple line plot
programming
NumPy package
plt.close("all")
Array basics p = pd.Series(np.random.rand(10).cumsum(),
Linear algebra index=np.arange(0, 1000, 100))
Data formats and p
handling
Pandas package
Series
## 0 0.669761
DataFrame ## 100 0.989702
Import/Export data ## 200 1.655715
Visual illustrations ## 300 1.966073
Matplotlib package ## 400 2.151883
Figures and subplots
Plot types and styles
## 500 2.776987
Pandas layers ## 600 2.839751
Applications ## 700 3.188431
Time series ## 800 4.169061
Moving window ## 900 4.923286
Financial applications
Optimization
## dtype: float64
p.plot()
plt.savefig("out/line.pdf")
© 2022 PyEcon.org
Line plots 276
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
programming 5
NumPy package
Array basics
Linear algebra
4
Data formats and
handling
Pandas package
Series 3
DataFrame
Import/Export data
Visual illustrations
2
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
1
Applications
Time series
Moving window
0 200 400 600 800
Financial applications
Optimization
© 2022 PyEcon.org
Line plots 277
Essential concepts
Getting started
Procedural
programming Line plots
Object-orientation
Numerical
df = pd.DataFrame(np.random.randn(10, 3), index=np.arange(10),
programming columns=["a", "b", "c"])
NumPy package df
Array basics
## a b c
Linear algebra
df.plot(figsize=(15, 12))
Moving window
Financial applications
Optimization plt.savefig("out/line2.pdf")
© 2022 PyEcon.org
Line plots 278
Essential concepts
Getting started
Procedural
programming
Object-orientation
2.0 a
b
Numerical c
programming
NumPy package
1.5
Array basics
Linear algebra
Import/Export data
Visual illustrations
0.0
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers 0.5
Applications
Time series
1.0
Moving window
Financial applications
Optimization
1.5
0 2 4 6 8
© 2022 PyEcon.org
Plotting and pandas 279
Essential concepts
Getting started
Procedural The plot method applied to a DataFrame plots each column as a
programming
Object-orientation different line and shows the legend automatically. Plotting DataFrames,
Numerical there are serveral arguments to change the style of the plot:
programming
NumPy package
Array basics
Linear algebra
Argument Description
Data formats and kind "line", "bar", etc
handling
Pandas package
logy logarithmic scale on Y-axis
Series use_index If True, use index for tick labels
DataFrame
Import/Export data rot Rotation of tick labels
Visual illustrations xticks Values for x ticks
Matplotlib package
Figures and subplots yticks Values for y ticks
Plot types and styles
Pandas layers
grid Set grid True or False
Applications xlim X-axis limits
Time series
Moving window
ylim Y-axis limits
Financial applications subplots Plot each DataFrame column in a new subplot
Optimization
© 2022 PyEcon.org
Pandas plot 280
Essential concepts
Getting started
Procedural
programming
Separated line plots
df.plot(grid=True, rot=45, subplots=True, title="Example",
Object-orientation
Numerical
programming figsize=(15, 10))
NumPy package plt.savefig("out/pandas.pdf")
Array basics
Linear algebra
© 2022 PyEcon.org 8
Standard creation of plots and pandas 281
Essential concepts
Numerical
Standard creation
programming fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(1, 1, 1)
NumPy package
Array basics
Linear algebra guests = np.array([[1334, 456], [1243, 597], [1477, 505],
Data formats and [1502, 404], [854, 512], [682, 0]])
handling canteen = pd.DataFrame(guests,
Pandas package
Series
index=["Mon", "Tue", "Wed",
DataFrame "Thu", "Fri", "Sat"],
Import/Export data columns=["Zentral", "Turm"])
Visual illustrations canteen
Matplotlib package
Figures and subplots ## Zentral Turm
Plot types and styles
Pandas layers
## Mon 1334 456
## Tue 1243 597
Applications
Time series
## Wed 1477 505
Moving window ## Thu 1502 404
Financial applications ## Fri 854 512
Optimization
## Sat 682 0
© 2022 PyEcon.org
Standard creation of plots and pandas 282
Essential concepts
Getting started
Procedural
programming Bar plot
Object-orientation
Numerical
canteen.plot(ax=ax, kind="bar")
programming ax.set_ylabel("guests", fontsize=20)
NumPy package ax.set_title("Canteen use in Göttingen", fontsize=20)
Array basics
Linear algebra
fig.savefig("out/canteen.pdf")
Data formats and
handling
Pandas package
The bar plot resides in the subplot ax,
Series
DataFrame
Import/Export data
The label and title are set as shown before without using pandas.
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Bar plot 283
Essential concepts
Getting started
Procedural
programming
Numerical
programming Zentral
Turm
NumPy package 1400
Array basics
Linear algebra
1200
Data formats and
handling
1000
Pandas package
guests
Series
DataFrame 800
Import/Export data
Fri
Mon
Tue
Thu
Sat
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Bar plot 284
Essential concepts
Getting started
Procedural
programming Bar plot - stacked
Object-orientation
Numerical
canteen.plot(ax=ax, kind="bar", stacked=True)
programming ax.set_ylabel("guests", fontsize=20)
NumPy package ax.set_title("Canteen use in Göttingen", fontsize=20)
Array basics
Linear algebra
fig.savefig("out/canteenstacked.pdf")
Data formats and
handling
Pandas package
Series
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Bar plot 285
Essential concepts
Getting started
Procedural
programming
Numerical
programming 2000 Zentral
NumPy package
Turm
Zentral
Array basics 1750 Turm
Linear algebra
Fri
Mon
Tue
Thu
Sat
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot financial data 286
Essential concepts
Getting started
Procedural
programming BTC chart
fig = plt.figure(figsize=(16, 8))
Object-orientation
Numerical
programming ax = fig.add_subplot(1, 1, 1)
NumPy package ax.set_ylabel("price", fontsize=20)
Array basics ax.set_xlabel("Date", fontsize=20)
Linear algebra
BTC = pd.read_csv("data/btc-eur.csv", index_col=0, parse_dates=True)
Data formats and
handling
BTCclose = BTC["Close"]
Pandas package BTCclose.plot(ax=ax)
Series ax.set_title("BTC-EUR", fontsize=20)
DataFrame
fig.savefig("out/btc.pdf")
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot financial data 287
Essential concepts
Getting started
Procedural
programming
Object-orientation
BTC-EUR
Numerical
programming 15000
NumPy package
Array basics 12500
Linear algebra
10000
Data formats and
price
handling
7500
Pandas package
Series
5000
DataFrame
Import/Export data
2500
Visual illustrations
0
Matplotlib package
2 3 4 5 6 7 8 9
Figures and subplots 201 201 201 201 201 201 201 201
Plot types and styles Date
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot financial data 288
Essential concepts
Getting started
Procedural
programming Compare - bad illustration
Object-orientation
Numerical
amazon = pd.read_csv("data/amzn.csv", index_col=0,
programming parse_dates=True)["Close"]
NumPy package siemens = pd.read_csv("data/sie.de.csv", index_col=0,
Array basics
Linear algebra
parse_dates=True)["Close"]
fig = plt.figure(figsize=(16, 8))
Data formats and
handling ax = fig.add_subplot(1, 1, 1)
Pandas package ax.set_ylabel("price")
Series
amazon.plot(ax=ax, label="Amazon")
DataFrame
Import/Export data
siemens.plot(ax=ax, label="Siemens")
Visual illustrations
ax.legend(loc="best")
Matplotlib package fig.savefig("out/compare.pdf")
Figures and subplots
Plot types and styles
Pandas layers
Applications
In this illustration you can hardly compare the trend of the two
Time series stocks,
Moving window
Financial applications Using pandas you can standardize both dataframes in one line.
Optimization
© 2022 PyEcon.org
Plot financial data 289
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical Amazon
Siemens
programming 1400
NumPy package
Array basics 1200
Linear algebra
1000
Data formats and
price
handling 800
Pandas package
Series 600
DataFrame
400
Import/Export data
Matplotlib package
3 5 7 9 1 1 3
7-0 7-0 7-0 7-0 7-1 8-0 8-0
201 201 201 201 201 201 201
Figures and subplots
Plot types and styles Date
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot financial data 290
Essential concepts
Getting started
Procedural
programming Compare - good illustration
Object-orientation
Numerical
amazon = amazon / amazon[0] * 100
programming siemens = siemens / siemens[0] * 100
NumPy package fig = plt.figure(figsize=(16, 8))
Array basics
Linear algebra
ax = fig.add_subplot(1, 1, 1)
ax.set_ylabel("percentage")
Data formats and
handling amazon.plot(ax=ax, label="Amazon")
Pandas package siemens.plot(ax=ax, label="Siemens")
Series
ax.legend(loc="best")
DataFrame
Import/Export data
fig.savefig("out/comparenew.pdf")
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Plot financial data 291
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical Amazon
Siemens
programming
NumPy package 160
Array basics
Linear algebra
140
Data formats and
percentage
handling
Pandas package
120
Series
DataFrame
Import/Export data
100
Visual illustrations
Matplotlib package
3 5 7 9 1 1 3
7-0 7-0 7-0 7-0 7-1 8-0 8-0
201 201 201 201 201 201 201
Figures and subplots
Plot types and styles Date
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Chapter 5 292
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Applications
programming
NumPy package
Array basics
Linear algebra
5.1 Time series
Data formats and
handling
5.2 Moving window
Pandas package
Series 5.3 Financial applications
DataFrame
Import/Export data 5.4 Optimization
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Section 5.1 293
Essential concepts
Getting started
Procedural
programming
Object-orientation
Numerical
Applications
programming
NumPy package
Array basics
Linear algebra
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Date and time data types 294
Essential concepts
Getting started
Procedural Data types for date and time are included in the Python standard
programming
Object-orientation library.
Numerical
programming Datetime creation
NumPy package
Array basics from datetime import datetime
Linear algebra now = datetime.now()
Data formats and now
handling
Pandas package
Series
## datetime.datetime(2022, 2, 14, 0, 36, 9, 153276)
DataFrame
Import/Export data now.day
Visual illustrations
Matplotlib package ## 14
Figures and subplots
Plot types and styles now.hour
Pandas layers
Applications ## 0
Time series
From datetime you can get the attributes year, month, day, hour,
Moving window
Financial applications
Optimization
minute, second, microsecond.
© 2022 PyEcon.org
Set datetime 295
Essential concepts
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Time difference 296
Essential concepts
© 2022 PyEcon.org
Convert string and datetime 297
Essential concepts
val = "2020-5-5"
Plot types and styles
Pandas layers
d = datetime.strptime(val, "%Y-%m-%d")
Applications
Time series
d
Moving window
Financial applications ## datetime.datetime(2020, 5, 5, 0, 0)
Optimization
© 2022 PyEcon.org
Convert string and datetime 298
Essential concepts
Getting started
Procedural
programming Converting examples
Object-orientation
val = "31.01.2012"
Numerical
programming d = datetime.strptime(val, "%d.%m.%Y")
NumPy package d
Array basics
Linear algebra
## datetime.datetime(2012, 1, 31, 0, 0)
Data formats and
handling
Pandas package
now.strftime("Today is %A and we are in week %W of the year %Y.")
Series
DataFrame ## 'Today is Monday and we are in week 07 of the year 2022.'
Import/Export data
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Overview: Datetime formats 299
Essential concepts
Getting started
Procedural
programming
Object-orientation
Type Description
Numerical %Y 4-digit year
programming
NumPy package
%m 2-digit month [01, 12]
Array basics
Linear algebra
%d 2-digit day [01, 31]
Data formats and
%H Hour (24-hour clock) [00, 23]
handling
Pandas package
%I Hour (12-hour clock) [01, 12]
Series %M 2-digit minute [00, 59]
DataFrame
Import/Export data %S Second [00, 61]
Visual illustrations %W Week number of the year [00, 53]
Matplotlib package
Figures and subplots
%F Shortcut for %Y-%m-%d
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org
Overview : Datetime formats 300
Essential concepts
Getting started
Procedural
programming
Object-orientation
Type Description
Numerical %a Abbreviated weekday name
programming
NumPy package
%A Full weekday name
Array basics
Linear algebra
%b Abbreviated month name
Data formats and
%B Full month name
handling
Pandas package
%c Full date and time
Series %x Locale-appropriate formatted date
DataFrame
Import/Export data
Visual illustrations
Matplotlib package
Figures and subplots
Plot types and styles
Pandas layers
Applications
Time series
Moving window
Financial applications
Optimization
© 2022 PyEcon.org