0% found this document useful (0 votes)

29 views

PChem3 Python Tutorial5

This document discusses Python modules and libraries for scientific computing. It introduces the os, subprocess, time and datetime modules, covering functions for file/directory management, running system commands, and working with dates/times. Examples demonstrate basic usage of these modules.

Uploaded by

Suhyun Lee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

PChem3 Python Tutorial5

Uploaded by

Suhyun Lee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Physical Chemistry 3 Spring 2024, SNU

2.5 Modules and libraries

Modules are files containing functions and classes that can be imported by other Python programs. This
collection of modules is referred to as libraries. You have the flexibility to import modules and libraries
created by other users or craft your own for integration into other programs. One of Python’s strengths
lies in its extensive library ecosystem, comprising both built-in and user-generated resources, which
offer a plethora of versatile functionalities tailored to various needs. These libraries significantly enhance
Python’s capability for scientific calculations, providing ready-made solutions for common tasks such as
numerical computations, data analysis, and visualization. Whether you’re performing basic arithmetic
operations or conducting complex scientific simulations, Python’s rich library support ensures that you
have the tools necessary to tackle diverse computational challenges effectively.

Modules and libraries can be imported using the import statement. In the upcoming sections, we will
introduce several frequently-used modules in Python. It’s important to note that we won’t cover every
function and class within each library. For detailed information, you should refer to the library doc-
umentation. Additionally, it’s crucial to always check the version of the library you are using. Often,
libraries have dependencies, which can be quite cumbersome. Inconsistent versions of libraries can lead
to numerous problems.

Proficiency in Python coding often involves the ability to search for and utilize libraries that provide the
necessary functions and classes. Becoming adept at effectively navigating library documentation and
leveraging existing resources is a key skill for Python programmers.

In this section, we will maintain consistency by using the following versions of Python and libraries.

Code 2.77: Python and libraries’ versions.

1 import sys
2 # os, time, datetime modules are built-in modules
3 print('Current Python version:\n', sys.version)
4
5 # Un-comment this line if you haven't installed numpy
6 # Comment/Un-comment with ctrl + / (Windows) and cmd + / (Mac)
7 # !pip install numpy
8 import numpy as np
9 print('Current numpy version:', np.__version__)
10
11 # Un-comment this line if you haven't installed numba
12 # !pip install numba
13 import numba
14 print('Current numba version:', numba.__version__)
15
16 # Un-comment this line if you haven't installed joblib
17 # !pip install joblib
18 import joblib
19 print('Current joblib version:', joblib.__version__)

Output 2.77

Current Python version:

3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:49:36) [Clang 16.0.6 ]
Current numpy version: 1.26.4
Current numba version: 0.59.0
Current joblib version: 1.3.2

44
Physical Chemistry 3 Spring 2024, SNU

2.5.1 os and subprocess module

❗ For the documentation of os and subprocess modules in Python 3.11.8, see https://docs.python.
org/3.11/library/os.html and https://docs.python.org/3.11/library/subprocess.html, respectively.

os module provides utilization of os(operating system)-dependent functionalities. os.getcwd() func-

tion returns current directory into a string. This is equivalent to pwd command in bash shell.

Code 2.78: os.getcwd() function.

1 import os
2
3 os.getcwd()

Output 2.78

'/Users/(...)/Downloads/python_tutorial'

Directory management

os.mkdir() (equivalent to mkdir command in bash ) generates a new directory, while os.listdir()
lists all directories and files under current directory. Note that built-in sorted() function sorts the list
given in the argument.

Code 2.79: os.mkdir() and os.listdir() function.

1 os.mkdir('./dir') # Equiv. to 'mkdir dir'
2 sorted(os.listdir()) # sorted() sorts os.listdir()

Output 2.79

['README.md',
'dir',
'file.txt',
'tutorial1.ipynb',
'tutorial2.ipynb',
'tutorial3.ipynb',
'tutorial4.ipynb',
'tutorial5.ipynb',
'tutorial6.ipynb',
'tutorial7.ipynb']

Note that os.mkdir() raises FileExistsError if you try to create an already existing directory.

Code 2.80: FileExistsError

1 os.mkdir('./dir') # Error

Output 2.80
---------------------------------------------------------------------------
FileExistsError Traceback (most recent call last)
Cell In[4], line 1
----> 1 os.mkdir('./dir')

FileExistsError: [Errno 17] File exists: './dir'

45
Physical Chemistry 3 Spring 2024, SNU

Compared to os.mkdir() , os.makedirs() can avoid errors by utilizing the exist_ok option. In fact,
os.makedirs() creates directories recursively. Even if there is no parent directory, os.makedirs() au-
tomatically generates directories recursively until the destination directory is reached. Conversely, the
os.rmdir() function, which is equivalent to the rm -r command in bash , removes directories.

Code 2.81: os.makdirs() and os.rmdir() function.

1 os.makedirs('./dir', exist_ok = True) # No error
2 os.rmdir('./dir') # Equiv. to 'rm -r dir'
3 sorted(os.listdir())

Output 2.81

['README.md',
'file.txt',
'tutorial1.ipynb',
'tutorial2.ipynb',
'tutorial3.ipynb',
'tutorial4.ipynb',
'tutorial5.ipynb',
'tutorial6.ipynb',
'tutorial7.ipynb']

File management

os.rename() funciton (equivalent to the mv command in bash ) renames a file. To check if a path exists,
use the os.path.exists() function.

Code 2.82: os.rename() and os.path.exists() functions.

1 os.rename('./file.txt', './file2.txt') # Equiv. to 'mv file.txt file2.txt'
2 print(os.path.exists('./file.txt'))
3 print(os.path.exists('./file2.txt'))

Output 2.82

False
True

To remove a file, use os.remove() function (equivalent to the rm command in bash ).

Code 2.83: os.remove() function.

1 os.remove('./file2.txt') # Equiv. to 'rm file2.txt'
2 print(os.path.exists('./file2.txt'))

Output 2.83

False

os.path.isdir() and os.path.isfile() checks if such directory or file exists. os.path.join() is a

useful tool for managing paths in Python. You can join paths with it. Following example Code 2.84
shows joining base path and file path inside the directory.

❗ . means current directory in Linux. .. means parent directory. So ./dir means dir directory (or
a file) in the current directory.

46
Physical Chemistry 3 Spring 2024, SNU

Code 2.84: Methods of os.path module.

1 print(os.path.isdir('./dir'))
2 print(os.path.isfile('./file.txt'))
3
4 pwd = os.getcwd()
5 file = './file.txt'
6 PATH = os.path.join(pwd, file)
7 print(PATH)
8
9 print(os.path.split(PATH))
10 print(os.path.splitext(PATH))

Output 2.84

False
False
/Users/(...)/Downloads/python_tutorial/./file.txt
('/Users/(...)/Downloads/python_tutorial/.', 'file.txt')
('/Users/(...)/Downloads/python_tutorial/./file', '.txt')

os.path.split() and os.path.splitext() split the path string. os.path.split() separates the file
path from the entire path while os.path.splitext() separates the file extension.

subprocess module

subprocess module in python manages processes itself and standard input/output/error pipes. This
module replaces some old functions in os module. Here we only introduce one class in subprocess
module: the subprocess.Popen() class. Code 2.85 executes ls -la command in bash .

Code 2.85: subprocess.Popen() class usage.

1 import subprocess
2
3 # Valid for UNIX operating systems
4 proc = subprocess.Popen(['ls', '-la'], stdout = subprocess.PIPE, stderr = subprocess.PIPE)
5 out = proc.communicate()
6 print(out)

Output 2.85

(b'total 6400\ndrwxr-xr-x@ 11 kadryjh1724 staff 352 Feb 19 19:56 ...

(output truncated)

You can execute bash shell commands with subprocess.Popen() class: generate a process and ”com-
municate” with the process with communicate() method. The standard output ( stdout ) and standard
error ( stderr ) is returned into the variable out in line 5.

47
Physical Chemistry 3 Spring 2024, SNU

2.5.2 time and datetime module

time and datetime module are standard python libraries useful for dealing with time information and
calculations. For the official documentation, see https://docs.python.org/3.11/library/time.html and
https://docs.python.org/3.11/library/datetime.html.

time.time() function

The time.time() function returns the current time from the ”epoch” in seconds. For most devices, the
epoch is set as UTC 1970/01/01 00:00:00.

Code 2.86: time.time() function usage.

1 import time
2
3 now = time.time()
4 print(now)
5 print(time.strftime('%Y/%m/%d %H:%M:%S', time.localtime(now)))

Output 2.86

1708780414.4803193
2024/02/24 22:13:34

The time.strftime() function converts time information into a readable format. It takes a format string
and the current time as arguments. It’s important to note that the current time should be converted into
your local time (such as KST in our case) before passing it to the function.

Measuring elapsed time

You can measure elapsed time by executing time.time() before and after your code. However, Python
provides more functionalities, such as time.perf_counter() and time.process_time() .

Code 2.87: Measuring execution time.

1 start_time = time.perf_counter()
2 start_proc_time = time.process_time()
3
4 sum = 0
5 for i in range(10000000):
6 sum += i
7 time.sleep(5)
8
9 end_time = time.perf_counter()
10 end_proc_time = time.process_time()
11
12 print(f'Elapsed time (real): {end_time - start_time}')
13 print(f'Elapsed time (cpu): {end_proc_time - start_proc_time}')

Output 2.87

Elapsed time (real): 6.273692643968388

Elapsed time (cpu): 1.275854185

48
Physical Chemistry 3 Spring 2024, SNU

The time.perf_counter() (performance counter) function measures the real amount of time, while
time.process_time() returns CPU time. In Code 2.87, time measured by time.perf_counter() in-
cludes any sleeping time generated by the time.sleep() function, whereas time.process_time() does
not.

datetime.datetime class

datetime.datetime class provides convenient processing of time information. Current time can be re-
trieved with datetime.now() function (note that default datetime.datetime object has higher readabil-
ity), and can be converted into the other format you want with strftime() method.

Code 2.88: The datetime module.

1 from datetime import datetime # From the module datetime, import datetime.datetime
2
3 now = datetime.now()
4 print(f'now has {type(now)} and its value is {now}')
5 print(now.strftime('%Y/%m/%d %H:%M:%S'))

Output 2.88

now has <class 'datetime.datetime'> and its value is 2024-02-24 22:13:47.179442

2024/02/24 22:13:47

You can compute the time difference by subtracting two datetime.datetime objects (recall magic meth-
ods). This operation results in a datetime.timedelta object.

It’s important to note that a datetime.timedelta object is distinct from a datetime.datetime object, as
they possess different attributes and methods.

Code 2.89: The datetime.timedelta object.

1 future = datetime.now()
2 dt = future - now
3 print(f'dt has {type(dt)} and its value is {dt}')

Output 2.89

dt has <class 'datetime.timedelta'> and its value is 0:00:02.103660

Following Code 2.90 shows another example.

Code 2.90: Adding a timedelta object to the datetime object.

1 from datetime import timedelta
2
3 now = datetime.now()
4 future = now + timedelta(days = 30, hours = 5, minutes = 17, seconds = 23)
5 print(now)
6 print(future)

Output 2.90

2024-02-24 22:13:55.997377
2024-03-26 03:31:18.997377

49
Physical Chemistry 3 Spring 2024, SNU

Parsing and formatting time strings

The datetime.strptime (string parse time) function parses time from a string, while datetime.strftime
(string format time) function formats time data into a formatted string.

Code 2.91: datetime.strptime() and datetime.strftime() functions.

1 date_str = '2024-03-05 15:00:00'
2 first_class = datetime.strptime(date_str, '%Y-%m-%d %H:%M:%S')
3 print(first_class)
4 print(first_class.strftime('Year %Y, Month %m, Day %d |%H|:|%M|:|%S|'))

Output 2.91

2024-03-05 15:00:00
Year 2024, Month 03, Day 05 |15|:|00|:|00|

Decorators

A decorator is a function that decorates another function by taking it as an argument. By decorating the
function, new functionalities can be added to it. Decorators can be used in various situations, but at
an introductory level, one of the easiest ways to utilize decorators is for measuring time. In Code 2.92,
the sum() function computes the sum of numbers up to the input number. The time_wrapper() func-
tion, which takes another function as an argument, wraps the input function with the datetime.now()
function and returns the time difference.

Code 2.92: The concept of decorators.

1 def time_wrapper(func):
2
3 def wrapper(*args, **kwargs):
4
5 start = datetime.now()
6 ret = func(*args, **kwargs)
7 end = datetime.now()
8 print(f'Time elapsed: {(end - start).total_seconds()} s')
9 return ret
10
11 return wrapper
12
13 def sum(N):
14
15 ret = 0
16 for i in range(N):
17 ret += i
18
19 return ret
20
21 fn = time_wrapper(sum)
22
23 print(fn(1000000))

Output 2.92

Time elapsed: 0.082543 s

499999500000

50
Physical Chemistry 3 Spring 2024, SNU

Python provides a simpler syntax for decorating a function: you can decorate a function by writing
@(function name) before the function definition. The following Code 2.93 is equivalent to Code 2.92.

Code 2.93: Decorator syntax.

1 @time_wrapper
2 def sum(N):
3
4 ret = 0
5 for i in range(N):
6 ret += i
7
8 return ret
9
10 print(sum(1000000))

Output 2.93

Time elapsed: 0.084964 s

499999500000

2.5.3 numpy module

numpy is a fundamental package for fast scientific computing in Python. It supports multi-dimensional
arrays, along with a collection of mathematical functions designed to operate efficiently on these arrays.

One of the major drawbacks of pure Python is its speed. Despite its convenience, Python is known to be
slow. The high performance of numpy can be attributed to its C and C++ backends. While basic Python
can be slower due to its ”interpreted” nature, numpy ’s core functionality is primarily implemented in C
and C++, allowing it to execute array operations much faster than equivalent Python code.

However, although numpy is fast, its operations are executed on a single CPU, which means that many
linear algebra operations, which are faster on GPUs, can be slow. To address this issue, packages like
jax have emerged.

numpy provides a plenty of functionalities beyond what I have introduced here. If you require additional
features, please refer to numpy ’s official documentation at https://numpy.org/doc/stable/.

numpy arrays

Lists are one-dimensional arrays in numpy . You can assign data types with the dtype keyword argument.

Code 2.94: One-dimensional numpy array.

1 import numpy as np
2
3 a = np.array([1, 2, 3], dtype = np.int64)
4 print(a.shape)

Output 2.94

(3,)

numpy arrays have shape attribute.

51
Physical Chemistry 3 Spring 2024, SNU

Multidimensional arrays can be indexed and sliced like the lists.

Code 2.95: Two-dimensional numpy array.

1 b = np.array([[4., 5., 6.], [7., 8., 9.]], dtype = np.float32)
2 print(b[0], b[1])
3 print(b[1][2])
4 print(b.shape)

Output 2.95

[4. 5. 6.] [7. 8. 9.]

9.0
(2, 3)

In Code 2.95, an array with shape (2, 3) was introduced. It’s important to note that the first dimension,
2, indicates that the first set of parentheses [] contains two elements. Similarly, the second dimension,
3, signifies that the second set of parentheses contains three elements. This rule applies consistently for
higher-dimensional arrays as well.

Code 2.96: Multi-dimensional numpy array.

1 c = np.array([[[1, -1, 1], [3, 4, 7]],
2 [[5, 1, 1], [-2, 0, 3]],
3 [[0, -2, -4], [3, 1, 3]]])
4 print(c.shape)
5
6 for i in range(2):
7
8 print(f'{i}:', '-' * 10)
9 print('c[i, :, 0]:', c[i, :, 0])
10 print('c[:, i, 0]:', c[:, i:, 0])
11 print('c[0, i, :]:', c[0, i, :])
12 print('\n')

Output 2.96

(3, 2, 3)
0: ----------
c[i, :, 0]: [1 3]
c[:, i, 0]: [[ 1 3]
[ 5 -2]
[ 0 3]]
c[0, i, :]: [ 1 -1 1]

1: ----------
c[i, :, 0]: [ 5 -2]
c[:, i, 0]: [[ 3]
[-2]
[ 3]]
c[0, i, :]: [3 4 7]

52
Physical Chemistry 3 Spring 2024, SNU

Functions for creating certain shapes of arrays

The usages are straightforward.

Code 2.97: np.zeros(), np.ones() and np.arange() functions.

1 a = np.zeros((3, 2))
2 b = np.ones(5)
3 c = np.arange(1, 11, 1)
4
5 print(a, b, c)

Output 2.97

[[0. 0.]
[0. 0.]
[0. 0.]] [1. 1. 1. 1. 1.] [ 1 2 3 4 5 6 7 8 9 10]

Reshaping arrays

numpy provides array reshaping functions, which involve rearranging the dimensions of an array. For ex-
ample, in the following Code 2.98, an array with shape (4, 2, 2) is created using the np.random.randn
function. The np.random module provides various random number generators, and the randn function
generates random numbers following a normal (or Gaussian) distribution with mean 0 and standard de-
viation 1.

Code 2.98: Reshaping of arrays.

1 x = np.random.randn(4, 2, 2)
2 print(x)
3 print(x.reshape(8, 2))
4 # print(x.reshape(-1, 2))

Output 2.98

[[[-0.06248479 -2.79681374]
[ 1.05793301 -0.29597319]]

[[-0.50820033 0.3952909 ]
[ 0.71850602 -1.03609737]]

[[ 0.18801837 0.59544032]
[ 0.85238323 -0.03165555]]

[[-0.69334143 -1.1885556 ]
[-2.82748787 -0.43549003]]]
[[-0.06248479 -2.79681374]
[ 1.05793301 -0.29597319]
[-0.50820033 0.3952909 ]
[ 0.71850602 -1.03609737]
[ 0.18801837 0.59544032]
[ 0.85238323 -0.03165555]
[-0.69334143 -1.1885556 ]
[-2.82748787 -0.43549003]]

53
Physical Chemistry 3 Spring 2024, SNU

Using the reshape function, one can rearrange the elements to match a new shape. It’s important to
ensure that the reshaped dimensions are compatible with the original shape. In some cases, you can use
a wildcard -1 as an input to the reshape function. If -1 is used, numpy automatically determines the
dimension corresponding to -1 based on the other dimensions.

Adding and removing additional dimensions are commonly referred to as unsqueezing and squeezing. In
Code 2.99, we add one additional dimension to the one-dimensional array.

Code 2.99: Expanding dimensions.

1 y = np.random.randn(10)
2 print(y)
3 print(np.expand_dims(y, axis = 0))
4 print(np.expand_dims(y, axis = 0).shape)
5
6 # import torch
7 # y = torch.randn(10)
8 # print(y.unsqueeze(0))

Output 2.99

[ 1.74403849 -0.05739269 (...) -1.66095744]

[[ 1.74403849 -0.05739269 (...) -1.66095744]]
(1, 10)

Note that the shape changed from (10,) to (1, 10) when adding an additional dimension. You can
change the location of dimension addition using the axis argument. If you set axis=1 , the new di-
mension will be added along the second axis, resulting in a shape of (10, 1).

Code 2.100: np.squeeze() function.

1 z = np.random.randn(4, 2, 1)
2 print(z, '\n')
3 print(np.squeeze(z), np.squeeze(z).shape)

Output 2.100

[[[-2.41010067]
[ 0.75490811]]

[[-1.62621492]
[ 0.2833764 ]]

[[ 0.11003475]
[ 1.34817212]]

[[ 0.80623919]
[ 0.3242343 ]]]

[[-2.41010067 0.75490811]
[-1.62621492 0.2833764 ]
[ 0.11003475 1.34817212]
[ 0.80623919 0.3242343 ]] (4, 2)

In constrast, np.squeeze() squeezes extra dimension.

54
Physical Chemistry 3 Spring 2024, SNU

Broadcasting

Broadcasting is a powerful tool for performing operations between numpy arrays with different shapes.
Let’s begin with a somewhat trivial example of broadcasting.

❗ Examples and figures from this section are retrieved from:

https://numpy.org/doc/stable/user/basics.broadcasting.html

Code 2.101: Multiplying a scalar and a 1D numpy array.

1 a = np.array([1.0, 2.0, 3.0])
2 b = 2.0
3 print(a * b)

Output 2.101

[2. 4. 6.]

Two arrays a and b has different shapes: (3,) and (1,) (or a scalar). However we naturally assume
that these two arrays with different shape can be multiplied indeed. If you look at this multiplication
process closely, this process happens:

Code 2.102: Multiplying a scalar and a 1D numpy array, detailed process.

1 a = np.array([1.0, 2.0, 3.0])
2 b = np.array([2.0, 2.0, 2.0])
3 print(a * b)

Output 2.102

[2. 4. 6.]

In Code 2.102, the array b with shape (1,) were extended (or repeated) to match the shape of the array
a. Then two matrices can be multiplied elementwise. By similar way, you can define multiplications
between high-dimensional arrays with different shapes when their shapes are somewhat compatible.

Figure 2.2: Broadcasting in Code 2.101 and Code 2.102.

numpy compares dimensions elementwise. The rules for broadcastable arrays are:

» Two dimensions are same (trivial).

» One of the dimension is 1 (stretchable).

which also applies to the example in Code 2.101 and Code 2.102.

55
Physical Chemistry 3 Spring 2024, SNU

Examples of compatible and incompatible arrays

» A (shape (5, 4)) and B (shape (1,), or a scalar): A*B (shape (5, 4))

» A (shape (256, 256, 3), a RGB image) and B (shape (3,)): A*B (shape (256, 256, 3))

» A (shape (3,)) and B (shape (4,)) and A (shape (4, 3)) and B (shape (4,)) (incompatible)

One example of broadcasting

A (shape (4, 3)) is compatible with B (shape (3,)), but not with C (shape (4,)).

Figure 2.3: Compatible arrays.

Figure 2.4: Incompatible arrays.

If we unsqueeze (or expand dimensions) C into np.expand_dims(C, axis=1) (shape (4, 1)), then A and
np.expand_dims(C, axis=1) would be compatible. Lastly, array a (shape (4, 1)) and array b (shape (3,))
is compatible.

Figure 2.5: Compatible arrays, example 2.

56
Physical Chemistry 3 Spring 2024, SNU

Code 2.103: Incompatible arrays.

1 a = np.array([[1, 1, 1, 1], [2, 2, 2, 2]]) # (2, 4) array
2 b = np.array([[1, 1], [2, 2], [3, 3]]) # (3, 2) array
3
4 print(a + b)

Output 2.103
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[27], line 4
1 a = np.array([[1, 1, 1, 1], [2, 2, 2, 2]]) # (2, 4) array
2 b = np.array([[1, 1], [2, 2], [3, 3]]) # (3, 2) array
----> 4 print(a + b)

ValueError: operands could not be broadcast together with shapes (2,4) (3,2)

Code 2.104: Compatible arrays.

1 a = np.array([[1, 1, 1, 1], [2, 2, 2, 2]]) # (2, 4) array
2 b = np.array([[7], [7]]) # (2, 1) array
3
4 print(a + b)
5 print((a + b).shape)

Output 2.104

[[8 8 8 8]
[9 9 9 9]]
(2, 4)

Vectorizing with numpy

Typically, Python lists may contain data with different types. For instance, sum() function which calcu-
lates the sum of all elements inside the given list, should check the type of its elements every time you
add the subsequent element.

Meanwhile, numpy arrays are homogeneous (all array elements have same data types); this allows to use
compiled C code for operations between matrix elements, which provides definite speed-up. In Code
2.105, first we generate a numpy array with np.linspace() function, which takes starting point, ending
point and number of points between two points as arguments.

Code 2.105: Vectorized np.sin() function.

1 a = np.linspace(0, np.pi / 2, 100)
2 b = np.sin(a)
3 print(b)
4 print(np.average(np.power(b, 2)))

Output 2.105

[0. 0.01586596 0.03172793 0.04758192 0.06342392 0.07924996

0.09505604 0.1108382 0.12659245 0.14231484 0.1580014 0.17364818
(...)
0.98982144 0.99195481 0.99383846 0.99547192 0.99685478 0.99798668
0.99886734 0.99949654 0.99987413 1. ]
0.5

57
Physical Chemistry 3 Spring 2024, SNU

The np.sin() function is a vectorized function; which guarantees better performance than pure Python
for loops. Note that line 4 in Code 2.105 means

Z π/2
2 1
sin2 x dx =
π 0 2

np.average() function provides averaging through the specific axis of the array.

Code 2.106: np.average() function.

1 x = np.random.randn(4, 3, 2)
2 print(np.average(x, axis = 0)) # (3, 2) array
3 print(np.average(x, axis = 1)) # (4, 2) array
4 print(np.average(x, axis = 2)) # (4, 3) array

Output 2.106

[[-0.50325378 0.73180525]
[ 0.55506121 -0.51120121]
[ 0.09804889 0.40828138]]
[[ 0.21118092 0.59686133]
[ 0.36192477 -0.25026971]
[-0.62335483 -0.48032833]
[ 0.25005756 0.9722506 ]]
[[ 0.10923951 0.54430973 0.55851413]
[ 0.04499412 -0.47082109 0.59330956]
[-0.18002283 -1.25462955 -0.22087235]
[ 0.48289215 1.26886091 0.08170919]]

Linear algebra with numpy

numpy.linalg module provides various linear algebra related functions. Take a look at the documenta-
tion if you are interested in. In this tutorial, we just introduce few useful functions.

Matrix multiplications can be done with @ operator, not with * operator (elementwise multiplication).

Code 2.107: Matrix multiplication.

1 A = np.random.randn(4, 3)
2 x = np.random.randn(3)
3 b = np.random.randn(4)
4
5 print(A @ x + b)

Output 2.107

[ 4.56414939 -0.03387167 0.37951254 -0.56519003]

Determinants, eigenvalues, singular value decomposition (SVD), inverse matrix can be calculated easily!

Code 2.108: Linear algebra operations.

1 A = np.array([[4, 2], [3, 5]], dtype = np.int32)
2 print(np.linalg.det(A))
3 print(np.linalg.eigvals(A))
4 print(np.linalg.svd(A))
5 print(np.linalg.inv(A))

58
Physical Chemistry 3 Spring 2024, SNU

Output 2.108

14.000000000000004
[2. 7.]
SVDResult(U=array([[-0.59025263, -0.80721858],
[-0.80721858, 0.59025263]]), S=array([7.07720233, 1.97818281]), Vh=array([[-0.67578487,
-0.73709891], [-0.73709891, 0.67578487]]))
[[ 0.35714286 -0.14285714]
[-0.21428571 0.28571429]]

numpy has way much more functionalities. Never stay inside this tutorial: navigate the outside world,
the official documentations.

2.5.4 numba module

When used effectively, the numba module offers significantly faster computations. For more information,
refer to the official documentation of numba at https://numba.readthedocs.io/en/stable/index.html.

Code 2.109: Numba example.

1 # !pip install scipy
2 import time
3 from numba import jit
4
5 def f1():
6 sum = 0
7 for i in range(100000000):
8 sum += i
9 return sum
10
11 @jit(nopython = True) # Equivalent to @njit
12 def f2():
13 sum = 0
14 for i in range(100000000):
15 sum += i
16 return sum
17
18 start = time.time()
19 f1()
20 end = time.time()
21 print('Without numba jit:', end - start)
22
23 start = time.time()
24 f2()
25 end = time.time()
26 print('With numba jit (first compile):', end - start)
27
28 start = time.time()
29 f2()
30 end = time.time()
31 print('With numba jit:', end - start)

59
Physical Chemistry 3 Spring 2024, SNU

Output 2.109

Without numba jit: 4.764200687408447

With numba jit (first compile): 0.44391965866088867
With numba jit: 6.175041198730469e-05

numba is a compiler that generates machine-optimized code. It provides the @jit decorator, which
stands for just-in-time compilation. When writing array-based or math-heavy code, decorating your
function with @jit can significantly enhance performance (compare the execution time of f1() and
f2() ). Upon the first function call, numba compiles it (compare the execution time of the first and sec-
ond call of f2() ). The @jit decorator’s nopython=True argument indicates that the decorated function
does not use Python-ic objects such as lists and dictionaries, which may contain elements with different
data types. numba can handle integers, floating-point numbers, strings, numpy arrays, and other fixed
data types. If you include Python-ic objects in your decorated function, you’ll need to use object mode,
which may not yield performance enhancements.

2.5.5 multiprocessing and joblib module

This subsection is only useful if your machine has multiple processors (or CPUs). If not, you can skip
it. The multiprocessing module in Python provides a way to utilize multiple processes (not multiple
threads), while the joblib package encapsulates the functionalities of the multiprocessing module,
making it more user-friendly. You can find the documentations here: https://docs.python.org/3.11/
library/multiprocessing.html and https://joblib.readthedocs.io/en/stable/. We will focus on the
joblib package here.

If you hava more computer science background, searching for Python GIL (Global Interpreter Lock)
would be an interesting thing to do.

The multiprocessing.cpu_count() function extracts the number of CPU cores available (Actually, you
can do similar things or extract more information about your machine with cat /proc/cpuinfo , if you
use a Linux machine).

Code 2.110: How many CPU cores does your machine have?
1 import time
2 import multiprocessing as mp
3 from joblib import Parallel, delayed
4
5 n_cores = mp.cpu_count()
6 print(n_cores)

Output 2.110

Assume that you have 32 independent tasks (each of them has nothing to do with the other 31 tasks)
to do. Then you do not need to do them sequentially with only 1 CPU. If you can utilize 32 CPUs
simultaneously and assign them individual tasks, theoretically 32-fold speed-up can be achieved. In the
following Code 2.111, we utilized 4 CPU cores for the parallelization.

60
Physical Chemistry 3 Spring 2024, SNU

Code 2.111: Example joblib usage.

1 import math
2
3 def parallel_function(i):
4 return math.factorial(int(math.sqrt(i ** 3)))
5
6 start = time.time()
7 for i in range(100, 1000):
8 parallel_function(i)
9 end = time.time()
10 print('Serial execution:', end - start)
11
12 start = time.time()
13 with Parallel(n_jobs = 4) as parallel:
14 parallel(delayed(parallel_function)(i) for i in range(100, 1000))
15 end = time.time()
16 print('(Embarrassingly) Parallel execution:', end - start)

Output 2.111

Serial execution: 5.801486968994141

(Embarrassingly) Parallel execution: 1.9781787395477295

If you try to parallelize too simple task, assigning tasks and copying data would take more time than
sequential computation. Therefore using joblib would worsen the performance.

Code 2.112: Bad joblib usage.

1 def joblib_worsens_this(i):
2 return np.sqrt(i * i)
3
4 start = time.time()
5 for i in range(1000000):
6 joblib_worsens_this(i)
7 end = time.time()
8 print('Serial execution:', end - start)
9
10 start = time.time()
11 with Parallel(n_jobs = n_cores) as parallel:
12 parallel(delayed(joblib_worsens_this)(i) for i in range(1000000))
13 end = time.time()
14 print('(Embarrassingly) Parallel execution:', end - start)

Output 2.112

Serial execution: 1.4704248905181885

(Embarrassingly) Parallel execution: 7.743301868438721

RSPP En-Us SG M05 Modslibraries
No ratings yet
RSPP En-Us SG M05 Modslibraries
29 pages
Unit4 PHP
No ratings yet
Unit4 PHP
12 pages
System Programming With Python in Linux
No ratings yet
System Programming With Python in Linux
81 pages
MSC Python-unit1 & 2 Notes
No ratings yet
MSC Python-unit1 & 2 Notes
8 pages
Introduction on Python
No ratings yet
Introduction on Python
20 pages
Python - Unit 3 Complete Notes
No ratings yet
Python - Unit 3 Complete Notes
61 pages
UNIT 5 notes
No ratings yet
UNIT 5 notes
16 pages
Unit 3
No ratings yet
Unit 3
10 pages
Exp 1 Python
No ratings yet
Exp 1 Python
3 pages
PPDS - UNIT 3 (1)
No ratings yet
PPDS - UNIT 3 (1)
23 pages
Python: Further Topics Day Two: Bruce Beckles University of Cambridge Computing Service
No ratings yet
Python: Further Topics Day Two: Bruce Beckles University of Cambridge Computing Service
46 pages
PPL_Assignment No 9
No ratings yet
PPL_Assignment No 9
6 pages
Python Setup and Usage: Release 2.7.8
100% (1)
Python Setup and Usage: Release 2.7.8
57 pages
PythonInEarthScience
No ratings yet
PythonInEarthScience
86 pages
Os Module of Python
No ratings yet
Os Module of Python
73 pages
python notes sarang sir (1)
No ratings yet
python notes sarang sir (1)
24 pages
HKUST2023 Python HSC Lecture2
No ratings yet
HKUST2023 Python HSC Lecture2
13 pages
Unit 3
No ratings yet
Unit 3
16 pages
Doing Operating System Tasks in Python: Hans Petter Langtangen
No ratings yet
Doing Operating System Tasks in Python: Hans Petter Langtangen
6 pages
Python Setup and Usage: Release 3.7.4rc1
No ratings yet
Python Setup and Usage: Release 3.7.4rc1
80 pages
Python Ch5
No ratings yet
Python Ch5
16 pages
Основы Языка Прграммирования Python
No ratings yet
Основы Языка Прграммирования Python
5 pages
OS MODULE in File Handling
No ratings yet
OS MODULE in File Handling
1 page
UNIT-5
No ratings yet
UNIT-5
6 pages
File and Directory Paths
No ratings yet
File and Directory Paths
9 pages
Library 12
No ratings yet
Library 12
19 pages
08_Python_files
No ratings yet
08_Python_files
34 pages
Chapter 6-7. Modules and Files
No ratings yet
Chapter 6-7. Modules and Files
76 pages
Ph2150 Notesaaaaaaaaaa
No ratings yet
Ph2150 Notesaaaaaaaaaa
16 pages
Ipython Manual
No ratings yet
Ipython Manual
90 pages
Python Module
No ratings yet
Python Module
41 pages
Chapter 6
No ratings yet
Chapter 6
23 pages
Python Programming_File Handling
No ratings yet
Python Programming_File Handling
35 pages
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Basic Python Readthedocs Io en Latest
No ratings yet
Basic Python Readthedocs Io en Latest
55 pages
Python Marathon, 1-Month Challenge
No ratings yet
Python Marathon, 1-Month Challenge
11 pages
Using PDF
No ratings yet
Using PDF
49 pages
Python For Astronomers
No ratings yet
Python For Astronomers
62 pages
Head First
No ratings yet
Head First
5 pages
Lecture 11
No ratings yet
Lecture 11
30 pages
Packages
No ratings yet
Packages
21 pages
Unit 3 Python
No ratings yet
Unit 3 Python
40 pages
UNIT-5
No ratings yet
UNIT-5
11 pages
Python Term Work Mech 3rd Sem
100% (1)
Python Term Work Mech 3rd Sem
32 pages
Unit - 5 Python Notes
No ratings yet
Unit - 5 Python Notes
9 pages
Using
No ratings yet
Using
57 pages
Lecture 2
No ratings yet
Lecture 2
31 pages
Red: What You'd Actually Put in Python File: Common Differences Between Python and MATLAB, Ways To Approach Python
No ratings yet
Red: What You'd Actually Put in Python File: Common Differences Between Python and MATLAB, Ways To Approach Python
19 pages
Advance Python
No ratings yet
Advance Python
202 pages
EE250Unit1_Technologies
No ratings yet
EE250Unit1_Technologies
61 pages
unit V
No ratings yet
unit V
12 pages
Python Class On Files
No ratings yet
Python Class On Files
19 pages
UNIT-03 FILES
No ratings yet
UNIT-03 FILES
17 pages
Python Setup and Usage: Release 3.8.6rc1
No ratings yet
Python Setup and Usage: Release 3.8.6rc1
86 pages
Python Unit 5
No ratings yet
Python Unit 5
7 pages
Python Notes - 4
No ratings yet
Python Notes - 4
23 pages
CP Presentation
No ratings yet
CP Presentation
15 pages
Python History and Versions: o o o o o o o o o o
No ratings yet
Python History and Versions: o o o o o o o o o o
52 pages
FILE 2
No ratings yet
FILE 2
6 pages
Python Notes Unit1
No ratings yet
Python Notes Unit1
62 pages
Casting Workshop
No ratings yet
Casting Workshop
5 pages
Becoming AI Engineer Learning Path
No ratings yet
Becoming AI Engineer Learning Path
4 pages
0231 - PM Internship Report
No ratings yet
0231 - PM Internship Report
38 pages
MCE 4423 Experiment 4 - Efficiency and Performance of A Pelton Turbine
No ratings yet
MCE 4423 Experiment 4 - Efficiency and Performance of A Pelton Turbine
16 pages
Landscape Lecture
No ratings yet
Landscape Lecture
34 pages
Arihant Mathematics Engineering Solved Papers - Watermark
100% (5)
Arihant Mathematics Engineering Solved Papers - Watermark
1,136 pages
Kinematics Clicker Questions PDF
No ratings yet
Kinematics Clicker Questions PDF
49 pages
Samontina JJD Ice03 Act2
No ratings yet
Samontina JJD Ice03 Act2
1 page
Calibration of Sensors
No ratings yet
Calibration of Sensors
5 pages
Citect With SV
No ratings yet
Citect With SV
18 pages
IITD-Courses-of-Study MEM
No ratings yet
IITD-Courses-of-Study MEM
1 page
04 Clutches
No ratings yet
04 Clutches
26 pages
Questions Based On Chemical Reactions: Page 1 of 2
No ratings yet
Questions Based On Chemical Reactions: Page 1 of 2
2 pages
Cursor
No ratings yet
Cursor
18 pages
Biological Microscopes 2005 - 2
No ratings yet
Biological Microscopes 2005 - 2
24 pages
CBSE Sample Paper 2023 Class 12 Physics Marking Scheme
No ratings yet
CBSE Sample Paper 2023 Class 12 Physics Marking Scheme
9 pages
At 1314 Sample Paper IX - Going To X - IQ+S&M - AIEEE
No ratings yet
At 1314 Sample Paper IX - Going To X - IQ+S&M - AIEEE
15 pages
802D Opm
No ratings yet
802D Opm
354 pages
Poster Presentation
No ratings yet
Poster Presentation
1 page
Basic Concepts of Thermodynamics
100% (1)
Basic Concepts of Thermodynamics
19 pages
da ds notes
No ratings yet
da ds notes
27 pages
PPS Reexam Synoptic answer 24-25
No ratings yet
PPS Reexam Synoptic answer 24-25
18 pages
BT Graphite 2100 User Guide
No ratings yet
BT Graphite 2100 User Guide
39 pages
Categorizing Traditional Chinese Painting Images: Lecture Notes in Computer Science October 2004
No ratings yet
Categorizing Traditional Chinese Painting Images: Lecture Notes in Computer Science October 2004
9 pages
Answer Key Elecs Superbook
100% (1)
Answer Key Elecs Superbook
46 pages
Power System Analysis: Dr. M. Varadarajan Eee, Sce
No ratings yet
Power System Analysis: Dr. M. Varadarajan Eee, Sce
21 pages
Datasheet Din 7991
No ratings yet
Datasheet Din 7991
6 pages
Lab-02 Declarations and Initialization of Data Variables, Data Types, Escape Sequence
No ratings yet
Lab-02 Declarations and Initialization of Data Variables, Data Types, Escape Sequence
4 pages
GEAR SHIFTING FINAL REPORT - pdf12
No ratings yet
GEAR SHIFTING FINAL REPORT - pdf12
27 pages
EIL Document On Motor, Panel
100% (1)
EIL Document On Motor, Panel
62 pages