Pycse
Pycse
Pycse
Engineering
John Kitchin
jkitchin@andrew.cmu.edu
http://kitchingroup.cheme.cmu.edu
Twitter: @johnkitchin
2015-04-25
Contents
1 Overview 9
1
2.9.3 struct . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.9.4 dictionaries . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10 Indexing vectors and arrays in Python . . . . . . . . . . . . . 29
2.10.1 2d arrays . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.10.2 Using indexing to assign values to rows and columns . 32
2.10.3 3D arrays . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.10.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11 Controlling the format of printed variables . . . . . . . . . . . 33
2.12 Advanced string formatting . . . . . . . . . . . . . . . . . . . 36
3 Math 38
3.1 Numeric derivatives by differences . . . . . . . . . . . . . . . 38
3.2 Vectorized numeric derivatives . . . . . . . . . . . . . . . . . 40
3.3 2-point vs. 4-point numerical derivatives . . . . . . . . . . . . 41
3.4 Derivatives by polynomial fitting . . . . . . . . . . . . . . . . 43
3.5 Derivatives by fitting a function and taking the analytical
derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6 Derivatives by FFT . . . . . . . . . . . . . . . . . . . . . . . . 47
3.7 A novel way to numerically estimate the derivative of a func-
tion - complex-step derivative approximation . . . . . . . . . 48
3.8 Vectorized piecewise functions . . . . . . . . . . . . . . . . . . 50
3.9 Smooth transitions between discontinuous functions . . . . . 54
3.9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.10 Smooth transitions between two constants . . . . . . . . . . . 58
3.11 On the quad or trapzd in ChemE heaven . . . . . . . . . . . 59
3.11.1 Numerical data integration . . . . . . . . . . . . . . . 60
3.11.2 Combining numerical data with quad . . . . . . . . . 62
3.11.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.12 Polynomials in python . . . . . . . . . . . . . . . . . . . . . . 62
3.12.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.13 Wilkinsons polynomial . . . . . . . . . . . . . . . . . . . . . 65
3.14 The trapezoidal method of integration . . . . . . . . . . . . . 70
3.15 Numerical Simpsons rule . . . . . . . . . . . . . . . . . . . . . 72
3.16 Integrating functions in python . . . . . . . . . . . . . . . . . 72
3.16.1 double integrals . . . . . . . . . . . . . . . . . . . . . . 73
3.16.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.17 Integrating equations in python . . . . . . . . . . . . . . . . . 74
3.18 Function integration by the Romberg method . . . . . . . . . 75
3.19 Symbolic math in python . . . . . . . . . . . . . . . . . . . . 75
2
3.19.1 Solve the quadratic equation . . . . . . . . . . . . . . 75
3.19.2 differentiation . . . . . . . . . . . . . . . . . . . . . . . 76
3.19.3 integration . . . . . . . . . . . . . . . . . . . . . . . . 76
3.19.4 Analytically solve a simple ODE . . . . . . . . . . . . 76
3.20 Is your ice cream float bigger than mine . . . . . . . . . . . . 77
4 Linear algebra 79
4.1 Potential gotchas in linear algebra in numpy . . . . . . . . . . 79
4.2 Solving linear equations . . . . . . . . . . . . . . . . . . . . . 81
4.3 Rules for transposition . . . . . . . . . . . . . . . . . . . . . . 83
4.3.1 The transpose in Python . . . . . . . . . . . . . . . . 83
4.3.2 Rule 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3.3 Rule 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3.4 Rule 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3.5 Rule 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4 Sums products and linear algebra notation - avoiding loops
where possible . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.1 Old-fashioned way with a loop . . . . . . . . . . . . . 85
4.4.2 The numpy approach . . . . . . . . . . . . . . . . . . 86
4.4.3 Matrix algebra approach. . . . . . . . . . . . . . . . . 86
4.4.4 Another example . . . . . . . . . . . . . . . . . . . . . 86
4.4.5 Last example . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5 Determining linear independence of a set of vectors . . . . . . 88
4.5.1 another example . . . . . . . . . . . . . . . . . . . . . 90
4.5.2 Near deficient rank . . . . . . . . . . . . . . . . . . . . 90
4.5.3 Application to independent chemical reactions. . . . . 91
4.6 Reduced row echelon form . . . . . . . . . . . . . . . . . . . . 92
4.7 Computing determinants from matrix decompositions . . . . 93
4.8 Calling lapack directly from scipy . . . . . . . . . . . . . . . . 94
5 Nonlinear algebra 96
5.1 Know your tolerance . . . . . . . . . . . . . . . . . . . . . . . 96
5.2 Solving integral equations with fsolve . . . . . . . . . . . . . . 98
5.2.1 Summary notes . . . . . . . . . . . . . . . . . . . . . . 100
5.3 Method of continuity for nonlinear equation solving . . . . . . 100
5.4 Method of continuity for solving nonlinear equations - Part II 105
5.5 Counting roots . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5.1 Use roots for this polynomial . . . . . . . . . . . . . . 108
3
5.5.2 method 1 . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.5.3 Method 2 . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.6 Finding the nth root of a periodic function . . . . . . . . . . 110
5.7 Coupled nonlinear equations . . . . . . . . . . . . . . . . . . . 113
6 Statistics 114
6.1 Introduction to statistical data analysis . . . . . . . . . . . . 114
6.2 Basic statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3 Confidence interval on an average . . . . . . . . . . . . . . . . 116
6.4 Are averages different . . . . . . . . . . . . . . . . . . . . . . 117
6.4.1 The hypothesis . . . . . . . . . . . . . . . . . . . . . . 117
6.4.2 Compute the t-score for our data . . . . . . . . . . . . 117
6.4.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . 118
6.5 Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.6 Numerical propagation of errors . . . . . . . . . . . . . . . . . 128
6.6.1 Addition and subtraction . . . . . . . . . . . . . . . . 128
6.6.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . 129
6.6.3 Division . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.6.4 exponents . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.6.5 the chain rule in error propagation . . . . . . . . . . . 130
6.6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.7 Another approach to error propagation . . . . . . . . . . . . . 131
6.7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.8 Random thoughts . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 139
4
7.12 Reading in delimited text files . . . . . . . . . . . . . . . . . . 160
8 Interpolation 161
8.1 Better interpolate than never . . . . . . . . . . . . . . . . . . 161
8.1.1 Estimate the value of f at t=2. . . . . . . . . . . . . . 161
8.1.2 improved interpolation? . . . . . . . . . . . . . . . . . 162
8.1.3 The inverse question . . . . . . . . . . . . . . . . . . . 163
8.1.4 A harder problem . . . . . . . . . . . . . . . . . . . . 164
8.1.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.2 Interpolation of data . . . . . . . . . . . . . . . . . . . . . . . 167
8.3 Interpolation with splines . . . . . . . . . . . . . . . . . . . . 168
9 Optimization 169
9.1 Constrained optimization . . . . . . . . . . . . . . . . . . . . 169
9.2 Finding the maximum power of a photovoltaic device. . . . . 170
9.3 Using Lagrange multipliers in optimization . . . . . . . . . . 173
9.3.1 Construct the Lagrange multiplier augmented function 174
9.3.2 Finding the partial derivatives . . . . . . . . . . . . . 175
9.3.3 Now we solve for the zeros in the partial derivatives . 175
9.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.4 Linear programming example with inequality constraints . . . 176
9.5 Find the minimum distance from a point to a curve. . . . . . 178
5
10.1.15 Solving Bessels Equation numerically . . . . . . . . . 208
10.1.16 Phase portraits of a system of ODEs . . . . . . . . . . 209
10.1.17 Linear algebra approaches to solving systems of con-
stant coefficient ODEs . . . . . . . . . . . . . . . . . . 213
10.2 Delay Differential Equations . . . . . . . . . . . . . . . . . . . 215
10.3 Differential algebraic systems of equations . . . . . . . . . . . 215
10.4 Boundary value equations . . . . . . . . . . . . . . . . . . . . 215
10.4.1 Plane Poiseuille flow - BVP solve by shooting method 215
10.4.2 Plane poiseuelle flow solved by finite difference . . . . 221
10.4.3 Boundary value problem in heat conduction . . . . . . 224
10.4.4 BVP in pycse . . . . . . . . . . . . . . . . . . . . . . . 226
10.4.5 A nonlinear BVP . . . . . . . . . . . . . . . . . . . . . 228
10.4.6 Another look at nonlinear BVPs . . . . . . . . . . . . 231
10.4.7 Solving the Blasius equation . . . . . . . . . . . . . . . 233
10.5 Partial differential equations . . . . . . . . . . . . . . . . . . . 235
10.5.1 Modeling a transient plug flow reactor . . . . . . . . . 235
10.5.2 Transient heat conduction - partial differential equations239
10.5.3 Transient diffusion - partial differential equations . . . 243
11 Plotting 246
11.1 Plot customizations - Modifying line, text and figure properties246
11.1.1 setting all the text properties in a figure. . . . . . . . 249
11.2 Plotting two datasets with very different scales . . . . . . . . 251
11.2.1 Make two plots! . . . . . . . . . . . . . . . . . . . . . . 252
11.2.2 Scaling the results . . . . . . . . . . . . . . . . . . . . 253
11.2.3 Double-y axis plot . . . . . . . . . . . . . . . . . . . . 254
11.2.4 Subplots . . . . . . . . . . . . . . . . . . . . . . . . . . 255
11.3 Customizing plots after the fact . . . . . . . . . . . . . . . . . 256
11.4 Fancy, built-in colors in Python . . . . . . . . . . . . . . . . . 259
11.5 Picassos short lived blue period with Python . . . . . . . . . 260
11.6 Interactive plotting . . . . . . . . . . . . . . . . . . . . . . . . 263
11.6.1 Basic mouse clicks . . . . . . . . . . . . . . . . . . . . 263
11.7 key events not working on Mac/org-mode . . . . . . . . . . . 265
11.7.1 Mouse movement . . . . . . . . . . . . . . . . . . . . . 267
11.7.2 key press events . . . . . . . . . . . . . . . . . . . . . 268
11.7.3 Picking lines . . . . . . . . . . . . . . . . . . . . . . . 269
11.7.4 Picking data points . . . . . . . . . . . . . . . . . . . . 269
11.8 Peak annotation in matplotlib . . . . . . . . . . . . . . . . . . 270
6
12 Programming 272
12.1 Some of this, sum of that . . . . . . . . . . . . . . . . . . . . 272
12.1.1 Nested lists . . . . . . . . . . . . . . . . . . . . . . . . 273
12.2 Sorting in python . . . . . . . . . . . . . . . . . . . . . . . . . 274
12.3 Unique entries in a vector . . . . . . . . . . . . . . . . . . . . 276
12.4 Lather, rinse and repeat . . . . . . . . . . . . . . . . . . . . . 276
12.4.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 277
12.5 Brief intro to regular expressions . . . . . . . . . . . . . . . . 278
12.6 Working with lists . . . . . . . . . . . . . . . . . . . . . . . . 279
12.7 Making word files in python . . . . . . . . . . . . . . . . . . . 281
12.8 Interacting with Excel in python . . . . . . . . . . . . . . . . 283
12.8.1 Writing Excel workbooks . . . . . . . . . . . . . . . . 284
12.8.2 Updating an existing Excel workbook . . . . . . . . . 284
12.8.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 285
12.9 Using Excel in Python . . . . . . . . . . . . . . . . . . . . . . 285
12.10Running Aspen via Python . . . . . . . . . . . . . . . . . . . 286
12.11Using an external solver with Aspen . . . . . . . . . . . . . . 289
12.12Redirecting the print function . . . . . . . . . . . . . . . . . . 290
12.13Getting a dictionary of counts . . . . . . . . . . . . . . . . . . 294
12.14About your python . . . . . . . . . . . . . . . . . . . . . . . . 295
12.15Automatic, temporary directory changing . . . . . . . . . . . 296
13 Miscellaneous 298
13.1 Mail merge with python . . . . . . . . . . . . . . . . . . . . . 298
7
14.5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 320
14.6 The Gibbs free energy of a reacting mixture and the equilib-
rium composition . . . . . . . . . . . . . . . . . . . . . . . . . 321
14.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 326
14.7 Water gas shift equilibria via the NIST Webbook . . . . . . . 326
14.7.1 hydrogen . . . . . . . . . . . . . . . . . . . . . . . . . 327
14.7.2 H_{2}O . . . . . . . . . . . . . . . . . . . . . . . . . . 327
14.7.3 CO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
14.7.4 CO_{2} . . . . . . . . . . . . . . . . . . . . . . . . . . 328
14.7.5 Standard state heat of reaction . . . . . . . . . . . . . 328
14.7.6 Non-standard state H and G . . . . . . . . . . . . 329
14.7.7 Plot how the G varies with temperature . . . . . . . 329
14.7.8 Equilibrium constant calculation . . . . . . . . . . . . 330
14.7.9 Equilibrium yield of WGS . . . . . . . . . . . . . . . . 331
14.7.10 Compute gas phase pressures of each species . . . . . 332
14.7.11 Compare the equilibrium constants . . . . . . . . . . . 332
14.7.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 333
14.8 Constrained minimization to find equilibrium compositions . 333
14.8.1 summary . . . . . . . . . . . . . . . . . . . . . . . . . 337
14.9 Using constrained optimization to find the amount of each
phase present . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
14.10Conservation of mass in chemical reactions . . . . . . . . . . 340
14.11Numerically calculating an effectiveness factor for a porous
catalyst bead . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
14.12Computing a pipe diameter . . . . . . . . . . . . . . . . . . . 344
14.13Reading parameter database text files in python . . . . . . . 346
14.14Calculating a bubble point pressure of a mixture . . . . . . . 349
14.15The equal area method for the van der Waals equation . . . . 350
14.15.1 Compute areas . . . . . . . . . . . . . . . . . . . . . . 353
14.16Time dependent concentration in a first order reversible re-
action in a batch reactor . . . . . . . . . . . . . . . . . . . . . 355
14.17Finding equilibrium conversion . . . . . . . . . . . . . . . . . 357
14.18Integrating a batch reactor design equation . . . . . . . . . . 358
14.19Uncertainty in an integral equation . . . . . . . . . . . . . . . 358
14.20Integrating the batch reactor mole balance . . . . . . . . . . . 359
14.21Plug flow reactor with a pressure drop . . . . . . . . . . . . . 361
14.22Solving CSTR design equations . . . . . . . . . . . . . . . . . 362
14.23Meet the steam tables . . . . . . . . . . . . . . . . . . . . . . 363
14.23.1 Starting point in the Rankine cycle in condenser. . . . 363
14.23.2 Isentropic compression of liquid to point 2 . . . . . . . 364
8
14.23.3 Isobaric heating to T3 in boiler where we make steam 364
14.23.4 Isentropic expansion through turbine to point 4 . . . . 365
14.23.5 To get from point 4 to point 1 . . . . . . . . . . . . . 365
14.23.6 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . 365
14.23.7 Entropy-temperature chart . . . . . . . . . . . . . . . 365
14.23.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 368
14.24What region is a point in . . . . . . . . . . . . . . . . . . . . 368
15 Units 374
15.1 Using units in python . . . . . . . . . . . . . . . . . . . . . . 374
15.1.1 scimath . . . . . . . . . . . . . . . . . . . . . . . . . . 375
15.2 Handling units with the quantities module . . . . . . . . . . . 376
15.3 Units in ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . 382
15.4 Handling units with dimensionless equations . . . . . . . . . . 386
17 References 399
18 Index 399
1 Overview
This is a collection of examples of using python in the kinds of scientific
and engineering computations I have used in classes and research. They are
organized by topics.
I recommend the Continuum IO Anaconda python distribution (https:
//www.continuum.io). This distribution is free for academic use, and cheap
otherwise. It is pretty complete in terms of mathematical, scientific and
plotting modules. All of the examples in this book were created run with
the Anaconda python distribution.
9
A float number has a decimal in it. The following are all floats: 1.0, -9., and
3.56. Note the trailing zero is not required, although it is good style.
1 print(2 + 4)
2 print(8.1 - 5)
6
3.0999999999999996
1 print(5 * 4)
2 print(3.1 * 2)
20
6.2
1 print(4.0 / 2.0)
2 print(1.0/3.1)
2.0
0.3225806451612903
1 print(4 / 2)
2 print(1/3)
2.0
0.3333333333333333
The first result is probably what you expected, but the second may come
as a surprise. In integer division the remainder is discarded, and the result
is an integer.
Exponentiation is also a basic math operation that python supports di-
rectly.
10
1 print(3.**2)
2 print(3**2)
3 print(2**0.5)
9.0
9
1.4142135623730951
1 import numpy as np
2 print(np.sqrt(2))
1.41421356237
1 import numpy as np
2 print(np.exp(1))
2.71828182846
There are two logarithmic functions commonly used, the natural log
function numpy.log and the base10 logarithm numpy.log10.
1 import numpy as np
2 print(np.log(10))
3 print(np.log10(10)) # base10
2.30258509299
1.0
11
There are many other intrinsic functions available in numpy which we
will eventually cover. First, we need to consider how to create our own
functions.
1 import numpy as np
2 x = 3
3 print(x**3 - np.log(x))
25.9013877113
It would be tedious to type this out each time. Next, we learn how to
express this equation as a new function, which we can call with different
values.
1 import numpy as np
2 def f(x):
3 return x**3 - np.log(x)
4
5 print(f(3))
6 print(f(5.1))
25.9013877113
131.02175946
It may not seem like we did much there, but this is the foundation for
solving equations in the future. Before we get to solving equations, we have
a few more details to consider. Next, we consider evaluating functions on
arrays of values.
1 def f(x):
2 "return the inverse square of x"
3 return 1.0 / x**2
4
5 print(f(3))
6 print(f([4,5]))
12
Note that functions are not automatically vectorized. That is why we
see the error above. There are a few ways to achieve that. One is to "cast"
the input variables to objects that support vectorized operations, such as
numpy.array objects.
1 import numpy as np
2
3 def f(x):
4 "return the inverse square of x"
5 x = np.array(x)
6 return 1.0 / x**2
7
8 print(f(3))
9 print(f([4,5]))
0.111111111111
[ 0.0625 0.04 ]
1 import numpy as np
2
3 def func(x, y):
4 "return product of x and y"
5 return x * y
6
7 print(func(2, 3))
8 print(func(np.array([2, 3]), np.array([3, 4])))
6
[ 6 12]
You can define "lambda" functions, which are also known as inline or
anonymous functions. The syntax is lambda var:f(var). I think these
are hard to read and discourage their use. Here is a typical usage where
you have to define a simple function that is passed to another function, e.g.
scipy.integrate.quad to perform an integral.
(4.0, 4.440892098500626e-14)
13
1 def wrapper(x):
2 a = 4
3 def func(x, a):
4 return a * x
5
6 return func(x, a)
7
8 print(wrapper(4))
16
16
14
2.5 Advanced function creation
Python has some nice features in creating functions. You can create default
values for variables, have optional variables and optional keyword variables.
In this function f(a,b), a and b are called positional arguments, and they are
required, and must be provided in the same order as the function defines.
If we provide a default value for an argument, then the argument is called
a keyword argument, and it becomes optional. You can combine positional
arguments and keyword arguments, but positional arguments must come
first. Here is an example.
4
8
16
15
In the first call to the function, we only define the argument a, which
is a mandatory, positional argument. In the second call, we define a and n,
in the order they are defined in the function. Finally, in the third call, we
define a as a positional argument, and n as a keyword argument.
If all of the arguments are optional, we can even call the function with no
arguments. If you give arguments as positional arguments, they are used in
the order defined in the function. If you use keyword arguments, the order
is arbitrary.
1
16
16
1 def func(*args):
2 sum = 0
3 for arg in args:
4 sum += arg
5 return sum
6
7 print(func(1, 2, 3, 4))
10
16
1 import functools, operator
2 def func(*args):
3 return functools.reduce(operator.add, args)
4 print(func(1, 2, 3, 4))
10
1 def func(**kwargs):
2 for kw in kwargs:
3 print({0} = {1}.format(kw, kwargs[kw]))
4
5 func(t1=6, color=blue)
t1 = 6
color = blue
17
In that example we wrap the matplotlib plotting commands in a function,
which we can call the way we want to, with arbitrary optional arguments.
In this example, you cannot pass keyword arguments that are illegal to the
plot command or you will get an error.
It is possible to combine all the options at once. I admit it is hard to
imagine where this would be really useful, but it can be done!
1 import numpy as np
2
3 def func(a, b=2, *args, **kwargs):
4 "return a**b + sum(args) and print kwargs"
5 for kw in kwargs:
6 print(kw: {0} = {1}.format(kw, kwargs[kw]))
7
8 return a**b + np.sum(args)
9
10 print(func(2, 3, 4, 5, mysillykw=hahah))
18
and it is inconvenient to have to use def to create a named function. Lambda
functions solve this problem. Let us look at some examples. First, we create
a lambda function, and assign it to a variable. Then we show that variable
is a function, and that we can call it with an argument.
1 f = lambda x: 2*x
2 print(f)
3 print(f(2))
1 f = lambda x,y: x + y
2 print(f)
3 print(f(2, 3))
1 f = lambda x, y=3: x + y
2 print(f)
3 print(f(2))
4 print(f(4, 1))
19
<function <lambda> at 0x10077f378>
1
3
6
You can also make arbitrary keyword arguments. Here we make a func-
tion that simply returns the kwargs as a dictionary. This feature may be
helpful in passing kwargs to other functions.
{b: 3, a: 1}
Of course, you can combine these options. Here is a function with all
the options.
6.25
20
Another time to use lambda functions is if you want to set a particular
value of a parameter in a function. Say we have a function with an inde-
pendent variable, x and a parameter a, i.e. f (x; a). If we want to find a
solution f (x; a) = 0 for some value of a, we can use a lambda function to
make a function of the single variable x. Here is a example.
1.5625
Any function that takes a function as an argument can use lambda func-
tions. Here we use a lambda function that adds two numbers in the reduce
function to sum a list of numbers.
1 import functools as ft
2 print(ft.reduce(lambda x, y: x + y, [0, 1, 2, 3, 4]))
10
R2 2
We can evaluate the integral 0 x dx with a lambda function.
(2.666666666666667, 2.960594732333751e-14)
2.6.2 Summary
Lambda functions can be helpful. They are never necessary. You can al-
ways define a function using def, but for some small, single-use functions,
a lambda function could make sense. Lambda functions have some limita-
tions, including that they are limited to a single expression, and they lack
documentation strings.
21
2.7 Creating arrays in python
Often, we will have a set of 1-D arrays, and we would like to construct a 2D
array with those vectors as either the rows or columns of the array. This may
happen because we have data from different sources we want to combine, or
because we organize the code with variables that are easy to read, and then
want to combine the variables. Here are examples of doing that to get the
vectors as the columns.
1 import numpy as np
2
3 a = np.array([1, 2, 3])
4 b = np.array([4, 5, 6])
5
6 print(np.column_stack([a, b]))
7
8 # this means stack the arrays vertically, e.g. on top of each other
9 print(np.vstack([a, b]).T)
[[1 4]
[2 5]
[3 6]]
[[1 4]
[2 5]
[3 6]]
Or rows:
1 import numpy as np
2
3 a = np.array([1, 2, 3])
4 b = np.array([4, 5, 6])
5
6 print(np.row_stack([a, b]))
7
8 # this means stack the arrays vertically, e.g. on top of each other
9 print(np.vstack([a, b]))
[[1 2 3]
[4 5 6]]
[[1 2 3]
[4 5 6]]
22
from a calculation for further analysis, or plotting for example. There are
splitting functions in numpy. They are somewhat confusing, so we examine
some examples. The numpy.hsplit command splits an array "horizontally".
The best way to think about it is that the "splits" move horizontally across
the array. In other words, you draw a vertical split, move over horizontally,
draw another vertical split, etc. . . You must specify the number of splits
that you want, and the array must be evenly divisible by the number of
splits.
1 import numpy as np
2
3 A = np.array([[1, 2, 3, 5],
4 [4, 5, 6, 9]])
5
6 # split into two parts
7 p1, p2 = np.hsplit(A, 2)
8 print(p1)
9 print(p2)
10
11 #split into 4 parts
12 p1, p2, p3, p4 = np.hsplit(A, 4)
13 print(p1)
14 print(p2)
15 print(p3)
16 print(p4)
[[1 2]
[4 5]]
[[3 5]
[6 9]]
[[1]
[4]]
[[2]
[5]]
[[3]
[6]]
[[5]
[9]]
1 import numpy as np
2
3 A = np.array([[1, 2, 3, 5],
23
4 [4, 5, 6, 9]])
5
6 # split into two parts
7 p1, p2 = np.vsplit(A, 2)
8 print(p1)
9 print(p2)
10 print(p2.shape)
[[1 2 3 5]]
[[4 5 6 9]]
(1, 4)
1 import numpy as np
2
3 A = np.array([[1, 2, 3, 5],
4 [4, 5, 6, 9]])
5
6 # split into two parts
7 p1, p2 = A
8 print(p1)
9 print(p2)
[1 2 3 5]
[4 5 6 9]
1 import numpy as np
2
3 A = np.array([[1, 2, 3, 5],
4 [4, 5, 6, 9]])
5
6 # split into two parts
7 p1, p2, p3, p4 = A.T
8 print(p1)
9 print(p2)
10 print(p3)
11 print(p4)
12 print(p4.shape)
[1 4]
[2 5]
[3 6]
[5 9]
(2,)
24
Note that now, we have 1D arrays.
You can also access rows and columns by indexing. We index an array
by [row, column]. To get a row, we specify the row number, and all the
columns in that row like this [row, :]. Similarly, to get a column, we specify
that we want all rows in that column like this: [:, column]. This approach
is useful when you only want a few columns or rows.
1 import numpy as np
2
3 A = np.array([[1, 2, 3, 5],
4 [4, 5, 6, 9]])
5
6 # get row 1
7 print(A[1])
8 print(A[1, :]) # row 1, all columns
9
10 print(A[:, 2]) # get third column
11 print(A[:, 2].shape)
[4 5 6 9]
[4 5 6 9]
[3 6]
(2,)
1 import numpy as np
2 print(np.linspace(0, np.pi, 10))
The main point of using the numpy functions is that they work element-
wise on elements of an array. In this example, we compute the cos(x) for
each element of x.
25
1 import numpy as np
2 x = np.linspace(0, np.pi, 10)
3 print(np.cos(x))
You can already see from this output that there is a root to the equation
cos(x) = 0, because there is a change in sign in the output. This is not a
very convenient way to view the results; a graph would be better. We use
matplotlib to make figures. Here is an example.
This figure illustrates graphically what the numbers above show. The
function crosses zero at approximately x = 1.5. To get a more precise
value, we must actually solve the function numerically. We use the function
26
scipy.optimize.fsolve to do that. More precisely, we want to solve the
equation f (x) = cos(x) = 0. We create a function that defines that equation,
and then use scipy.optimize.fsolve to solve it.
1.57079632679
1.5707963267948966
benzene
[-16, 104]
27
benzene 6.9056
[-16, 104]
Lists are "mutable", which means you can change their values.
3 cat
[3, 4, 5, [7, 8], dog]
2.9.2 tuples
Tuples are immutable; you cannot change their values. This is handy in
cases where it is an error to change the value. A tuple is like a list but it is
enclosed in parentheses.
2.9.3 struct
Python does not exactly have the same thing as a struct in Matlab. You
can achieve something like it by defining an empty class and then defining
attributes of the class. You can check if an object has a particular attribute
using hasattr.
1 class Antoine:
2 pass
3
4 a = Antoine()
5 a.name = benzene
6 a.Trange = [-16, 104]
7
8 print(a.name)
9 print(hasattr(a, Trange))
10 print(hasattr(a, A))
benzene
True
False
28
2.9.4 dictionaries
The analog of the containers.Map in Matlab is the dictionary in python.
Dictionaries are enclosed in curly brackets, and are composed of key:value
pairs.
1 s = {name:benzene,
2 A:6.9056,
3 B:1211.0}
4
5 s[C] = 220.79
6 s[Trange] = [-16, 104]
7
8 print(s)
9 print(s[Trange])
1 s = {name:benzene,
2 A:6.9056,
3 B:1211.0}
4
5 print(C in s)
6 # default value for keys not in the dictionary
7 print(s.get(C, None))
8
9 print(s.keys())
10 print(s.values())
False
None
dict_keys([B, name, A])
dict_values([1211.0, benzene, 6.9056])
2.9.5 Summary
We have examined four data structures in python. Note that none of these
types are arrays/vectors with defined mathematical operations. For those,
you need to consider numpy.array.
29
example, maybe you want to plot column 1 vs column 2, or you want the
integral of data between x = 4 and x = 6, but your vector covers 0 < x <
10. Indexing is the way to do these things.
A key point to remember is that in python array/vector indices start at
0. Unlike Matlab, which uses parentheses to index a array, we use brackets
in python.
1 import numpy as np
2
3 x = np.linspace(-np.pi, np.pi, 10)
4 print(x)
5
6 print(x[0]) # first element
7 print(x[2]) # third element
8 print(x[-1]) # last element
9 print(x[-2]) # second to last element
We can select a range of elements too. The syntax a:b extracts the a{th}
to (b-1){th} elements. The syntax a:b:n starts at a, skips nelements up to
the index b.
Suppose we want the part of the vector where x > 2. We could do that
by inspection, but there is a better way. We can create a mask of boolean
(0 or 1) values that specify whether x > 2 or not, and then use the mask as
an index.
30
1 print(x[x > 2])
[ 2.44346095 3.14159265]
You can use this to analyze subsections of data, for example to integrate
the function y = sin(x) where x > 2.
1 y = np.sin(x)
2
3 print(np.trapz( x[x > 2], y[x > 2]))
-1.79500162881
2.10.1 2d arrays
In 2d arrays, we use row, column notation. We use a : to indicate all rows
or all columns.
1 a = np.array([[1, 2, 3],
2 [4, 5, 6],
3 [7, 8, 9]])
4
5 print(a[0, 0])
6 print(a[-1, -1])
7
8 print(a[0, :] )# row one
9 print(a[:, 0] )# column one
10 print(a[:])
0__dummy_completion__ 1__dummy_completion__
0__dummy_completion__ 1__dummy_completion__
1
9
[1 2 3]
[1 4 7]
[[1 2 3]
[4 5 6]
[7 8 9]]
31
2.10.2 Using indexing to assign values to rows and columns
1 b = np.zeros((3, 3))
2 print(b)
3
4 b[:, 0] = [1, 2, 3] # set column 0
5 b[2, 2] = 12 # set a single element
6 print(b)
7
8 b[2] = 6 # sets everything in row 2 to 6!
9 print(b)
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
[[ 1. 0. 0.]
[ 2. 0. 0.]
[ 3. 0. 12.]]
[[ 1. 0. 0.]
[ 2. 0. 0.]
[ 6. 6. 6.]]
Python does not have the linear assignment method like Matlab does.
You can achieve something like that as follows. We flatten the array to 1D,
do the linear assignment, and reshape the result back to the 2D array.
1 c = b.flatten()
2 c[2] = 34
3 b[:] = c.reshape(b.shape)
4 print(b)
[[ 1. 0. 34.]
[ 2. 0. 0.]
[ 6. 6. 6.]]
2.10.3 3D arrays
The 3d array is like book of 2D matrices. Each page has a 2D matrix on it.
think about the indexing like this: (row, column, page)
32
[[[ 0.17900461 0.24477532 0.75963967]
[ 0.5595659 0.43535773 0.88449451]
[ 0.8169282 0.67361582 0.31123476]]
2.10.4 Summary
The most common place to use indexing is probably when a function returns
an array with the independent variable in column 1 and solution in column
2, and you want to plot the solution. Second is when you want to analyze
one part of the solution. There are also applications in numerical methods,
for example in assigning values to the elements of a matrix or vector.
1 a = 2./3
2 print(a)
3 print(1/3)
33
4 print(1./3.)
5 print(10.1)
6 print("Avogadros number is ", 6.022e23,.)
0.6666666666666666
0.3333333333333333
0.3333333333333333
10.1
Avogadros number is 6.022e+23 .
In that example, the 0 in {0:1.3f} refers to the first (and only) argument
to the format function. If there is more than one argument, we can refer to
them like this:
Note you can refer to the same argument more than once, and in arbi-
trary order within the string.
Suppose you have a list of numbers you want to print out, like this:
34
The answer is 0.33
The answer is 0.17
The answer is 0.11
The "g" format specifier is a general format that can be used to indicate a
precision, or to indicate significant digits. To print a number with a specific
number of significant digits we do this:
1 print({0:1.3g}.format(1./3.))
2 print({0:1.3g}.format(4./3.))
0.333
1.33
We can also specify plus or minus signs. Compare the next two outputs.
-1.00
1.00
You can see the decimals do not align. That is because there is a minus
sign in front of one number. We can specify to show the sign for positive
and negative numbers, or to pad positive numbers to leave space for positive
numbers.
-1.00
+1.00
-1.00
1.00
35
1 import numpy as np
2 eps = np.finfo(np.double).eps
3 print(eps)
4 print({0}.format(eps))
5 print({0:1.2f}.format(eps))
6 print({0:1.2e}.format(eps)) #exponential notation
7 print({0:1.2E}.format(eps)) #exponential notation with capital E
2.22044604925e-16
2.220446049250313e-16
0.00
2.22e-16
2.22E-16
There are many other options for formatting strings. See http://docs.
python.org/2/library/string.html#formatstrings for a full specifica-
tion of the options.
36
1 speed = slow
2 color= blue
3
4 print(The {speed} {color} fox.format(**locals()))
1 class A:
2 def __init__(self, a, b, c):
3 self.a = a
4 self.b = b
5 self.c = c
6
7 mya = A(3,4,5)
8
9 print(a = {obj.a}, b = {obj.b}, c = {obj.c:1.2f}.format(obj=mya))
a = 3, b = 4, c = 5.00
And, you can access elements of a list. Note, however you cannot use -1
as an index in this case.
1 L = [4, 5, cat]
2
3 print(element 0 = {obj[0]}, and the last element is {obj[2]}.format(obj=L))
37
1 class A:
2 def __init__(self, a, b):
3 self.a = a; self.b = b
4
5 def __format__(self, format):
6 s = a={{0:{0}}} b={{1:{0}}}.format(format)
7 return s.format(self.a, self.b)
8
9 def __str__(self):
10 return str: class A, a={0} b={1}.format(self.a, self.b)
11
12 def __repr__(self):
13 return representing: class A, a={0}, b={1}.format(self.a, self.b)
14
15 mya = A(3, 4)
16
17 print({0}.format(mya)) # uses __format__
18 print({0!s}.format(mya)) # uses __str__
19 print({0!r}.format(mya)) # uses __repr__
a=3 b=4
str: class A, a=3 b=4
representing: class A, a=3, b=4
3 Math
3.1 Numeric derivatives by differences
numpy has a function called numpy.diff() that is similar to the one found
in matlab. It calculates the differences between the elements in your list,
and returns a list that is one element shorter, which makes it unsuitable for
plotting the derivative of a function.
Loops in python are pretty slow (relatively speaking) but they are usually
trivial to understand. In this script we show some simple ways to construct
derivative vectors using loops. It is implied in these formulas that the data
points are equally spaced. If they are not evenly spaced, you need a different
approach.
1 import numpy as np
2 from pylab import *
3 import time
38
4
5
6 These are the brainless way to calculate numerical derivatives. They
7 work well for very smooth data. they are surprisingly fast even up to
8 10000 points in the vector.
9
10
11 x = np.linspace(0.78,0.79,100)
12 y = np.sin(x)
13 dy_analytical = np.cos(x)
14
15 lets use a forward difference method:
16 that works up until the last point, where there is not
17 a forward difference to use. there, we use a backward difference.
18
19
20 tf1 = time.time()
21 dyf = [0.0]*len(x)
22 for i in range(len(y)-1):
23 dyf[i] = (y[i+1] - y[i])/(x[i+1]-x[i])
24 #set last element by backwards difference
25 dyf[-1] = (y[-1] - y[-2])/(x[-1] - x[-2])
26
27 print( Forward difference took %f seconds % (time.time() - tf1))
28
29 and now a backwards difference
30 tb1 = time.time()
31 dyb = [0.0]*len(x)
32 #set first element by forward difference
33 dyb[0] = (y[0] - y[1])/(x[0] - x[1])
34 for i in range(1,len(y)):
35 dyb[i] = (y[i] - y[i-1])/(x[i]-x[i-1])
36
37 print( Backward difference took %f seconds % (time.time() - tb1))
38
39 and now, a centered formula
40 tc1 = time.time()
41 dyc = [0.0]*len(x)
42 dyc[0] = (y[0] - y[1])/(x[0] - x[1])
43 for i in range(1,len(y)-1):
44 dyc[i] = (y[i+1] - y[i-1])/(x[i+1]-x[i-1])
45 dyc[-1] = (y[-1] - y[-2])/(x[-1] - x[-2])
46
47 print( Centered difference took %f seconds % (time.time() - tc1))
48
49
50 the centered formula is the most accurate formula here
51
52
53 plt.plot(x,dy_analytical,label=analytical derivative)
54 plt.plot(x,dyf,--,label=forward)
55 plt.plot(x,dyb,--,label=backward)
56 plt.plot(x,dyc,--,label=centered)
57
58 plt.legend(loc=lower left)
59 plt.savefig(images/simple-diffs.png)
39
Forward difference took 0.000094 seconds
Backward difference took 0.000084 seconds
Centered difference took 0.000088 seconds
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(0, 2 * np.pi, 100)
5 y = np.sin(x)
6 dy_analytical = np.cos(x)
7
8
9 # we need to specify the size of dy ahead because diff returns
10 #an array of n-1 elements
11 dy = np.zeros(y.shape, np.float) #we know it will be this size
12 dy[0:-1] = np.diff(y) / np.diff(x)
13 dy[-1] = (y[-1] - y[-2]) / (x[-1] - x[-2])
14
15
16
40
17 calculate dy by center differencing using array slices
18
19
20 dy2 = np.zeros(y.shape,np.float) #we know it will be this size
21 dy2[1:-1] = (y[2:] - y[0:-2]) / (x[2:] - x[0:-2])
22
23 # now the end points
24 dy2[0] = (y[1] - y[0]) / (x[1] - x[0])
25 dy2[-1] = (y[-1] - y[-2]) / (x[-1] - x[-2])
26
27 plt.plot(x,y)
28 plt.plot(x,dy_analytical,label=analytical derivative)
29 plt.plot(x,dy,label=forward diff)
30 plt.plot(x,dy2,k--,lw=2,label=centered diff)
31 plt.legend(loc=lower left)
32 plt.savefig(images/vectorized-diffs.png)
41
Here is an example of a 4-point centered difference of some noisy data:
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(0, 2*np.pi, 100)
5 y = np.sin(x) + 0.1 * np.random.random(size=x.shape)
6 dy_analytical = np.cos(x)
7
8 #2-point formula
9 dyf = [0.0] * len(x)
10 for i in range(len(y)-1):
11 dyf[i] = (y[i+1] - y[i])/(x[i+1]-x[i])
12 #set last element by backwards difference
13 dyf[-1] = (y[-1] - y[-2])/(x[-1] - x[-2])
14
15
16 calculate dy by 4-point center differencing using array slices
17
18 \frac{y[i-2] - 8y[i-1] + 8[i+1] - y[i+2]}{12h}
19
20 y[0] and y[1] must be defined by lower order methods
21 and y[-1] and y[-2] must be defined by lower order methods
22
23
24 dy = np.zeros(y.shape, np.float) #we know it will be this size
25 h = x[1] - x[0] #this assumes the points are evenely spaced!
26 dy[2:-2] = (y[0:-4] - 8 * y[1:-3] + 8 * y[3:-1] - y[4:]) / (12.0 * h)
27
28 # simple differences at the end-points
29 dy[0] = (y[1] - y[0])/(x[1] - x[0])
30 dy[1] = (y[2] - y[1])/(x[2] - x[1])
31 dy[-2] = (y[-2] - y[-3]) / (x[-2] - x[-3])
32 dy[-1] = (y[-1] - y[-2]) / (x[-1] - x[-2])
33
34
35 plt.plot(x, y)
36 plt.plot(x, dy_analytical, label=analytical derivative)
37 plt.plot(x, dyf, r-, label=2pt-forward diff)
38 plt.plot(x, dy, k--, lw=2, label=4pt-centered diff)
39 plt.legend(loc=lower left)
40 plt.savefig(images/multipt-diff.png)
42
3.4 Derivatives by polynomial fitting
One way to reduce the noise inherent in derivatives of noisy data is to fit
a smooth function through the data, and analytically take the derivative of
the curve. Polynomials are especially convenient for this. The challenge is to
figure out what an appropriate polynomial order is. This requires judgment
and experience.
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from pycse import deriv
4
5 tspan = [0, 0.1, 0.2, 0.4, 0.8, 1]
6 Ca_data = [2.0081, 1.5512, 1.1903, 0.7160, 0.2562, 0.1495]
7
8 p = np.polyfit(tspan, Ca_data, 3)
9 plt.figure()
10 plt.plot(tspan, Ca_data)
11 plt.plot(tspan, np.polyval(p, tspan), g-)
12 plt.savefig(images/deriv-fit-1.png)
13
14 # compute derivatives
15 dp = np.polyder(p)
16
17 dCdt_fit = np.polyval(dp, tspan)
18
43
19 dCdt_numeric = deriv(tspan, Ca_data) # 2-point deriv
20
21 plt.figure()
22 plt.plot(tspan, dCdt_numeric, label=numeric derivative)
23 plt.plot(tspan, dCdt_fit, label=fitted derivative)
24
25 t = np.linspace(min(tspan), max(tspan))
26 plt.plot(t, np.polyval(dp, t), label=resampled derivative)
27 plt.legend(loc=best)
28 plt.savefig(images/deriv-fit-2.png)
You can see a third order polynomial is a reasonable fit here. There
are only 6 data points here, so any higher order risks overfitting. Here is
the comparison of the numerical derivative and the fitted derivative. We
have "resampled" the fitted derivative to show the actual shape. Note the
derivative appears to go through a maximum near t = 0.9. In this case,
that is probably unphysical as the data is related to the consumption of
species A in a reaction. The derivative should increase monotonically to
zero. The increase is an artefact of the fitting process. End points are
especially sensitive to this kind of error.
44
3.5 Derivatives by fitting a function and taking the analyti-
cal derivative
A variation of a polynomial fit is to fit a model with reasonable physics.
Here we fit a nonlinear function to the noisy data. The model is for the con-
centration vs. time in a batch reactor for a first order irreversible reaction.
Once we fit the data, we take the analytical derivative of the fitted function.
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from scipy.optimize import curve_fit
4 from pycse import deriv
5
6 tspan = np.array([0, 0.1, 0.2, 0.4, 0.8, 1])
7 Ca_data = np.array([2.0081, 1.5512, 1.1903, 0.7160, 0.2562, 0.1495])
8
9 def func(t, Ca0, k):
10 return Ca0 * np.exp(-k * t)
11
12
13 pars, pcov = curve_fit(func, tspan, Ca_data, p0=[2, 2.3])
14
15 plt.plot(tspan, Ca_data)
16 plt.plot(tspan, func(tspan, *pars), g-)
17 plt.savefig(images/deriv-funcfit-1.png)
18
45
19 # analytical derivative
20 k, Ca0 = pars
21 dCdt = -k * Ca0 * np.exp(-k * tspan)
22 t = np.linspace(0, 2)
23 dCdt_res = -k * Ca0 * np.exp(-k * t)
24
25 plt.figure()
26 plt.plot(tspan, deriv(tspan, Ca_data), label=numerical derivative)
27 plt.plot(tspan, dCdt, label=analytical derivative of fit)
28 plt.plot(t, dCdt_res, label=extrapolated)
29 plt.legend(loc=best)
30 plt.savefig(images/deriv-funcfit-2.png)
Visually this fit is about the same as a third order polynomial. Note
the difference in the derivative though. We can readily extrapolate this
derivative and get reasonable predictions of the derivative. That is true in
this case because we fitted a physically relevant model for concentration vs.
time for an irreversible, first order reaction.
46
3.6 Derivatives by FFT
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 N = 101 #number of points
5 L = 2 * np.pi #interval of data
6
7 x = np.arange(0.0, L, L/float(N)) #this does not include the endpoint
8
9 #add some random noise
10 y = np.sin(x) + 0.05 * np.random.random(size=x.shape)
11 dy_analytical = np.cos(x)
12
13
14 http://sci.tech-archive.net/Archive/sci.math/2008-05/msg00401.html
15
16 you can use fft to calculate derivatives!
17
18
19 if N % 2 == 0:
20 k = np.asarray(list(range(0, N // 2)) + [0] + list(range(-N // 2 + 1, 0)), np.float64)
21 else:
22 k = np.asarray(list(range(0, (N - 1) // 2)) + [0] + list(range(-(N - 1) // 2, 0)), np.float64)
23
24 k *= 2 * np.pi / L
25
26 fd = np.real(np.fft.ifft(1.0j * k * np.fft.fft(y)))
47
27
28 plt.plot(x, y, label=function)
29 plt.plot(x,dy_analytical,label=analytical der)
30 plt.plot(x,fd,label=fft der)
31 plt.legend(loc=lower left)
32
33 plt.savefig(images/fft-der.png)
34 plt.show()
48
The new way is: f 0 (x) = imag(f(x + ix)/x) where the function f is
evaluated in imaginary space with a small x in the complex plane. The
derivative is miraculously equal to the imaginary part of the result in the
limit of x 0!
This example comes from the first link. The derivative must be evaluated
using the chain rule. We compare a forward difference, central difference and
complex-step derivative approximations.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 def f(x): return np.sin(3*x)*np.log(x)
5
6 x = 0.7
7 h = 1e-7
8
9 # analytical derivative
10 dfdx_a = 3 * np.cos( 3*x)*np.log(x) + np.sin(3*x) / x
11
12 # finite difference
13 dfdx_fd = (f(x + h) - f(x))/h
14
15 # central difference
16 dfdx_cd = (f(x+h)-f(x-h))/(2*h)
17
18 # complex method
19 dfdx_I = np.imag(f(x + np.complex(0, h))/h)
20
21 print(dfdx_a)
22 print(dfdx_fd)
23 print(dfdx_cd)
24 print(dfdx_I)
1.77335410624
1.77335393925
1.77335410495
1.7733541062373848
These are all the same to 4 decimal places. The simple finite difference
is the least accurate, and the central differences is practically the same as
the complex number approach.
Let us use this method to verify the fundamental Theorem of Calcu-
lus, i.e. to evaluate the derivative of an integral function. Let f (x) =
R2
x
tan(t3 )dt, and we now want to compute df/dx. Of course, this can be
1
done analytically, but it is not trivial!
49
1 import numpy as np
2 from scipy.integrate import quad
3
4 def f_(z):
5 def integrand(t):
6 return np.tan(t**3)
7 return quad(integrand, 0, z**2)
8
9 f = np.vectorize(f_)
10
11 x = np.linspace(0, 1)
12
13 h = 1e-7
14
15 dfdx = np.imag(f(x + complex(0, h)))/h
16 dfdx_analytical = 2 * x * np.tan(x**6)
17
18 import matplotlib.pyplot as plt
19
20 plt.plot(x, dfdx, x, dfdx_analytical, r--)
21 plt.show()
1 def f1(x):
2 if x < 0:
3 return 0
4 elif (x >= 0) & (x < 1):
5 return x
6 elif (x >= 1) & (x < 2):
7 return 2.0 - x
8 else:
9 return 0
10
11 print(f1(-1))
12 #print(f1([0, 1, 2, 3])) # does not work!
50
0
This works, but the function is not vectorized, i.e. f([-1 0 2 3]) does not
evaluate properly (it should give a list or array). You can get vectorized
behavior by using list comprehension, or by writing your own loop. This
does not fix all limitations, for example you cannot use the f1 function in
the quad function to integrate it.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(-1, 3)
5 y = [f1(xx) for xx in x]
6
7 plt.plot(x, y)
8 plt.savefig(images/vector-piecewise.png)
51
1 def f2(x):
2 fully vectorized version
3 x = np.asarray(x)
4 y = np.zeros(x.shape)
5 y += ((x >= 0) & (x < 1)) * x
6 y += ((x >= 1) & (x < 2)) * (2 - x)
7 return y
8
9 print(f2([-1, 0, 1, 2, 3, 4]))
10 x = np.linspace(-1,3);
11 plt.plot(x,f2(x))
12 plt.savefig(images/vector-piecewise-2.png)
[ 0. 0. 1. 0. 0. 0.]
[<matplotlib.lines.Line2D object at 0x10b3a6358>]
1 def heaviside(x):
2 x = np.array(x)
3 if x.shape != ():
4 y = np.zeros(x.shape)
52
5 y[x > 0.0] = 1
6 y[x == 0.0] = 0.5
7 else: # special case for 0d array (a number)
8 if x > 0: y = 1
9 elif x == 0: y = 0.5
10 else: y = 0
11 return y
12
13 def f3(x):
14 x = np.array(x)
15 y1 = (heaviside(x) - heaviside(x - 1)) * x # first interval
16 y2 = (heaviside(x - 1) - heaviside(x - 2)) * (2 - x) # second interval
17 return y1 + y2
18
19 from scipy.integrate import quad
20 print(quad(f3, -1, 3))
(1.0, 1.1102230246251565e-14)
1 plt.plot(x, f3(x))
2 plt.savefig(images/vector-piecewise-3.png)
53
simplicity and speed; loops in python are usually very slow compared to
vectorized functions.
1 import numpy as np
2 from scipy.optimize import fsolve
3 import matplotlib.pyplot as plt
4
5 def fF_laminar(Re):
6 return 16.0 / Re
7
8 def fF_turbulent_unvectorized(Re):
9 # Nikuradse correlation for turbulent flow
10 # 1/np.sqrt(f) = (4.0*np.log10(Re*np.sqrt(f))-0.4)
11 # we have to solve this equation to get f
12 def func(f):
13 return 1/np.sqrt(f) - (4.0*np.log10(Re*np.sqrt(f))-0.4)
14 fguess = 0.01
15 f, = fsolve(func, fguess)
16 return f
17
18 # this enables us to pass vectors to the function and get vectors as
19 # solutions
20 fF_turbulent = np.vectorize(fF_turbulent_unvectorized)
54
11 plt.ylabel($f_F$)
12 plt.legend()
13 plt.savefig(images/smooth-transitions-1.png)
1 x = np.linspace(-4, 4);
2 y = 1.0 / (1 + np.exp(-x / 0.1))
3 plt.figure(2)
4 plt.clf()
5 plt.plot(x, y)
6 plt.xlabel(x); plt.ylabel(y); plt.title($\sigma(x)$)
7 plt.savefig(images/smooth-transitions-sigma.png)
55
If we have two functions, f1 (x) and f2 (x) we want to smoothly join,
we do it like this: f (x) = (1 (x))f1 (x) + (x)f2 (x). There is no formal
justification for this form of joining, it is simply a mathematical convenience
to get a numerically smooth function. Other functions besides the sigmoid
function could also be used, as long as they smoothly transition from 0 to
1, or from 1 to zero.
1 def fanning_friction_factor(Re):
2 combined, continuous correlation for the fanning friction factor.
3 the alpha parameter is chosen to provide the desired smoothness.
4 The transition region is about +- 4*alpha. The value 450 was
5 selected to reasonably match the shape of the correlation
6 function provided by Morrison (see last section of this file)
7 sigma = 1. / (1 + np.exp(-(Re - 3000.0) / 450.0));
8 f = (1-sigma) * fF_laminar(Re) + sigma * fF_turbulent(Re)
9 return f
10
11 Re = np.linspace(500, 10000);
12 f = fanning_friction_factor(Re);
13
14 # add data to figure 1
15 plt.figure(1)
16 plt.plot(Re,f, label=smooth transition)
17 plt.xlabel(Re)
18 plt.ylabel($f_F$)
19 plt.legend()
20 plt.savefig(images/smooth-transitions-3.png)
56
You can see that away from the transition the combined function is
practically equivalent to the original two functions. That is because away
from the transition the sigmoid function is 0 or 1. Near Re = 3000 is a
smooth transition from one curve to the other curve.
Morrison derived a single function for the friction factor correlation over
0.0076( 3170
Re )
0.165
all Re: f = + Re
16
. Here we show the comparison with the
1+( 3171 )
7.0
Re
approach used above. The friction factor differs slightly at high Re, because
Morrisons is based on the Prandlt correlation, while the work here is based
on the Nikuradse correlation. They are similar, but not the same.
57
3.9.1 Summary
The approach demonstrated here allows one to smoothly join two discon-
tinuous functions that describe physics in different regimes, and that must
transition over some range of data. It should be emphasized that the method
has no physical basis, it simply allows one to create a mathematically smooth
function, which could be necessary for some optimizers or solvers to work.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
58
4 x0 = 2.0 / 3.0
5 x1 = 1.5
6
7 w = 0.05
8
9 D = np.linspace(0,2, 500)
10
11 sigmaD = 1.0 / (1.0 + np.exp(-(1 - D) / w))
12
13 x = x0 + (x1 - x0)*(1 - sigmaD)
14
15 plt.plot(D, x)
16 plt.xlabel(D); plt.ylabel(x)
17 plt.savefig(images/smooth-transitions-constants.png)
59
method.
R2 3
Let us look at some examples. We consider the example of computing
0 x dx. the analytical integral is 1/4x4 , so we know the integral evaluates
to 16/4 = 4. This will be our benchmark for comparison to the numerical
methods.
We use the scipy.integrate.quad command to evaluate this 02 x3 dx.
R
4.0
4.0
1 import numpy as np
2
3 x = np.array([0, 0.5, 1, 1.5, 2])
4 y = x**3
5
6 i2 = np.trapz(y, x)
7
8 error = (i2 - 4) / 4
9
10 print(i2, error)
4.25 0.0625
Note the integral of these vectors is greater than 4! You can see why
here.
60
1 import numpy as np
2 import matplotlib.pyplot as plt
3 x = np.array([0, 0.5, 1, 1.5, 2])
4 y = x**3
5
6 x2 = np.linspace(0, 2)
7 y2 = x2**3
8
9 plt.plot(x, y, label=5 points)
10 plt.plot(x2, y2, label=50 points)
11 plt.legend()
12 plt.savefig(images/quad-1.png)
1 import numpy as np
2
3 x2 = np.linspace(0, 2, 100)
4 y2 = x2**3
5
6 print(np.trapz(y2, x2))
4.00040812162
61
3.11.2 Combining numerical data with quad
You might want to combine numerical data with the quad function if you
want to perform integrals easily. Let us say you are given this data:
x = [0 0.5 1 1.5 2]; y = [0 0.1250 1.0000 3.3750 8.0000];
and you want to integrate this from x = 0.25 to 1.75. We do not have
data in those regions, so some interpolation is going to be needed. Here is
one approach.
2.53199187838
2.5312499999999987
These approaches are very similar, and both rely on linear interpolation.
The second approach is simpler, and uses fewer lines of code.
3.11.3 Summary
trapz and quad are functions for getting integrals. Both can be used with
numerical data if interpolation is used. The syntax for the quad and trapz
function is different in scipy than in Matlab.
Finally, see this post for an example of solving an integral equation using
quad and fsolve.
62
Polynomials can be represented as a list of coefficients. For example, the
polynomial 4 x3 + 3 x2 2 x + 10 = 0 can be represented as [4, 3, -2,
10]. Here are some ways to create a polynomial object, and evaluate it.
1 import numpy as np
2
3 ppar = [4, 3, -2, 10]
4 p = np.poly1d(ppar)
5
6 print(p(3))
7 print(np.polyval(ppar, 3))
8
9 x = 3
10 print(4*x**3 + 3*x**2 -2*x + 10)
139
139
139
1 import numpy as np
2
3 p = np.poly1d([2, 0, -1])
4 p2 = np.polyder(p)
5 print(p2)
6 print(p2(4))
4 x
16
1 import numpy as np
2
3 p = np.poly1d([2, 0, -1])
4 p2 = np.polyint(p)
5 print(p2)
6 print(p2(4) - p2(2))
63
3
0.6667 x - 1 x
35.3333333333
One reason to use polynomials is the ease of finding all of the roots using
numpy.roots.
1 import numpy as np
2 print(np.roots([2, 0, -1])) # roots are +- sqrt(2)
3
4 # note that imaginary roots exist, e.g. x^2 + 1 = 0 has two roots, +-i
5 p = np.poly1d([1, 0, 1])
6 print(np.roots(p))
[ 0.70710678 -0.70710678]
[-0.+1.j 0.-1.j]
1 import numpy as np
2 # numerical values of the constants
3 a = 3.49e4
4 b = 1.45
5 p = 679.7 # pressure in psi
6 T = 683 # T in Rankine
7 n = 1.136 # lb-moles
8 R = 10.73 # ft^3 * psi /R / lb-mol
9
10 ppar = [1.0, -(p*n*b+n*R*T)/p, n**2*a/p, -n**3*a*b/p];
11 print(np.roots(ppar))
Note that only one root is real (and even then, we have to interpet 0.j
as not being imaginary. Also, in a cubic polynomial, there can only be two
imaginary roots). In this case that means there is only one phase present.
3.12.1 Summary
Polynomials in numpy are even better than in Matlab, because you get
a polynomial object that acts just like a function. Otherwise, they are
functionally equivalent.
64
3.13 Wilkinsons polynomial
Wilkinsons polynomial is defined as w(x) = 20 i=1 (x i) = (x 1)(x
Q
2) . . . (x 20).
This innocent looking function has 20 roots, which are 1,2,3,. . . ,19,20.
Here is a plot of the function.
65
4 x = Symbol(x)
5 W = 1
6 for i in range(1, 21):
7 W = W * (x-i)
8
9 print(W.expand())
10
11 P,d = poly_from_expr(W.expand())
12 print(P)
The coefficients are orders of magnitude apart in size. This should make
you nervous, because the roots of this equation are between 1-20, but there
are numbers here that are O(19). This is likely to make any rounding errors
in the number representations very significant, and may lead to issues with
accuracy of the solution. Let us explore that.
We will get the roots using numpy.roots.
1 import numpy as np
2 from sympy import Symbol
3 from sympy.polys.polytools import poly_from_expr
4
5 x = Symbol(x)
6 W = 1
7 for i in range(1, 21):
8 W = W * (x-i)
9
10 P,d = poly_from_expr(W.expand())
11 p = P.all_coeffs()
12 x = np.arange(1, 21)
13 print(\nThese are the known roots\n,x)
14
15 # evaluate the polynomial at the known roots
16 print(\nThe polynomial evaluates to {0} at the known roots.format(np.polyval(p, x)))
17
18 # find the roots ourselves
19 roots = np.roots(p)
20 print(\nHere are the roots from numpy:\n, roots)
21
22 # evaluate solution at roots
23 print(\nHere is the polynomial evaluated at the calculated roots:\n, np.polyval(p, roots))
66
Here are the roots from numpy:
[ 20.00060348 18.99388894 18.02685247 16.91622268 16.14133991
14.77906016 14.22072943 12.85642119 12.08967018 10.96640641
10.01081017 8.99768263 8.00033976 6.99997228 6.0000001
5.00000024 3.99999998 3. 2. 1. ]
The roots are not exact. Even more to the point, the polynomial does
not evaluate to zero at the calculated roots! Something is clearly wrong
here. The polynomial function is fine, and it does evaluate to zero at the
known roots which are integers. It is subtle, but up to that point, we are
using only integers, which can be represented exactly. The roots function
is evidently using some float math, and the floats are not the same as the
integers.
If we simply change the roots to floats, and reevaluate our polynomial,
we get dramatically different results.
1 import numpy as np
2 from sympy import Symbol
3 from sympy.polys.polytools import poly_from_expr
4
5 x = Symbol(x)
6 W = 1
7 for i in range(1, 21):
8 W = W * (x - i)
9
10 P, d = poly_from_expr(W.expand())
11 p = P.all_coeffs()
12 x = np.arange(1, 21, dtype=np.float)
13 print(\nThese are the known roots\n,x)
14
15 # evaluate the polynomial at the known roots
16 print(\nThe polynomial evaluates to {0} at the known roots.format(np.polyval(p, x)))
67
The polynomial evaluates to [0 0 55296.0000000000 425984.000000000 1024000.00000000 9
35825664.0000000 75235328.0000000 198567936.000000 566272000.000000
757796864.000000 1418231808.00000 2411708416.00000 3807354880.00000
6303744000.00000 11150557184.0000 17920108544.0000 16046678016.0000
38236565504.0000 54726656000.0000] at the known roots
1 import numpy as np
2 from sympy import Symbol
3 from sympy.polys.polytools import poly_from_expr
4
5 x = Symbol(x)
6 W = 1
7 for i in range(1, 21):
8 W = W * (x - i)
9
10 P,d = poly_from_expr(W.expand())
11 p = [float(x) for x in P.all_coeffs()]
12 x = np.arange(1, 21)
13 print(\nThese are the known roots\n,x)
14
15 # evaluate the polynomial at the known roots
16 print(\nThe polynomial evaluates to {0} at the known roots.format(np.polyval(p, x)))
Let us try to understand what is happening here. It turns out that the
integer and float representations of the numbers are different! It is known
that you cannot exactly represent numbers as floats.
1 import numpy as np
2 from sympy import Symbol
3 from sympy.polys.polytools import poly_from_expr
68
4
5 x = Symbol(x)
6 W = 1
7 for i in range(1, 21):
8 W = W * (x - i)
9
10 P, d = poly_from_expr(W.expand())
11 p = P.all_coeffs()
12 print(p)
13 print({0:<30s}{1:<30s}{2}.format(Integer,Float,\delta))
14 for pj in p:
15 print({0:<30d}{1:<30f}{2:3e}.format(int(pj), float(pj), int(pj) - float(pj)))
Now you can see the issue. Many of these numbers are identical in integer
and float form, but some of them are not. The integer cannot be exactly
represented as a float, and there is a difference in the representations. It is
a small difference compared to the magnitude, but these kinds of differences
get raised to high powers, and become larger. You may wonder why I used
"0:<30s>" to print the integer? That is because pj in that loop is an object
69
from sympy, which prints as a string.
This is a famous, and well known problem that is especially bad for this
case. This illustrates that you cannot simply rely on what a computer tells
you the answer is, without doing some critical thinking about the problem
and the solution. Especially in problems where there are coefficients that
vary by many orders of magnitude you should be cautious.
There are a few interesting webpages on this topic, which inspired me
to work this out in python. These webpages go into more detail on this
problem, and provide additional insight into the sensitivity of the solutions
to the polynomial coefficients.
1. http://blogs.mathworks.com/cleve/2013/03/04/wilkinsons-polynomials/
2. http://www.numericalexpert.com/blog/wilkinson_polynomial/
3. http://en.wikipedia.org/wiki/Wilkinson%27s_polynomial
1 import numpy as np
2 import time
3
4 a = 0.0; b = np.pi;
5 N = 1000; # this is the number of intervals
6
7 h = (b - a)/N; # this is the width of each interval
8 x = np.linspace(a, b, N)
9 y = np.sin(x); # the sin function is already vectorized
10
11 t0 = time.time()
12 f = 0.0
13 for k in range(len(x) - 1):
14 f += 0.5 * ((x[k+1] - x[k]) * (y[k+1] + y[k]))
15
16 tf = time.time() - t0
70
17 print(time elapsed = {0} sec.format(tf))
18
19 print(f)
1 t0 = time.time()
2 Xk = x[1:-1] - x[0:-2] # vectorized version of (x[k+1] - x[k])
3 Yk = y[1:-1] + y[0:-2] # vectorized version of (y[k+1] + y[k])
4
5 f = 0.5 * np.sum(Xk * Yk) # vectorized version of the loop above
6 tf = time.time() - t0
7 print(time elapsed = {0} sec.format(tf))
8
9 print(f)
In the last example, there may be loop buried in the sum command. Let
us do one final method, using linear algebra, in a single line. The key to
understanding this is to recognize the sum is just the result of a dot product
of the x differences and y sums.
1 t0 = time.time()
2 f = 0.5 * np.dot(Xk, Yk)
3 tf = time.time() - t0
4 print(time elapsed = {0} sec.format(tf))
5
6 print(f)
The loop method is straightforward to code, and looks alot like the
formula that defines the trapezoid method. the vectorized methods are not
as easy to read, and take fewer lines of code to write. However, the vectorized
methods are much faster than the loop, so the loss of readability could be
worth it for very large problems.
The times here are considerably slower than in Matlab. I am not sure if
that is a totally fair comparison. Here I am running python through emacs,
which may result in slower performance. I also used a very crude way of
timing the performance which lumps some system performance in too.
71
3.15 Numerical Simpsons rule
A more accurate numerical integration than the trapezoid method is Simp-
sons rule. The syntax is similar to trapz, but the method is in scipy.integrate.
1 import numpy as np
2 from scipy.integrate import simps, romb
3
4 a = 0.0; b = np.pi / 4.0;
5 N = 10 # this is the number of intervals
6
7 x = np.linspace(a, b, N)
8 y = np.cos(x)
9
10 t = np.trapz(y, x)
11 s = simps(y, x)
12 a = np.sin(b) - np.sin(a)
13
14 print(trapz = {0} ({1:%} error).format(t, (t - a)/a))
15 print(simps = {0} ({1:%} error).format(s, (s - a)/a))
16 print(analy = {0}.format(a))
You can see the Simpsons method is more accurate than the trapezoid
method.
y = x2
from x=0 to x=1. You should be able to work out that the answer is
1/3.
72
1 from scipy.integrate import quad
2
3 def integrand(x):
4 return x**2
5
6 ans, err = quad(integrand, 0, 1)
7 print(ans)
0.33333333333333337
-9.869604401089358
73
1 from scipy.integrate import tplquad
2 import numpy as np
3
4 def integrand(z, y, x):
5 return y * np.sin(x) + z * np.cos(x)
6
7 ans, err = tplquad(integrand,
8 0, np.pi, # x limits
9 lambda x: 0,
10 lambda x: 1, # y limits
11 lambda x,y: -1,
12 lambda x,y: 1) # z limits
13
14 print (ans)
1.9999999999999998
3.16.2 Summary
scipy.integrate offers the same basic functionality as Matlab does. The syn-
tax differs significantly for these simple examples, but the use of functions
for the limits enables freedom to integrate over non-constant limits.
1 import scipy
2 from scipy.integrate import quad
3
4 nu0 = 5 # L/min
5 alpha = 1.0 # L/min
6 def integrand(t):
7 return nu0 + alpha * t
8
9 t0 = 0.0
10 tfinal = 10.0
11 V, estimated_error = quad(integrand, t0, tfinal)
12 print({0:1.2f} L flowed into the tank over 10 minutes.format(V))
74
3.18 Function integration by the Romberg method
An alternative to the scipy.integrate.quad function is the Romberg method.
This method is not likely to be more accurate than quad, and it does not
give you an error estimate.
1 import numpy as np
2
3 from scipy.integrate import quad, romberg
4
5 a = 0.0
6 b = np.pi / 4.0
7
8 print(quad(np.sin, a, b))
9 print(romberg(np.sin, a, b))
(0.2928932188134524, 3.2517679528326894e-15)
0.292893218813
75
3.19.2 differentiation
you might find this helpful!
2*a*x + b
2*a
x**2
3.19.3 integration
f(x) == C1*exp(x)
f(4) == C1*exp(4)
[f(0) == C1, f(0.5) == 1.64872127070013*C1, f(1) == E*C1]
It is not clear you can solve the initial value problem to get C1.
The symbolic math in sympy is pretty good. It is not up to the capability
of Maple or Mathematica, (but neither is Matlab) but it continues to be
developed, and could be helpful in some situations.
76
3.20 Is your ice cream float bigger than mine
Float numbers (i.e. the ones with decimals) cannot be perfectly represented
in a computer. This can lead to some artifacts when you have to compare
float numbers that on paper should be the same, but in silico are not. Let
us look at some examples. In this example, we do some simple math that
should result in an answer of 1, and then see if the answer is "equal" to one.
1 print(3.0 * (1.0/3.0))
2 print(1.0 == 3.0 * (1.0/3.0))
1.0
True
1 print(49.0 * (1.0/49.0))
2 print(1.0 == 49.0 * (1.0/49.0))
0.9999999999999999
False
The first line shows the result is not 1.0, and the equality fails! You can
see here why the equality statement fails. We will print the two numbers to
sixteen decimal places.
0.9999999999999999
1.0000000000000000
1.1102230246251565e-16
The two numbers actually are not equal to each other because of float
math. They are very, very close to each other, but not the same.
This leads to the idea of asking if two numbers are equal to each other
within some tolerance. The question of what tolerance to use requires
thought. Should it be an absolute tolerance? a relative tolerance? How
large should the tolerance be? We will use the distance between 1 and the
nearest floating point number (this is eps in Matlab). numpy can tell us this
number with the np.spacing command.
Below, we implement a comparison function from
77
1 # Implemented from Acta Crystallographica A60, 1-6 (2003). doi:10.1107/S010876730302186X
2
3 import numpy as np
4 print(np.spacing(1))
5
6 def feq(x, y, epsilon):
7 x == y
8 return not((x < (y - epsilon)) or (y < (x - epsilon)))
9
10 print(feq(1.0, 49.0 * (1.0/49.0), np.spacing(1)))
2.22044604925e-16
True
For completeness, here are the other float comparison operators from
that paper. We also show a few examples.
1 import numpy as np
2
3 def flt(x, y, epsilon):
4 x < y
5 return x < (y - epsilon)
6
7 def fgt(x, y, epsilon):
8 x > y
9 return y < (x - epsilon)
10
11 def fle(x, y, epsilon):
12 x <= y
13 return not(y < (x - epsilon))
14
15 def fge(x, y, epsilon):
16 x >= y
17 return not(x < (y - epsilon))
18
19 print(fge(1.0, 49.0 * (1.0/49.0), np.spacing(1)))
20 print(fle(1.0, 49.0 * (1.0/49.0), np.spacing(1)))
21
22 print(fgt(1.0 + np.spacing(1), 49.0 * (1.0/49.0), np.spacing(1)))
23 print(flt(1.0 - 2 * np.spacing(1), 49.0 * (1.0/49.0), np.spacing(1)))
True
True
True
True
As you can see, float comparisons can be tricky. You have to give a lot
of thought to how to make the comparisons, and the functions shown above
are not the only way to do it. You need to build in testing to make sure
your comparisons are doing what you want.
78
4 Linear algebra
4.1 Potential gotchas in linear algebra in numpy
Numpy has some gotcha features for linear algebra purists. The first is that
a 1d array is neither a row, nor a column vector. That is, a = aT if a is a
1d array. That means you can take the dot product of a with itself, without
transposing the second argument. This would not be allowed in Matlab.
1 import numpy as np
2
3 a = np.array([0, 1, 2])
4 print(a.shape)
5 print(a)
6 print(a.T)
7
8
9 print(np.dot(a, a))
10 print(np.dot(a, a.T))
(3,)
[0 1 2]
[0 1 2]
5
5
Compare the syntax to the new Python 3.5 syntax:
1 print(a @ a)
5
Compare the previous behavior with this 2d array. In this case, you
cannot take the dot product of b with itself, because the dimensions are
incompatible. You must transpose the second argument to make it dimen-
sionally consistent. Also, the result of the dot product is not a simple scalar,
but a 1 1 array.
1 b = np.array([[0, 1, 2]])
2 print(b.shape)
3 print(b)
4 print(b.T)
5
6 print(np.dot(b, b)) # this is not ok, the dimensions are wrong.
7 print(np.dot(b, b.T))
8 print(np.dot(b, b.T).shape)
79
(1, 3)
[[0 1 2]]
[[0]
[1]
[2]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
[[5]]
(1, 1)
[[ 3 3 3 3 3 4]
[ 5 5 5 5 5 6]
[ 7 7 7 7 7 8]
[ 9 9 9 9 9 10]]
1 x = np.array([2, 4, 6, 8])
2 y = np.array([1, 1, 1, 1, 1, 1, 2])
3
4 print(x[:, np.newaxis] + y)
[[ 3 3 3 3 3 3 4]
[ 5 5 5 5 5 5 6]
[ 7 7 7 7 7 7 8]
[ 9 9 9 9 9 9 10]]
1 a = np.array([1, 2, 3])
2 b = np.array([10, 20, 30, 40])
3
4 print(a * b[:, np.newaxis])
80
[[ 10 20 30]
[ 20 40 60]
[ 30 60 90]
[ 40 80 120]]
x1 x2 + x3 = 0 (5)
10x2 + 25x3 = 90 (6)
20x1 + 10x2 = 80 (7)
1 import numpy as np
2 A = np.array([[1, -1, 1],
3 [0, 10, 25],
4 [20, 10, 0]])
5
6 b = np.array([0, 90, 80])
7
8 x = np.linalg.solve(A, b)
9 print(x)
10 print(np.dot(A,x))
11
12 # Let us confirm the solution.
13 # this shows one element is not equal because of float tolerance
14 print(np.dot(A,x) == b)
15
16 # here we use a tolerance comparison to show the differences is less
17 # than a defined tolerance.
18 TOLERANCE = 1e-12
19 print(np.abs((np.dot(A, x) - b)) <= TOLERANCE)
[ 2. 4. 2.]
81
[ 0. 90. 80.]
[ True True True]
[ True True True]
1 import numpy as np
2 A = np.array([[1, -1, 1],
3 [0, 10, 25],
4 [20, 10, 0]])
5
6 b = np.array([0, 90, 80])
7
8 # determine number of independent rows in A we get the singular values
9 # and count the number greater than 0.
10 TOLERANCE = 1e-12
11 u, s, v = np.linalg.svd(A)
12 print(Singular values: {0}.format(s))
13 print(# of independent rows: {0}.format(np.sum(np.abs(s) > TOLERANCE)))
14
15 # to illustrate a case where there are only 2 independent rows
16 # consider this case where row3 = 2*row2.
17 A = np.array([[1, -1, 1],
18 [0, 10, 25],
19 [0, 20, 50]])
20
21 u, s, v = np.linalg.svd(A)
22
23 print(Singular values: {0}.format(s))
24 print(# of independent rows: {0}.format(np.sum(np.abs(s) > TOLERANCE)))
Matlab comparison
82
4.3 Rules for transposition
Matlab comparison
Here are the four rules for matrix multiplication and transposition
1. (AT )T = A
2. (A + B)T = AT + BT
3. (cA)T = cAT
4. (AB)T = BT AT
reference: Chapter 7.2 in Advanced Engineering Mathematics, 9th edi-
tion. by E. Kreyszig.
1 import numpy as np
2 A = np.array([[5, -8, 1],
3 [4, 0, 0]])
4
5 # function
6 print(np.transpose(A))
7
8 # notation
9 print(A.T)
[[ 5 4]
[-8 0]
[ 1 0]]
[[ 5 4]
[-8 0]
[ 1 0]]
4.3.2 Rule 1
1 import numpy as np
2
3 A = np.array([[5, -8, 1],
4 [4, 0, 0]])
5
6 print(np.all(A == (A.T).T))
True
83
4.3.3 Rule 2
1 import numpy as np
2 A = np.array([[5, -8, 1],
3 [4, 0, 0]])
4
5 B = np.array([[3, 4, 5], [1, 2,3]])
6
7 print(np.all( A.T + B.T == (A + B).T))
True
4.3.4 Rule 3
1 import numpy as np
2 A = np.array([[5, -8, 1],
3 [4, 0, 0]])
4
5 c = 2.1
6
7 print(np.all((c*A).T == c*A.T))
True
4.3.5 Rule 4
1 import numpy as np
2 A = np.array([[5, -8, 1],
3 [4, 0, 0]])
4
5 B = np.array([[0, 2],
6 [1, 2],
7 [6, 7]])
8
9 print(np.all(np.dot(A, B).T == np.dot(B.T, A.T)))
True
4.3.6 Summary
That wraps up showing numerically the transpose rules work for these ex-
amples.
84
4.4 Sums products and linear algebra notation - avoiding
loops where possible
Matlab comparison
Today we examine some methods of linear algebra that allow us to avoid
writing explicit loops in Matlab for some kinds of mathematical operations.
Consider the operation on two vectors a and b.
n
y=
X
ai bi
i=1
a = [1 2 3 4 5]
b = [3 6 8 9 10]
1 a = [1, 2, 3, 4, 5]
2 b = [3, 6, 8, 9, 10]
3
4 sum = 0
5 for i in range(len(a)):
6 sum = sum + a[i] * b[i]
7 print(sum)
125
1 a = [1, 2, 3, 4, 5]
2 b = [3, 6, 8, 9, 10]
3
4 sum = 0
5 for x,y in zip(a,b):
6 sum += x * y
7 print(sum)
125
85
4.4.2 The numpy approach
The most compact method is to use the methods in numpy.
1 import numpy as np
2
3 a = np.array([1, 2, 3, 4, 5])
4 b = np.array([3, 6, 8, 9, 10])
5
6 print(np.sum(a * b))
125
1 import numpy as np
2
3 a = np.array([1, 2, 3, 4, 5])
4 b = np.array([3, 6, 8, 9, 10])
5
6 print(np.dot(a, b))
125
162.39
86
1 import numpy as np
2 w = np.array([0.1, 0.25, 0.12, 0.45, 0.98])
3 x = np.array([9, 7, 11, 12, 8])
4 y = np.sum(w * x**2)
5 print(y)
162.39
We can also express this in matrix algebra form. The operation is equiv-
alent to y = ~x Dw ~xT where Dw is a diagonal matrix with the weights on
the diagonal.
1 import numpy as np
2 w = np.array([0.1, 0.25, 0.12, 0.45, 0.98])
3 x = np.array([9, 7, 11, 12, 8])
4 y = np.dot(x, np.dot(np.diag(w), x))
5 print(y)
162.39
This last form avoids explicit loops and sums, and relies on fast linear
algebra routines.
1 import numpy as np
2
3 w = np.array([0.1, 0.25, 0.12, 0.45, 0.98])
4 x = np.array([9, 7, 11, 12, 8])
5 y = np.array([2, 5, 3, 8, 0])
6
7 print(np.sum(w * x * y))
8 print(np.dot(w, np.dot(np.diag(x), y)))
57.71
57.71
87
4.4.6 Summary
We showed examples of the following equalities between traditional sum
notations and linear algebra
n
ab =
X
ai bi
i=1
n
xDw xT = wi xi2
X
i=1
n
xDw yT =
X
wi xi yi
i=1
These relationships enable one to write the sums as a single line of python
code, which utilizes fast linear algebra subroutines, avoids the construction
of slow loops, and reduces the opportunity for errors in the code. Admittedly,
it introduces the opportunity for new types of errors, like using the wrong
relationship, or linear algebra errors due to matrix size mismatches.
1 import numpy as np
2 v1 = [6, 0, 3, 1, 4, 2];
3 v2 = [0, -1, 2, 7, 0, 5];
88
4 v3 = [12, 3, 0, -19, 8, -11];
5
6 A = np.row_stack([v1, v2, v3])
7
8 # matlab definition
9 eps = np.finfo(np.linalg.norm(A).dtype).eps
10 TOLERANCE = max(eps * np.array(A.shape))
11
12 U, s, V = np.linalg.svd(A)
13 print(s)
14 print(np.sum(s > TOLERANCE))
15
16 TOLERANCE = 1e-14
17 print(np.sum(s > TOLERANCE))
You can see if you choose too small a TOLERANCE, nothing looks
like zero. the result with TOLERANCE=1e-14 suggests the rows are not
linearly independent. Let us show that one row can be expressed as a linear
combination of the other rows.
The number of rows is greater than the rank, so these vectors are not
independent. Lets demonstrate that one vector can be defined as a linear
combination of the other two vectors. Mathematically we represent this as:
x1 v1 + x2 v2 = v3
or
[x1 x2 ][v1; v2] = v3
This is not the usual linear algebra form of Ax = b. To get there, we
transpose each side of the equation to get:
[v1.T v2.T][x_1; x_2] = v3.T
which is the form Ax = b. We solve it in a least-squares sense.
1 A = np.column_stack([v1, v2])
2 x = np.linalg.lstsq(A, v3)
3 print(x[0])
[ 2. -3.]
89
4.5.1 another example
[ 7.57773162 5.99149259]
You can tell by inspection the rank is 2 because there are no near-zero
singular values.
1 import numpy as np
2
3 A = [[1, 2, 3],
4 [0, 2, 3],
5 [0, 0, 1e-6]]
6
7 U, s, V = np.linalg.svd(A)
8 print(s)
9 print(np.sum(np.abs(s) > 1e-15))
10 print(np.sum(np.abs(s) > 1e-5))
90
4.5.3 Application to independent chemical reactions.
reference: Exercise 2.4 in Chemical Reactor Analysis and Design Funda-
mentals by Rawlings and Ekerdt.
The following reactions are proposed in the hydrogenation of bromine:
Let this be our species vector: v = [H2 H Br2 Br HBr].T
the reactions are then defined by M*v where M is a stoichometric matrix
in which each row represents a reaction with negative stoichiometric coef-
ficients for reactants, and positive stoichiometric coefficients for products.
A stoichiometric coefficient of 0 is used for species not participating in the
reaction.
1 import numpy as np
2
3 # [H2 H Br2 Br HBr]
4 M = [[-1, 0, -1, 0, 2], # H2 + Br2 == 2HBR
5 [ 0, 0, -1, 2, 0], # Br2 == 2Br
6 [-1, 1, 0, -1, 1], # Br + H2 == HBr + H
7 [ 0, -1, -1, 1, 1], # H + Br2 == HBr + Br
8 [ 1, -1, 0, 1, -1], # H + HBr == H2 + Br
9 [ 0, 0, 1, -2, 0]] # 2Br == Br2
10
11 U, s, V = np.linalg.svd(M)
12 print(s)
13 print(np.sum(np.abs(s) > 1e-15))
14
15 import sympy
16 M = sympy.Matrix(M)
17 reduced_form, inds = M.rref()
18
19 print(reduced_form)
20
21 labels = [H2, H, Br2, Br, HBr]
22 for row in reduced_form.tolist():
23 s = 0 =
24 for nu,species in zip(row,labels):
25 if nu != 0:
26
27 s += {0:+d}{1}.format(int(nu), species)
28 if s != 0 = :
29 print(s)
91
6 reactions are given, but the rank of the matrix is only 3. so there are
only three independent reactions. You can see that reaction 6 is just the
opposite of reaction 2, so it is clearly not independent. Also, reactions 3 and
5 are just the reverse of each other, so one of them can also be eliminated.
finally, reaction 4 is equal to reaction 1 minus reaction 3.
There are many possible independent reactions. In the code above, we
use sympy to put the matrix into reduced row echelon form, which enables
us to identify three independent reactions, and shows that three rows are all
zero, i.e. they are not independent of the other three reactions. The choice
of independent reactions is not unique.
1 import numpy as np
2 from sympy import Matrix
3
4 A = np.array([[3, 2, 1],
5 [2, 1, 1],
6 [6, 2, 4]])
7
8 rA, pivots = Matrix(A).rref()
9 print(rA)
This rref form is a bit different than you might get from doing it by
hand. The rows are also normalized.
Based on this, we conclude the A matrix has a rank of 2 since one row
of the reduced form contains all zeros. That means the determinant will be
zero, and it should not be possible to compute the inverse of the matrix,
and there should be no solution to linear equations of Ax = b. Let us check
it out.
1 import numpy as np
2 from sympy import Matrix
3
4 A = np.array([[3, 2, 1],
5 [2, 1, 1],
6 [6, 2, 4]])
92
7
8 print(np.linalg.det(A))
9 print(np.linalg.inv(A))
10
11 b = np.array([3, 0, 6])
12
13 print(np.linalg.solve(A, b))
6.66133814775e-16
[[ 3.00239975e+15 -9.00719925e+15 1.50119988e+15]
[ -3.00239975e+15 9.00719925e+15 -1.50119988e+15]
[ -3.00239975e+15 9.00719925e+15 -1.50119988e+15]]
[ 1.80143985e+16 -1.80143985e+16 -1.80143985e+16]
There are "solutions", but there are a couple of red flags that should
catch your eye. First, the determinant is within machine precision of zero.
Second the elements of the inverse are all "large". Third, the solutions are all
"large". All of these are indications of or artifacts of numerical imprecision.
1 import numpy as np
2 from scipy.linalg import lu
93
3
4 A = np.array([[6, 2, 3],
5 [1, 1, 1],
6 [0, 4, 9]])
7
8 P, L, U = lu(A)
9
10 nswaps = len(np.diag(P)) - np.sum(np.diag(P)) - 1
11
12 detP = (-1)**nswaps
13 detL = np.prod(np.diag(L))
14 detU = np.prod(np.diag(U))
15
16 print(detP * detL * detU)
17
18 print(np.linalg.det(A))
24.0
24.0
94
1.65 + 2.26i 2.05 0.85i 0.97 2.84i 0
6.30i 1.48 1.75i 3.99 + 4.01i 0.59 0.48i
A=
0 0.77 + 2.83i 1.06 + 1.94i 3.33 1.04i
0 0 4.48 1.09i 0.46 1.72i
(8)
and
1.06 + 21.50i
22.72 53.90i
b= . (9)
28.24 38.60i
34.56 + 16.73i
The A matrix has one lower diagonal (kl = 1) and two upper diagonals
(ku = 2), four equations (n = 4) and one right-hand side.
1 import scipy.linalg.lapack as la
2
3 # http://www.nag.com/lapack-ex/node22.html
4 import numpy as np
5 A = np.array([[-1.65 + 2.26j, -2.05 - 0.85j, 0.97 - 2.84j, 0.0 ],
6 [6.30j, -1.48 - 1.75j, -3.99 + 4.01j, 0.59 - 0.48j],
7 [0.0, -0.77 + 2.83j, -1.06 + 1.94j, 3.33 - 1.04j],
8 [0.0, 0.0, 4.48 - 1.09j, -0.46 - 1.72j]])
9
10 # construction of Ab is tricky. Fortran indexing starts at 1, not
11 # 0. This code is based on the definition of Ab at
12 # http://linux.die.net/man/l/zgbsv. First, we create the Fortran
13 # indices based on the loops, and then subtract one from them to index
14 # the numpy arrays.
15 Ab = np.zeros((5,4),dtype=np.complex)
16 n, kl, ku = 4, 1, 2
17
18 for j in range(1, n + 1):
19 for i in range(max(1, j - ku), min(n, j + kl) + 1):
20 Ab[kl + ku + 1 + i - j - 1, j - 1] = A[i-1, j-1]
21
22 b = np.array([[-1.06 + 21.50j],
23 [-22.72 - 53.90j],
24 [28.24 - 38.60j],
25 [-34.56 + 16.73j]])
26
27 lub, piv, x, info = la.flapack.zgbsv(kl, ku, Ab, b)
28
29 # compare to results at http://www.nag.com/lapack-ex/examples/results/zgbsv-ex.r
30 print(x = ,x)
31 print(info = ,info)
32
33 # check solution
34 print(solved: ,np.all(np.dot(A,x) - b < 1e-12))
95
35
36 # here is the easy way!!!
37 print(\n\nbuilt-in solver)
38 print(np.linalg.solve(A,b))
x = [[-3.+2.j]
[ 1.-7.j]
[-5.+4.j]
[ 6.-8.j]]
info = 0
solved: True
built-in solver
[[-3.+2.j]
[ 1.-7.j]
[-5.+4.j]
[ 6.-8.j]]
Some points of discussion.
1. Kind of painful! but, nevertheless, possible. You have to do a lot
more work figuring out the dimensions of the problem, how to setup
the problem, keeping track of indices, etc. . .
But, one day it might be helpful to know this can be done, e.g. to debug
an installation, to validate an approach against known results, etc. . .
5 Nonlinear algebra
Nonlinear algebra problems are typically solved using an iterative process
that terminates when the solution is found within a specified tolerance.
This process is hidden from the user. The canonical standard form to solve
is f (X) = 0.
96
Cao = 2*u.mol/u.L;
V = 10*u.L;
nu = 0.5*u.L/u.s;
k = 0.23 * u.L/u.mol/u.s;
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 # unit definitions
6 m = 1.0
7 L = m**3 / 1000.0
8 mol = 1.0
9 s = 1.0
10
11 # provide data
12 Cao = 2.0 * mol / L
13 V = 10.0 * L
14 nu = 0.5 * L / s
15 k = 0.23 * L / mol / s
16
17 def func(Ca):
18 return V - nu * (Cao - Ca)/(k * Ca**2)
1 c = np.linspace(0.001, 2) * mol / L
2
3 plt.clf()
4 plt.plot(c, func(c))
5 plt.xlabel(C (mol/m^3))
6 plt.ylim([-0.1, 0.1])
7 plt.savefig(images/nonlin-tolerance.png)
97
Now let us solve the equation. It looks like an answer is near C=500.
559.583745606
-1.73472347598e-18
-1.73472347598e-21
know the reactor volume is 100 L, the inlet molar flow of A is 1 mol/L, the
98
volumetric flow is 10 L/min, and ra = kCa, with k = 0.23 1/min. What
is the exit molar flow rate? We need to solve the following equation:
1
Z Fa
100 = dF a
F a(V =0) kF a/
We start by creating a function handle that describes the integrand. We
can use this function in the quad command to evaluate the integral.
1 import numpy as np
2 from scipy.integrate import quad
3 from scipy.optimize import fsolve
4
5 k = 0.23
6 nu = 10.0
7 Fao = 1.0
8
9 def integrand(Fa):
10 return -1.0 / (k * Fa / nu)
11
12 def func(Fa):
13 integral,err = quad(integrand, Fao, Fa)
14 return 100.0 - integral
15
16 vfunc = np.vectorize(func)
99
Now we can see a zero is near Fa = 0.1, so we proceed to solve the
equation.
1 Fa_guess = 0.1
2 Fa_exit, = fsolve(vfunc, Fa_guess)
3 print(The exit concentration is {0:1.2f} mol/L.format(Fa_exit / nu))
100
many solutions. The equations are implicit, so it is not easy to graph them,
but let us give it a shot, starting on the x range -5 to 5. The idea is set a
value for x, and then solve for y in each equation.
1 import numpy as np
2 from scipy.optimize import fsolve
3
4 import matplotlib.pyplot as plt
5
6 def f(x, y):
7 return 2 + x + y - x**2 + 8*x*y + y**3;
8
9 def g(x, y):
10 return 1 + 2*x - 3*y + x**2 + x*y - y*np.exp(x)
11
12 x = np.linspace(-5, 5, 500)
13
14 @np.vectorize
15 def fy(x):
16 x0 = 0.0
17 def tmp(y):
18 return f(x, y)
19 y1, = fsolve(tmp, x0)
20 return y1
21
22 @np.vectorize
23 def gy(x):
24 x0 = 0.0
25 def tmp(y):
26 return g(x, y)
27 y1, = fsolve(tmp, x0)
28 return y1
29
30
31 plt.plot(x, fy(x), x, gy(x))
32 plt.xlabel(x)
33 plt.ylabel(y)
34 plt.legend([fy, gy])
35 plt.savefig(images/continuation-1.png)
/Users/jkitchin/anaconda3/lib/python3.5/site-packages/scipy/optimize/minpack.py:161:
improvement from the last ten iterations.
warnings.warn(msg, RuntimeWarning)
/Users/jkitchin/anaconda3/lib/python3.5/site-packages/scipy/optimize/minpack.py:161:
improvement from the last five Jacobian evaluations.
warnings.warn(msg, RuntimeWarning)
[<matplotlib.lines.Line2D object at 0x10c59a400>, <matplotlib.lines.Line2D object at
<matplotlib.text.Text object at 0x10a0aad30>
<matplotlib.text.Text object at 0x10a3af898>
<matplotlib.legend.Legend object at 0x10a3afb70>
101
You can see there is a solution near x = -1, y = 0, because both functions
equal zero there. We can even use that guess with fsolve. It is disappointly
easy! But, keep in mind that in 3 or more dimensions, you cannot perform
this visualization, and another method could be required.
1 def func(X):
2 x,y = X
3 return [f(x, y), g(x, y)]
4
5 print(fsolve(func, [-2, -2]))
[ -1.00000000e+00 1.28730858e-15]
102
Now, at = 0 we have the simple linear equations:
x + y = 2
2x 3y = 1
These equations are trivial to solve:
[-1.4 -0.6]
(11)
y f /g/x f /xg/
&=& (12)
f /xg/y f /yg/x
\ f/ x &=& 1 - 2 x + 8 y
(13)
(14)
f/ y &=& 1 + 8 x + 3 y2
(15)
(16)
g/ x &=& 2 + 2 x + y - y ex
(17)
(18)
g/ y &=& -3 + x - ex
(19)
103
y
Now, we simply set up those two differential equations on x
and ,
with the initial conditions at = 0 which is the solution of the simpler linear
equations, and integrate to = 1, which is the final solution of the original
equations!
You can see the solution is somewhat approximate; the true solution is x
= -1, y = 0. The approximation could be improved by lowering the tolerance
on the ODE solver. The functions evaluate to a small number, close to
zero. You have to apply some judgment to determine if that is sufficiently
accurate. For instance if the units on that answer are kilometers, but you
need an answer accurate to a millimeter, this may not be accurate enough.
This is a fair amount of work to get a solution! The idea is to solve a
simple problem, and then gradually turn on the hard part by the lambda
parameter. What happens if there are multiple solutions? The answer you
finally get will depend on your = 0 starting point, so it is possible to miss
solutions this way. For problems with lots of variables, this would be a good
approach if you can identify the easy problem.
104
5.4 Method of continuity for solving nonlinear equations -
Part II
Matlab post Yesterday in Post 1324 we looked at a way to solve nonlinear
equations that takes away some of the burden of initial guess generation.
The idea was to reformulate the equations with a new variable , so that at
= 0 we have a simpler problem we know how to solve, and at = 1 we
have the original set of equations. Then, we derive a set of ODEs on how
the solution changes with , and solve them.
Today we look at a simpler example and explain a little more about what
is going on. Consider the equation: f (x) = x2 5x + 6 = 0, which has two
roots, x = 2 and x = 3. We will use the method of continuity to solve this
equation to illustrate a few ideas. First, we introduce a new variable as:
f (x; ) = 0. For example, we could write f (x; ) = x2 5x + 6 = 0. Now,
when = 0, we hve the simpler equation 5x + 6 = 0, with the solution
x = 6/5. The question now is, how does x change as changes? We get that
from the total derivative of how f (x, ) changes with . The total derivative
is:
df f f x
= + =0
d x
f f
We can calculate two of those quantities: and x analytically from
our equation and solve for
x
as
x f f
= /
x
That defines an ordinary differential equation that we can solve by inte-
grating from = 0 where we know the solution to = 1 which is the solution
to the real problem. For this problem: f f
= x and x = 5 + 2x.
2
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 def dxdL(x, Lambda):
6 return -x**2 / (-5.0 + 2 * Lambda * x)
7
8 x0 = 6.0/5.0
9 Lspan = np.linspace(0, 1)
10 x = odeint(dxdL, x0, Lspan)
11
12 plt.plot(Lspan, x)
13 plt.xlabel($\lambda$)
14 plt.ylabel(x)
15 plt.savefig(images/nonlin-contin-II-1.png)
105
We found one solution at x=2. What about the other solution? To get
that we have to introduce into the equations in another way. We could
try: f (x; ) = x2 + (5x + 6), but this leads to an ODE that is singular at
the initial starting point. Another approach is f (x; ) = x2 + 6 + (5x),
but now the solution at = 0 is imaginary, and we do not have a way to
integrate that! What we can do instead is add and subtract a number like
this: f (x; ) = x2 4 + (5x + 6 + 4). Now at = 0, we have a simple
equation with roots at 2, and we already know that x = 2 is a solution.
So, we create our ODE on dx/d with initial condition x(0) = 2.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 def dxdL(x, Lambda):
6 return (5 * x - 10) / (2 * x - 5 * Lambda)
7
8 x0 = -2
9 Lspan = np.linspace(0, 1)
10 x = odeint(dxdL, x0, Lspan)
11
12 plt.plot(Lspan, x)
13 plt.xlabel($\lambda$)
106
14 plt.ylabel(x)
15 plt.savefig(images/nonlin-contin-II-2.png)
Now we have the other solution. Note if you choose the other root,
x = 2, you find that 2 is a root, and learn nothing new. You could choose
other values to add, e.g., if you chose to add and subtract 16, then you
would find that one starting point leads to one root, and the other starting
point leads to the other root. This method does not solve all problems
associated with nonlinear root solving, namely, how many roots are there,
and which one is "best" or physically reasonable? But it does give a way
to solve an equation where you have no idea what an initial guess should
be. You can see, however, that just like you can get different answers from
different initial guesses, here you can get different answers by setting up the
equations differently.
f (x) = x3 + 6x2 4x 24
107
5.5.1 Use roots for this polynomial
This ony works for a polynomial, it does not work for any other nonlinear
function.
1 import numpy as np
2 print(np.roots([1, 6, -4, -24]))
[-6. 2. -2.]
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(-8, 4)
5 y = x**3 + 6 * x**2 - 4*x - 24
6 plt.plot(x, y)
7 plt.savefig(images/count-roots-1.png)
108
5.5.2 method 1
Count the number of times the sign changes in the interval. What we have to
do is multiply neighboring elements together, and look for negative values.
That indicates a sign change. For example the product of two positive or
negative numbers is a positive number. You only get a negative number
from the product of a positive and negative number, which means the sign
changed.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(-8, 4)
5 y = x**3 + 6 * x**2 - 4*x - 24
6
7 print(np.sum(y[0:-2] * y[1:-1] < 0))
This method gives us the number of roots, but not where the roots are.
5.5.3 Method 2
Using events in an ODE solver python can identify events in the solution to
an ODE, for example, when a function has a certain value, e.g. f(x) = 0.
We can take advantage of this to find the roots and number of roots in this
case. We take the derivative of our function, and integrate it from an initial
starting point, and define an event function that counts zeros.
1 import numpy as np
2 from pycse import odelay
3
4 def fprime(f, x):
5 return 3.0 * x**2 + 12.0*x - 4.0
6
7 def event(f, x):
8 value = f # we want f = 0
9 isterminal = False
10 direction = 0
11 return value, isterminal, direction
12
13 xspan = np.linspace(-8, 4)
109
14 f0 = -120
15
16 X, F, TE, YE, IE = odelay(fprime, f0, xspan, events=[event])
17 for te, ye in zip(TE, YE):
18 print(root found at x = {0: 1.3f}, f={1: 1.3f}.format(te, float(ye)))
110
You can see there are many roots to this equation, and we want to be
sure we get the n{th} root. This function is pretty well behaved, so if you
make a good guess about the solution you will get an answer, but if you
make a bad guess, you may get the wrong root. We examine next a way
to do it without guessing the solution. What we want is the solution to
f (x) = 0, but we want all the solutions in a given interval. We derive a new
equation, f 0 (x) = 0, with initial condition f (0) = f 0, and integrate the ODE
with an event function that identifies all zeros of f for us. The derivative
of our function is df /dx = d/dx(xJ1 (x)) BiJ00 (x). It is known (http:
//www.markrobrien.com/besselfunct.pdf) that d/dx(xJ1 (x)) = xJ0 (x),
and J00 (x) = J1 (x). All we have to do now is set up the problem and run
it.
111
13 def fprime(f, x):
14 "df/dx"
15 return x * jn(0, x) - Bi * (-jn(1, x))
16
17 def e1(f, x):
18 "event function to find zeros of f"
19 isterminal = False
20 value = f
21 direction = 0
22 return value, isterminal, direction
23
24 f0 = f(0)
25 xspan = np.linspace(0, 30, 200)
26
27 x, fsol, XE, FE, IE = odelay(fprime, f0, xspan, events=[e1])
28
29 plt.plot(x, fsol, .-, label=Numerical solution)
30 plt.plot(xspan, f(xspan), --, label=Analytical function)
31 plt.plot(XE, FE, ro, label=roots)
32 plt.legend(loc=best)
33 plt.savefig(images/heat-transfer-roots-2.png)
34
35 for i, root in enumerate(XE):
36 print(root {0} is at {1}.format(i, root))
root 0 is at 1.2557837640729235
root 1 is at 4.079477427934495
root 2 is at 7.15579903773092
root 3 is at 10.270985121143715
root 4 is at 13.398397381859922
root 5 is at 16.53115870938385
root 6 is at 19.66672767595721
root 7 is at 22.80395034851908
root 8 is at 25.94222881771057
root 9 is at 29.081221488671474
112
You can work this out once, and then you have all the roots in the
interval and you can select the one you want.
y= x2 (20)
y= 8 x2 (21)
113
9 x0, y0 = 1, 1 # initial guesses
10 guess = [x0, y0]
11 sol = fsolve(objective, guess)
12 print(sol)
13
14 # of course there may be more than one solution
15 x0, y0 = -1, -1 # initial guesses
16 guess = [x0, y0]
17 sol = fsolve(objective, guess)
18 print(sol)
[ 2. 4.]
[-2. 4.]
6 Statistics
6.1 Introduction to statistical data analysis
Matlab post
Given several measurements of a single quantity, determine the average
value of the measurements, the standard deviation of the measurements and
the 95% confidence interval for the average.
1 import numpy as np
2
3 y = [8.1, 8.0, 8.1]
4
5 ybar = np.mean(y)
6 s = np.std(y, ddof=1)
7
8 print(ybar, s)
8.06666666667 0.057735026919
114
4. Compute the student-t multiplier. This is a function of the confidence
interval you specify, and the number of data points you have minus 1.
You subtract 1 because one degree of freedom is lost from calculating
the average.
T_multiplier = 4.302652729911275
ci95 = 0.14342175766370865
The true average is between 7.9232449090029595 and 8.210088424330376 at a 95% confide
interval you specify, and the number of data points you have minus 1. You
subtract 1 because one degree of freedom is lost from calculating the average.
The confidence interval is defined as ybar +- T_multiplier*std/sqrt(n).
1 import numpy as np
2 from scipy.stats.distributions import t
115
3
4 y = [8.1, 8.0, 8.1]
5
6 ybar = np.mean(y)
7 s = np.std(y)
8
9 ci = 0.95
10 alpha = 1.0 - ci
11
12 n = len(y)
13 T_multiplier = t.ppf(1-alpha/2.0, n-1)
14
15 ci95 = T_multiplier * s / np.sqrt(n-1)
16
17 print([ybar - ci95, ybar + ci95])
[7.9232449090029595, 8.210088424330376]
We are 95% certain the next measurement will fall in the interval above.
1 import numpy as np
2 from scipy.stats.distributions import t
3
4 n = 10 # number of measurements
5 dof = n - 1 # degrees of freedom
6 avg_x = 16.1 # average measurement
7 std_x = 0.01 # standard deviation of measurements
8
9 # Find 95% prediction interval for next measurement
10
11 alpha = 1.0 - 0.95
12
13 pred_interval = t.ppf(1-alpha/2.0, dof) * std_x / np.sqrt(n)
14
15 s = [We are 95% confident the next measurement,
16 will be between {0:1.3f} and {1:1.3f}]
17 print(.join(s).format(avg_x - pred_interval, avg_x + pred_interval))
We are 95% confident the next measurement will be between 16.093 and 16.107
116
6.4 Are averages different
Matlab post
Adapted from http://stattrek.com/ap-statistics-4/unpaired-means.
aspx
Class A had 30 students who received an average test score of 78, with
standard deviation of 10. Class B had 25 students an average test score of 85,
with a standard deviation of 15. We want to know if the difference in these
averages is statistically relevant. Note that we only have estimates of the
true average and standard deviation for each class, and there is uncertainty
in those estimates. As a result, we are unsure if the averages are really
different. It could have just been luck that a few students in class B did
better.
1 import numpy as np
2
3 n1 = 30 # students in class A
4 x1 = 78.0 # average grade in class A
5 s1 = 10.0 # std dev of exam grade in class A
6
7 n2 = 25 # students in class B
8 x2 = 85.0 # average grade in class B
9 s2 = 15.0 # std dev of exam grade in class B
10
11 # the standard error of the difference between the two averages.
12 SE = np.sqrt(s1**2 / n1 + s2**2 / n2)
13
14 # compute DOF
15 DF = (n1 - 1) + (n2 - 1)
117
of the means and the hypothesized difference of the means, normalized by
the standard error. we compute the absolute value of the t-score to make
sure it is positive for convenience later.
1.99323179108
6.4.3 Interpretation
A way to approach determinining if the difference is significant or not is
to ask, does our computed average fall within a confidence range of the
hypothesized value (zero)? If it does, then we can attribute the difference
to statistical variations at that confidence level. If it does not, we can say
that statistical variations do not account for the difference at that confidence
level, and hence the averages must be different.
Let us compute the t-value that corresponds to a 95% confidence level
for a mean of zero with the degrees of freedom computed earlier. This means
that 95% of the t-scores we expect to get will fall within t95.
2.00574599354
since tscore < t95, we conclude that at the 95% confidence level we
cannot say these averages are statistically different because our computed
t-score falls in the expected range of deviations. Note that our t-score is
very close to the 95% limit. Let us consider a smaller confidence interval.
1 ci = 0.94
2 alpha = 1 - ci;
3 t95 = t.ppf(1.0 - alpha/2.0, DF)
4
5 print(t95)
1.92191364181
118
at the 94% confidence level, however, tscore > t94, which means we
can say with 94% confidence that the two averages are different; class B
performed better than class A did. Alternatively, there is only about a 6%
chance we are wrong about that statement. another way to get there
An alternative way to get the confidence that the averages are different
is to directly compute it from the cumulative t-distribution function. We
compute the difference between all the t-values less than tscore and the
t-values less than -tscore, which is the fraction of measurements that are
between them. You can see here that we are practically 95% sure that the
averages are different.
0.948605075732
We need to read the data in, and perform a regression analysis on P vs.
T. In python we start counting at 0, so we actually want columns 3 and 4.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 data = np.loadtxt(data/PT.txt, skiprows=2)
119
5 T = data[:, 3]
6 P = data[:, 4]
7
8 plt.plot(T, P, k.)
9 plt.xlabel(Temperature)
10 plt.ylabel(Pressure)
11 plt.savefig(images/model-selection-1.png)
It appears the data is roughly linear, and we know from the ideal gas law
that PV = nRT, or P = nR/V*T, which says P should be linearly correlated
with V. Note that the temperature data is in degC, not in K, so it is not
expected that P=0 at T = 0. We will use linear algebra to compute the line
coefficients.
1 A = np.vstack([T**0, T]).T
2 b = P
3
4 x, res, rank, s = np.linalg.lstsq(A, b)
5 intercept, slope = x
6 print(b, m =, intercept, slope)
7
120
8 n = len(b)
9 k = len(x)
10
11 sigma2 = np.sum((b - np.dot(A,x))**2) / (n - k)
12
13 C = sigma2 * np.linalg.inv(np.dot(A.T, A))
14 se = np.sqrt(np.diag(C))
15
16 from scipy.stats.distributions import t
17 alpha = 0.05
18
19 sT = t.ppf(1-alpha/2., n - k) # student T multiplier
20 CI = sT * se
21
22 print(CI = ,CI)
23 for beta, ci in zip(x, CI):
24 print([{0} {1}].format(beta - ci, beta + ci))
b, m = 7.74899739238 3.93014043824
CI = [ 4.76511545 0.1026405 ]
[2.9838819463763695 12.514112838386989]
[3.8274999407885466 4.032780935691978]
The confidence interval on the intercept is large, but it does not contain
zero at the 95% confidence level.
The R2 value accounts roughly for the fraction of variation in the data
that can be described by the model. Hence, a value close to one means nearly
all the variations are described by the model, except for random variations.
1 ybar = np.mean(P)
2 SStot = np.sum((P - ybar)**2)
3 SSerr = np.sum((P - np.dot(A, x))**2)
4 R2 = 1 - SSerr/SStot
5 print(R2)
0.993715411798
1 plt.figure(); plt.clf()
2 plt.plot(T, P, k., T, np.dot(A, x), b-)
3 plt.xlabel(Temperature)
4 plt.ylabel(Pressure)
5 plt.title(R^2 = {0:1.3f}.format(R2))
6 plt.savefig(images/model-selection-2.png)
121
<matplotlib.text.Text object at 0x110ca7908>
<matplotlib.text.Text object at 0x110caf9b0>
<matplotlib.text.Text object at 0x110cd0358>
The fit looks good, and R2 is near one, but is it a good model? There
are a few ways to examine this. We want to make sure that there are no
systematic trends in the errors between the fit and the data, and we want
to make sure there are not hidden correlations with other variables. The
residuals are the error between the fit and the data. The residuals should
not show any patterns when plotted against any variables, and they do not
in this case.
1 residuals = P - np.dot(A, x)
2
3 plt.figure()
4
5 f, (ax1, ax2, ax3) = plt.subplots(3)
6
7 ax1.plot(T,residuals,ko)
8 ax1.set_xlabel(Temperature)
9
10
11 run_order = data[:, 0]
12 ax2.plot(run_order, residuals,ko )
13 ax2.set_xlabel(run order)
14
122
15 ambientT = data[:, 2]
16 ax3.plot(ambientT, residuals,ko)
17 ax3.set_xlabel(ambient temperature)
18
19 plt.tight_layout() # make sure plots do not overlap
20 plt.savefig(images/model-selection-3.png)
There may be some correlations in the residuals with the run order. That
could indicate an experimental source of error.
We assume all the errors are uncorrelated with each other. We can use a
lag plot to assess this, where we plot residual[i] vs residual[i-1], i.e. we look
for correlations between adjacent residuals. This plot should look random,
with no correlations if the model is good.
1 plt.figure(); plt.clf()
2 plt.plot(residuals[1:-1], residuals[0:-2],ko)
123
3 plt.xlabel(residual[i])
4 plt.ylabel(residual[i-1])
5 plt.savefig(images/model-selection-correlated-residuals.png)
1 A = np.vstack([T**0, T, T**2]).T
2 b = P;
3
4 x, res, rank, s = np.linalg.lstsq(A, b)
5 print(x)
6
7 n = len(b)
8 k = len(x)
9
10 sigma2 = np.sum((b - np.dot(A,x))**2) / (n - k)
11
12 C = sigma2 * np.linalg.inv(np.dot(A.T, A))
13 se = np.sqrt(np.diag(C))
14
124
15 from scipy.stats.distributions import t
16 alpha = 0.05
17
18 sT = t.ppf(1-alpha/2., n - k) # student T multiplier
19 CI = sT * se
20
21 print(CI = ,CI)
22 for beta, ci in zip(x, CI):
23 print([{0} {1}].format(beta - ci, beta + ci))
24
25
26 ybar = np.mean(P)
27 SStot = np.sum((P - ybar)**2)
28 SSerr = np.sum((P - np.dot(A,x))**2)
29 R2 = 1 - SSerr/SStot
30 print(R^2 = {0}.format(R2))
You can see that the confidence interval on the constant and T2 term
includes zero. That is a good indication this additional parameter is not
significant. You can see also that the R2 value is not better than the one
from a linear fit, so adding a parameter does not increase the goodness of
fit. This is an example of overfitting the data. Since the constant in this
model is apparently not significant, let us consider the simplest model with
a fixed intercept of zero.
Let us consider a model with intercept = 0, P = alpha*T.
1 A = np.vstack([T]).T
2 b = P;
3
4 x, res, rank, s = np.linalg.lstsq(A, b)
5
6 n = len(b)
7 k = len(x)
8
9 sigma2 = np.sum((b - np.dot(A,x))**2) / (n - k)
10
11 C = sigma2 * np.linalg.inv(np.dot(A.T, A))
12 se = np.sqrt(np.diag(C))
13
14 from scipy.stats.distributions import t
15 alpha = 0.05
16
125
17 sT = t.ppf(1-alpha/2.0, n - k) # student T multiplier
18 CI = sT * se
19
20 for beta, ci in zip(x, CI):
21 print([{0} {1}].format(beta - ci, beta + ci))
22
23 plt.figure()
24 plt.plot(T, P, k. , T, np.dot(A, x))
25 plt.xlabel(Temperature)
26 plt.ylabel(Pressure)
27 plt.legend([data, fit])
28
29 ybar = np.mean(P)
30 SStot = np.sum((P - ybar)**2)
31 SSerr = np.sum((P - np.dot(A,x))**2)
32 R2 = 1 - SSerr/SStot
33 plt.title(R^2 = {0:1.3f}.format(R2))
34 plt.savefig(images/model-selection-no-intercept.png)
[4.056801244949384 4.123083498991817]
<matplotlib.figure.Figure object at 0x111a1ae10>
[<matplotlib.lines.Line2D object at 0x111c23b70>, <matplotlib.lines.Line2D object at
<matplotlib.text.Text object at 0x111a3be48>
<matplotlib.text.Text object at 0x111c06550>
<matplotlib.legend.Legend object at 0x111c326a0>
<matplotlib.text.Text object at 0x111c187f0>
The
126
fit is visually still pretty good, and the R2 value is only slightly worse. Let
us examine the residuals again.
1 residuals = P - np.dot(A,x)
2
3 plt.figure()
4 plt.plot(T,residuals,ko)
5 plt.xlabel(Temperature)
6 plt.ylabel(residuals)
7 plt.savefig(images/model-selection-no-incpt-resid.png)
You can see a slight trend of decreasing value of the residuals as the
Temperature increases. This may indicate a deficiency in the model with
no intercept. For the ideal gas law in degC: P V = nR(T + 273) or P =
nR/V T + 273 nR/V , so the intercept is expected to be non-zero in this
case. Specifically, we expect the intercept to be 273*R*n/V. Since the molar
density of a gas is pretty small, the intercept may be close to, but not equal
to zero. That is why the fit still looks ok, but is not as good as letting the
127
intercept be a fitting parameter. That is an example of the deficiency in our
model.
In the end, it is hard to justify a model more complex than a line in this
case.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 N = 1e4 # number of samples of parameters
5
6 A_mu = 2.5; A_sigma = 0.4
7 B_mu = 4.1; B_sigma = 0.3
8
9 A = np.random.normal(A_mu, A_sigma, size=N)
10 B = np.random.normal(B_mu, B_sigma, size=N)
11
12 p = A + B
13 m = A - B
14
15 plt.hist(p)
16 plt.show()
17
18 print(np.std(p))
19 print(np.std(m))
20
21 print(np.sqrt(A_sigma**2 + B_sigma**2)) # the analytical std dev
128
1.00800000e+03, 2.43300000e+03, 3.20400000e+03,
2.21700000e+03, 7.61000000e+02, 1.56000000e+02,
1.00000000e+01]), array([ 4.37827788, 4.78610393, 5.19392997, 5.60175601,
6.4174081 , 6.82523414, 7.23306018, 7.64088623, 8.04871227,
8.45653831]), <a list of 10 Patch objects>)
0.491531788514
0.495333805189
0.5
6.6.2 Multiplication
11.8683624528
11.8726576637
6.6.3 Division
This is really like multiplication: F / x = F * (1 / x).
1 d = F / x
2 print(np.std(d))
3 print(np.sqrt((F_sigma / F_mu)**2 + (x_sigma / x_mu)**2) * F_mu / x_mu)
0.295540277244
0.289859806243
6.6.4 exponents
This rule is different than multiplication (A2 = A*A) because in the previ-
ous examples we assumed the errors in A and B for A*B were uncorrelated.
in A*A, the errors are not uncorrelated, so there is a different rule for error
propagation.
129
1 t_mu = 2.03; t_sigma = 0.01*t_mu; # 1% error
2 A_mu = 16.07; A_sigma = 0.06;
3
4 t = np.random.normal(t_mu, t_sigma, size=(1, N))
5 A = np.random.normal(A_mu, A_sigma, size=(1, N))
6
7 # Compute t^5 and sqrt(A) with error propagation
8 print(np.std(t**5))
9 print((5 * t_sigma / t_mu) * t_mu**5)
1.7454605614
1.7236544062149992
1 print(np.std(np.sqrt(A)))
2 print(1.0 / 2.0 * A_sigma / A_mu * np.sqrt(A_mu))
0.00747831865757
0.00748364738749
3.62167603078
3.61801050303
6.6.6 Summary
You can numerically perform error propagation analysis if you know the
underlying distribution of errors on the parameters in your equations. One
benefit of the numerical propogation is you do not have to remember the
error propagation rules, and you directly look at the distribution in nonlinear
cases. Some limitations of this approach include
130
1. You have to know the distribution of the errors in the parameters
1 import uncertainties as u
2
3 A = u.ufloat((2.5, 0.4))
4 B = u.ufloat((4.1, 0.3))
5 print(A + B)
6 print(A - B)
6.6+/-0.5
-1.6+/-0.5
1 F = u.ufloat((25, 1))
2 x = u.ufloat((6.4, 0.4))
3
4 t = F * x
5 print(t)
6
7 d = F / x
8 print(d)
160+/-12
3.91+/-0.29
Exponentiation
1 t = u.ufloat((2.03, 0.0203))
2 print(t**5)
3
131
4 from uncertainties.umath import sqrt
5 A = u.ufloat((16.07, 0.06))
6 print(sqrt(A))
7 # print np.sqrt(A) # this does not work
8
9 from uncertainties import unumpy as unp
10 print(unp.sqrt(A))
34.5+/-1.7
4.009+/-0.007
4.009+/-0.007
Note in the last example, we had to either import a function from un-
certainties.umath or import a special version of numpy that handles uncer-
tainty. This may be a limitation of the uncertainties package as not all
functions in arbitrary modules can be covered. Note, however, that you can
wrap a function to make it handle uncertainty like this.
1 import numpy as np
2
3 wrapped_sqrt = u.wrap(np.sqrt)
4 print(wrapped_sqrt(A))
4.009+/-0.007
1 import numpy as np
2 import uncertainties as u
3
4 x = np.array([u.ufloat((1, 0.01)),
5 u.ufloat((2, 0.1)),
6 u.ufloat((3, 0.1))])
7
8 y = 2 * x
9
10 print(np.trapz(x, y))
8.0+/-0.6
1 v0 = u.ufloat((1.2, 0.02))
2 a = u.ufloat((3.0, 0.3))
3 t = u.ufloat((12.0, 0.12))
4
5 v = v0 + a * t
6 print(v)
132
37+/-4
A real example? This is what I would setup for a real working example.
We try to compute the exit concentration from a CSTR. The idea is to
wrap the "external" fsolve function using the uncertainties.wrap function,
which handles the units. Unfortunately, it does not work, and it is not clear
why. But see the following discussion for a fix.
133
not know how to deal with uncertainties. The idea is to create a function
that returns a float, when everything is given as a float. Then, we wrap the
fsolve call, and finally wrap the wrapped fsolve call!
Step 1. Write the function to solve with arguments for all unitted
quantities. This function may be called with uncertainties, or with
floats.
Step 2. Wrap the call to fsolve in a function that takes all the param-
eters as arguments, and that returns the solution.
Step 3. Use uncertainties.wrap to wrap the function in Step 2 to get
the answer with uncertainties.
1 import uncertainties as u
2 from scipy.optimize import fsolve
3
4 Fa0 = u.ufloat((5.0, 0.05))
5 v0 = u.ufloat((10., 0.1))
6
7 V = u.ufloat((66000.0, 100.0)) # reactor volume L^3
8 k = u.ufloat((3.0, 0.2)) # rate constant L/mol/h
9
10 # Step 1
11 def func(Ca, v0, k, Fa0, V):
12 "Mole balance for a CSTR. Solve this equation for func(Ca)=0"
13 Fa = v0 * Ca # exit molar flow of A
14 ra = -k * Ca**2 # rate of reaction of A L/mol/h
15 return Fa0 - Fa + V * ra
16
17 # Step 2
18 def Ca_solve(v0, k, Fa0, V):
19 wrap fsolve to pass parameters as float or units
20 # this line is a little fragile. You must put [0] at the end or
21 # you get the NotImplemented result
22 guess = 0.1 * Fa0 / v0
23 sol = fsolve(func, guess, args=(v0, k, Fa0, V))[0]
24 return sol
25
26 # Step 3
27 print(u.wrap(Ca_solve)(v0, k, Fa0, V))
0.00500+/-0.00017
It would take some practice to get used to this, but the payoff is that
you have an "automatic" error propagation method.
Being ever the skeptic, let us compare the result above to the Monte
Carlo approach to error estimation below.
134
1 import numpy as np
2 from scipy.optimize import fsolve
3
4 N = 10000
5 Fa0 = np.random.normal(5, 0.05, (1, N))
6 v0 = np.random.normal(10.0, 0.1, (1, N))
7 V = np.random.normal(66000, 100, (1,N))
8 k = np.random.normal(3.0, 0.2, (1, N))
9
10 SOL = np.zeros((1, N))
11
12 for i in range(N):
13 def func(Ca):
14 return Fa0[0,i] - v0[0,i] * Ca + V[0,i] * (-k[0,i] * Ca**2)
15 SOL[0,i] = fsolve(func, 0.1 * Fa0[0,i] / v0[0,i])[0]
16
17 print(Ca(exit) = {0}+/-{1}.format(np.mean(SOL), np.std(SOL)))
Ca(exit) = 0.005007316200125377+/-0.00017141142140602455
6.7.1 Summary
The uncertainties module is pretty amazing. It automatically propagates
errors through a pretty broad range of computations. It is a little tricky for
third-party packages, but it seems doable.
Read more about the package at http://pythonhosted.org/uncertainties/
index.html.
1 import numpy as np
2
3 n = np.random.uniform()
135
4 print(n = {0}.format(n))
5
6 if n > 0.49:
7 print(You win!)
8 else:
9 print(you lose.)
n = 0.2932019329044865
you lose.
The odds of you winning the last bet are slightly stacked in your favor.
There is only a 49% chance your friend wins, but a 51% chance that you
win. Lets play the game a lot of times times and see how many times you
win, and your friend wins. First, lets generate a bunch of numbers and look
at the distribution with a histogram.
1 import numpy as np
2
3 N = 10000
4 games = np.random.uniform(size=N)
5
6 wins = np.sum(games > 0.49)
7 losses = N - wins
8
9 print(You won {0} times ({1:%}).format(wins, float(wins) / N))
10
11 import matplotlib.pyplot as plt
12 count, bins, ignored = plt.hist(games)
13 plt.savefig(images/random-thoughts-1.png)
136
As you can see you win slightly more than you lost.
It is possible to get random integers. Here are a few examples of getting
a random integer between 1 and 100. You might do this to get random
indices of a list, for example.
1 import numpy as np
2
3 print(np.random.random_integers(1, 100))
4 print(np.random.random_integers(1, 100, 3))
5 print(np.random.random_integers(1, 100, (2, 2)))
11
[50 72 79]
[[14 37]
[77 92]]
2
The normal distribution is defined by f (x) = 1 2 exp( (x)
2 2
) where
2
is the mean value, and is the standard deviation. In the standard
distribution, = 0 and = 1.
1 import numpy as np
2
3 mu = 1
137
4 sigma = 0.5
5 print(np.random.normal(mu, sigma))
6 print(np.random.normal(mu, sigma, 2))
0.9794466646232775
[ 1.58062379 0.71593225]
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 mu = 0; sigma = 1
5
6 N = 5000
7 samples = np.random.normal(mu, sigma, N)
8
9 counts, bins, ignored = plt.hist(samples, 50, normed=True)
10
11 plt.plot(bins, 1.0/np.sqrt(2 * np.pi * sigma**2)*np.exp(-((bins - mu)**2)/(2*sigma**2)))
12 plt.savefig(images/random-thoughts-2.png)
What fraction of points lie between plus and minus one standard devia-
tion of the mean?
138
samples >= mu-sigma will return a vector of ones where the inequality
is true, and zeros where it is not. (samples >= mu-sigma) & (samples
<= mu+sigma) will return a vector of ones where there is a one in both
vectors, and a zero where there is not. In other words, a vector where both
inequalities are true. Finally, we can sum the vector to get the number of
elements where the two inequalities are true, and finally normalize by the
total number of samples to get the fraction of samples that are greater than
-sigma and less than sigma.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 mu = 0; sigma = 1
5
6 N = 5000
7 samples = np.random.normal(mu, sigma, N)
8
9 a = np.sum((samples >= (mu - sigma)) & (samples <= (mu + sigma))) / float(N)
10 b = np.sum((samples >= (mu - 2*sigma)) & (samples <= (mu + 2*sigma))) / float(N)
11 print({0:%} of samples are within +- standard deviations of the mean.format(a))
12 print({0:%} of samples are within +- 2standard deviations of the mean.format(b))
6.8.1 Summary
We only considered the numpy.random functions here, and not all of them.
There are many distributions of random numbers to choose from. There are
also random numbers in the python random module. Remember these are
only pseudorandom numbers, but they are still useful for many applications.
7 Data analysis
7.1 Fit a line to numerical data
Matlab post
We want to fit a line to this data:
139
1 import numpy as np
2
3 p = np.polyfit(x, y, 1)
4 print(p)
5 slope, intercept = p
6 print(slope, intercept)
[-0.31452218 0.00062457]
-0.3145221843 0.000624573378839
To show the fit, we can use numpy.polyval to evaluate the fit at many
points.
140
7.2 Linear least squares fitting with linear algebra
Matlab post
The idea here is to formulate a set of linear equations that is easy to solve.
We can express the equations in terms of our unknown fitting parameters
pi as:
x1^0*p0 + x1*p1 = y1
x2^0*p0 + x2*p1 = y2
x3^0*p0 + x3*p1 = y3
etc...
1 import numpy as np
2 x = np.array([0, 0.5, 1, 1.5, 2.0, 3.0, 4.0, 6.0, 10])
3 y = np.array([0, -0.157, -0.315, -0.472, -0.629, -0.942, -1.255, -1.884, -3.147])
4
5 A = np.column_stack([x**0, x])
6
7 M = np.dot(A.T, A)
141
8 b = np.dot(A.T, y)
9
10 i1, slope1 = np.dot(np.linalg.inv(M), b)
11 i2, slope2 = np.linalg.solve(M, b) # an alternative approach.
12
13 print(i1, slope1)
14 print(i2, slope2)
15
16 # plot data and fit
17 import matplotlib.pyplot as plt
18
19 plt.plot(x, y, bo)
20 plt.plot(x, np.dot(A, [i1, slope1]), r--)
21 plt.xlabel(x)
22 plt.ylabel(y)
23 plt.savefig(images/la-line-fit.png)
0.00062457337884 -0.3145221843
0.000624573378839 -0.3145221843
142
confidence interval for each parameter. Data from example 5-1 in Fogler,
Elements of Chemical Reaction Engineering.
We want the equation Ca(t) = b0 + b1 t + b2 t2 + b3 t3 + b4 t4 fit
to the data in the least squares sense. We can write this in a linear algebra
form as: T*p = Ca where T is a matrix of columns [1 t t2 t3 t4], and
p is a column vector of the fitting parameters. We want to solve for the p
vector and estimate the confidence intervals.
pycse now has a regress function similar to Matlab. That function just
uses the code in the next example (also seen here).
[[ 4.90747574e-02 5.09057619e-02]
[ -3.49867288e-04 -2.45825348e-04]
[ 5.40268291e-07 2.14670133e-06]
[ -7.67338615e-09 7.03689639e-10]
[ -3.23368790e-12 1.06276264e-11]]
1 import numpy as np
2 from scipy.stats.distributions import t
3
4 time = np.array([0.0, 50.0, 100.0, 150.0, 200.0, 250.0, 300.0])
5 Ca = np.array([50.0, 38.0, 30.6, 25.6, 22.2, 19.5, 17.4])*1e-3
6
143
7 T = np.column_stack([time**0, time, time**2, time**3, time**4])
8
9 p, res, rank, s = np.linalg.lstsq(T, Ca)
10 # the parameters are now in p
11
12 # compute the confidence intervals
13 n = len(Ca)
14 k = len(p)
15
16 sigma2 = np.sum((Ca - np.dot(T, p))**2) / (n - k) # RMSE
17
18 C = sigma2 * np.linalg.inv(np.dot(T.T, T)) # covariance matrix
19 se = np.sqrt(np.diag(C)) # standard error
20
21 alpha = 0.05 # 100*(1 - alpha) confidence level
22
23 sT = t.ppf(1.0 - alpha/2.0, n - k) # student T multiplier
24 CI = sT * se
25
26 for beta, ci in zip(p, CI):
27 print({2: 1.2e} [{0: 1.4e} {1: 1.4e}].format(beta - ci, beta + ci, beta))
28
29 SS_tot = np.sum((Ca - np.mean(Ca))**2)
30 SS_err = np.sum((np.dot(T, p) - Ca)**2)
31
32 # http://en.wikipedia.org/wiki/Coefficient_of_determination
33 Rsq = 1 - SS_err/SS_tot
34 print(R^2 = {0}.format(Rsq))
35
36 # plot fit
37 import matplotlib.pyplot as plt
38 plt.plot(time, Ca, bo, label=raw data)
39 plt.plot(time, np.dot(T, p), r-, label=fit)
40 plt.xlabel(Time)
41 plt.ylabel(Ca (mol/L))
42 plt.legend(loc=best)
43 plt.savefig(images/linregress-conf.png)
144
A fourth order polynomial fits the data well, with a good R2 value. All
of the parameters appear to be significant, i.e. zero is not included in any
of the parameter confidence intervals. This does not mean this is the best
model for the data, just that the model fits well.
145
16 def objective(pars, y, x):
17 #we will minimize this function
18 err = y - Murnaghan(pars, x)
19 return err
20
21 x0 = [ -56.0, 0.54, 2.0, 16.5] #initial guess of parameters
22
23 plsq = leastsq(objective, x0, args=(energies, vols))
24
25 print(Fitted parameters = {0}.format(plsq[0]))
26
27 import matplotlib.pyplot as plt
28 plt.plot(vols,energies, ro)
29
30 #plot the fitted curve on top
31 x = np.linspace(min(vols), max(vols), 50)
32 y = Murnaghan(plsq[0], x)
33 plt.plot(x, y, k-)
34 plt.xlabel(Volume)
35 plt.ylabel(Energy)
36 plt.savefig(images/nonlinear-curve-fitting.png)
146
See additional examples at http://docs.scipy.org/doc/scipy/reference/
tutorial/optimize.html.
147
Iterations: 137
Function evaluations: 240
parameters = [-56.46932645 0.59141447 1.9044796 16.59341303]
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.array([0.0, 1.1, 2.3, 3.1, 4.05, 6.0])
5 y = np.array([0.0039, 1.2270, 5.7035, 10.6472, 18.6032, 42.3024])
6
7 plt.plot(x, y)
8 plt.xlabel(x)
9 plt.ylabel(y)
10 plt.savefig(images/nonlin-minsse-1.png)
148
[<matplotlib.lines.Line2D object at 0x10d7a38d0>]
<matplotlib.text.Text object at 0x10b1bb7f0>
<matplotlib.text.Text object at 0x10b1cdeb8>
We are going to fit the function y = xa to the data. The best a will
minimize the summed squared error between the model and the fit.
1 def errfunc_(a):
2 return np.sum((y - x**a)**2)
3
4 errfunc = np.vectorize(errfunc_)
5
6 arange = np.linspace(1, 3)
7 sse = errfunc(arange)
8
9 plt.figure()
10 plt.plot(arange, sse)
11 plt.xlabel(a)
12 plt.ylabel($\Sigma (y - y_{pred})^2$)
13 plt.savefig(images/nonlin-minsse-2.png)
149
Based on the graph above, you can see a minimum in the summed
squared error near a = 2.1. We use that as our initial guess. Since we
know the answer is bounded, we use scipy.optimize.fminbound
2.09004838933
<matplotlib.figure.Figure object at 0x110c47c50>
[<matplotlib.lines.Line2D object at 0x110f28080>]
[<matplotlib.lines.Line2D object at 0x110ecb198>]
<matplotlib.text.Text object at 0x110edbc88>
<matplotlib.text.Text object at 0x110ee3da0>
<matplotlib.legend.Legend object at 0x110f28320>
150
We can do nonlinear fitting by directly minimizing the summed squared
error between a model and data. This method lacks some of the features
of other methods, notably the simple ability to get the confidence interval.
However, this method is flexible and may offer more insight into how the
solution depends on the parameters.
151
95% confidence intervals on the parameters.
152
You can see by inspection that the fit looks pretty reasonable. The
parameter confidence intervals are not too big, so we can be pretty confident
of their values.
153
12 return c0 * np.exp(-x) + c1*x
13
14 pars, pcov = curve_fit(func, x, y, p0=[4.96, 2.11])
15
16 alpha = 0.05 # 95% confidence interval
17
18 n = len(y) # number of data points
19 p = len(pars) # number of parameters
20
21 dof = max(0, n-p) # number of degrees of freedom
22
23 tval = t.ppf(1.0 - alpha / 2.0, dof) # student-t value for the dof and confidence level
24
25 for i, p,var in zip(range(n), pars, np.diag(pcov)):
26 sigma = var**0.5
27 print(c{0}: {1} [{2} {3}].format(i, p,
28 p - sigma*tval,
29 p + sigma*tval))
30
31 import matplotlib.pyplot as plt
32 plt.plot(x,y,bo )
33 xfit = np.linspace(0,1)
34 yfit = func(xfit, pars[0], pars[1])
35 plt.plot(xfit,yfit,b-)
36 plt.legend([data,fit],loc=best)
37 plt.savefig(images/nonlin-fit-ci.png)
1 import numpy as np
2 from mpl_toolkits.mplot3d import Axes3D
3 import matplotlib.pyplot as plt
4
5 x1 = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
154
Figure 3: Nonlinear fit to data.
155
32 return f
33
34 @np.vectorize
35 def errfunc(a, b):
36 # function for the summed squared error
37 fit = model(X, a, b)
38 sse = np.sum((fit - f)**2)
39 return sse
40
41 SSE = errfunc(A, B)
42
43 plt.clf()
44 plt.contourf(A, B, SSE, 50)
45 plt.plot([3.2], [2.1], ro)
46 plt.figtext( 3.4, 2.2, Minimum near here, color=r)
47
48 plt.savefig(images/graphical-mulvar-2.png)
49
50 guesses = [3.18, 2.02]
51
52 from scipy.optimize import curve_fit
53
54 popt, pcov = curve_fit(model, X, f, guesses)
55 print(popt)
56
57 plt.plot([popt[0]], [popt[1]], r*)
58 plt.savefig(images/graphical-mulvar-3.png)
59
60 print(model(X, *popt))
61
62 fig = plt.figure()
63 ax = fig.gca(projection = 3d)
64
65 ax.plot(x1, x2, f, ko, label=data)
66 ax.plot(x1, x2, model(X, *popt), r-, label=fit)
67 ax.set_xlabel(x1)
68 ax.set_ylabel(x2)
69 ax.set_zlabel(f(x1,x2))
70
71 plt.savefig(images/graphical-mulvar-4.png)
[ 3.21694798 1.9728254 ]
[ 3.25873623 6.59792994 10.29473657 13.68011436 17.29161001
23.62366445]
156
157
It can be difficult to figure out initial guesses for nonlinear fitting prob-
lems. For one and two dimensional systems, graphical techniques may be
useful to visualize how the summed squared error between the model and
data depends on the parameters.
158
7.11 Fitting a numerical ODE solution to data
Matlab post
Suppose we know the concentration of A follows this differential equa-
tion: dC
dt = kCA , and we have data we want to fit to it. Here is an
A
1 import numpy as np
2 from scipy.optimize import curve_fit
3 from scipy.integrate import odeint
4
5 # given data we want to fit
6 tspan = [0, 0.1, 0.2, 0.4, 0.8, 1]
7 Ca_data = [2.0081, 1.5512, 1.1903, 0.7160, 0.2562, 0.1495]
8
9 def fitfunc(t, k):
10 Function that returns Ca computed from an ODE for a k
11 def myode(Ca, t):
12 return -k * Ca
13
14 Ca0 = Ca_data[0]
15 Casol = odeint(myode, Ca0, t)
16 return Casol[:,0]
17
18 k_fit, kcov = curve_fit(fitfunc, tspan, Ca_data, p0=1.3)
19 print(k_fit)
20
21 tfit = np.linspace(0,1);
22 fit = fitfunc(tfit, k_fit)
23
24 import matplotlib.pyplot as plt
25 plt.plot(tspan, Ca_data, ro, label=data)
26 plt.plot(tfit, fit, b-, label=fit)
27 plt.legend(loc=best)
28 plt.savefig(images/ode-fit.png)
[ 2.58893455]
159
7.12 Reading in delimited text files
Matlab post
sometimes you will get data in a delimited text file format, .e.g. sepa-
rated by commas or tabs. Matlab can read these in easily. Suppose we have
a file containing this data:
1 3
3 4
5 6
4 8
1 import numpy as np
2
3 x,y = np.loadtxt(data/testdata.txt, unpack=True)
4
5 print(x, y)
[ 1. 3. 5. 4.] [ 3. 4. 6. 8.]
160
8 Interpolation
8.1 Better interpolate than never
Matlab post
We often have some data that we have obtained in the lab, and we want
to solve some problem using the data. For example, suppose we have this
data that describes the value of f at time t.
161
1 from scipy.interpolate import interp1d
2
3 g = interp1d(t, f) # default is linear interpolation
4
5 print(g(2))
6 print(g([2, 3, 4]))
0.20885
[ 0.20885 0.0498 0.03403333]
1 import numpy as np
2 print(np.exp(-2))
0.135335283237
1 g2 = interp1d(t, f, cubic)
2 print(g2(2))
3 print(g2([2, 3, 4]))
0.10848181818181851
[ 0.10848182 0.0498 0.08428727]
1 plt.figure()
2 plt.plot(t, f)
3 plt.xlabel(t)
4 plt.ylabel(f(t))
5
6 x = np.linspace(0.5, 6)
7 fit = g2(x)
8 plt.plot(x, fit, label=fit)
9 plt.savefig(images/interpolation-2.png)
162
<matplotlib.figure.Figure object at 0x110ef9208>
[<matplotlib.lines.Line2D object at 0x1115bde10>]
<matplotlib.text.Text object at 0x1118dd5f8>
<matplotlib.text.Text object at 0x1118e4cc0>
[<matplotlib.lines.Line2D object at 0x1118d8240>]
Wow. That is a weird looking fit. Very different from what Matlab
produces. This is a good teaching moment not to rely blindly on interpola-
tion! We will rely on the linear interpolation from here out which behaves
predictably.
method 1 We setup a function that we can use fsolve on. The function
will be equal to zero at the time. The second function will look like 0 = 0.2
- f(t). The answer for 0.2=exp(-t) is t = 1.6094. Since we use interpolation
here, we will get an approximate answer.
163
1 from scipy.optimize import fsolve
2
3 def func(t):
4 return 0.2 - g(t)
5
6 initial_guess = 2
7 ans, = fsolve(func, initial_guess)
8 print(ans)
2.0556428796
1 g3 = interp1d(f[::-1], t[::-1])
2
3 print(g3(0.2))
2.055642879597611
1 def func(t):
2 objective function. we do some error bounds because we cannot interpolate out of the range.
3 if t < 0.5: t=0.5
4 if t > 6: t = 6
5 return 100 - 1.0 / g(t)
6
7 initial_guess = 4.5
8 a1, = fsolve(func, initial_guess)
9 print(a1)
10 print(The %error is {0:%}.format((a1 - 4.6052)/4.6052))
164
5.52431289641
The %error is 19.958154%
3.6310782241
The %error is -21.152649%
8.1.5 Discussion
In this case you get different errors, one overestimates and one underesti-
mates the answer, and by a lot: 20%. Let us look at what is happening.
165
<matplotlib.figure.Figure object at 0x1115c3860>
[<matplotlib.lines.Line2D object at 0x111629400>]
[<matplotlib.lines.Line2D object at 0x1115d47b8>]
[<matplotlib.lines.Line2D object at 0x111629e48>]
[<matplotlib.lines.Line2D object at 0x111910668>]
<matplotlib.text.Text object at 0x1118f2780>
<matplotlib.text.Text object at 0x1118ed2b0>
<matplotlib.legend.Legend object at 0x111915588>
You can see that the 1/interpolated f(x) underestimates the value, while
interpolated (1/f(x)) overestimates the value. This is an example of where
you clearly need more data in that range to make good estimates. Neither
interpolation method is doing a great job. The trouble in reality is that you
often do not know the real function to do this analysis. Here you can say the
time is probably between 3.6 and 5.5 where 1/f(t) = 100, but you can not
read much more than that into it. If you need a more precise answer, you
need better data, or you need to use an approach other than interpolation.
For example, you could fit an exponential function to the data and use that
to estimate values at other times.
So which is the best to interpolate? I think you should interpolate the
quantity that is linear in the problem you want to solve, so in this case I
think interpolating 1/f(x) is better. When you use an interpolated function
166
in a nonlinear function, strange, unintuitive things can happen. That is why
the blue curve looks odd. Between data points are linear segments in the
original interpolation, but when you invert them, you cause the curvature
to form.
2.675
[ 2.675 2.495 2.9 nan]
167
14 import numpy as np
15 print(np.array(xi)**2)
In this case the cubic spline interpolation is more accurate than the
linear interpolation. That is because the underlying data was polynomial in
nature, and a spline is like a polynomial. That may not always be the case,
and you need some engineering judgement to know which method is best.
168
Iterations: 12
Function evaluations: 24
Figure 4: Illustration of a spline fit to data and finding the maximum point.
9 Optimization
9.1 Constrained optimization
Matlab post
adapted from http://en.wikipedia.org/wiki/Lagrange_multipliers.
Suppose we seek to minimize the function f (x, y) = x + y subject to
the constraint that x2 + y 2 = 1. The function we seek to maximize is an
unbounded plane, while the constraint is a unit circle. We could setup a
Lagrange multiplier approach to solving this problem, but we will use a
constrained optimization approach instead.
169
1 from scipy.optimize import fmin_slsqp
2
3 def objective(X):
4 x, y = X
5 return x + y
6
7 def eqc(X):
8 equality constraint
9 x, y = X
10 return x**2 + y**2 - 1.0
11
12 X0 = [-1, -1]
13 X = fmin_slsqp(objective, X0, eqcons=[eqc])
14 print(X)
170
<matplotlib.figure.Figure object at 0x11193ec18>
[<matplotlib.lines.Line2D object at 0x111d43668>]
171
You can see in fact there is a maximum, near V=0.6. We could solve
this problem analytically by taking the appropriate derivative and solving
it for zero. That still might require solving a nonlinear problem though. We
will directly setup and solve the constrained optimization.
172
25 plt.plot(V, i(V), Vmax, imax, ro)
26 plt.savefig(images/P2.png)
You can see the maximum power is approximately 0.2 (unspecified units),
at the conditions indicated by the red dot in the figure above.
173
1 import numpy as np
2
3 x = np.linspace(-1.5, 1.5)
4
5 [X, Y] = np.meshgrid(x, x)
6
7 import matplotlib as mpl
8 from mpl_toolkits.mplot3d import Axes3D
9 import matplotlib.pyplot as plt
10
11 fig = plt.figure()
12 ax = fig.gca(projection=3d)
13
14 ax.plot_surface(X, Y, X + Y)
15
16 theta = np.linspace(0,2*np.pi);
17 R = 1.0
18 x1 = R * np.cos(theta)
19 y1 = R * np.sin(theta)
20
21 ax.plot(x1, y1, x1 + y1, r-)
22 plt.savefig(images/lagrange-1.png)
174
function. Since g(x, y) = 0, we are not really changing the original function,
provided that the constraint is met!
1 import numpy as np
2
3 def func(X):
4 x = X[0]
5 y = X[1]
6 L = X[2] # this is the multiplier. lambda is a reserved keyword in python
7 return x + y + L * (x**2 + y**2 - 1)
1 def dfunc(X):
2 dLambda = np.zeros(len(X))
3 h = 1e-3 # this is the step size used in the finite difference.
4 for i in range(len(X)):
5 dX = np.zeros(len(X))
6 dX[i] = h
7 dLambda[i] = (func(X+dX)-func(X-dX))/(2*h);
8 return dLambda
175
8 X2 = fsolve(dfunc, [-1, -1, 0])
9 print(X2, func(X2))
9.3.4 Summary
Three dimensional plots in matplotlib are a little more difficult than in Mat-
lab (where the code is almost the same as 2D plots, just different commands,
e.g. plot vs plot3). In Matplotlib you have to import additional modules in
the right order, and use the object oriented approach to plotting as shown
here.
176
To solve this problem, we cast it as a linear programming problem, which
minimizes a function f(X) subject to some constraints. We create a proxy
function for the negative of profit, which we seek to minimize.
f = -(143*x + 60*y)
177
This code is not exactly the same as the original post, but we get to
the same answer. The linear programming capability in scipy is currently
somewhat limited in 0.10. It is a little better in 0.11, but probably not as
advanced as Matlab. There are some external libraries available:
1. http://abel.ee.ucla.edu/cvxopt/
2. http://openopt.org/LP
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from scipy.optimize import fmin_cobyla
4
5 P = (0.5, 2)
6
7 def f(x):
8 return x**2
9
10 def objective(X):
11 x,y = X
12 return np.sqrt((x - P[0])**2 + (y - P[1])**2)
13
14 def c1(X):
15 x,y = X
16 return f(x) - y
17
18 X = fmin_cobyla(objective, x0=[0.5,0.5], cons=[c1])
19
20 print(The minimum distance is {0:1.2f}.format(objective(X)))
21
22 # Verify the vector to this point is normal to the tangent of the curve
23 # position vector from curve to point
24 v1 = np.array(P) - np.array(X)
25 # position vector
26 v2 = np.array([1, 2.0 * X[0]])
27 print(dot(v1, v2) = ,np.dot(v1, v2))
28
29 x = np.linspace(-2, 2, 100)
30
31 plt.plot(x, f(x), r-, label=f(x))
32 plt.plot(P[0], P[1], bo, label=point)
33 plt.plot([P[0], X[0]], [P[1], X[1]], b-, label=shortest distance)
34 plt.plot([X[0], X[0] + 1], [X[1], X[1] + 2.0 * X[0]], g-, label=tangent)
35 plt.axis(equal)
178
36 plt.xlabel(x)
37 plt.ylabel(y)
38 plt.legend(loc=best)
39 plt.savefig(images/min-dist-p-func.png)
In the code above, we demonstrate that the point we find on the curve
that minimizes the distance satisfies the property that a vector from that
point to our other point is normal to the tangent of the curve at that point.
This is shown by the fact that the dot product of the two vectors is very
close to zero. It is not zero because of the accuracy criteria that is used to
stop the minimization is not high enough.
179
10 Differential equations
The key to successfully solving many differential equations is correctly clas-
sifying the equations, putting them into a standard form and then picking
the appropriate solver. You must be able to determine if an equation is:
The following sections will illustrate the methods for solving these kinds
of equations.
dy
= y(t)
dt
over the time span of 0 to 2. The initial condition is y(0) = 1.
to solve this equation, you need to create a function of the form: dydt
= f(y, t) and then use one of the odesolvers, e.g. odeint.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 def fprime(y,t):
6 return y
7
8 tspan = np.linspace(0, 25)
9 y0 = 1
10 ysol = odeint(fprime, y0, tspan)
11 plt.figure(figsize=(4,3))
12 plt.plot(tspan, ysol, label=numerical solution)
13 plt.plot(tspan, np.exp(tspan), r--, label=analytical solution)
180
14 plt.xlabel(time)
15 plt.ylabel(y(t))
16 plt.legend(loc=best)
17 plt.savefig(images/simple-ode.png)
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 def fprime(y,t):
6 return y
7
8 tspan = np.linspace(0, 2)
9 y0 = 1
10 ysol = odeint(fprime, y0, tspan)
11
12 from scipy.interpolate import interp1d
13
14 ip = interp1d(ysol[:,0], tspan) # reverse interpolation
15 print(y = 3 at x = {0}.format(ip(3)))
181
y = 3 at x = 1.098547805640928
1 import numpy as np
2 from scipy.integrate import odeint
3
4 def dfdt(F, t):
5 rho, theta, z = F
6 drhodt = 0 # constant radius
7 dthetadt = 1 # constant angular velocity
8 dzdt = -1 # constant dropping velocity
9 return [drhodt, dthetadt, dzdt]
10
11 # initial conditions
12 rho0 = 1
13 theta0 = 0
14 z0 = 100
15
16 tspan = np.linspace(0, 50, 500)
17 sol = odeint(dfdt, [rho0, theta0, z0], tspan)
18
19 rho = sol[:,0]
20 theta = sol[:,1]
21 z = sol[:,2]
22
23 # convert cylindrical coords to cartesian for plotting.
24 X = rho * np.cos(theta)
25 Y = rho * np.sin(theta)
26
27 from mpl_toolkits.mplot3d import Axes3D
28 import matplotlib.pyplot as plt
29 fig = plt.figure()
30 ax = fig.gca(projection=3d)
31 ax.plot(X, Y, z)
32 plt.savefig(images/ode-cylindrical.png)
182
10.1.3 ODEs with discontinuous forcing functions
Matlab post
Adapted from http://archives.math.utk.edu/ICTCM/VOL18/S046/paper.
pdf
A mixing tank initially contains 300 g of salt mixed into 1000 L of water.
At t=0 min, a solution of 4 g/L salt enters the tank at 6 L/min. At t=10
min, the solution is changed to 2 g/L salt, still entering at 6 L/min. The
tank is well stirred, and the tank solution leaves at a rate of 6 L/min. Plot
the concentration of salt (g/L) in the tank as a function of time.
A mass balance on the salt in the tank leads to this differential equation:
dt = CS,in (t) MS /V with the initial condition that MS (t = 0) = 300.
dMS
0 t 0,
CS,in (t) = 4 0 < t 10,
2 t > 10.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 V = 1000.0 # L
183
6 nu = 6.0 # L/min
7
8 def Cs_in(t):
9 inlet concentration
10 if t < 0:
11 Cs = 0.0 # g/L
12 elif (t > 0) and (t <= 10):
13 Cs = 4.0
14 else:
15 Cs = 2.0
16 return Cs
17
18 def mass_balance(Ms, t):
19 $\frac{dM_S}{dt} = \nu C_{S,in}(t) - \nu M_S/V$
20 dMsdt = nu * Cs_in(t) - nu * Ms / V
21 return dMsdt
22
23 tspan = np.linspace(0.0, 15.0, 50)
24
25 M0 = 300.0 # gm salt
26 Ms = odeint(mass_balance, M0, tspan)
27
28 plt.plot(tspan, Ms/V, b.-)
29 plt.xlabel(Time (min))
30 plt.ylabel(Salt concentration (g/L))
31 plt.savefig(images/ode-discont.png)
You can see the discontinuity in the salt concentration at 10 minutes due
to the discontinous change in the entering salt concentration.
184
10.1.4 Simulating the events feature of Matlabs ode solvers
The ode solvers in Matlab allow you create functions that define events that
can stop the integration, detect roots, etc. . . We will explore how to get a
similar effect in python. Here is an example that somewhat does this, but it
is only an approximation. We will manually integrate the ODE, adjusting
the time step in each iteration to zero in on the solution. When the desired
accuracy is reached, we stop the integration.
It does not appear that events are supported in scipy. A solution is at
http://mail.scipy.org/pipermail/scipy-dev/2005-July/003078.html,
but it does not appear integrated into scipy yet (8 years later ;).
1 import numpy as np
2 from scipy.integrate import odeint
3
4 def dCadt(Ca, t):
5 "the ode function"
6 k = 0.23
7 return -k * Ca**2
8
9 Ca0 = 2.3
10
11 # create lists to store time span and solution
12 tspan = [0, ]
13 sol = [Ca0,]
14 i = 0
15
16 while i < 100: # take max of 100 steps
17 t1 = tspan[i]
18 Ca = sol[i]
19
20 # pick the next time using a Newton-Raphson method
21 # we want f(t, Ca) = (Ca(t) - 1)**2 = 0
22 # df/dt = df/dCa dCa/dt
23 # = 2*(Ca - 1) * dCadt
24 t2 = t1 - (Ca - 1.0)**2 / (2 * (Ca - 1) *dCadt(Ca, t1))
25
26 f = odeint(dCadt, Ca, [t1, t2])
27
28 if np.abs(Ca - 1.0) <= 1e-4:
29 print(Solution reached at i = {0}.format(i))
30 break
31
32 tspan += [t2]
33 sol.append(f[-1][0])
34 i += 1
35
36 print(At t={0:1.2f} Ca = {1:1.3f}.format(tspan[-1], sol[-1]))
37
38 import matplotlib.pyplot as plt
39 plt.plot(tspan, sol, bo)
40 plt.savefig(images/event-i.png)
185
Solution reached at i = 15
At t=2.46 Ca = 1.000
186
10.1.5 Mimicking ode events in python
The ODE functions in scipy.integrate do not directly support events like the
functions in Matlab do. We can achieve something like it though, by digging
into the guts of the solver, and writing a little code. In previous example I
used an event to count the number of roots in a function by integrating the
derivative of the function.
1 import numpy as np
2 from scipy.integrate import odeint
3
4 def myode(f, x):
5 return 3*x**2 + 12*x -4
6
7 def event(f, x):
8 an event is when f = 0
9 return f
10
11 # initial conditions
12 x0 = -8
13 f0 = -120
14
15 # final x-range and step to integrate over.
16 xf = 4 #final x value
17 deltax = 0.45 #xstep
18
19 # lists to store the results in
20 X = [x0]
21 sol = [f0]
22 e = [event(f0, x0)]
23 events = []
24 x2 = x0
25 # manually integrate at each time step, and check for event sign changes at each step
26 while x2 <= xf: #stop integrating when we get to xf
27 x1 = X[-1]
28 x2 = x1 + deltax
29 f1 = sol[-1]
30
31 f2 = odeint(myode, f1, [x1, x2]) # integrate from x1,f1 to x2,f2
32 X += [x2]
33 sol += [f2[-1][0]]
34
35 # now evaluate the event at the last position
36 e += [event(sol[-1], X[-1])]
37
38 if e[-1] * e[-2] < 0:
39 # Event detected where the sign of the event has changed. The
40 # event is between xPt = X[-2] and xLt = X[-1]. run a modified bisect
41 # function to narrow down to find where event = 0
42 xLt = X[-1]
43 fLt = sol[-1]
44 eLt = e[-1]
45
46 xPt = X[-2]
187
47 fPt = sol[-2]
48 ePt = e[-2]
49
50 j = 0
51 while j < 100:
52 if np.abs(xLt - xPt) < 1e-6:
53 # we know the interval to a prescribed precision now.
54 print(x = {0}, event = {1}, f = {2}.format(xLt, eLt, fLt))
55 events += [(xLt, fLt)]
56 break # and return to integrating
57
58 m = (ePt - eLt)/(xPt - xLt) #slope of line connecting points
59 #bracketing zero
60
61 #estimated x where the zero is
62 new_x = -ePt / m + xPt
63
64 # now get the new value of the integrated solution at that new x
65 f = odeint(myode, fPt, [xPt, new_x])
66 new_f = f[-1][-1]
67 new_e = event(new_f, new_x)
68
69 # now check event sign change
70 if eLt * new_e > 0:
71 xPt = new_x
72 fPt = new_f
73 ePt = new_e
74 else:
75 xLt = new_x
76 fLt = new_f
77 eLt = new_e
78
79 j += 1
80
81
82 import matplotlib.pyplot as plt
83 plt.plot(X, sol)
84
85 # add event points to the graph
86 for x,e in events:
87 plt.plot(x,e,bo )
88 plt.savefig(images/event-ode-1.png)
188
That was a lot of programming to do something like find the roots of the
function! Below is an example of using a function coded into pycse to solve
the same problem. It is a bit more sophisticated because you can define
whether an event is terminal, and the direction of the approach to zero for
each event.
189
23
24 import matplotlib.pyplot as plt
25 plt.plot(X, F, .-)
26
27 # plot the event locations.use a different color for each event
28 colors = rg
29
30 for x,y,i in zip(TE, YE, IE):
31 plt.plot([x], [y], o, color=colors[i])
32
33 plt.savefig(images/event-ode-2.png)
34 print(TE, YE, IE)
190
Given that the concentration of a species A in a constant volume, batch
reactor obeys this differential equation dC
dt = kCA with the initial con-
A 2
191
You can see the solution is near two seconds. Now we create an inter-
polating function to evaluate the solution. We will plot the interpolating
function on a finer grid to make sure it seems reasonable.
192
that loos pretty reasonable. Now we solve the problem.
1 tguess = 2.0
2 tsol, = fsolve(lambda t: 1.0 - ca_func(t), tguess)
3 print(tsol)
4
5 # you might prefer an explicit function
6 def func(t):
7 return 1.0 - ca_func(t)
8
9 tsol2, = fsolve(func, tguess)
10 print(tsol2)
2.4574668235
2.4574668235
That is it. Interpolation can provide a simple way to evaluate the nu-
merical solution of an ODE at other values.
For completeness we examine a final way to construct the function. We
can actually integrate the ODE in the function to evaluate the solution at
the point of interest. If it is not computationally expensive to evaluate
the ODE solution this works fine. Note, however, that the ODE will get
integrated from 0 to the value t for each iteration of fsolve.
193
1 def func(t):
2 tspan = [0, t]
3 sol = odeint(dCadt, Ca0, tspan)
4 return 1.0 - sol[-1]
5
6 tsol3, = fsolve(func, tguess)
7 print(tsol3)
2.45746688202
dy
= y(t)
dt
The initial condition is y(0) = 1.
194
Given that the concentration of a species A in a constant volume, batch
reactor obeys this differential equation dC
dt = kCA with the initial con-
A 2
195
10 direction = 1
11 isterminal = False
12 return value, isterminal, direction
13
14 def maxima(y, x):
15 Approaching a maximum, dydx is positive and going to zero. our event function is decreasing
16 value = ode(y, x)
17 direction = -1
18 isterminal = False
19 return value, isterminal, direction
20
21 xspan = np.linspace(0, 20, 100)
22
23 y0 = 0
24
25 X, Y, XE, YE, IE = odelay(ode, y0, xspan, events=[minima, maxima])
26 print(IE)
27 import matplotlib.pyplot as plt
28 plt.plot(X, Y)
29
30 # blue is maximum, red is minimum
31 colors = rb
32 for xe, ye, ie in zip(XE, YE, IE):
33 plt.plot([xe], [ye], o, color=colors[ie])
34
35 plt.savefig(./images/ode-events-min-max.png)
[0 1 0 1 0 1 0]
196
10.1.10 Error tolerance in numerical solutions to ODEs
Matlab post Usually, the numerical ODE solvers in python work well with
the standard settings. Sometimes they do not, and it is not always obvious
they have not worked! Part of using a tool like python is checking how well
your solution really worked. We use an example of integrating an ODE that
defines the van der Waal equation of an ideal gas here.
we plot the analytical solution to the van der waal equation in reduced
form here.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 Tr = 0.9
5 Vr = np.linspace(0.34,4,1000)
6
7 #analytical equation for Pr
8 Prfh = lambda Vr: 8.0 / 3.0 * Tr / (Vr - 1.0 / 3.0) - 3.0 / (Vr**2)
9 Pr = Prfh(Vr) # evaluated on our reduced volume vector.
10
11 # Plot the EOS
12 plt.clf()
13 plt.plot(Vr,Pr)
14 plt.ylim([0, 2])
15 plt.xlabel($V_R$)
16 plt.ylabel($P_R$)
17 plt.savefig(images/ode-vw-1.png)
197
we want an equation for dPdV, which we will integrate we use symbolic
math to do the derivative for us.
Now, we solve the ODE. We will specify a large relative tolerance criteria
(Note the default is much smaller than what we show here).
198
13 plt.plot(Vspan, P[:,0], r.)
14 plt.ylim([0, 2])
15 plt.xlabel($V_R$)
16 plt.ylabel($P_R$)
17 plt.savefig(images/ode-vw-2.png)
You can see there is disagreement between the analytical solution and
numerical solution. The origin of this problem is accuracy at the initial
condition, where the derivative is extremely large.
1 print(myode(Po, 0.34))
-53847.34378179728
We can increase the tolerance criteria to get a better answer. The de-
faults in odeint are actually set to 1.49012e-8.
199
1 Vspan = np.linspace(0.334, 4)
2 Po = Prfh(Vspan[0])
3 P = odeint(myode, Po, Vspan)
4
5 # Plot the EOS
6 plt.clf()
7 plt.plot(Vr,Pr) # analytical solution
8 plt.plot(Vspan, P[:,0], r.)
9 plt.ylim([0, 2])
10 plt.xlabel($V_R$)
11 plt.ylabel($P_R$)
12 plt.savefig(images/ode-vw-3.png)
The problem here was the derivative value varied by four orders of mag-
nitude over the integration range, so the default tolerances were insufficient
to accurately estimate the numerical derivatives over that range. Tighten-
ing the tolerances helped resolve that problem. Another approach might be
to split the integration up into different regions. For instance, if instead of
200
starting at Vr = 0.34, which is very close to a sigularity in the van der waal
equation at Vr = 1/3, if you start at Vr = 0.5, the solution integrates just
fine with the standard tolerances.
dCa
= kCa(t)
dt
where k is a parameter, and we want to solve the equation for a couple
of values of k to test the sensitivity of the solution on the parameter. Our
question is, given Ca(t = 0) = 2, how long does it take to get Ca = 1, and
how sensitive is the answer to small variations in k?
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 def myode(Ca, t, k):
6 ODE definition
7 dCadt = -k * Ca
8 return dCadt
9
10 tspan = np.linspace(0, 0.5)
11 k0 = 2
12 Ca0 = 2
13
14 plt.figure(); plt.clf()
15
16 for k in [0.95 * k0, k0, 1.05 * k0]:
17 sol = odeint(myode, Ca0, tspan, args=(k,))
18 plt.plot(tspan, sol, label=k={0:1.2f}.format(k))
19 print(At t=0.5 Ca = {0:1.2f} mol/L.format(sol[-1][0]))
20
21 plt.legend(loc=best)
22 plt.xlabel(Time)
23 plt.ylabel($C_A$ (mol/L))
24 plt.savefig(images/parameterized-ode1.png)
201
You can see there are some variations in the concentration at t = 0.5.
You could over or underestimate the concentration if you have the wrong
estimate of $k$! You have to use some judgement here to decide how long
to run the reaction to ensure a target goal is met.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 def ode(F, t):
6 Ca, k = F
7 dCadt = -k * Ca
8 dkdt = 0.0
202
9 return [dCadt, dkdt]
10
11 tspan = np.linspace(0, 4)
12
13 Ca0 = 1;
14 K = [2.0, 3.0]
15 for k in K:
16 F = odeint(ode, [Ca0, k], tspan)
17 Ca = F[:,0]
18 plt.plot(tspan, Ca, label=k={0}.format(k))
19 plt.xlabel(time)
20 plt.ylabel($C_A$)
21 plt.legend(loc=best)
22 plt.savefig(images/ode-parameterized-1.png)
203
Here we define the ODE function in a loop. Since the nested function is
in the namespace of the main function, it can "see" the values of the variables
in the main function. We will use this method to look at the solution to the
van der Pol equation for several different values of mu.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 MU = [0.1, 1, 2, 5]
6 tspan = np.linspace(0, 100, 5000)
7 Y0 = [0, 3]
8
9 for mu in MU:
10 # define the ODE
11 def vdpol(Y, t):
12 x,y = Y
13 dxdt = y
14 dydt = -x + mu * (1 - x**2) * y
15 return [dxdt, dydt]
16
17 Y = odeint(vdpol, Y0, tspan)
18
19 x = Y[:,0]; y = Y[:,1]
20 plt.plot(x, y, label=mu={0:1.2f}.format(mu))
21
22 plt.axis(equal)
23 plt.legend(loc=best)
24 plt.savefig(images/ode-nested-parameterization.png)
25 plt.savefig(images/ode-nested-parameterization.svg)
204
You can see the solution changes dramatically for different values of mu.
The point here is not to understand why, but to show an easy way to study
a parameterize ode with a nested function. Nested functions can be a great
way to "share" variables between functions especially for ODE solving, and
nonlinear algebra solving, or any other application where you need a lot of
parameters defined in one function in another function.
d2 x dx
(1 x2 ) +x=0
dt2 dt
is a constant. If we let y = x x3 /3 http://en.wikipedia.org/
wiki/Van_der_Pol_oscillator, then we arrive at this set of equations:
dx
= (x 1/3x3 y)
dt
205
dy
= /x
dt
here is how we solve this set of equations. Let = 1.
206
Here is the phase portrait. You can see that a limit cycle is approached,
indicating periodicity in the solution.
207
10.1.15 Solving Bessels Equation numerically
Matlab post
Reference Ch 5.5 Kreysig, Advanced Engineering Mathematics, 9th ed.
Bessels equation x2 y 00 +xy 0 +(x2 2 )y = 0 comes up often in engineering
problems such as heat transfer. The solutions to this equation are the Bessel
functions. To solve this equation numerically, we must convert it to a system
of first order ODEs. This can be done by letting z = y 0 and z 0 = y 00 and
performing the change of variables:
y0 = z
1
z0 =
(xz (x2 2 )y
x2
if we take the case where = 0, the solution is known to be the Bessel
function J0 (x), which is represented in Matlab as besselj(0,x). The initial
conditions for this problem are: y(0) = 1 and y 0 (0) = 0.
There is a problem with our system of ODEs at x=0. Because of the
1/x2 term, the ODEs are not defined at x=0. If we start very close to zero
instead, we avoid the problem.
1 import numpy as np
2 from scipy.integrate import odeint
3 from scipy.special import jn # bessel function
4 import matplotlib.pyplot as plt
5
6 def fbessel(Y, x):
7 nu = 0.0
8 y = Y[0]
9 z = Y[1]
10
11 dydx = z
12 dzdx = 1.0 / x**2 * (-x * z - (x**2 - nu**2) * y)
13 return [dydx, dzdx]
14
15 x0 = 1e-15
16 y0 = 1
17 z0 = 0
18 Y0 = [y0, z0]
19
20 xspan = np.linspace(1e-15, 10)
21 sol = odeint(fbessel, Y0, xspan)
22
23 plt.plot(xspan, sol[:,0], label=numerical soln)
24 plt.plot(xspan, jn(0, xspan), r--, label=Bessel)
25 plt.legend()
26 plt.savefig(images/bessel.png)
208
You can see the numerical and analytical solutions overlap, indicating
they are at least visually the same.
y 00 + sin(y) = 0
We reduce this to standard matlab form of a system of first order ODEs
by letting y1 = y and y2 = y10 . This leads to:
y10 = y2
y20 = sin(y1 )
The phase portrait is a plot of a vector field which qualitatively shows
how the solutions to these equations will go from a given starting point.
here is our definition of the differential equations:
To generate the phase portrait, we need to compute the derivatives y10
and y20 at t = 0 on a grid over the range of values for y1 and y2 we are
interested in. We will plot the derivatives as a vector at each (y1, y2) which
will show us the initial direction from each point. We will examine the
solutions over the range -2 < y1 < 8, and -2 < y2 < 2 for y2, and create a
grid of 20 x 20 points.
209
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 def f(Y, t):
5 y1, y2 = Y
6 return [y2, -np.sin(y1)]
7
8 y1 = np.linspace(-2.0, 8.0, 20)
9 y2 = np.linspace(-2.0, 2.0, 20)
10
11 Y1, Y2 = np.meshgrid(y1, y2)
12
13 t = 0
14
15 u, v = np.zeros(Y1.shape), np.zeros(Y2.shape)
16
17 NI, NJ = Y1.shape
18
19 for i in range(NI):
20 for j in range(NJ):
21 x = Y1[i, j]
22 y = Y2[i, j]
23 yprime = f([x, y], t)
24 u[i,j] = yprime[0]
25 v[i,j] = yprime[1]
26
27
28 Q = plt.quiver(Y1, Y2, u, v, color=r)
29
30 plt.xlabel($y_1$)
31 plt.ylabel($y_2$)
32 plt.xlim([-2, 8])
33 plt.ylim([-4, 4])
34 plt.savefig(images/phase-portrait.png)
210
Let us plot a few solutions on the vector field. We will consider the solu-
tions where y1(0)=0, and values of y2(0) = [0 0.5 1 1.5 2 2.5], in otherwords
we start the pendulum at an angle of zero, with some angular velocity.
211
[<matplotlib.lines.Line2D object at 0x1115f64e0>]
[<matplotlib.lines.Line2D object at 0x10b2a53c8>]
[<matplotlib.lines.Line2D object at 0x1115f4dd8>]
[<matplotlib.lines.Line2D object at 0x10b39d1d0>]
[<matplotlib.lines.Line2D object at 0x10b2ed3c8>]
[<matplotlib.lines.Line2D object at 0x11205def0>]
[<matplotlib.lines.Line2D object at 0x110fb8160>]
[<matplotlib.lines.Line2D object at 0x10b312c18>]
[<matplotlib.lines.Line2D object at 0x10e4c2c50>]
[<matplotlib.lines.Line2D object at 0x1115ed588>]
[<matplotlib.lines.Line2D object at 0x1115f0ac8>]
[<matplotlib.lines.Line2D object at 0x1115f0e10>]
(-2, 8)
What do these figures mean? For starting points near the origin, and
small velocities, the pendulum goes into a stable limit cycle. For others, the
trajectory appears to fly off into y1 space. Recall that y1 is an angle that
has values from to . The y1 data in this case is not wrapped around
to be in this range.
212
10.1.17 Linear algebra approaches to solving systems of constant
coefficient ODEs
Matlab post Today we consider how to solve a system of first order, con-
stant coefficient ordinary differential equations using linear algebra. These
equations could be solved numerically, but in this case there are analytical
solutions that can be derived. The equations we will solve are:
y10 = 0.02y1 + 0.02y2
y20 = 0.02y1 0.02y2 " #
y10
We can express this set of equations in matrix form as: =
y20
" #" #
0.02 0.02 y1
0.02 0.02 y2
The
" general
# solution "to this set
# of equations
" is # " #!
y1 h i c1 0 1 0 t
= v1 v2 exp
y2 0 c2 0 2 t
" #
1 0
where is a diagonal matrix of the eigenvalues of the constant
0 2
h i
coefficient matrix, v1 v2 is a matrix of eigenvectors where the ith col-
" #
c1 0
umn corresponds to the eigenvector of the ith eigenvalue, and is
0 c2
a matrix determined by the initial conditions.
In this example, we evaluate the solution using linear algebra. The initial
conditions we will consider are y1 (0) = 0 and y2 (0) = 150.
1 import numpy as np
2
3 A = np.array([[-0.02, 0.02],
4 [ 0.02, -0.02]])
5
6 # Return the eigenvalues and eigenvectors of a Hermitian or symmetric matrix.
7 evals, evecs = np.linalg.eigh(A)
8 print(evals)
9 print(evecs)
[-0.04 0. ]
[[ 0.70710678 0.70710678]
[-0.70710678 0.70710678]]
The eigenvectors are the columns of evecs.
Compute the c matrix
V*c = Y0
213
1 Y0 = [0, 150]
2
3 c = np.diag(np.linalg.solve(evecs, Y0))
4 print(c)
[[-106.06601718 0. ]
[ 0. 106.06601718]]
214
10.2 Delay Differential Equations
In Matlab you can solve Delay Differential equations (DDE) (Matlab post).
I do not know of a solver in scipy at this time that can do this.
215
derivatives are so that we can do an integration. If our guess was good, then
the solution will go through the known second boundary point. If not, we
guess again, until we get the answer we need. In this example we repeat the
pressure driven flow example, but illustrate the shooting method.
In the pressure driven flow of a fluid with viscosity between two station-
ary plates separated by distance d and driven by a pressure drop P/x,
the governing equations on the velocity u of the fluid are (assuming flow in
the x-direction with the velocity varying only in the y-direction):
P d2 u
= 2
x dy
with boundary conditions u(y = 0) = 0 and u(y = d) = 0, i.e. the
no-slip condition at the edges of the plate.
we convert this second order BVP to a system of ODEs by letting u1 = u,
u2 = u01 and then u02 = u001 . This leads to:
dy = u2
du1
dy = x
du2 1 P
First guess We need u_1(0) and u_2(0), but we only have u_1(0). We
need to guess a value for u_2(0) and see if the solution goes through the
u_2(d)=0 boundary value.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 d = 0.1 # plate thickness
6
7 def odefun(U, y):
8 u1, u2 = U
9 mu = 1
10 Pdrop = -100
11 du1dy = u2
12 du2dy = 1.0 / mu * Pdrop
13 return [du1dy, du2dy]
14
15 u1_0 = 0 # known
16 u2_0 = 1 # guessed
17
18 dspan = np.linspace(0, d)
19
20 U = odeint(odefun, [u1_0, u2_0], dspan)
216
21
22 plt.plot(dspan, U[:,0])
23 plt.plot([d],[0], ro)
24 plt.xlabel(d)
25 plt.ylabel($u_1$)
26 plt.savefig(images/bvp-shooting-1.png)
Second guess
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 d = 0.1 # plate thickness
6
7 def odefun(U, y):
8 u1, u2 = U
9 mu = 1
10 Pdrop = -100
11 du1dy = u2
12 du2dy = 1.0 / mu * Pdrop
13 return [du1dy, du2dy]
14
15 u1_0 = 0 # known
16 u2_0 = 10 # guessed
217
17
18 dspan = np.linspace(0, d)
19
20 U = odeint(odefun, [u1_0, u2_0], dspan)
21
22 plt.plot(dspan, U[:,0])
23 plt.plot([d],[0], ro)
24 plt.xlabel(d)
25 plt.ylabel($u_1$)
26 plt.savefig(images/bvp-shooting-2.png)
Now we have clearly overshot. Let us now make a function that will
iterate for us to find the right value.
1 import numpy as np
2 from scipy.integrate import odeint
3 from scipy.optimize import fsolve
4 import matplotlib.pyplot as plt
5
6 d = 0.1 # plate thickness
7 Pdrop = -100
8 mu = 1
9
10 def odefun(U, y):
11 u1, u2 = U
12 du1dy = u2
218
13 du2dy = 1.0 / mu * Pdrop
14 return [du1dy, du2dy]
15
16 u1_0 = 0 # known
17 dspan = np.linspace(0, d)
18
19 def objective(u2_0):
20 dspan = np.linspace(0, d)
21 U = odeint(odefun, [u1_0, u2_0], dspan)
22 u1 = U[:,0]
23 return u1[-1]
24
25 u2_0, = fsolve(objective, 1.0)
26
27 # now solve with optimal u2_0
28 U = odeint(odefun, [u1_0, u2_0], dspan)
29
30 plt.plot(dspan, U[:,0], label=Numerical solution)
31 plt.plot([d],[0], ro)
32
33 # plot an analytical solution
34 u = -(Pdrop) * d**2 / 2 / mu * (dspan / d - (dspan / d)**2)
35 plt.plot(dspan, u, r--, label=Analytical solution)
36
37
38 plt.xlabel(d)
39 plt.ylabel($u_1$)
40 plt.legend(loc=best)
41 plt.savefig(images/bvp-shooting-3.png)
219
You can see the agreement is excellent!
This also seems like a useful bit of code to not have to reinvent regularly,
so it has been added to pycse as BVP_sh. Here is an example usage.
220
10.4.2 Plane poiseuelle flow solved by finite difference
Matlab post
Adapted from http://www.physics.arizona.edu/~restrepo/475B/Notes/
sourcehtml/node24.html
We want to solve a linear boundary value problem of the form: y =
p(x)y + q(x)y + r(x) with boundary conditions y(x1) = alpha and y(x2)
= beta.
For this example, we solve the plane poiseuille flow problem using a finite
difference approach. An advantage of the approach we use here is we do not
have to rewrite the second order ODE as a set of coupled first order ODEs,
nor do we have to provide guesses for the solution. We do, however, have to
discretize the derivatives and formulate a linear algebra problem.
we want to solve u = 1/mu*DPDX with u(0)=0 and u(0.1)=0. for
this problem we let the plate separation be d=0.1, the viscosity = 1, and
x = 100.
P
221
2 + h2 q1 1 + h2 p1 0 0 0
1 h p
2 2 2 + h2 q2 1 + h2 p2 0 0
A=
.. .. ..
0 . . . 0
0 0 1 h2 pN 1 2 + h2 qN 1 1 + 2 pN 1
h
0 0 0 1 h2 pN 2 + h2 qN
yi
..
y= .
yN
h2 r1 + (1 + h2 p1 )
h2 r2
..
b=
.
h2 rN 1
h2 rN+ (1 h2 pN )
1 import numpy as np
2
3 # we use the notation for y = p(x)y + q(x)y + r(x)
4 def p(x):
5 return 0
6
7 def q(x):
8 return 0
9
10 def r(x):
11 return -100
12
13 #we use the notation y(x1) = alpha and y(x2) = beta
14
15 x1 = 0; alpha = 0.0
16 x2 = 0.1; beta = 0.0
17
18 npoints = 100
19
20 # compute interval width
21 h = (x2-x1)/npoints;
22
23 # preallocate and shape the b vector and A-matrix
24 b = np.zeros((npoints - 1, 1));
25 A = np.zeros((npoints - 1, npoints - 1));
26 X = np.zeros((npoints - 1, 1));
27
28 #now we populate the A-matrix and b vector elements
29 for i in range(npoints - 1):
30 X[i,0] = x1 + (i + 1) * h
31
222
32 # get the value of the BVP Odes at this x
33 pi = p(X[i])
34 qi = q(X[i])
35 ri = r(X[i])
36
37 if i == 0:
38 # first boundary condition
39 b[i] = -h**2 * ri + (1 + h / 2 * pi)*alpha;
40 elif i == npoints - 1:
41 # second boundary condition
42 b[i] = -h**2 * ri + (1 - h / 2 * pi)*beta;
43 else:
44 b[i] = -h**2 * ri # intermediate points
45
46 for j in range(npoints - 1):
47 if j == i: # the diagonal
48 A[i,j] = 2 + h**2 * qi
49 elif j == i - 1: # left of the diagonal
50 A[i,j] = -1 - h / 2 * pi
51 elif j == i + 1: # right of the diagonal
52 A[i,j] = -1 + h / 2 * pi
53 else:
54 A[i,j] = 0 # off the tri-diagonal
55
56 # solve the equations A*y = b for Y
57 Y = np.linalg.solve(A,b)
58
59 x = np.hstack([x1, X[:,0], x2])
60 y = np.hstack([alpha, Y[:,0], beta])
61
62 import matplotlib.pyplot as plt
63
64 plt.plot(x, y)
65
66 mu = 1
67 d = 0.1
68 x = np.linspace(0,0.1);
69 Pdrop = -100 # this is DeltaP/Deltax
70 u = -(Pdrop) * d**2 / 2.0 / mu * (x / d - (x / d)**2)
71 plt.plot(x,u,r--)
72
73 plt.xlabel(distance between plates)
74 plt.ylabel(fluid velocity)
75 plt.legend((finite difference, analytical soln))
76 plt.savefig(images/pp-bvp-fd.png)
223
You can see excellent agreement here between the numerical and analyt-
ical solution.
2 T = 0
with boundary conditions that at T (x = a) = TA and T (x = L) = TB .
The analytical solution is not difficult here: T = TA TA TL
B
x, but we
will solve this by finite differences.
For this problem, lets consider a slab that is defined by x=0 to x=L,
with T (x = 0) = 100, and T (x = L) = 200. We want to find the function
T(x) inside the slab.
We approximate the second derivative by finite differences as
f 00 (x) f (xh)2fh(x)+f
2
(x+h)
Since the second derivative in this case is equal to zero, we have at each
discretized node 0 = Ti1 2Ti + Ti+1 . We know the values of Tx=0 =
and Tx=L = .
224
2 1 0 0 0
1 2 1 0 0
A= 0
. . . . . .
. . . 0
0 0 1 2 1
0 0 0 1 2
T1
..
x= .
TN
T (x = 0)
0
..
b=
.
0
T (x = L)
These are linear equations in the unknowns x that we can easily solve.
Here, we evaluate the solution.
1 import numpy as np
2
3 #we use the notation T(x1) = alpha and T(x2) = beta
4 x1 = 0; alpha = 100
5 x2 = 5; beta = 200
6
7 npoints = 100
8
9 # preallocate and shape the b vector and A-matrix
10 b = np.zeros((npoints, 1));
11 b[0] = -alpha
12 b[-1] = -beta
13
14 A = np.zeros((npoints, npoints));
15
16 #now we populate the A-matrix and b vector elements
17 for i in range(npoints ):
18 for j in range(npoints):
19 if j == i: # the diagonal
20 A[i,j] = -2
21 elif j == i - 1: # left of the diagonal
22 A[i,j] = 1
23 elif j == i + 1: # right of the diagonal
24 A[i,j] = 1
25
26 # solve the equations A*y = b for Y
27 Y = np.linalg.solve(A,b)
28
225
29 x = np.linspace(x1, x2, npoints + 2)
30 y = np.hstack([alpha, Y[:,0], beta])
31
32 import matplotlib.pyplot as plt
33
34 plt.plot(x, y)
35
36 plt.plot(x, alpha + (beta - alpha)/(x2 - x1) * x, r--)
37
38 plt.xlabel(X)
39 plt.ylabel(T(X))
40 plt.legend((finite difference, analytical soln), loc=best)
41 plt.savefig(images/bvp-heat-conduction-1d.png)
226
4 def p(x): return 0
5 def q(x): return 0
6 def r(x): return -100
7
8 #we use the notation y(x1) = alpha and y(x2) = beta
9
10 x1 = 0; alpha = 0.0
11 x2 = 0.1; beta = 0.0
12
13 npoints = 100
14
15 x, y = bvp_L0(p, q, r, x1, x2, alpha, beta, npoints=100)
16 print(len(x))
17
18 import matplotlib.pyplot as plt
19 plt.plot(x, y)
20 plt.savefig(images/bvp-pycse.png)
100
227
7
8 #we use the notation y(x1) = alpha and y(x2) = beta
9
10 x1 = 0; alpha = 100
11 x2 = 1; beta = 200
12
13 npoints = 100
14
15 x, y = bvp_L0(p, q, r, x1, x2, alpha, beta, npoints=100)
16 print(len(x))
17
18 import matplotlib.pyplot as plt
19 plt.plot(x, y)
20 plt.xlabel(X)
21 plt.ylabel(T)
22 plt.savefig(images/ht-example.png)
100
228
y 00 (x) yi1 2y
h2
i +yi+1
yi1
y 0 (x) yi+12h
We define a function y 00 (x) = F (x, y, y 0 ). At each node in our discretized
region, we will have an equation that looks like y 00 (x)F (x, y, y 0 ) = 0, which
will be nonlinear in the unknown solution y. The set of equations to solve
is:
y0 = 0 (22)
yi1 2yi + yi+1 yi+1 yi1
+ (3yi )( ) = 0 (23)
h2 2h
yL = 0 (24)
1 import numpy as np
2 from scipy.optimize import fsolve
3 import matplotlib.pyplot as plt
4
5 x1 = 0.0
6 x2 = 2.0
7
8 alpha = 0.0
9 beta = 1.0
10
11 N = 11
12 X = np.linspace(x1, x2, N)
13 h = (x2 - x1) / (N - 1)
14
15 def Ypp(x, y, yprime):
16 define y = 3*y*y
17 return -3.0 * y * yprime
18
19 def residuals(y):
20 When we have the right values of y, this function will be zero.
21
22 res = np.zeros(y.shape)
23
24 res[0] = y[0] - alpha
25
26 for i in range(1, N - 1):
27 x = X[i]
28 YPP = (y[i - 1] - 2 * y[i] + y[i + 1]) / h**2
29 YP = (y[i + 1] - y[i - 1]) / (2 * h)
30 res[i] = YPP - Ypp(x, y[i], YP)
31
32 res[-1] = y[-1] - beta
33 return res
229
34
35 # we need an initial guess
36 init = alpha + (beta - alpha) / (x2 - x1) * X
37
38 Y = fsolve(residuals, init)
39
40 plt.plot(X, Y)
41 plt.savefig(images/bvp-nonlinear-1.png)
That code looks useful, so I put it in the pycse module in the function
BVP_nl. Here is an example usage. We have to create two functions, one
for the differential equation, and one for the initial guess.
1 import numpy as np
2 from pycse import BVP_nl
3 import matplotlib.pyplot as plt
4
5 x1 = 0.0
6 x2 = 2.0
7
8 alpha = 0.0
9 beta = 1.0
10
11 def Ypp(x, y, yprime):
12 define y = 3*y*y
13 return -3.0 * y * yprime
14
15 def BC(X, Y):
230
16 return [alpha - Y[0], beta - Y[-1]]
17
18 X = np.linspace(x1, x2)
19 init = alpha + (beta - alpha) / (x2 - x1) * X
20
21 x, y = BVP_nl(Ypp, X, BC, init)
22
23 plt.plot(x, y)
24 plt.savefig(images/bvp-nonlinear-2.png)
Boundary value problems may have more than one solution. Let us
consider the BVP:
y 00 + |y| = 0 (25)
y(0) = 0 (26)
y(4) = 2 (27)
231
We will see this equation has two answers, depending on your initial
guess. We convert this to the following set of coupled equations:
y10 = y2 (28)
y20 = |y1 | (29)
y1 (0) = 0 (30)
y1 (4) = 2 (31)
This BVP is nonlinear because of the absolute value. We will have to
guess solutions to get started. We will guess two different solutions, both of
which will be constant values. We will use pycse.bvp to solve the equation.
1 import numpy as np
2 from pycse import bvp
3 import matplotlib.pyplot as plt
4
5 def odefun(Y, x):
6 y1, y2 = Y
7 dy1dx = y2
8 dy2dx = -np.abs(y1)
9 return [dy1dx, dy2dx]
10
11 def bcfun(Y):
12 y1a, y2a = Y[0][0], Y[1][0]
13 y1b, y2b = Y[0][-1], Y[1][-1]
14
15 return [y1a, -2 - y1b]
16
17 x = np.linspace(0, 4, 100)
18
19 y1 = 1.0 * np.ones(x.shape)
20 y2 = 0.0 * np.ones(x.shape)
21
22 Yinit = np.vstack([y1, y2])
23
24 sol = bvp(odefun, bcfun, x, Yinit)
25
26 plt.plot(x, sol[0])
27
28 # another initial guess
29 y1 = -1.0 * np.ones(x.shape)
30 y2 = 0.0 * np.ones(x.shape)
31
32 Yinit = np.vstack([y1, y2])
33
34 sol = bvp(odefun, bcfun, x, Yinit)
35
36 plt.plot(x, sol[0])
37 plt.legend([guess 1, guess 2])
38 plt.savefig(images/bvp-another-nonlin-1.png)
232
This example shows that a nonlinear BVP may have different solutions,
and which one you get depends on the guess you make for the solution. This
is analogous to solving nonlinear algebraic equations (which is what is done
in solving this problem!).
1
f 000 + f f 00 = 0 (32)
2
f (0) = 0 (33)
0
f (0) = 0 (34)
0
f () = 1 (35)
233
f10 = f2 (36)
f20
= f3 (37)
1
f30 = f1 f3 (38)
2
f1 (0) = 0 (39)
f2 (0) = 0 (40)
f2 () = 1 (41)
1 import numpy as np
2 from pycse import bvp
3
4 def odefun(F, x):
5 f1, f2, f3 = F
6 return [f2,
7 f3,
8 -0.5 * f1 * f3]
9
10 def bcfun(F):
11 return [F[0][0], # f1(0) = 0
12 F[1][0], # f2(0) = 0
13 1.0 - F[1][-1]] # f2(inf) = 1
14
15 eta = np.linspace(0, 6, 100)
16 f1init = eta
17 f2init = np.exp(-eta)
18 f3init = np.exp(-eta)
19
20 Finit = np.vstack([f1init, f2init, f3init])
21
22 sol = bvp(odefun, bcfun, eta, Finit)
23
234
24 print("f(0) = f_3(0) = {0}".format(sol[2, 0]))
25
26 import matplotlib.pyplot as plt
27 plt.plot(eta, sol[0])
28 plt.xlabel($\eta$)
29 plt.ylabel($f(\eta)$)
30 plt.savefig(images/blasius.png)
235
This leads to the following set of equations:
dC0
= 0 (entrance concentration never changes) (42)
dt
dC1 C1 C0
= 0 kC12 (43)
dt V1 V0
dC2 C2 C1
= 0 kC22 (44)
dt V2 V1
..
. (45)
dC4 C4 C3
= 0 kC42 (46)
dt V4 V3
Last, we need initial conditions for all the nodes in the discretization.
Let us assume the reactor was full of empty solvent, so that Ci = 0 at t = 0.
In the next block of code, we get the transient solutions, and the steady
state solution.
1 import numpy as np
2 from scipy.integrate import odeint
3
4 Ca0 = 2 # Entering concentration
5 vo = 2 # volumetric flow rate
6 volume = 20 # total volume of reactor, spacetime = 10
7 k = 1 # reaction rate constant
8
9 N = 100 # number of points to discretize the reactor volume on
10
11 init = np.zeros(N) # Concentration in reactor at t = 0
12 init[0] = Ca0 # concentration at entrance
13
14 V = np.linspace(0, volume, N) # discretized volume elements
15 tspan = np.linspace(0, 25) # time span to integrate over
16
17 def method_of_lines(C, t):
18 coupled ODES at each node point
19 D = -vo * np.diff(C) / np.diff(V) - k * C[1:]**2
20 return np.concatenate([[0], #C0 is constant at entrance
21 D])
22
236
23 sol = odeint(method_of_lines, init, tspan)
24
25 # steady state solution
26 def pfr(C, V):
27 return 1.0 / vo * (-k * C**2)
28
29 ssol = odeint(pfr, Ca0, V)
237
After approximately one space time, the steady state solution is reached
at the exit. For completeness, we also examine the steady state solu-
tion.
1 plt.figure()
2 plt.plot(V, ssol, label=Steady state)
3 plt.plot(V, sol[-1], label=t = {}.format(tspan[-1]))
4 plt.xlabel(Volume)
5 plt.ylabel($C_A$)
6 plt.legend(loc=best)
7 plt.savefig(images/transient-pfr-2.png)
238
takes slightly longer to compute then, since the number of coupled odes is
equal to the number of nodes.
We can also create an animated gif to show how the concentration of
A throughout the reactor varies with time. Note, I had to install ffmpeg
(http://ffmpeg.org/) to save the animation.
http://kitchingroup.cheme.cmu.edu/media/transient_pfr.mp4
You can see from the animation that after about 10 time units, the
solution is not changing further, suggesting steady state has been reached.
239
2
in the spatial dimension as xu2 = (u(x + h) 2u(x) + u(x h))/h2 at each
node. This leads to a set of coupled ordinary differential equations that is
easy to solve.
Let us say the rod has a length of 1, k = 0.02, and solve for the time-
dependent temperature profiles.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 N = 100 # number of points to discretize
6 L = 1.0
7 X = np.linspace(0, L, N) # position along the rod
8 h = L / (N - 1)
9
10 k = 0.02
11
12 def odefunc(u, t):
13 dudt = np.zeros(X.shape)
14
15 dudt[0] = 0 # constant at boundary condition
16 dudt[-1] = 0
17
18 # now for the internal nodes
19 for i in range(1, N-1):
20 dudt[i] = k * (u[i + 1] - 2*u[i] + u[i - 1]) / h**2
21
22 return dudt
23
24 init = 150.0 * np.ones(X.shape) # initial temperature
25 init[0] = 100.0 # one boundary condition
26 init[-1] = 200.0 # the other boundary condition
27
28 tspan = np.linspace(0.0, 5.0, 100)
29 sol = odeint(odefunc, init, tspan)
30
31
32 for i in range(0, len(tspan), 5):
33 plt.plot(X, sol[i], label=t={0:1.2f}.format(tspan[i]))
34
35 # put legend outside the figure
36 plt.legend(loc=center left, bbox_to_anchor=(1, 0.5))
37 plt.xlabel(X position)
38 plt.ylabel(Temperature)
39
40 # adjust figure edges so the legend is in the figure
41 plt.subplots_adjust(top=0.89, right=0.77)
42 plt.savefig(images/pde-transient-heat-1.png)
43
44
45 # Make a 3d figure
46 from mpl_toolkits.mplot3d import Axes3D
47 fig = plt.figure()
48 ax = fig.add_subplot(111, projection=3d)
240
49
50 SX, ST = np.meshgrid(X, tspan)
51 ax.plot_surface(SX, ST, sol, cmap=jet)
52 ax.set_xlabel(X)
53 ax.set_ylabel(time)
54 ax.set_zlabel(T)
55 ax.view_init(elev=15, azim=-124) # adjust view so it is easy to see
56 plt.savefig(images/pde-transient-heat-3d.png)
57
58 # animated solution. We will use imagemagick for this
59
60 # we save each frame as an image, and use the imagemagick convert command to
61 # make an animated gif
62 for i in range(len(tspan)):
63 plt.clf()
64 plt.plot(X, sol[i])
65 plt.xlabel(X)
66 plt.ylabel(T(X))
67 plt.title(t = {0}.format(tspan[i]))
68 plt.savefig(___t{0:03d}.png.format(i))
69
70 import subprocess
71 print(subprocess.call([convert, -quality, 100, ___t*.png images/transient_heat.gif]))
72 print(subprocess.call([rm, ___t*.png])) #remove temp files
1
1
This version of the graphical solution is not that easy to read, although
with some study you can see the solution evolves from the initial condition
which is flat, to the steady state solution which is a linear temperature ramp.
241
The 3d version may be easier to interpret. The temperature profile starts
242
10.5.3 Transient diffusion - partial differential equations
We want to solve for the concentration profile of component that diffuses
into a 1D rod, with an impermeable barrier at the end. The PDE governing
this situation is:
2C
t = D x2
C
Note that we cannot use the method of lines as we did before because
we have the derivative-based boundary condition at one of the boundaries.
We approximate the time derivative as:
C
Ci,j+1 Ci,j
t t
i,j
2C
Ci+1,j 2Ci,j +Ci1,j
x2
h2
i,j
We define = Dt h2
, and from these two approximations and the PDE,
we solve for the unknown solution at a later time step as:
Ci,j+1 = Ci+1,j + (1 2)Ci,j + Ci1,j
We know Ci,j=0 from the initial conditions, so we simply need to iterate
to evaluate Ci,j , which is the solution at each time step.
See also: http://www3.nd.edu/~jjwteach/441/PdfNotes/lecture16.
pdf
1 import numpy as np
2 import matplotlib.pyplot as plt
243
3
4 N = 20 # number of points to discretize
5 L = 1.0
6 X = np.linspace(0, L, N) # position along the rod
7 h = L / (N - 1) # discretization spacing
8
9 C0t = 0.1 # concentration at x = 0
10 D = 0.02
11
12 tfinal = 50.0
13 Ntsteps = 1000
14 dt = tfinal / (Ntsteps - 1)
15 t = np.linspace(0, tfinal, Ntsteps)
16
17 alpha = D * dt / h**2
18 print(alpha)
19
20 C_xt = [] # container for all the time steps
21
22 # initial condition at t = 0
23 C = np.zeros(X.shape)
24 C[0] = C0t
25
26 C_xt += [C]
27
28 for j in range(1, Ntsteps):
29 N = np.zeros(C.shape)
30 N[0] = C0t
31 N[1:-1] = alpha*C[2:] + (1 - 2 * alpha) * C[1:-1] + alpha * C[0:-2]
32 N[-1] = N[-2] # derivative boundary condition flux = 0
33 C[:] = N
34 C_xt += [N]
35
36 # plot selective solutions
37 if j in [1,2,5,10,20,50,100,200,500]:
38 plt.plot(X, N, label=t={0:1.2f}.format(t[j]))
39
40 plt.xlabel(Position in rod)
41 plt.ylabel(Concentration)
42 plt.title(Concentration at different times)
43 plt.legend(loc=best)
44 plt.savefig(images/transient-diffusion-temporal-dependence.png)
45
46 C_xt = np.array(C_xt)
47 plt.figure()
48 plt.plot(t, C_xt[:,5], label=x={0:1.2f}.format(X[5]))
49 plt.plot(t, C_xt[:,10], label=x={0:1.2f}.format(X[10]))
50 plt.plot(t, C_xt[:,15], label=x={0:1.2f}.format(X[15]))
51 plt.plot(t, C_xt[:,19], label=x={0:1.2f}.format(X[19]))
52 plt.legend(loc=best)
53 plt.xlabel(Time)
54 plt.ylabel(Concentration)
55 plt.savefig(images/transient-diffusion-position-dependence.png)
0.36136136136136143
244
The solution is somewhat sensitive to the choices of time step and spatial
discretization. If you make the time step too big, the method is not stable,
and large oscillations may occur.
245
11 Plotting
11.1 Plot customizations - Modifying line, text and figure
properties
Matlab post
Here is a vanilla plot.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(0, 2 * np.pi)
5 plt.plot(x, np.sin(x))
6 plt.savefig(images/plot-customization-1.png)
Lets increase the line thickness, change the line color to red, and make
the markers red circles with black outlines. I also like figures in presentations
to be 6 inches high, and 4 inches wide.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
246
4 x = np.linspace(0, 2 * np.pi)
5
6 plt.figure(figsize=(4, 6))
7 plt.plot(x, np.sin(x), lw=2, color=r, marker=o, mec=k, mfc=b)
8
9 plt.xlabel(x data, fontsize=12, fontweight=bold)
10 plt.ylabel(y data, fontsize=12, fontstyle=italic, color=b)
11 plt.tight_layout() # auto-adjust position of axes to fit figure.
12 plt.savefig(images/plot-customization-2.png)
247
248
11.1.1 setting all the text properties in a figure.
You may notice the axis tick labels are not consistent with the labels now.
If you have many plots it can be tedious to try setting each text property.
Python to the rescue! With these commands you can find all the text
instances, and change them all at one time! Likewise, you can change all
the lines, and all the axes.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(0, 2 * np.pi)
5
6 plt.figure(figsize=(4, 6))
7 plt.plot(x, np.sin(x), lw=2, color=r, marker=o, mec=k, mfc=b)
8
9 plt.xlabel(x data, fontsize=12, fontweight=bold)
10 plt.ylabel(y data, fontsize=12, fontstyle=italic, color=b)
11
12 # set all font properties
13 fig = plt.gcf()
14 for o in fig.findobj(lambda x:hasattr(x, set_fontname)
15 or hasattr(x, set_fontweight)
16 or hasattr(x, set_fontsize)):
17 o.set_fontname(Arial)
18 o.set_fontweight(bold)
19 o.set_fontsize(14)
20
21 # make anything you can set linewidth to be lw=2
22 def myfunc(x):
23 return hasattr(x, set_linewidth)
24
25 for o in fig.findobj(myfunc):
26 o.set_linewidth(2)
27
28 plt.tight_layout() # auto-adjust position of axes to fit figure.
29 plt.savefig(images/plot-customization-3.png)
249
There are many other things you can do!
250
11.2 Plotting two datasets with very different scales
Matlab plot
Sometimes you will have two datasets you want to plot together, but
the scales will be so different it is hard to seem them both in the same plot.
Here we examine a few strategies to plotting this kind of data.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(0, 2*np.pi)
5 y1 = np.sin(x);
6 y2 = 0.01 * np.cos(x);
7
8 plt.plot(x, y1, x, y2)
9 plt.legend([y1, y2])
10 plt.savefig(images/two-scales-1.png)
11 # in this plot y2 looks almost flat!
251
11.2.1 Make two plots!
this certainly solves the problem, but you have two full size plots, which
can take up a lot of space in a presentation and report. Often your goal
in plotting both data sets is to compare them, and it is easiest to compare
plots when they are perfectly lined up. Doing that manually can be tedious.
1 plt.figure()
2 plt.plot(x,y1)
3 plt.legend([y1])
4 plt.savefig(images/two-scales-2.png)
5
6 plt.figure()
7 plt.plot(x,y2)
8 plt.legend([y2])
9 plt.savefig(images/two-scales-3.png)
252
11.2.2 Scaling the results
Sometimes you can scale one dataset so it has a similar magnitude as the
other data set. Here we could multiply y2 by 100, and then it will be similar
in size to y1. Of course, you need to indicate that y2 has been scaled in the
graph somehow. Here we use the legend.
1 plt.figure()
2 plt.plot(x, y1, x, 100 * y2)
3 plt.legend([y1, 100*y2])
4 plt.savefig(images/two-scales-4.png)
253
11.2.3 Double-y axis plot
Using two separate y-axes can solve your scaling problem. Note that each
y-axis is color coded to the data. It can be difficult to read these graphs
when printed in black and white
1 fig = plt.figure()
2 ax1 = fig.add_subplot(111)
3 ax1.plot(x, y1)
4 ax1.set_ylabel(y1)
5
6 ax2 = ax1.twinx()
7 ax2.plot(x, y2, r-)
8 ax2.set_ylabel(y2, color=r)
9 for tl in ax2.get_yticklabels():
10 tl.set_color(r)
11
12 plt.savefig(images/two-scales-5.png)
254
11.2.4 Subplots
An alternative approach to double y axes is to use subplots.
1 plt.figure()
2 f, axes = plt.subplots(2, 1)
3 axes[0].plot(x, y1)
4 axes[0].set_ylabel(y1)
5
6 axes[1].plot(x, y2)
7 axes[1].set_ylabel(y2)
8 plt.savefig(images/two-scales-6.png)
255
11.3 Customizing plots after the fact
Matlab post Sometimes it is desirable to make a plot that shows the data
you want to present, and to customize the details, e.g. font size/type and
line thicknesses afterwards. It can be tedious to try to add the customization
code to the existing code that makes the plot. Today, we look at a way to
do the customization after the plot is created.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 x = np.linspace(0,2)
5 y1 = x
6 y2 = x**2
7 y3 = x**3
8
9 plt.plot(x, y1, x, y2, x, y3)
10 xL = plt.xlabel(x)
11 yL = plt.ylabel(f(x))
12 plt.title(plots of y = x^n)
13 plt.legend([x, x^2, x^3], loc=best)
14 plt.savefig(images/after-customization-1.png)
15
16 fig = plt.gcf()
17
18 plt.setp(fig, size_inches, (4, 6))
256
19 plt.savefig(images/after-customization-2.png)
20
21
22 # set lines to dashed
23 from matplotlib.lines import Line2D
24 for o in fig.findobj(Line2D):
25 o.set_linestyle(--)
26
27 #set(allaxes,FontName,Arial,FontWeight,Bold,LineWidth,2,FontSize,14);
28
29 import matplotlib.text as text
30 for o in fig.findobj(text.Text):
31 plt.setp(o, fontname,Arial, fontweight,bold, fontsize, 14)
32
33 plt.setp(xL, fontstyle, italic)
34 plt.setp(yL, fontstyle, italic)
35 plt.savefig(images/after-customization-3.png)
257
258
11.4 Fancy, built-in colors in Python
Matlab post
259
Matplotlib has a lot of built-in colors. Here is a list of them, and an
example of using them.
1 import numpy as np
2 import matplotlib.pyplot as plt
260
3
4 #this plots horizontal lines for each y value of m.
5 for m in np.linspace(1, 50, 100):
6 plt.plot([0, 50], [m, m])
7
8 plt.savefig(images/blues-1.png)
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 c = {}
5 with open(color.table) as f:
6 for line in f:
7 fields = line.split(\t)
8 colorname = fields[0].lower()
9 hexcode = fields[1]
10 c[colorname] = hexcode
11
12 names = c.keys()
13 names = sorted(names)
14
15 print(names)
261
16
17 blues = [c[alice blue],
18 c[light blue],
19 c[baby blue],
20 c[light sky blue],
21 c[maya blue],
22 c[cornflower blue],
23 c[bleu de france],
24 c[azure],
25 c[blue sapphire],
26 c[cobalt],
27 c[blue],
28 c[egyptian blue],
29 c[duke blue]]
30
31 ax = plt.gca()
32 ax.set_color_cycle(blues)
33
34 #this plots horizontal lines for each y value of m.
35 for i, m in enumerate(np.linspace(1, 50, 100)):
36 plt.plot([0, 50], [m, m])
37
38 plt.savefig(images/blues-2.png)
[aero, aero blue, african violet, air force blue (raf), air force blue (usaf
262
11.6 Interactive plotting
11.6.1 Basic mouse clicks
One basic event a figure can react to is a mouse click. Let us make a graph
with a parabola in it, and draw the shortest line from a point clicked on to
the graph. Here is an example of doing that.
263
Normal return from subroutine COBYLA
Here is the result from two clicks. For some reason, this only works when
you click inside the parabola. It does not work outside the parabola.
We can even do different things with different mouse clicks. A left click
corresponds to event.button = 1, a middle click is event.button = 2, and a
right click is event.button = 3. You can detect if a double click occurs too.
Here is an example of these different options.
264
15 event.dblclick))
16
17 ms=5 # marker size
18 if event.dblclick: #make marker bigger
19 ms = 10
20
21 ax.plot([event.xdata], [event.ydata], o, color=colors[event.button], ms=ms)
22 ax.figure.canvas.draw() # this line is critical to change the title
23 plt.savefig(images/interactive-button-click.png)
24
25 cid = fig.canvas.mpl_connect(button_press_event, onclick)
26 plt.show()
Finally, you may want to have key modifiers for your clicks, e.g. Ctrl-
click is different than a click.
265
1 from __future__ import print_function
2 import sys
3 import numpy as np
4 import matplotlib.pyplot as plt
5
6
7 def press(event):
8 print(press, event.key)
9 sys.stdout.flush()
10 if event.key == x:
11 visible = xl.get_visible()
12 xl.set_visible(not visible)
13 fig.canvas.draw()
14
15 fig, ax = plt.subplots()
16
17 fig.canvas.mpl_connect(key_press_event, press)
18
19 ax.plot(np.random.randx(12), np.random.rand(12), go)
20 xl = ax.set_xlabel(easy come, easy go)
21
22 plt.show()
266
MPL MouseEvent: xy=(337,263) xydata=(4.66330645161,0.559895833333) button=2 dblclick=
MPL MouseEvent: xy=(367,305) xydata=(5.20766129032,0.669270833333) button=1 dblclick=
You can have almost every key-click combination imaginable. This allows
you to have many different things that can happen when you click on a
graph. With this method, you can get the coordinates close to a data point,
but you do not get the properties of the point. For that, we need another
mechanism.
267
11
12 # make the figure
13 fig = plt.figure()
14
15 ax = fig.add_subplot(111)
16 line, = ax.plot(x, y, ro-)
17 marker, = ax.plot([0.5], [0.5],go, ms=15)
18
19 ax.set_title(Move the mouse around)
20
21 def onmove(event):
22
23 xe = event.xdata
24 ye = event.ydata
25
26 ax.set_title(at x={0} y={1}.format(xe, p(xe)))
27 marker.set_xdata(xe)
28 marker.set_ydata(p(xe))
29
30 ax.figure.canvas.draw() # this line is critical to change the title
31
32 cid = fig.canvas.mpl_connect(motion_notify_event, onmove)
33 plt.show()
268
24
25 cid = fig.canvas.mpl_connect(key_press_event, onpress)
26 plt.show()
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 fig = plt.figure()
5 ax = fig.add_subplot(111)
6 ax.set_title(click on a line)
7
8 x = np.linspace(0, 2*np.pi)
9
10 L1, = ax.plot(x, np.sin(x), picker=5)
11 L2, = ax.plot(x, np.cos(x), picker=5)
12
13 def onpick(event):
14 thisline = event.artist
15
16 # reset all lines to thin
17 for line in [L1, L2]:
18 line.set_lw(1)
19
20 thisline.set_lw(5) # make selected line thick
21 ax.figure.canvas.draw() # this line is critical to change the linewidth
22
23 fig.canvas.mpl_connect(pick_event, onpick)
24
25 plt.show()
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 fig = plt.figure()
5 ax = fig.add_subplot(111)
6 ax.set_title(click on a point)
7
269
8 x = [0, 1, 2, 3, 4, 5]
9 labels = [a, b, c, d, e, f]
10 ax.plot(x, bo, picker=5)
11
12 # this is the transparent marker for the selected data point
13 marker, = ax.plot([0], [0], yo, visible=False, alpha=0.8, ms=15)
14
15 def onpick(event):
16 ind = event.ind
17 ax.set_title(Data point {0} is labeled "{1}".format(ind, labels[ind]))
18 marker.set_visible(True)
19 marker.set_xdata(x[ind])
20 marker.set_ydata(x[ind])
21
22 ax.figure.canvas.draw() # this line is critical to change the linewidth
23 plt.savefig(images/interactive-labeled-points.png)
24
25 fig.canvas.mpl_connect(pick_event, onpick)
26
27 plt.show()
270
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 w, i = np.loadtxt(data/raman.txt, usecols=(0, 1), unpack=True)
5
6 plt.plot(w, i)
7 plt.xlabel(Raman shift (cm$^{-1}$))
8 plt.ylabel(Intensity (counts))
9
10 ax = plt.gca()
11
12 # put a shaded rectangle over a region
13 ax.annotate(Some typical region, xy=(550, 15500), xycoords=data)
14 ax.fill_between([700, 800], 0, [16000, 16000], facecolor=red, alpha=0.25)
15
16 # shade the region in the spectrum
17 ind = (w>1019) & (w<1054)
18 ax.fill_between(w[ind], 0, i[ind], facecolor=gray, alpha=0.5)
19 area = np.trapz(i[ind], w[ind])
20 x, y = w[ind][np.argmax(i[ind])], i[ind][np.argmax(i[ind])]
21 ax.annotate(Area = {0:1.2f}.format(area), xy=(x, y),
22 xycoords=data,
23 xytext=(x + 50, y + 5000),
24 textcoords=data,
25 arrowprops=dict(arrowstyle="->",
26 connectionstyle="angle,angleA=0,angleB=90,rad=10"))
27
28
29 # find a max in this region, and annotate it
30 ind = (w>1250) & (w<1252)
31 x,y = w[ind][np.argmax(i[ind])], i[ind][np.argmax(i[ind])]
32 ax.annotate(A peak, xy=(x, y),
33 xycoords=data,
34 xytext=(x + 350, y + 2000),
35 textcoords=data,
36 arrowprops=dict(arrowstyle="->",
37 connectionstyle="angle,angleA=0,angleB=90,rad=10"))
38
39 # find max in this region, and annotate it
40 ind = (w>1380) & (w<1400)
41 x,y = w[ind][np.argmax(i[ind])], i[ind][np.argmax(i[ind])]
42 ax.annotate(Another peak, xy=(x, y),
43 xycoords=data,
44 xytext=(x + 50, y + 2000),
45 textcoords=data,
46 arrowprops=dict(arrowstyle="->",
47 connectionstyle="angle,angleA=0,angleB=90,rad=10"))
48
49 # indicate a region with connected arrows
50 ax.annotate(CH bonds, xy=(2780, 6000), xycoords=data)
51 ax.annotate(, xy=(2800., 5000.), xycoords=data,
52 xytext=(3050, 5000), textcoords=data,
53 # the arrows connect the xy to xytext coondinates
54 arrowprops=dict(arrowstyle="<->",
55 connectionstyle="bar",
56 ec="k", # edge color
271
57 shrinkA=0.1, shrinkB=0.1))
58
59 plt.savefig(images/plot-annotes.png)
60 plt.show()
12 Programming
12.1 Some of this, sum of that
Matlab plot
Python provides a sum function to compute the sum of a list. However,
the sum function does not work on every arrangement of numbers, and it
certainly does not work on nested lists. We will solve this problem with
recursion.
Here is a simple example.
1 v = [1, 2, 3, 4, 5, 6, 7, 8, 9] # a list
2 print(sum(v))
3
4 v = (1, 2, 3, 4, 5, 6, 7, 8, 9) # a tuple
5 print(sum(v))
45
272
45
If you have data in a dictionary, sum works by default on the keys. You
can give the sum function the values like this.
1 v = [1,
2 [2, 3],
3 [4, [5, 6]],
4 7,
5 [8,9]]
6
7 def recursive_sum(X):
8 compute sum of arbitrarily nested lists
9 s = 0 # initial value of the sum
10
11 for i in range(len(X)):
12 import types # we use this to test if we got a number
13 if isinstance(X[i], (int, float, complex)):
14 # this is the terminal step
15 s += X[i]
16 else:
17 # we did not get a number, so we recurse
18 s += recursive_sum(X[i])
19 return s
20
273
21 print(recursive_sum(v))
22 print(recursive_sum([1, 2, 3, 4, 5, 6, 7, 8, 9])) # test on non-nested list
45
45
1 a = [4, 5, 1, 6, 8, 3, 2]
2 print(a)
3 a.sort() # inplace sorting
4 print(a)
5
6 a.sort(reverse=True)
7 print(a)
[4, 5, 1, 6, 8, 3, 2]
[1, 2, 3, 4, 5, 6, 8]
[8, 6, 5, 4, 3, 2, 1]
If you do not want to modify your list, but rather get a copy of a sorted
list, use the sorted command.
1 a = [4, 5, 1, 6, 8, 3, 2]
2 print(sorted a = ,sorted(a)) # no change to a
3 print(sorted a = ,sorted(a, reverse=True)) # no change to a
4 print(a = ,a)
sorted a = [1, 2, 3, 4, 5, 6, 8]
sorted a = [8, 6, 5, 4, 3, 2, 1]
a = [4, 5, 1, 6, 8, 3, 2]
274
1 a = [b, a, c, tree]
2 print(sorted(a))
[a, b, c, tree]
1 a = [B, a, c, tree]
2 print(sorted(a))
3
4 # sort by lower case letter
5 print(sorted(a, key=str.lower))
[B, a, c, tree]
[a, B, c, tree]
275
12.3 Unique entries in a vector
Matlab post
It is surprising how often you need to know only the unique entries in
a vector of entries. In python, we create a "set" from a list, which only
contains unique entries. Then we convert the set back to a list.
1 a = [1, 1, 2, 3, 4, 5, 3, 5]
2
3 b = list(set(a))
4 print(b)
[1, 2, 3, 4, 5]
1 a = [a,
2 b,
3 abracadabra,
4 b,
5 c,
6 d,
7 b]
8
9 print(list(set(a)))
[d, b, abracadabra, c, a]
1 def recursive_factorial(n):
2 compute the factorial recursively. Note if you put a negative
3 number in, this function will never end. We also do not check if
4 n is an integer.
276
5 if n == 0:
6 return 1
7 else:
8 return n * recursive_factorial(n - 1)
9
10 print(recursive_factorial(5))
120
120.0
1 n = 5
2 factorial_loop = 1
3 for i in range(1, n + 1):
4 factorial_loop *= i
5
6 print(factorial_loop)
120
1. the syntax of the for loop is quite different with the use of the in
operator.
3. We have to loop from 1 to n+1 because the last number in the range
is not returned.
12.4.1 Conclusions
Recursive functions have a special niche in mathematical programming.
There is often another way to accomplish the same goal. That is not always
true though, and in a future post we will examine cases where recursion is
the only way to solve a problem.
277
12.5 Brief intro to regular expressions
Matlab post
This example shows how to use a regular expression to find strings match-
ing the pattern :cmd:datastring. We want to find these strings, and then
replace them with something that depends on what cmd is, and what datas-
tring is.
Let us define some commands that will take datasring as an argument,
and return the modified text. The idea is to find all the cmds, and then
run them. We use pythons eval command to get the function handle from
a string, and the cmd functions all take a datastring argument (we define
them that way). We will create commands to replace :cmd:datastring with
html code for a light gray background, and :red:some text with html code
making the text red.
1 text = rHere is some text. use the :cmd:open to get the text into
2 a variable. It might also be possible to get a multiline
3 :red:line
4 2 directive.
5
6 print(text)
7 print(---------------------------------)
Here is some text. use the :cmd:open to get the text into
a variable. It might also be possible to get a multiline
:red:line
2 directive.
---------------------------------
1 def cmd(datastring):
2 replace :cmd:datastring with html code with light gray background
3 s = <FONT style="BACKGROUND-COLOR: LightGray">%{0}</FONT>;
4 html = s.format(datastring)
5 return html
6
7 def red(datastring):
8 replace :red:datastring with html code to make datastring in red font
9 html = <font color=red>{0}</font>.format(datastring)
10 return html
278
everything between :*: as the directive. ([^:]*) matches everything not a
:. :([^:]*): matches the stuff between two :. 2. then we want everything
between *. ([^]*) matches everything not a . 3. The () makes a group
that python stores so we can refer to them later.
1 import re
2 regex = :([^:]*):([^]*)
3 matches = re.findall(regex, text)
4 for directive, datastring in matches:
5 directive = eval(directive) # get the function
6 text = re.sub(regex, directive(datastring), text)
7
8 print(Modified text:)
9 print(text)
Modified text:
Here is some text. use the <FONT style="BACKGROUND-COLOR: LightGray">%open</FONT> to
a variable. It might also be possible to get a multiline
<FONT style="BACKGROUND-COLOR: LightGray">%open</FONT> directive.
1 text =
2 As we have seen, handling units with third party functions is fragile, and often requires additional code to wrap
3
4 Before doing the examples, let us consider how the quantities package handles dimensionless numbers.
5
6 import quantities as u
7
8 a = 5 * u.m
9 L = 10 * u.m # characteristic length
10
11 print a/L
12 print type(a/L)
13
14
15
16 words = text.split()
17 print(words)
[As, we, have, seen,, handling, units, with, third, party, functio
279
Let us get the length of each word.
Now let us get all the words that start with the letter "a". This is
sometimes called filtering a list. We use a string function startswith to
check for upper and lower-case letters. We will use list comprehension with
a condition.
A slightly harder example is to find all the words that are actually num-
bers. We could use a regular expression for that, but we will instead use a
function we create. We use a function that tries to cast a word as a float.
If this fails, we know the word is not a float, so we return False.
1 def float_p(word):
2 try:
3 float(word)
4 return True
5 except ValueError:
6 return False
280
7
8 print([word for word in words if float_p(word)])
9
10 # here is a functional approach
11 print(list(filter(float_p, words)))
[5, 10]
[5, 10]
Finally, we consider filtering the list to find all words that contain certain
symbols, say any character in this string "./=*#". Any of those characters
will do, so we search each word for one of them, and return True if it contains
it, and False if none are contained.
1 def punctuation_p(word):
2 S = ./=*#
3 for s in S:
4 if s in word:
5 return True
6 return False
7
8 print([word for word in words if punctuation_p(word)])
9 print(filter(punctuation_p, words))
In this section we examined a few ways to interact with lists using list
comprehension and functional programming. These approaches make it pos-
sible to work on arbitrary size lists, without needing to know in advance how
big the lists are. New lists are automatically generated as results, without
the need to preallocate lists, i.e. you do not need to know the size of the
output. This can be handy as it avoids needing to write loops in some cases
and leads to more compact code.
281
4 word = Dispatch(Word.Application)
5 word.Visible = True
6
7 document = word.Documents.Add()
8 selection = word.Selection
9
10 selection.TypeText(Hello world. \n)
11 selection.TypeText(My name is Professor Kitchin\n)
12 selection.TypeParagraph
13 selection.TypeText(How are you today?\n)
14 selection.TypeParagraph
15 selection.Style=Normal
16
17
18 selection.TypeText(Big Finale\n)
19 selection.Style=Heading 1
20 selection.TypeParagraph
21
22 H1 = document.Styles.Item(Heading 1)
23 H1.Font.Name = Garamond
24 H1.Font.Size = 20
25 H1.Font.Bold = 1
26 H1.Font.TextColor.RGB=60000 # some ugly color green
27
28 selection.TypeParagraph
29 selection.TypeText(That is all for today!)
30
31
32 document.SaveAs2(os.getcwd() + /test.docx)
33 word.Quit()
./test.docx
That is it! I would not call this extra convenient, but if you have a need
to automate the production of Word documents from a program, this is an
approach that you can use. You may find http://msdn.microsoft.com/
en-us/library/kw65a0we%28v=vs.80%29.aspx a helpful link for documen-
tation of what you can do.
I was going to do this by docx, which does not require windows, but it
appears broken. It is missing a template directory, and it does not match
the github code. docx is not actively maintained anymore either.
282
11
12 document.append(paragraph(That is all for today.))
13
14 document.save(test.doc)
1 import xlrd
2
3 wb = xlrd.open_workbook(data/example.xlsx)
4 sh1 = wb.sheet_by_name(uSheet1)
5
6 print(sh1.col_values(0)) # column 0
7 print(sh1.col_values(1)) # column 1
8
9 sh2 = wb.sheet_by_name(uSheet2)
10
11 x = sh2.col_values(0) # column 0
12 y = sh2.col_values(1) # column 1
13
14 import matplotlib.pyplot as plt
15 plt.plot(x, y)
16 plt.savefig(images/excel-1.png)
[value, function]
[2.0, 3.0]
283
12.8.1 Writing Excel workbooks
Writing data to Excel sheets is pretty easy. Note, however, that this over-
writes the worksheet if it already exists.
1 import xlwt
2 import numpy as np
3
4 x = np.linspace(0, 2)
5 y = np.sqrt(x)
6
7 # save the data
8 book = xlwt.Workbook()
9
10 sheet1 = book.add_sheet(Sheet 1)
11
12 for i in range(len(x)):
13 sheet1.write(i, 0, x[i])
14 sheet1.write(i, 1, y[i])
15
16 book.save(data/example2.xls) # maybe can only write .xls format
284
1 from xlrd import open_workbook
2
3 from xlutils.copy import copy
4
5 rb = open_workbook(data/example2.xls,formatting_info=True)
6 rs = rb.sheet_by_index(0)
7
8 wb = copy(rb)
9
10 ws = wb.add_sheet(Sheet 2)
11 ws.write(0, 0, "Appended")
12
13 wb.save(data/example2.xls)
12.8.3 Summary
Matlab has better support for interacting with Excel than python does right
now. You could get better Excel interaction via COM, but that is Windows
specific, and requires you to have Excel installed on your computer. If you
only need to read or write data, then xlrd/xlwt or the openpyxl modules
will server you well.
285
9 ws.Range("B1").Value = x
10 V = ws.Range("B6").Value
11 print at X = {0} V = {1:1.2f} L.format(x, V)
12
13 # we tell Excel the workbook is saved, even though it is not, so it
14 # will quit without asking us to save.
15 excel.ActiveWorkbook.Saved = True
16 excel.Application.Quit()
at X = 0.1 V = 22.73 L
at X = 0.5 V = 113.64 L
at X = 0.9 V = 204.55 L
This was a simple example (one that did not actually need Excel at
all) that illustrates the feasibility of communicating with Excel via a COM
interface.
Some links I have found that help figure out how to do this are:
http://www.numbergrinder.com/2008/11/pulling-data-from-excel-using-python-com/
http://www.numbergrinder.com/2008/11/closing-excel-using-python/
http://www.dzone.com/snippets/script-excel-python
286
This is an example that just illustrates it is possible to access data from
a simulation that has been run. You have to know quite a bit about the
Aspen flowsheet before writing this code. Particularly, you need to open the
Variable Explorer to find the "path" to the variables that you want, and to
know what the units are of those variables are.
1 import os
2 import win32com.client as win32
3 aspen = win32.Dispatch(Apwn.Document)
4
5 aspen.InitFromArchive2(os.path.abspath(data\Flash_Example.bkp))
6
7 ## Input variables
8 feed_temp = aspen.Tree.FindNode(\Data\Streams\FEED\Input\TEMP\MIXED).Value
9 print Feed temperature was {0} degF.format(feed_temp)
10
11 ftemp = aspen.Tree.FindNode(\Data\Blocks\FLASH\Input\TEMP).Value
12 print Flash temperature = {0}.format(ftemp)
13
14 ## Output variables
15 eL_out = aspen.Tree.FindNode("\Data\Streams\LIQUID\Output\MOLEFLOW\MIXED\ETHANOL").Value
16 wL_out = aspen.Tree.FindNode("\Data\Streams\LIQUID\Output\MOLEFLOW\MIXED\WATER").Value
17
18 eV_out = aspen.Tree.FindNode("\Data\Streams\VAPOR\Output\MOLEFLOW\MIXED\ETHANOL").Value
19 wV_out = aspen.Tree.FindNode("\Data\Streams\VAPOR\Output\MOLEFLOW\MIXED\WATER").Value
20
21 tot = aspen.Tree.FindNode("\Data\Streams\FEED\Input\TOTFLOW\MIXED").Value
22
23 print Ethanol vapor mol flow: {0} lbmol/hr.format(eV_out)
24 print Ethanol liquid mol flow: {0} lbmol/hr.format(eL_out)
25
26 print Water vapor mol flow: {0} lbmol/hr.format(wV_out)
27 print Water liquid mol flow: {0} lbmol/hr.format(wL_out)
28
29 print Total = {0}. Total in = {1}.format(eV_out + eL_out + wV_out + wL_out,
30 tot)
287
31
32 aspen.Close()
1 import os
2 import numpy as np
3 import matplotlib.pyplot as plt
4 import win32com.client as win32
5
6 aspen = win32.Dispatch(Apwn.Document)
7 aspen.InitFromArchive2(os.path.abspath(data\Flash_Example.bkp))
8
9 T = np.linspace(150, 200, 10)
10
11 x_ethanol, y_ethanol = [], []
12
13 for temperature in T:
14 aspen.Tree.FindNode(\Data\Blocks\FLASH\Input\TEMP).Value = temperature
15 aspen.Engine.Run2()
16
17 x_ethanol.append(aspen.Tree.FindNode(\Data\Streams\LIQUID\Output\MOLEFRAC\MIXED\ETHANOL).Value)
18 y_ethanol.append(aspen.Tree.FindNode(\Data\Streams\VAPOR\Output\MOLEFRAC\MIXED\ETHANOL).Value)
19
20 plt.plot(T, y_ethanol, T, x_ethanol)
21 plt.legend([vapor, liquid])
22 plt.xlabel(Flash Temperature (degF))
23 plt.ylabel(Ethanol mole fraction)
24 plt.savefig(images/aspen-water-ethanol-flash.png)
25 aspen.Close()
288
It takes about 30 seconds to run the previous example. Unfortunately,
the way it is written, if you want to change anything, you have to run all
of the calculations over again. How to avoid that is moderately tricky, and
will be the subject of another example.
In summary, it seems possible to do a lot with Aspen automation via
python. This can also be done with Matlab, Excel, and other programming
languages where COM automation is possible. The COM interface is not
especially well documented, and you have to do a lot of digging to figure
out some things. It is not clear how committed Aspen is to maintaining or
improving the COM interface (http://www.chejunkie.com/aspen-plus/
aspen-plus-activex-automation-server/). Hopefully they can keep it
alive for power users who do not want to program in Excel!
289
1 import os
2 import win32com.client as win32
3 aspen = win32.Dispatch(Apwn.Document)
4
5 aspen.InitFromArchive2(os.path.abspath(data\Flash_Example.bkp))
6
7 from scipy.optimize import fsolve
8
9 def func(flashT):
10 flashT = float(flashT) # COM objects do not understand numpy types
11 aspen.Tree.FindNode(\Data\Blocks\FLASH\Input\TEMP).Value = flashT
12 aspen.Engine.Run2()
13 y = aspen.Tree.FindNode(\Data\Streams\VAPOR\Output\MOLEFRAC\MIXED\ETHANOL).Value
14 return y - 0.8
15
16 sol, = fsolve(func, 150.0)
17 print A flash temperature of {0:1.2f} degF will have y_ethanol = 0.8.format(sol)
One unexpected detail was that the Aspen COM objects cannot be as-
signed numpy number types, so it was necessary to recast the argument as
a float. Otherwise, this worked about as expected for an fsolve problem.
1 def debug():
2 print(step 1)
3 print(3 + 4)
4 print(finished)
5
6 debug()
step 1
7
finished
Now, let us redirect the printed lines to a file. We create a file object,
and set sys.stdout equal to that file object.
290
1 import sys
2 print(__stdout__ before = {0}.format(sys.__stdout__), file=sys.stdout)
3 print(stdout before = {0}.format(sys.stdout), file=sys.stdout)
4
5 f = open(data/debug.txt, w)
6 sys.stdout = f
7
8 # note that sys.__stdout__ does not change, but stdout does.
9 print(__stdout__ after = {0}.format(sys.__stdout__), file=sys.stdout)
10 print(stdout after = {0}.format(sys.stdout), file=sys.stdout)
11
12 debug()
13
14 # reset stdout back to console
15 sys.stdout = sys.__stdout__
16
17 print(f)
18 f.close() # try to make it a habit to close files
19 print(f)
Note it can be important to close files. If you are looping through large
numbers of files, you will eventually run out of file handles, causing an error.
We can use a context manager to automatically close the file like this
1 import sys
2
3 # use the open context manager to automatically close the file
4 with open(data/debug.txt, w) as f:
5 sys.stdout = f
6 debug()
7 print(f, file=sys.__stdout__)
8
9 # reset stdout
10 sys.stdout = sys.__stdout__
11 print(f)
See, the file is closed for us! We can see the contents of our file like
this.
1 cat data/debug.txt
291
step 1
7
finished
The approaches above are not fault safe. Suppose our debug function
raised an exception. Then, it could be possible the line to reset the stdout
would not be executed. We can solve this with try/finally code.
1 import sys
2
3 print(before: , sys.stdout)
4 try:
5 with open(data/debug-2.txt, w) as f:
6 sys.stdout = f
7 # print to the original stdout
8 print(during: , sys.stdout, file=sys.__stdout__)
9 debug()
10 raise Exception(something bad happened)
11 finally:
12 # reset stdout
13 sys.stdout = sys.__stdout__
14
15 print(after: , sys.stdout)
16 print(f) # verify it is closed
17 print(sys.stdout) # verify this is reset
1 cat data/debug-2.txt
step 1
7
finished
292
want to change sys.stdout to a new value inside our context, and change it
back when we exit the context. We will store the value of sys.stdout going
in, and restore it on the way out.
1 import sys
2
3 class redirect:
4 def __init__(self, f=sys.stdout):
5 "redirect print statement to f. f must be a file-like object"
6 self.f = f
7 self.stdout = sys.stdout
8 print(init stdout: , sys.stdout, file=sys.__stdout__)
9 def __enter__(self):
10 sys.stdout = self.f
11 print(stdout in context-manager: ,sys.stdout, f=sys.__stdout__)
12 def __exit__(self, *args):
13 sys.stdout = self.stdout
14 print(__stdout__ at exit = ,sys.__stdout__)
15
16 # regular printing
17 with redirect():
18 debug()
19
20 # write to a file
21 with open(data/debug-3.txt, w) as f:
22 with redirect(f):
23 debug()
24
25 # mixed regular and
26 with open(data/debug-4.txt, w) as f:
27 with redirect(f):
28 print(testing redirect)
29 with redirect():
30 print(temporary console printing)
31 debug()
32 print(Now outside the inner context. This should go to data/debug-4.txt)
33 debug()
34 raise Exception(something else bad happened)
35
36 print(sys.stdout)
1 cat data/debug-3.txt
The contents of the other debug file have some additional lines, because
we printed some things while in the redirect context.
1 cat data/debug-4.txt
293
See http://www.python.org/dev/peps/pep-0343/ (number 5) for an-
other example of redirecting using a function decorator. I think it is harder
to understand, because it uses a generator.
There were a couple of points in this section:
1. You can control where things are printed in your programs by modi-
fying the value of sys.stdout
2. You can use try/except/finally blocks to make sure code gets executed
in the event an exception is raised
3. You can use context managers to make sure files get closed, and code
gets executed if exceptions are raised.
1 L = [a, a, b,d, e, b, e, a]
2
3 d = {}
4 for el in L:
5 if el in d:
6 d[el] += 1
7 else:
8 d[el] = 1
9
10 print(d)
{b: 2, a: 3, e: 2, d: 1}
That seems like too much code, and that there must be a list compre-
hension approach combined with a dictionary constructor.
1 L = [a, a, b,d, e, b, e, a]
2
3 print(dict((el,L.count(el)) for el in L))
{b: 2, a: 3, d: 1, e: 2}
Wow, that is a lot simpler! I suppose for large lists this might be slow,
since count must look through the list for each element, whereas the longer
code looks at each element once, and does one conditional analysis.
Here is another example of much shorter and cleaner code.
294
1 from collections import Counter
2 L = [a, a, b,d, e, b, e, a]
3 print(Counter(L))
4 print(Counter(L)[a])
Counter({a: 3, e: 2, b: 2, d: 1})
3
1 import sys
2
3 print(sys.version)
4
5 print(sys.executable)
6
7 print(sys.platform)
8
9 # where the platform independent Python files are installed
10 print(sys.prefix)
1 import platform
2
3 print(platform.uname())
4 print(platform.system())
5 print(platform.architecture())
6 print(platform.machine())
7 print(platform.node())
8 print(platform.platform())
9 print(platform.processor())
10 print(platform.python_build())
11 print(platform.python_version())
295
Johns-MacBook-Air.local
Darwin-13.4.0-x86_64-i386-64bit
i386
(default, Dec 7 2015 11:24:55)
3.5.1
Now, let us look at an example where the directory does exist. We will
change into the directory, run some code, and then raise an Exception.
296
3 CWD = os.getcwd() # store initial position
4 print(initially inside {0}.format(os.getcwd()))
5 TEMPDIR = data
6
7 try:
8 os.chdir(TEMPDIR)
9 print(inside {0}.format(os.getcwd()))
10 print(os.listdir(.))
11 raise Exception(boom)
12 except:
13 print(Exception caught: ,sys.exc_info()[0])
14 finally:
15 print(Running final code)
16 os.chdir(CWD)
17 print(finally inside {0}.format(os.getcwd()))
You can see that we changed into the directory, ran some code, and then
caught an exception. Afterwards, we changed back to our original directory.
This code works fine, but it is somewhat verbose, and tedious to write over
and over. We can get a cleaner syntax with a context manager. The context
manager uses the with keyword in python. In a context manager some code
is executed on entering the "context", and code is run on exiting the context.
We can use that to automatically change directory, and when done, change
back to the original directory. We use the contextlib.contextmanager
decorator on a function. With a function, the code up to a yield statement
is run on entering the context, and the code after the yield statement is run
on exiting. We wrap the yield statement in try/except/finally block to make
sure our final code gets run.
1 import contextlib
2 import os, sys
3
4 @contextlib.contextmanager
5 def cd(path):
6 print(initially inside {0}.format(os.getcwd()))
7 CWD = os.getcwd()
8
9 os.chdir(path)
10 print(inside {0}.format(os.getcwd()))
11 try:
297
12 yield
13 except:
14 print(Exception caught: ,sys.exc_info()[0])
15 finally:
16 print(finally inside {0}.format(os.getcwd()))
17 os.chdir(CWD)
18
19 # Now we use the context manager
20 with cd(data):
21 print(os.listdir(.))
22 raise Exception(boom)
23
24 print
25 with cd(data/run2):
26 print(os.listdir(.))
One case that is not handled well with this code is if the directory you
want to change into does not exist. In that case an exception is raised on
entering the context when you try change into a directory that does not
exist. An alternative class based context manager can be found here.
13 Miscellaneous
13.1 Mail merge with python
Suppose you are organizing some event, and you have a mailing list of email
addresses and people you need to send a mail to telling them what room
they will be in. You would like to send a personalized email to each person,
and you do not want to type each one by hand. Python can automate this
for you. All you need is the mailing list in some kind of structured format,
and then you can go through it line by line to create and send emails.
We will use an org-table to store the data in.
298
First name Last name email address Room number
Jane Doe jane-doe@gmail.com 1
John Doe john-doe@gmail.com 2
Jimmy John jimmy-john@gmail.com 3
1 import smtplib
2 from email.mime.multipart import MIMEMultipart
3 from email.mime.text import MIMEText
4 from email.utils import formatdate
5
6 template =
7 Dear {firstname:s},
8
9 I am pleased to inform you that your talk will be in room {roomnumber:d}.
10
11 Sincerely,
12 John
13
14
15 for firstname, lastname, emailaddress, roomnumber in data:
16 msg = MIMEMultipart()
17 msg[From] = "youremail@gmail.com"
18 msg[To] = emailaddress
19 msg[Date] = formatdate(localtime=True)
20
21 msgtext = template.format(**locals())
22 print(msgtext)
23
24 msg.attach(MIMEText(msgtext))
25
26 ## Uncomment these lines and fix
27 #server = smtplib.SMTP(your.relay.server.edu)
28 #server.sendmail(your_email@gmail.com, # from
29 # emailaddress,
30 # msg.as_string())
31 #server.quit()
299
32
33 print(msg.as_string())
34 print(------------------------------------------------------------------)
14 Worked examples
14.1 Peak finding in Raman spectroscopy
Raman spectroscopy is a vibrational spectroscopy. The data typically comes
as intensity vs. wavenumber, and it is discrete. Sometimes it is necessary
to identify the precise location of a peak. In this post, we will use spline
smoothing to construct an interpolating function of the data, and then use
fminbnd to identify peak positions.
This example was originally worked out in Matlab at http://matlab.
cheme.cmu.edu/2012/08/27/peak-finding-in-raman-spectroscopy/
numpy:loadtxt
Let us take a look at the raw data.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 w, i = np.loadtxt(data/raman.txt, usecols=(0, 1), unpack=True)
5
6 plt.plot(w, i)
7 plt.xlabel(Raman shift (cm$^{-1}$))
8 plt.ylabel(Intensity (counts))
9 plt.savefig(images/raman-1.png)
300
The next thing to do is narrow our focus to the region we are interested
in between 1340 cm{-1} and 1360 cm{-1}.
301
Next we consider a scipy:UnivariateSpline. This function "smooths" the
data.
302
Note that the UnivariateSpline function returns a "callable" function!
Our next goal is to find the places where there are peaks. This is defined by
the first derivative of the data being equal to zero. It is easy to get the first
derivative of a UnivariateSpline with a second argument as shown below.
303
25 plt.plot(minmax, sp(minmax), ro )
26
27 plt.savefig(images/raman-4.png)
304
14.1.1 Summary notes
Using org-mode with :session allows a large script to be broken up into mini
sections. However, it only seems to work with the default python mode in
Emacs, and it does not work with emacs-for-python or the latest python-
mode. I also do not really like the output style, e.g. the output from the
plotting commands.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 datafile = data/gc-data-21.txt
5
6 i = 0
7 with open(datafile) as f:
8 lines = f.readlines()
9
10 for i,line in enumerate(lines):
11 if # of Points in line:
12 npoints = int(line.split()[-1])
13 elif R.Time\tIntensity in line:
14 i += 1
15 break
16
17 # now get the data
18 t, intensity = [], []
19 for j in range(i, i + npoints):
20 fields = lines[j].split()
21 t += [float(fields[0])]
22 intensity += [int(fields[1])]
23
24 t = np.array(t)
25 intensity = np.array(intensity, np.float)
26
27 # now plot the data in the relevant time frame
28 plt.plot(t, intensity)
29 plt.xlim([4, 6])
305
30 plt.xlabel(Time (s))
31 plt.ylabel(Intensity (arb. units))
32 plt.savefig(images/deconvolute-1.png)
You can see there is a non-zero baseline. We will normalize that by the
average between 4 and 4.4 seconds.
306
The peaks are asymmetric, decaying gaussian functions. We define a
function for this
307
11 p1 = asym_peak(t, [a10, a11, a12, a13])
12 p2 = asym_peak(t, [a20, a21, a22, a23])
13 return p1 + p2
308
1 from scipy.optimize import curve_fit
2
3 popt, pcov = curve_fit(two_peaks, t, intensity, parguess)
4 print(popt)
5
6 plt.plot(t, two_peaks(t, *popt), r-)
7 plt.legend([data, initial guess,final fit])
8
9 plt.savefig(images/deconvolution-4.png)
The fits are not perfect. The small peak is pretty good, but there is an
unphysical tail on the larger peak, and a small mismatch at the peak. There
is not much to do about that, it means the model peak we are using is not
a good model for the peak. We will still integrate the areas though.
1 pars1 = popt[0:4]
2 pars2 = popt[4:8]
3
4 peak1 = asym_peak(t, pars1)
309
5 peak2 = asym_peak(t, pars2)
6
7 area1 = np.trapz(peak1, t)
8 area2 = np.trapz(peak2, t)
9
10 print(Area 1 = {0:1.2f}.format(area1))
11 print(Area 2 = {0:1.2f}.format(area2))
12
13 print(Area 1 is {0:1.2%} of the whole area.format(area1/(area1 + area2)))
14 print(Area 2 is {0:1.2%} of the whole area.format(area2/(area1 + area2)))
15
16 plt.figure()
17 plt.plot(t, intensity)
18 plt.plot(t, peak1, r-)
19 plt.plot(t, peak2, g-)
20 plt.xlim([4, 6])
21 plt.xlabel(Time (s))
22 plt.ylabel(Intensity (arb. units))
23 plt.legend([data, peak 1, peak 2])
24 plt.savefig(images/deconvolution-5.png)
Area 1 = 1310.48
Area 2 = 5325.71
Area 1 is 19.75% of the whole area
Area 2 is 80.25% of the whole area
<matplotlib.figure.Figure object at 0x11116fd30>
[<matplotlib.lines.Line2D object at 0x111410ac8>]
[<matplotlib.lines.Line2D object at 0x111410c50>]
[<matplotlib.lines.Line2D object at 0x1114184e0>]
(4, 6)
<matplotlib.text.Text object at 0x1113d95f8>
<matplotlib.text.Text object at 0x1113d9588>
<matplotlib.legend.Legend object at 0x11141d550>
310
This sample was air, and the first peak is oxygen, and the second peak
is nitrogen. we come pretty close to the actual composition of air, although
it is low on the oxygen content. To do better, one would have to use a
calibration curve.
In the end, the overlap of the peaks is pretty small, but it is still difficult
to reliably and reproducibly deconvolute them. By using an algorithm like
we have demonstrated here, it is possible at least to make the deconvolution
reproducible.
311
I got distracted looking for Shomate parameters for ethane today, and
came across this website on predicting the boiling point of water using the
Shomate equations. The basic idea is to find the temperature where the
Gibbs energy of water as a vapor is equal to the Gibbs energy of the liquid.
1 import numpy as np
2
312
3 T = np.linspace(0, 200) + 273.15
4 t = T / 1000.0
5
6 sTT = np.vstack([np.log(t),
7 t,
8 (t**2) / 2.0,
9 (t**3) / 3.0,
10 -1.0 / (2*t**2),
11 0 * t,
12 t**0,
13 0 * t**0]).T / 1000.0
14
15 hTT = np.vstack([t,
16 (t**2)/2.0,
17 (t**3)/3.0,
18 (t**4)/4.0,
19 -1.0 / t,
20 1 * t**0,
21 0 * t**0,
22 -1 * t**0]).T
23
24 Gliq = Hf_liq + np.dot(hTT, shomateL) - T*(np.dot(sTT, shomateL))
25 Ggas = Hf_gas + np.dot(hTT, shomateG) - T*(np.dot(sTT, shomateG))
26
27 from scipy.interpolate import interp1d
28 from scipy.optimize import fsolve
29
30 f = interp1d(T, Gliq - Ggas)
31 bp, = fsolve(f, 373)
32 print(The boiling point is {0} K.format(bp))
1 plt.figure(); plt.clf()
2 plt.plot(T-273.15, Gliq, T-273.15, Ggas)
3 plt.legend([liquid water, steam])
4
5 plt.xlabel(Temperature $^\circ$C)
6 plt.ylabel($\Delta G$ (kJ/mol))
7 plt.title(The boiling point is approximately {0:1.2f} $^\circ$C.format(bp-273.15))
8 plt.savefig(images/boiling-water.png)
313
14.3.1 Summary
The answer we get us 0.05 K too high, which is not bad considering we
estimated it using parameters that were fitted to thermodynamic data and
that had finite precision and extrapolated the steam properties below the
region the parameters were stated to be valid for.
314
1 import numpy as np
2
3 T = 1000 # K
4 R = 8.314e-3 # kJ/mol/K
5
6 P = 10.0 # atm, this is the total pressure in the reactor
7 Po = 1.0 # atm, this is the standard state pressure
We are going to store all the data and calculations in vectors, so we need
to assign each position in the vector to a species. Here are the definitions
we use in this work.
1 CO
2 H2O
3 CO2
4 H2
Now, construct the Gibbs free energy function, accounting for the change
in activity due to concentration changes (ideal mixing).
315
1 def func(nj):
2 nj = np.array(nj)
3 Enj = np.sum(nj);
4 Gj = Gjo / (R * T) + np.log(nj / Enj * P / Po)
5 return np.dot(nj, Gj)
We impose the constraint that all atoms are conserved from the initial
conditions to the equilibrium distribution of species. These constraints are
in the form of Aeq n = beq , where n is the vector of mole numbers for each
species.
316
1 yj = N / np.sum(N)
2 Pj = yj * P
3
4 for s, y, p in zip(species, yj, Pj):
5 print({0:10s}: {1:1.2f} {2:1.2f}.format(s, y, p))
CO : 0.23 2.28
H2O : 0.23 2.28
CO2 : 0.27 2.72
H2 : 0.27 2.72
1.43446295961
1 import numpy as np
2
3 R = 0.00198588 # kcal/mol/K
4 T = 1000 # K
5
6 species = [CH4, C2H4, C2H2, CO2, CO, O2, H2, H2O, C2H6]
7
317
8 # $G_^\circ for each species. These are the heats of formation for each
9 # species.
10 Gjo = np.array([4.61, 28.249, 40.604, -94.61, -47.942, 0, 0, -46.03, 26.13]) # kcal/mol
1 import numpy as np
2
3 def func(nj):
4 nj = np.array(nj)
5 Enj = np.sum(nj);
6 G = np.sum(nj * (Gjo / R / T + np.log(nj / Enj)))
7 return G
318
1 # initial guess suggested in the example
2 n0 = [1e-3, 1e-3, 1e-3, 0.993, 1.0, 1e-4, 5.992, 1.0, 1e-3]
3
4 #n0 = [0.066, 8.7e-08, 2.1e-14, 0.545, 1.39, 5.7e-14, 5.346, 1.521, 1.58e-7]
5
6 from scipy.optimize import fmin_slsqp
7 print(func(n0))
8
9 X = fmin_slsqp(func, n0, f_eqcons=ec1, f_ieqcons=ic1, iter=900, acc=1e-12)
10
11 for s,x in zip(species, X):
12 print({0:10s} {1:1.4g}.format(s, x))
13
14 # check that constraints were met
15 print(np.dot(Aeq, X) - beq)
16 print(np.all( np.abs( np.dot(Aeq, X) - beq) < 1e-12))
-104.429439817
Iteration limit exceeded (Exit mode 9)
Current function value: nan
Iterations: 101
Function evaluations: 2101
Gradient evaluations: 101
CH4 nan
C2H4 nan
C2H2 nan
CO2 nan
CO nan
O2 nan
H2 nan
H2O nan
C2H6 nan
[ nan nan nan]
False
319
nearly all the ethane is consumed, we do not get the full yield of hydrogen.
It appears that another equilibrium, one between CO, CO2, H2O and H2,
may be limiting that, since the rest of the hydrogen is largely in the water. It
is also of great importance that we have not said anything about reactions,
i.e. how these products were formed.
The water gas shift reaction is: CO + H2 O
CO2 + H2 . We can
compute the Gibbs free energy of the reaction from the heats of formation
of each species. Assuming these are the formation energies at 1000K, this is
the reaction free energy at 1000K.
-0.638
1.37887528109
1.37887525547
14.5.4 Summary
This is an appealing way to minimize the Gibbs energy of a mixture. No
assumptions about reactions are necessary, and the constraints are easy to
identify. The Gibbs energy function is especially easy to code.
320
14.6 The Gibbs free energy of a reacting mixture and the
equilibrium composition
Matlab post
In this post we derive the equations needed to find the equilibrium com-
position of a reacting mixture. We use the method of direct minimization
of the Gibbs free energy of the reacting mixture.
The Gibbs free energy of a mixture is defined as G = j nj where j
P
j
is the chemical potential of species j, and it is temperature and pressure
dependent, and nj is the number of moles of species j.
We define the chemical potential as j = Gj + RT ln aj , where Gj is the
Gibbs energy in a standard state, and aj is the activity of species j if the
pressure and temperature are not at standard state conditions.
If a reaction is occurring, then the number of moles of each species
are related to each other through the reaction extent and stoichiometric
coefficients: nj = nj0 +j . Note that the reaction extent has units of moles.
Combining these three equations and expanding the terms leads to:
G= nj0 Gj + j Gj + RT (nj0 + j ) ln aj
X X X
j j j
The first term is simply the initial Gibbs free energy that is present
before any reaction begins, and it is a constant. It is difficult to evaluate,
so we will move it to the left side of the equation in the next step, because
it does not matter what its value is since it is a constant. The second term
is related to the Gibbs free energy of reaction: r G = j Gj . With these
P
j
observations we rewrite the equation as:
nj0 Gj = r G + RT (nj0 + j ) ln aj
X X
G
j j
Now, we have an equation that allows us to compute the change in Gibbs
free energy as a function of the reaction extent, initial number of moles of
each species, and the activities of each species. This difference in Gibbs free
energy has no natural scale, and depends on the size of the system, i.e. on
nj0 . It is desirable to avoid this, so we now rescale the equation by the total
initial moles present, nT 0 and define a new variable 0 = /nT 0 , which is
dimensionless. This leads to:
G nj0 Gj
P
j
= r G0 + RT (yj0 + j 0 ) ln aj
X
nT 0 j
321
where yj0 is the initial mole fraction of species j present. The mole
fractions are intensive properties that do not depend on the system size.
y P
Finally, we need to address aj . For an ideal gas, we know that Aj = Pj ,
where the numerator is the partial pressure of species j computed from the
mole fraction of species j times the total pressure. To get the mole fraction
we note:
nj nj0 + j yj0 + j 0
yj = = =
nT 0 + j 1 + 0 j
P P
nT
j j
G nj0 Gj
P
j e = r G0 + RT yj0 + j 0 P
=G (yj0 + j 0 ) ln
e X
1+
0 j P
P
nT 0
j
j
1 import numpy as np
2
3 R = 8.314
4 P = 250000 # Pa
5 P0 = 100000 # Pa, approximately 1 atm
6 T = 400 # K
7
8 Grxn = -15564.0 #J/mol
9 yi0 = 0.5; yb0 = 0.5; yp0 = 0.0; # initial mole fractions
10
11 yj0 = np.array([yi0, yb0, yp0])
322
12 nu_j = np.array([-1.0, -1.0, 1.0]) # stoichiometric coefficients
13
14 def Gwigglewiggle(extentp):
15 diffg = Grxn * extentp
16 sum_nu_j = np.sum(nu_j)
17 for i,y in enumerate(yj0):
18 x1 = yj0[i] + nu_j[i] * extentp
19 x2 = x1 / (1.0 + extentp*sum_nu_j)
20 diffg += R * T * x1 * np.log(x2 * P / P0)
21 return diffg
There are bounds on how large 0 can be. Recall that nj = nj0 + j , and
that nj 0. Thus, max = nj0 /j , and the maximum value that 0 can
have is therefore yj0 /j where yj0 > 0. When there are multiple species,
you need the smallest epsilon0max to avoid getting negative mole numbers.
323
Now we simply minimize our Gwigglewiggle function. Based on the
figure above, the miminum is near 0.45.
0.46959618249
[<matplotlib.lines.Line2D object at 0x10b46ed68>]
324
To compute equilibrium mole fractions we do this:
1 K = np.exp(-Grxn/R/T)
2 print(K from delta G ,K)
3 print(K as ratio of mole fractions ,yp / (yi * yb) * P0 / P)
4 print(compact notation: ,np.prod((y_j * P / P0)**nu_j))
325
K from delta G 107.776294742
K as ratio of mole fractions 107.779200065
compact notation: 107.779200065
These results are very close, and only disagree because of the default
tolerance used in identifying the minimum of our function. You could tighten
the tolerances by setting options to the fminbnd function.
14.6.1 Summary
In this post we derived an equation for the Gibbs free energy of a reacting
mixture and used it to find the equilibrium composition. In future posts we
will examine some alternate forms of the equations that may be more useful
in some circumstances.
326
1 import numpy as np
2
3 T = np.linspace(500,1000) # degrees K
4 t = T/1000;
14.7.1 hydrogen
http://webbook.nist.gov/cgi/cbook.cgi?ID=C1333740&Units=SI&Mask=
1#Thermo-Gas
14.7.2 H_{2}O
http://webbook.nist.gov/cgi/cbook.cgi?ID=C7732185&Units=SI&Mask=
1#Thermo-Gas
Note these parameters limit the temperature range we can examine, as
these parameters are not valid below 500K. There is another set of param-
eters for lower temperatures, but we do not consider them here.
327
14.7.3 CO
http://webbook.nist.gov/cgi/cbook.cgi?ID=C630080&Units=SI&Mask=
1#Thermo-Gas
14.7.4 CO_{2}
http://webbook.nist.gov/cgi/cbook.cgi?ID=C124389&Units=SI&Mask=
1#Thermo-Gas
328
3 Grxn_29815 = Hrxn_29815 - 298.15*(Srxn_29815)/1000;
4
5 print(deltaH = {0:1.2f}.format(Hrxn_29815))
6 print(deltaG = {0:1.2f}.format(Grxn_29815))
deltaH = -41.15
deltaG = -28.62
i
Where i are the stoichiometric coefficients of each species, with appro-
priate sign for reactants and products, and (Hi (T ) Hi (Tref ) is precisely
what is calculated for each species with the equations
The entropy is on an absolute scale, so we directly calculate entropy at
each temperature. Recall that H is in kJ/mol and S is in J/mol/K, so we
divide S by 1000 to make the units match.
329
Over this temperature range the reaction is exothermic, although near
1000K it is just barely exothermic. At higher temperatures we expect the
reaction to become endothermic.
1 R = 8.314e-3 # kJ/mol/K
2 K = np.exp(-Grxn/R/T);
3
4 plt.figure()
5 plt.plot(T,K)
6 plt.xlim([500, 1000])
7 plt.xlabel(Temperature (K))
8 plt.ylabel(Equilibrium constant)
9 plt.savefig(images/wgs-nist-2.png)
330
14.7.9 Equilibrium yield of WGS
Now let us suppose we have a reactor with a feed of H_2O and CO at
10atm at 1000K. What is the equilibrium yield of H_2? Let be the extent
of reaction, so that Fi = Fi,0 + i . For reactants, i is negative, and for
products, i is positive. We have to solve for the extent of reaction that
satisfies the equilibrium condition.
331
19
20
21 # If we let X be fractional conversion then we have $C_A = C_{A0}(1-X)$,
22 # $C_B = C_{B0}-C_{A0}X$, $C_C = C_{C0}+C_{A0}X$, and $C_D =
23 # C_{D0}+C_{A0}X$. We also have $K(T) = (C_C C_D)/(C_A C_B)$, which finally
24 # reduces to $0 = K(T) - Xeq^2/(1-Xeq)^2$ under these conditions.
25
26 def f(X):
27 return K_Temperature - X**2/(1-X)**2;
28
29 x0 = 0.5
30 Xeq, = fsolve(f, x0)
31
32 print(The equilibrium conversion for these feed conditions is: {0:1.2f}.format(Xeq))
1 P_CO = Pa0*(1-Xeq)
2 P_H2O = Pa0*(1-Xeq)
3 P_H2 = Pa0*Xeq
4 P_CO2 = Pa0*Xeq
5
6 print(P_CO,P_H2O, P_H2, P_CO2)
1 print(K_Temperature)
2 print((P_CO2*P_H2)/(P_CO*P_H2O))
332
1.4352267476228722
1.43522674762
14.7.12 Summary
The NIST Webbook provides a plethora of data for computing thermody-
namic properties. It is a little tedious to enter it all into Matlab, and a little
tricky to use the data to estimate temperature dependent reaction energies.
A limitation of the Webbook is that it does not tell you have the thermo-
dynamic properties change with pressure. Luckily, those changes tend to be
small.
I noticed a different behavior in interpolation between scipy.interpolate.interp1d
and Matlabs interp1. The scipy function returns an interpolating function,
whereas the Matlab function directly interpolates new values, and returns
the actual interpolated data.
1 import numpy as np
2
3 def gibbs(E):
4 function defining Gibbs free energy as a function of reaction extents
5 e1 = E[0]
6 e2 = E[1]
7 # known equilibrium constants and initial amounts
8 K1 = 108; K2 = 284; P = 2.5
9 yI0 = 0.5; yB0 = 0.5; yP10 = 0.0; yP20 = 0.0
333
10 # compute mole fractions
11 d = 1 - e1 - e2
12 yI = (yI0 - e1 - e2) / d
13 yB = (yB0 - e1 - e2) / d
14 yP1 = (yP10 + e1) / d
15 yP2 = (yP20 + e2) / d
16 G = (-(e1 * np.log(K1) + e2 * np.log(K2)) +
17 d * np.log(P) + yI * d * np.log(yI) +
18 yB * d * np.log(yB) + yP1 * d * np.log(yP1) + yP2 * d * np.log(yP2))
19 return G
The equilibrium constants for these reactions are known, and we seek
to find the equilibrium reaction extents so we can determine equilibrium
compositions. The equilibrium reaction extents are those that minimize the
Gibbs free energy. We have the following constraints, written in standard
less than or equal to form:
1 0
2 0
1 + 2 0.5
In Matlab we express this in matrix form as Ax=b where
1 0
A = 0 1 (47)
1 1
and
b= 0 (48)
0.5
Unlike in Matlab, in python we construct the inequality constraints as
functions that are greater than or equal to zero when the constraint is met.
1 def constraint1(E):
2 e1 = E[0]
3 return e1
4
5
6 def constraint2(E):
7 e2 = E[1]
8 return e2
9
10
11 def constraint3(E):
12 e1 = E[0]
13 e2 = E[1]
14 return 0.5 - (e1 + e2)
334
Now, we minimize.
One way we can verify our solution is to plot the gibbs function and
see where the minimum is, and whether there is more than one minimum.
We start by making grids over the range of 0 to 0.5. Note we actually
start slightly above zero because at zero there are some numerical imaginary
elements of the gibbs function or it is numerically not defined since there are
logs of zero there. We also set all elements where the sum of the two extents
is greater than 0.5 to near zero, since those regions violate the constraints.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 def gibbs(E):
5 function defining Gibbs free energy as a function of reaction extents
6 e1 = E[0]
7 e2 = E[1]
8 # known equilibrium constants and initial amounts
9 K1 = 108; K2 = 284; P = 2.5;
10 yI0 = 0.5; yB0 = 0.5; yP10 = 0.0; yP20 = 0.0;
11 # compute mole fractions
12 d = 1 - e1 - e2;
13 yI = (yI0 - e1 - e2)/d;
14 yB = (yB0 - e1 - e2)/d;
15 yP1 = (yP10 + e1)/d;
16 yP2 = (yP20 + e2)/d;
17 G = (-(e1 * np.log(K1) + e2 * np.log(K2)) +
18 d * np.log(P) + yI * d * np.log(yI) +
19 yB * d * np.log(yB) + yP1 * d * np.log(yP1) + yP2 * d * np.log(yP2))
20 return G
335
21
22
23 a = np.linspace(0.001, 0.5, 100)
24 E1, E2 = np.meshgrid(a,a)
25
26 sumE = E1 + E2
27 E1[sumE >= 0.5] = 0.00001
28 E2[sumE >= 0.5] = 0.00001
29
30 # now evaluate gibbs
31 G = np.zeros(E1.shape)
32 m,n = E1.shape
33
34 G = gibbs([E1, E2])
35
36 CS = plt.contour(E1, E2, G, levels=np.linspace(G.min(),G.max(),100))
37 plt.xlabel($\epsilon_1$)
38 plt.ylabel($\epsilon_2$)
39 plt.colorbar()
40
41 plt.plot([0.13336503], [0.35066486], ro)
42
43 plt.savefig(images/gibbs-minimization-1.png)
44 plt.savefig(images/gibbs-minimization-1.svg)
45 plt.show()
You can see we found the minimum. We can compute the mole fractions
pretty easily.
336
1 e1 = X[0];
2 e2 = X[1];
3
4 yI0 = 0.5; yB0 = 0.5; yP10 = 0; yP20 = 0; #initial mole fractions
5
6 d = 1 - e1 - e2;
7 yI = (yI0 - e1 - e2) / d
8 yB = (yB0 - e1 - e2) / d
9 yP1 = (yP10 + e1) / d
10 yP2 = (yP20 + e2) / d
11
12 print(y_I = {0:1.3f} y_B = {1:1.3f} y_P1 = {2:1.3f} y_P2 = {3:1.3f}.format(yI,yB,yP1,yP2))
>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> y_I = 0.031 y_B = 0.031 y_P1 = 0.258 y_P2 = 0
14.8.1 summary
I found setting up the constraints in this example to be more confusing than
the Matlab syntax.
mass of phase i, and Ei is the energy per unit mass of phase i. There are some
constraints to ensure conservation of mass. Let us consider the following
compounds: Al, NiAl, Ni3Al, and Ni, and consider a case where the bulk
composition of our alloy is 93.8% Ni and balance Al. We want to know which
phases are present, and in what proportions. There are some subtleties in
considering the formula and molecular weight of an alloy. We consider the
formula with each species amount normalized so the fractions all add up to
one. For example, Ni_3Al is represented as Ni_{0.75}Al_{0.25}, and the
molecular weight is computed as 0.75*MW_{Ni} + 0.25*MW_{Al}.
We use scipy.optimize.fmin_slsqp to solve this problem, and define two
equality constraint functions, and the bounds on each weight fraction.
Note: the energies in this example were computed by density functional
theory at 0K.
337
1 import numpy as np
2 from scipy.optimize import fmin_slsqp
3
4 # these are atomic masses of each species
5 Ni = 58.693
6 Al = 26.982
7
8 COMPOSITIONS = [Al, NiAl, Ni3Al, Ni]
9 MW = np.array( [Al, (Ni + Al)/2.0, (3 * Ni + Al)/4.0, Ni])
10
11 xNi = np.array([0.0, 0.5, 0.75, 1.0]) # mole fraction of nickel in each compd
12 WNi = xNi * Ni / MW # weight fraction of Ni in each cmpd
13
14 ENERGIES = np.array([0.0, -0.7, -0.5, 0.0])
15
16 BNi = 0.938
17
18 def G(w):
19 function to minimize. w is a vector of weight fractions, ENERGIES is defined above.
20 return np.dot(w, ENERGIES)
21
22 def ec1(w):
23 conservation of Ni constraint
24 return BNi - np.dot(w, WNi)
25
26 def ec2(w):
27 weight fractions sum to one constraint
28 return 1 - np.sum(w)
29
30 w0 = np.array([0.0, 0.0, 0.5, 0.5]) # guess weight fractions
31
32 y = fmin_slsqp(G,
33 w0,
34 eqcons=[ec1, ec2],
35 bounds=[(0,1)]*len(w0))
36
37 for ci, wi in zip(COMPOSITIONS, y):
38 print({0:8s} {1:+8.2%}.format(ci, wi))
338
It may be convenient to formulate this in terms of moles.
1 import numpy as np
2 from scipy.optimize import fmin_slsqp
3
4 COMPOSITIONS = [Al, NiAl, Ni3Al, Ni]
5 xNi = np.array([0.0, 0.5, 0.75, 1.0]) # define this in mole fractions
6
7 ENERGIES = np.array([0.0, -0.7, -0.5, 0.0])
8
9 xNiB = 0.875 # bulk Ni composition
10
11 def G(n):
12 function to minimize
13 return np.dot(n, ENERGIES)
14
15 def ec1(n):
16 conservation of Ni
17 Ntot = np.sum(n)
18 return (Ntot * xNiB) - np.dot(n, xNi)
19
20 def ec2(n):
21 mole fractions sum to one
22 return 1 - np.sum(n)
23
24 n0 = np.array([0.0, 0.0, 0.45, 0.55]) # initial guess of mole fractions
25
26 y = fmin_slsqp(G,
27 n0,
28 eqcons=[ec1, ec2],
29 bounds=[(0, 1)]*(len(n0)))
30
31 for ci, xi in zip(COMPOSITIONS, y):
32 print({0:8s} {1:+8.2%}.format(ci, xi))
339
whereas we predict an equimolar mixture of the two phases. Below we com-
pute the mole fraction of Ni in each case.
0.875
0.874192746384855
You can see the overall mole fraction of Ni is practically the same in each
case.
1 import numpy as np
2 nu = [-1, -1, 1, 1];
340
3 M = [28, 18, 2, 44];
4 print(np.dot(nu, M))
You can see that sum of the stoichiometric coefficients times molecular
weights is zero. In other words a CO and H_2O have the same mass as H_2
and CO_2.
For any balanced chemical equation, there are the same number of each
kind of atom on each side of the equation. Since the mass of each atom is
unchanged with reaction, that means the mass of all the species that are
reactants must equal the mass of all the species that are products! Here we
look at the number of C, O, and H on each side of the reaction. Now if we
add the mass of atoms in the reactants and products, it should sum to zero
(since we used the negative sign for stoichiometric coefficients of reactants).
1 import numpy as np
2 # C O H
3 reactants = [-1, -2, -2]
4 products = [ 1, 2, 2]
5
6 atomic_masses = [12.011, 15.999, 1.0079] # atomic masses
7
8 print(np.dot(reactants, atomic_masses) + np.dot(products, atomic_masses))
0.0
341
A mole balance on the particle volume in spherical coordinates with
2
a first order reaction leads to: ddrCa
2 + r dr D CA = 0 with boundary
2 dCa k
e
conditions CA (R) = CAs and dCadr = 0 at r = 0. We convert this equation to
a system of first order ODEs by letting WA = dr . Then, our two equations
dCa
become:
dr = WA
dCa
and
dr = r WA + DE CA
dWA 2 k
have dr = 2 dr + DE CA
dWA dWA k
or dW dr = DE CA at r = 0.
A 3k
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 De = 0.1 # diffusivity cm^2/s
6 R = 0.5 # particle radius, cm
7 k = 6.4 # rate constant (1/s)
8 CAs = 0.2 # concentration of A at outer radius of particle (mol/L)
9
10
11 def ode(Y, r):
12 Wa = Y[0] # molar rate of delivery of A to surface of particle
13 Ca = Y[1] # concentration of A in the particle at r
14 # this solves the singularity at r = 0
15 if r == 0:
16 dWadr = k / 3.0 * De * Ca
17 else:
18 dWadr = -2 * Wa / r + k / De * Ca
19 dCadr = Wa
20 return [dWadr, dCadr]
21
22 # Initial conditions
23 Ca0 = 0.029315 # Ca(0) (mol/L) guessed to satisfy Ca(R) = CAs
24 Wa0 = 0 # no flux at r=0 (mol/m^2/s)
25
26 rspan = np.linspace(0, R, 500)
27
342
28 Y = odeint(ode, [Wa0, Ca0], rspan)
29
30 Ca = Y[:, 1]
31
32 # here we check that Ca(R) = Cas
33 print(At r={0} Ca={1}.format(rspan[-1], Ca[-1]))
34
35 plt.plot(rspan, Ca)
36 plt.xlabel(Particle radius)
37 plt.ylabel($C_A$)
38 plt.savefig(images/effectiveness-factor.png)
39
40 r = rspan
41 eta_numerical = (np.trapz(k * Ca * 4 * np.pi * (r**2), r)
42 / np.trapz(k * CAs * 4 * np.pi * (r**2), r))
43
44 print(eta_numerical)
45
46 phi = R * np.sqrt(k / De)
47 eta_analytical = (3 / phi**2) * (phi * (1.0 / np.tanh(phi)) - 1)
48 print(eta_analytical)
At r=0.5 Ca=0.20000148865173356
0.563011348314
0.563003362801
343
than it can diffuse into the particle. Hence, the overall reaction rate in the
particle is lower than it would be without the diffusion limit.
The effectiveness factor is the ratio of the actual reaction rate in the
particle with diffusion limitation to the ideal rate in the particle if there was
no concentration gradient:
R R 00
k aCA (r)4r2 dr
= 0R R
0 k 00 aCAs 4r2 dr
We will evaluate this numerically from our solution and compare it to
the analytical solution. The results are in good agreement, and you can
make the numerical estimate better by increasing the number of points in
the solution so that the numerical integration is more accurate.
Why go through the numerical solution when an analytical solution ex-
ists? The analytical solution here is only good for 1st order kinetics in a
sphere. What would you do for a complicated rate law? You might be able
to find some limiting conditions where the analytical equation above is rel-
evant, and if you are lucky, they are appropriate for your problem. If not,
it is a good thing you can figure this out numerically!
Thanks to Radovan Omorjan for helping me figure out the ODE at r=0!
344
It is known that (T ) =46.048 + 9.418T 0.0329T
2 + 4.882 105
is in kg/(m*s).
2
The aim is to find D that solves: p = 2fF Lv D . This is a nonlinear
equation in D, since D affects the fluid velocity, the Re, and the Fanning
friction factor. Here is the solution
1 import numpy as np
2 from scipy.optimize import fsolve
3 import matplotlib.pyplot as plt
4
5 T = 25 + 273.15
6 Q = 2.5e-3 # m^3/s
7 deltaP = 103000 # Pa
8 deltaL = 100 # m
9
10 #Note these correlations expect dimensionless T, where the magnitude
11 # of T is in K
12
13 def rho(T):
14 return 46.048 + 9.418 * T -0.0329 * T**2 +4.882e-5 * T**3 - 2.895e-8 * T**4
15
16 def mu(T):
17 return np.exp(-10.547 + 541.69 / (T - 144.53))
18
19 def fanning_friction_factor_(Re):
20 if Re < 2100:
21 raise Exception(Flow is probably not turbulent, so this correlation is not appropriate.)
22 # solve the Nikuradse correlation to get the friction factor
23 def fz(f): return 1.0/np.sqrt(f) - (4.0*np.log10(Re*np.sqrt(f))-0.4)
24 sol, = fsolve(fz, 0.01)
25 return sol
26
27 fanning_friction_factor = np.vectorize(fanning_friction_factor_)
28
29 Re = np.linspace(2200, 9000)
30 f = fanning_friction_factor(Re)
31
32 plt.plot(Re, f)
33 plt.xlabel(Re)
34 plt.ylabel(fanning friction factor)
35 # You can see why we use 0.01 as an initial guess for solving for the
36 # Fanning friction factor; it falls in the middle of ranges possible
37 # for these Re numbers.
38 plt.savefig(images/pipe-diameter-1.png)
39
40 def objective(D):
41 v = Q / (np.pi * D**2 / 4)
42 Re = D * v * rho(T) / mu(T)
43
44 fF = fanning_friction_factor(Re)
45
46 return deltaP - 2 * fF * rho(T) * deltaL * v**2 / D
345
47
48 D, = fsolve(objective, 0.04)
49
50 print(The minimum pipe diameter is {0} m\n.format(D))
Any pipe diameter smaller than that value will result in a larger pressure
drop at the same volumetric flow rate, or a smaller volumetric flowrate at
the same pressure drop. Either way, it will not meet the design specification.
Antoine Coefficients
log(P) = A-B/(T+C) where P is in mmHg and T is in Celsius
Source of data: Yaws and Yang (Yaws, C. L. and Yang, H. C.,
"To estimate vapor pressure easily. antoine coefficients relate vapor pressure to tem
To use this data, you find the line that has the compound you want, and
read off the data. You could do that manually for each component you want
but that is tedious, and error prone. Today we will see how to retrieve the
file, then read the data into python to create a database we can use to store
and retrieve the data.
We will use the data to find the temperature at which the vapor pressure
of acetone is 400 mmHg.
We use numpy.loadtxt to read the file, and tell the function the format
of each column. This creates a special kind of record array which we can
access data by field name.
346
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 data = np.loadtxt(data/antoine_data.dat,
5 dtype=[(id, np.int),
6 (formula, S8),
7 (name, S28),
8 (A, np.float),
9 (B, np.float),
10 (C, np.float),
11 (Tmin, np.float),
12 (Tmax, np.float),
13 (??, S4),
14 (?, S4)],
15 skiprows=7)
16
17 names = data[name]
18
19 acetone = data[names == acetone]
20
21 # for readability we unpack the array into variables
22 id, formula, name, A, B, C, Tmin, Tmax, u1, u2 = acetone
23
24 T = np.linspace(Tmin, Tmax)
25 P = 10**(A - B / ( T + C))
26 plt.plot(T, P)
27 plt.xlabel(T ($^\circ$C))
28 plt.ylabel(P$_{vap}$ (mmHg))
29
30 # Find T at which Pvap = 400 mmHg
31 # from our graph we might guess T ~ 40 ^{\circ}C
32
33 def objective(T):
34 return 400 - 10**(A - B / (T + C))
35
36 from scipy.optimize import fsolve
37 Tsol, = fsolve(objective, 40)
38 print(Tsol)
39 print(The vapor pressure is 400 mmHg at T = {0:1.1f} degC.format(Tsol))
40
41 #Plot CRC data http://en.wikipedia.org/wiki/Acetone_%28data_page%29#Vapor_pressure_of_liquid
42 # We only include the data for the range where the Antoine fit is valid.
43
44 Tcrc = [-59.4, -31.1, -9.4, 7.7, 39.5, 56.5]
45 Pcrc = [ 1, 10, 40, 100, 400, 760]
46
47 plt.plot(Tcrc, Pcrc, bo)
48 plt.legend([Antoine,CRC Handbook], loc=best)
49 plt.savefig(images/antoine-2.png)
56.9792617813
The vapor pressure is 400 mmHg at T = 57.0 degC
347
This result is close to the value reported here (39.5 degC), from the CRC
Handbook. The difference is probably that the value reported in the CRC
is an actual experimental number.
348
14.14 Calculating a bubble point pressure of a mixture
Matlab post
Adapted from http://terpconnect.umd.edu/~nsw/ench250/bubpnt.
htm (dead link)
We previously learned to read a datafile containing lots of Antoine coef-
ficients into a database, and use the coefficients to estimate vapor pressure
of a single compound. Here we use those coefficents to compute a bubble
point pressure of a mixture.
The bubble point is the temperature at which the sum of the component
vapor pressures is equal to the the total pressure. This is where a bubble of
vapor will first start forming, and the mixture starts to boil.
Consider an equimolar mixture of benzene, toluene, chloroform, acetone
and methanol. Compute the bubble point at 760 mmHg, and the gas phase
composition. The gas phase composition is given by: yi = xi Pi /PT .
1 import numpy as np
2 from scipy.optimize import fsolve
3
4 # load our thermodynamic data
5 data = np.loadtxt(data/antoine_data.dat,
6 dtype=[(id, np.int),
7 (formula, S8),
8 (name, S28),
9 (A, np.float),
10 (B, np.float),
11 (C, np.float),
12 (Tmin, np.float),
13 (Tmax, np.float),
14 (??, S4),
15 (?, S4)],
16 skiprows=7)
17
18 compounds = [benzene, toluene, chloroform, acetone, methanol]
19
20 # extract the data we want
21 A = np.array([data[data[name] == x.encode(encoding=UTF-8)][A][0]
22 for x in compounds])
23 B = np.array([data[data[name] == x.encode(encoding=UTF-8)][B][0]
24 for x in compounds])
25 C = np.array([data[data[name] == x.encode(encoding=UTF-8)][C][0]
26 for x in compounds])
27 Tmin = np.array([data[data[name] == x.encode(encoding=UTF-8)][Tmin][0]
28 for x in compounds])
29 Tmax = np.array([data[data[name] == x.encode(encoding=UTF-8)][Tmax][0]
30 for x in compounds])
31
32 # we have an equimolar mixture
33 x = np.array([0.2, 0.2, 0.2, 0.2, 0.2])
34
349
35 # Given a T, we can compute the pressure of each species like this:
36
37 T = 67 # degC
38 P = 10**(A - B / (T + C))
39 print(P)
40 print(np.dot(x, P)) # total mole-fraction weighted pressure
41
42 Tguess = 67
43 Ptotal = 760
44
45 def func(T):
46 P = 10**(A - B / (T + C))
47 return Ptotal - np.dot(x, P)
48
49 Tbubble, = fsolve(func, Tguess)
50
51 print(The bubble point is {0:1.2f} degC.format(Tbubble))
52
53 # double check answer is in a valid T range
54 if np.any(Tbubble < Tmin) or np.any(Tbubble > Tmax):
55 print(T_bubble is out of range!)
56
57 # print gas phase composition
58 y = x * 10**(A - B / (Tbubble + C))/Ptotal
59
60 for cmpd, yi in zip(compounds, y):
61 print(y_{0:<10s} = {1:1.3f}.format(cmpd, yi))
14.15 The equal area method for the van der Waals equation
Matlab post
When a gas is below its Tc the van der Waal equation oscillates. In the
portion of the isotherm where PR /Vr > 0, the isotherm fails to describe
real materials, which phase separate into a liquid and gas in this region.
Maxwell proposed to replace this region by a flat line, where the area
above and below the curves are equal. Today, we examine how to identify
where that line should be.
350
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 Tr = 0.9 # A Tr below Tc: Tr = T/Tc
5 # analytical equation for Pr. This is the reduced form of the van der Waal
6 # equation.
7 def Prfh(Vr):
8 return 8.0 / 3.0 * Tr / (Vr - 1.0 / 3.0) - 3.0 / (Vr**2)
9
10 Vr = np.linspace(0.5, 4, 100) # vector of reduced volume
11 Pr = Prfh(Vr) # vector of reduced pressure
12
13 plt.clf()
14 plt.plot(Vr,Pr)
15 plt.ylim([0, 2])
16 plt.xlabel($V_R$)
17 plt.ylabel($P_R$)
18 plt.savefig(images/maxwell-eq-area-1.png)
The idea is to pick a Pr and draw a line through the EOS. We want
the areas between the line and EOS to be equal on each side of the middle
intersection. Let us draw a line on the figure at y = 0.65.
351
1 y = 0.65
2
3 plt.plot([0.5, 4.0], [y, y], k--)
4 plt.savefig(images/maxwell-eq-area-2.png)
To find the areas, we need to know where the intersection of the vdW
eqn with the horizontal line. This is the same as asking what are the roots
of the vdW equation at that Pr. We need all three intersections so we can
integrate from the first root to the middle root, and then the middle root
to the third root. We take advantage of the polynomial nature of the vdW
equation, which allows us to use the roots command to get all the roots at
once. The polynomial is VR3 13 (1 + 8TR /PR ) + 3/PR 1/PR = 0. We use
the coefficients t0 get the roots like this.
352
[ 0.60286812 1.09743234 2.32534056]
[<matplotlib.lines.Line2D object at 0x110b2ef98>, <matplotlib.lines.Line2D object at
0.063225945606 0.0580212098122
0.321466743765 -0.798140339268
353
3 def equal_area(y):
4 Tr = 0.9
5 vdWp = [1, -1.0 / 3 * ( 1.0 + 8.0 * Tr / y), 3.0 / y, -1.0 / y]
6 v = np.roots(vdWp)
7 v.sort()
8 A1 = (v[1] - v[0]) * y - quad(Prfh, v[0], v[1])
9 A2 = quad(Prfh, v[1], v[2]) - (v[2] - v[1]) * y
10 return A1 - A2
11
12 y_eq, = fsolve(equal_area, 0.65)
13 print(y_eq)
14
15 Tr = 0.9
16 vdWp = [1, -1.0 / 3 * ( 1.0 + 8.0 * Tr / y_eq), 3.0 / y_eq, -1.0 / y_eq]
17 v = np.roots(vdWp)
18 v.sort()
19
20 A1, e1 = (v[1] - v[0]) * y_eq - quad(Prfh, v[0], v[1])
21 A2, e2 = quad(Prfh, v[1], v[2]) - (v[2] - v[1]) * y_eq
22
23 print(A1, A2)
0.646998351872
0.0617526473994 0.0617526473994
Now let us plot the equal areas and indicate them by shading.
1 fig = plt.gcf()
2 ax = fig.add_subplot(111)
3
4 ax.plot(Vr,Pr)
5
6 hline = np.ones(Vr.size) * y_eq
7
8 ax.plot(Vr, hline)
9 ax.fill_between(Vr, hline, Pr, where=(Vr >= v[0]) & (Vr <= v[1]), facecolor=gray)
10 ax.fill_between(Vr, hline, Pr, where=(Vr >= v[1]) & (Vr <= v[2]), facecolor=gray)
11
12 plt.text(v[0], 1, A1 = {0}.format(A1))
13 plt.text(v[2], 1, A2 = {0}.format(A2))
14 plt.xlabel($V_R$)
15 plt.ylabel($P_R$)
16 plt.title($T_R$ = 0.9)
17
18 plt.savefig(images/maxwell-eq-area-4.png)
19 plt.savefig(images/maxwell-eq-area-4.svg)
354
<matplotlib.text.Text object at 0x1115c2ac8>
<matplotlib.text.Text object at 0x1118daef0>
<matplotlib.text.Text object at 0x111347080>
<matplotlib.text.Text object at 0x111347e48>
<matplotlib.text.Text object at 0x111360d68>
355
8 # net rate for production of B: -ra + rb
9
10 k1 = 1 # 1/min;
11 k_1 = 0.5 # 1/min;
12
13 Ca = C[0]
14 Cb = C[1]
15
16 ra = -k1 * Ca
17 rb = -k_1 * Cb
18
19 dCadt = ra - rb
20 dCbdt = -ra + rb
21
22 dCdt = [dCadt, dCbdt]
23 return dCdt
24
25 tspan = np.linspace(0, 5)
26
27 init = [1, 0] # mol/L
28 C = odeint(myode, init, tspan)
29
30 Ca = C[:,0]
31 Cb = C[:,1]
32
33 import matplotlib.pyplot as plt
34 plt.plot(tspan, Ca, tspan, Cb)
35 plt.xlabel(Time (min))
36 plt.ylabel(C (mol/L))
37 plt.legend([$C_A$, $C_B$])
38 plt.savefig(images/reversible-batch.png)
356
That is it. The main difference between this and Matlab is the order of
arguments in odeint is different, and the ode function has differently ordered
arguments.
357
4 z = 1.44 - (Xe**2)/(1-Xe)**2
5 return z
6
7 X0 = 0.5
8 Xe, = fsolve(func, X0)
9 print(The equilibrium conversion is X = {0:1.2f}.format(Xe))
we derive the following design equation for the length of time required to
achieve a particular
RX
level of conversion :
t(X) = kC1A0 X=0 dX
(1X)2
if k = 103 L/mol/s and CA0 = 1 mol/L, estimate the time to achieve
90% conversion.
We could analytically solve the integral and evaluate it, but instead we
will numerically evaluate it using scipy.integrate.quad. This function returns
two values: the evaluated integral, and an estimate of the absolute error in
the answer.
You can see the estimate error is very small compared to the solution.
358
To do this we have to write a function that takes arguments with uncer-
tainty, and wrap the function with the uncertainties.wrap decorator. The
function must return a single float number (current limitation of the uncer-
tainties package). Then, we simply call the function, and the uncertainties
from the inputs will be automatically propagated to the outputs. Let us say
there is about 10% uncertainty in the rate constant, and 1% uncertainty in
the initial concentration.
The result shows about a 10% uncertainty in the time, which is similar
to the largest uncertainty in the inputs. This information should certainly
be used in making decisions about how long to actually run the reactor to
be sure of reaching the goal. For example, in this case, running the reactor
for 3 hours (that is roughly + 2) would ensure at a high level of confidence
(approximately 95% confidence) that you reach at least 90% conversion.
359
1 from scipy.integrate import odeint
2 import numpy as np
3 import matplotlib.pyplot as plt
4
5 k = 1.0e-3
6 Ca0 = 1.0 # mol/L
7
8 def func(X, t):
9 ra = -k * (Ca0 * (1 - X))**2
10 return -ra / Ca0
11
12 X0 = 0
13 tspan = np.linspace(0,10000)
14
15 sol = odeint(func, X0, tspan)
16 plt.plot(tspan,sol)
17 plt.xlabel(Time (sec))
18 plt.ylabel(Conversion)
19 plt.savefig(images/2013-01-06-batch-conversion.png)
You can read off of this figure to find the time required to achieve a
particular conversion.
360
14.21 Plug flow reactor with a pressure drop
If there is a pressure drop in a plug flow reactor, 2 there are two equations
needed to determine the exit conversion: one for the conversion, and one
from the pressure drop.
dX k0 1X
= y (49)
dW FA 0 1 + X
dX (1 + X)
= (50)
dy 2y
Here is how to integrate these equations numerically in python.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4
5 kprime = 0.0266
6 Fa0 = 1.08
7 alpha = 0.0166
8 epsilon = -0.15
9
10 def dFdW(F, W):
11 set of ODEs to integrate
12 X = F[0]
13 y = F[1]
14 dXdW = kprime / Fa0 * (1-X) / (1 + epsilon*X) * y
15 dydW = -alpha * (1 + epsilon * X) / (2 * y)
16 return [dXdW, dydW]
17
18 Wspan = np.linspace(0,60)
19 X0 = 0.0
20 y0 = 1.0
21 F0 = [X0, y0]
22 sol = odeint(dFdW, F0, Wspan)
23
24 # now plot the results
25 plt.plot(Wspan, sol[:,0], label=Conversion)
26 plt.plot(Wspan, sol[:,1], g--, label=y=$P/P_0$)
27 plt.legend(loc=best)
28 plt.xlabel(Catalyst weight (lb_m))
29 plt.savefig(images/2013-01-08-pdrop.png)
361
14.22 Solving CSTR design equations
Given a continuously stirred tank reactor with a volume of 66,000 dm3
where the reaction A B occurs, at a rate of rA = kCA2 (k = 3 L/mol/h),
1. FA = v0CA
362
11 Fa = v0 * Ca # exit molar flow of A
12 ra = -k * Ca**2 # rate of reaction of A L/mol/h
13 return Fa0 - Fa + V * ra
14
15 # CA guess that that 90 % is reacted away
16 CA_guess = 0.1 * Fa0 / v0
17 CA_sol, = fsolve(func, CA_guess)
18
19 print(The exit concentration is {0} mol/L.format(CA_sol))
Problem statement: A Rankine cycle operates using steam with the con-
denser at 100 degC, a pressure of 3.0 MPa and temperature of 600 degC in
the boiler. Assuming the compressor and turbine operate reversibly, esti-
mate the efficiency of the cycle.
Starting point in the Rankine cycle in condenser.
we have saturated liquid here, and we get the thermodynamic properties
for the given temperature. In this python module, these properties are all
in attributes of an IAPWS object created at a set of conditions.
1 #import iapws
2 #print iapws.__version__
3 from iapws import IAPWS97
4
5 T1 = 100 + 273.15 #in K
6
7 sat_liquid1 = IAPWS97(T=T1, x=0) # x is the steam quality. 0 = liquid
363
8
9 P1 = sat_liquid1.P
10 s1 = sat_liquid1.s
11 h1 = sat_liquid1.h
12 v1 = sat_liquid1.v
1 P2 = 3.0 # MPa
2 s2 = s1 # this is what isentropic means
3
4 sat_liquid2 = IAPWS97(P=P2, s=s1)
5 T2, = sat_liquid2.T
6 h2 = sat_liquid2.h
7
8 # work done to compress liquid. This is an approximation, since the
9 # volume does change a little with pressure, but the overall work here
10 # is pretty small so we neglect the volume change.
11 WdotP = v1*(P2 - P1);
12
13 print(The compressor work is: {0:1.4f} kJ/kg.format(WdotP))
1 T3 = 600 + 273.15 # K
2 P3 = P2 # definition of isobaric
3 steam = IAPWS97(P=P3, T=T3)
4
5 h3 = steam.h
6 s3 = steam.s
7
8 Qb, = h3 - h2 # heat required to make the steam
9
10 print(The boiler heat duty is: {0:1.2f} kJ/kg.format(Qb))
364
14.23.4 Isentropic expansion through turbine to point 4
14.23.6 Efficiency
This is a ratio of the work put in to make the steam, and the net work
obtained from the turbine. The answer here agrees with the efficiency cal-
culated in Sandler on page 135.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 plt.figure()
5 plt.clf()
6 T = np.linspace(300, 372+273, 200) # range of temperatures
7 for P in [0.1, 1, 2, 5, 10, 20]: #MPa
365
8 steam = [IAPWS97(T=t, P=P) for t in T]
9 S = [s.s for s in steam]
10 plt.plot(S, T, k-)
11
12 # saturated vapor and liquid entropy lines
13 svap = [s.s for s in [IAPWS97(T=t, x=1) for t in T]]
14 sliq = [s.s for s in [IAPWS97(T=t, x=0) for t in T]]
15
16 plt.plot(svap, T, r-)
17 plt.plot(sliq, T, b-)
18
19 plt.xlabel(Entropy (kJ/(kg K))
20 plt.ylabel(Temperature (K))
21 plt.savefig(images/iawps-steam.png)
366
We can plot our Rankine cycle path like this. We compute the entropies
along the non-isentropic paths.
367
14.23.8 Summary
This was an interesting exercise. On one hand, the tedium of interpolating
the steam tables is gone. On the other hand, you still have to know exactly
what to ask for to get an answer that is correct. The iapws interface is a
little clunky, and takes some getting used to. It does not seem as robust as
the Xsteam module I used in Matlab.
368
4 [0.35, 0.34],
5 [0.4, 0.43],
6 [0.51, 0.47],
7 [0.48, 0.55],
8 [0.44, 0.62],
9 [0.5, 0.66],
10 [0.55,0.57],
11 [0.556, 0.48],
12 [0.63, 0.43],
13 [0.70, 0.44],
14 [0.8, 0.51],
15 [0.91, 0.57],
16 [1.0, 0.6]]
17
18 import matplotlib.pyplot as plt
19 plt.clf()
20 plt.plot([p[0] for p in boundary],
21 [p[1] for p in boundary])
22 plt.ylim([0, 1])
23 plt.savefig(images/boundary-1.png)
369
there is an intersection in the boundary, we count that as a crossing. We
choose the origin (0, 0) in this case for the reference point. For an arbitrary
point (x1, y1), the equation of the line is therefore (provided x1 !=0):
y1
y = x1 x.
Let the points defining a boundary segment be (bx1, by1) and (bx2,
by2). The equation for the line connecting these points (provided bx1 !=
bx2) is:
by2by1
y = by1 + bx2bx1 (x bx1)
Setting these two equations equal to each other, we can solve for the value
of x, and if bx1 <= x <= bx2 then we would say there is an intersection
with that segment. The solution for x is:
x = mbx1by1
my1/x1
This can only fail if m = y1/x1 which means the segments are parallel
and either do not intersect or go through each other. One issue we have to
resolve is what to do when the intersection is at the boundary. In that case,
we would see an intersection with two segments since bx1 of one segment is
also bx2 of another segment. We resolve the issue by only counting inter-
sections with bx1. Finally, there may be intersections at values of x greater
than the point, and we are not interested in those because the intersections
are not between the point and reference point.
Here are all of the special cases that we have to handle:
370
We will have to do float comparisons, so we will define tolerance functions
for all of these. I tried this previously with regular comparison operators,
and there were many cases that did not work because of float comparisons.
In the code that follows, we define the tolerance functions, the function that
handles almost all the special cases, and show that it almost always correctly
371
identifies the region a point is in.
1 import numpy as np
2
3 TOLERANCE = 2 * np.spacing(1)
4
5 def feq(x, y, epsilon=TOLERANCE):
6 x == y
7 return not((x < (y - epsilon)) or (y < (x - epsilon)))
8
9 def flt(x, y, epsilon=TOLERANCE):
10 x < y
11 return x < (y - epsilon)
12
13 def fgt(x, y, epsilon=TOLERANCE):
14 x > y
15 return y < (x - epsilon)
16
17 def fle(x, y, epsilon=TOLERANCE):
18 x <= y
19 return not(y < (x - epsilon))
20
21
22 def fge(x, y, epsilon=TOLERANCE):
23 x >= y
24 return not(x < (y - epsilon))
25
26 boundary = [[0.1, 0],
27 [0.25, 0.1],
28 [0.3, 0.2],
29 [0.35, 0.34],
30 [0.4, 0.43],
31 [0.51, 0.47],
32 [0.48, 0.55],
33 [0.44, 0.62],
34 [0.5, 0.66],
35 [0.55,0.57],
36 [0.556, 0.48],
37 [0.63, 0.43],
38 [0.70, 0.44],
39 [0.8, 0.51],
40 [0.91, 0.57],
41 [1.0, 0.6]]
42
43 def intersects(p, isegment):
44 p is a point (x1, y1), isegment is an integer indicating which segment starting with 0
45 x1, y1 = p
46 bx1, by1 = boundary[isegment]
47 bx2, by2 = boundary[isegment + 1]
48 if feq(bx1, bx2) and feq(x1, 0.0): # both segments are vertical
49 if feq(bx1, x1):
50 return True
51 else:
52 return False
53 elif feq(bx1, bx2): # segment is vertical
54 m1 = y1 / x1 # slope of reference line
372
55 y = m1 * bx1 # value of reference line at bx1
56 if ((fge(y, by1) and flt(y, by2))
57 or (fle(y, by1) and fgt(y,by2))):
58 # reference line intersects the segment
59 return True
60 else:
61 return False
62 else: # neither reference line nor segment is vertical
63 m = (by2 - by1) / (bx2 - bx1) # segment slope
64 m1 = y1 / x1
65 if feq(m, m1): # line and segment are parallel
66 if feq(y1, m * bx1):
67 return True
68 else:
69 return False
70 else: # lines are not parallel
71 x = (m * bx1 - by1) / (m - m1) # x at intersection
72 if ((fge(x, bx1) and flt(x, bx2))
73 or (fle(x, bx1) and fgt(x, bx2))) and fle(x, x1):
74 return True
75 else:
76 return False
77 raise Exception(you should not get here)
78
79 import matplotlib.pyplot as plt
80
81 plt.plot([p[0] for p in boundary],
82 [p[1] for p in boundary], go-)
83 plt.ylim([0, 1])
84
85 N = 100
86
87 X = np.linspace(0, 1, N)
88
89 for x in X:
90 for y in X:
91 p = (x, y)
92 nintersections = sum([intersects(p, i) for i in range(len(boundary) - 1)])
93 if nintersections % 2 == 0:
94 plt.plot(x, y, r.)
95 else:
96 plt.plot(x, y, b.)
97
98 plt.savefig(images/boundary-2.png)
373
If you look carefully, there are two blue points in the red region, which
means there is some edge case we do not capture in our function. Kudos to
the person who figures it out. Update: It was pointed out that the points
intersect a point on the line.
15 Units
15.1 Using units in python
Units in Matlab
I think an essential feature in an engineering computational environment
is properly handling units and unit conversions. Mathcad supports that
pretty well. I wrote a package for doing it in Matlab. Today I am going to
explore units in python. Here are some of the packages that I have found
which support units to some extent
1. http://pypi.python.org/pypi/units/
2. http://packages.python.org/quantities/user/tutorial.html
3. http://dirac.cnrs-orleans.fr/ScientificPython/ScientificPythonManual/
Scientific.Physics.PhysicalQuantities-module.html
374
4. http://home.scarlet.be/be052320/Unum.html
5. https://simtk.org/home/python_units
6. http://docs.enthought.com/scimath/units/intro.html
15.1.1 scimath
scimath may only wok in Python2.
1 import numpy as np
2 from scimath.units.volume import liter
3 from scimath.units.substance import mol
4
5 q = np.array([1, 2, 3]) * mol
6 print(q)
7
8 P = q / liter
9 print(P)
That doesnt look too bad. It is a little clunky to have to import every
unit, and it is clear the package is saving everything in SI units by default.
Let us try to solve an equation.
Find the time that solves this equation.
0.01 = CA0 ekt
First we solve without units. That way we know the answer.
1 import numpy as np
2 from scipy.optimize import fsolve
3
4 CA0 = 1.0 # mol/L
5 CA = 0.01 # mol/L
6 k = 1.0 # 1/s
7
8 def func(t):
9 z = CA - CA0 * np.exp(-k*t)
10 return z
11
12 t0 = 2.3
13
14 t, = fsolve(func, t0)
15 print t = {0:1.2f} seconds.format(t)
t = 4.61 seconds
375
Now, with units. I note here that I tried the obvious thing of just
importing the units, and adding them on, but the package is unable to work
with floats that have units. For some functions, there must be an ndarray
with units which is practically what the UnitScalar code below does.
1 import numpy as np
2 from scipy.optimize import fsolve
3 from scimath.units.volume import liter
4 from scimath.units.substance import mol
5 from scimath.units.time import second
6 from scimath.units.api import has_units, UnitScalar
7
8 CA0 = UnitScalar(1.0, units = mol / liter)
9 CA = UnitScalar(0.01, units = mol / liter)
10 k = UnitScalar(1.0, units = 1 / second)
11
12 @has_units(inputs="t::units=s",
13 outputs="result::units=mol/liter")
14 def func(t):
15 z = CA - CA0 * float(np.exp(-k*t))
16 return z
17
18 t0 = UnitScalar(2.3, units = second)
19
20 t, = fsolve(func, t0)
21 print t = {0:1.2f} seconds.format(t)
22 print type(t)
t = 4.61 seconds
<type numpy.float64>
This is some heavy syntax that in the end does not preserve the units.
In my Matlab package, we had to "wrap" many functions like fsolve so they
would preserve units. Clearly this package will need that as well. Overall,
in its current implementation this package does not do what I would expect
all the time.3
3
Then again no package does yet!
376
1 import quantities as u
2 import numpy as np
3
4 from scipy.optimize import fsolve
5 CA0 = 1 * u.mol / u.L
6 CA = 0.01 * u.mol / u.L
7 k = 1.0 / u.s
8
9 def func(t):
10 return CA - CA0 * np.exp(-k * t)
11
12 tguess = 4 * u.s
13
14 print(func(tguess))
15
16 print(fsolve(func, tguess))
-0.008315638888734178 mol/L
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/jkitchin/anaconda3/lib/python3.5/site-packages/scipy/optimize/minpack.
res = _root_hybr(func, x0, args, jac=fprime, **options)
File "/Users/jkitchin/anaconda3/lib/python3.5/site-packages/scipy/optimize/minpack.
shape, dtype = _check_func(fsolve, func, func, x0, args, n, (n,))
File "/Users/jkitchin/anaconda3/lib/python3.5/site-packages/scipy/optimize/minpack.
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "<stdin>", line 2, in func
File "/Users/jkitchin/anaconda3/lib/python3.5/site-packages/quantities/quantity.py"
res._dimensionality = p_dict[uf](*objs)
File "/Users/jkitchin/anaconda3/lib/python3.5/site-packages/quantities/dimensionali
raise ValueError("quantity must be dimensionless")
ValueError: quantity must be dimensionless
Our function works fine with units, but fsolve does not pass numbers with
units back to the function, so this function fails because the exponential
function gets an argument with dimensions in it. We can create a new
function that solves this problem. We need to "wrap" the function we want
to solve to make sure that it uses units, but returns a float number. Then,
we put the units back onto the final solved value. Here is how we do that.
1 import quantities as u
2 import numpy as np
3
4 from scipy.optimize import fsolve as _fsolve
5
6 CA0 = 1 * u.mol / u.L
377
7 CA = 0.01 * u.mol / u.L
8 k = 1.0 / u.s
9
10 def func(t):
11 return CA - CA0 * np.exp(-k * t)
12
13 def fsolve(func, t0):
14 wrapped fsolve command to work with units
15 tU = t0 / float(t0) # units on initial guess, normalized
16 def wrapped_func(t):
17 t will be unitless, so we add unit to it. t * tU has units.
18 return float(func(t * tU))
19
20 sol, = _fsolve(wrapped_func, t0)
21 return sol * tU
22
23 tguess = 4 * u.s
24
25 print(fsolve(func, tguess))
4.605170185988092 s
1 import quantities as u
2 import numpy as np
3
4 from scipy.optimize import fsolve as _fsolve
5
6 def fsolve(func, t0):
7 wrapped fsolve command to work with units. We get the units on
8 the function argument, then wrap the function so we can add units
9 to the argument and return floats. Finally we call the original
10 fsolve from scipy. Note: this does not support all of the options
11 to fsolve.
12
13 try:
14 tU = [t / float(t) for t in t0] # units on initial guess, normalized
378
15 except TypeError:
16 tU = t0 / float(t0)
17
18 def wrapped_func(t):
19 t will be unitless, so we add unit to it. t * tU has units.
20 try:
21 T = [x1 * x2 for x1,x2 in zip(t, tU)]
22 except TypeError:
23 T = t * tU
24
25 try:
26 return [float(x) for x in func(T)]
27 except TypeError:
28 return float(func(T))
29
30 sol = _fsolve(wrapped_func, t0)
31 try:
32 return [x1 * x2 for x1,x2 in zip(sol, tU)]
33 except TypeError:
34 return sol * tU
35
36 ### Problem 1
37 CA0 = 1 * u.mol / u.L
38 CA = 0.01 * u.mol / u.L
39 k = 1.0 / u.s
40
41 def func(t):
42 return CA - CA0 * np.exp(-k * t)
43
44
45 tguess = 4 * u.s
46 sol1, = fsolve(func, tguess)
47 print(sol1 = ,sol1)
48
49 ### Problem 2
50 def func2(X):
51 a,b = X
52 return [a**2 - 4*u.kg**2,
53 b**2 - 25*u.J**2]
54
55 Xguess = [2.2*u.kg, 5.2*u.J]
56 s2a, s2b = fsolve(func2, Xguess)
57 print(s2a = {0}\ns2b = {1}.format(s2a, s2b))
sol1 = 4.605170185988092 s
s2a = 1.9999999999999867 kg
s2b = 5.000000000000002 J
That is pretty good. There is still room for improvement in the wrapped
function, as it does not support all of the options that scipy.optimize.fsolve
supports. Here is a draft of a function that does that. We have to return
different numbers of arguments depending on the value of full_output. This
379
function works, but I have not fully tested all the options. Here are three
examples that work, including one with an argument.
1 import quantities as u
2 import numpy as np
3
4 from scipy.optimize import fsolve as _fsolve
5
6 def fsolve(func, t0, args=(),
7 fprime=None, full_output=0, col_deriv=0,
8 xtol=1.49012e-08, maxfev=0, band=None,
9 epsfcn=0.0, factor=100, diag=None):
10 wrapped fsolve command to work with units. We get the units on
11 the function argument, then wrap the function so we can add units
12 to the argument and return floats. Finally we call the original
13 fsolve from scipy.
14
15 try:
16 tU = [t / float(t) for t in t0] # units on initial guess, normalized
17 except TypeError:
18 tU = t0 / float(t0)
19
20 def wrapped_func(t, *args):
21 t will be unitless, so we add unit to it. t * tU has units.
22 try:
23 T = [x1 * x2 for x1,x2 in zip(t, tU)]
24 except TypeError:
25 T = t * tU
26
27 try:
28 return [float(x) for x in func(T, *args)]
29 except TypeError:
30 return float(func(T))
31
32 sol = _fsolve(wrapped_func, t0, args,
33 fprime, full_output, col_deriv,
34 xtol, maxfev, band,
35 epsfcn, factor, diag)
36
37 if full_output:
38 x, infodict, ier, mesg = sol
39 try:
40 x = [x1 * x2 for x1,x2 in zip(x, tU)]
41 except TypeError:
42 x = x * tU
43 return x, infodict, ier, mesg
44 else:
45 try:
46 x = [x1 * x2 for x1,x2 in zip(sol, tU)]
47 except TypeError:
48 x = sol * tU
49 return x
50
51 ### Problem 1
52 CA0 = 1 * u.mol / u.L
380
53 CA = 0.01 * u.mol / u.L
54 k = 1.0 / u.s
55
56 def func(t):
57 return CA - CA0 * np.exp(-k * t)
58
59
60 tguess = 4 * u.s
61 sol1, = fsolve(func, tguess)
62 print(sol1 = ,sol1)
63
64 ### Problem 2
65 def func2(X):
66 a,b = X
67 return [a**2 - 4*u.kg**2,
68 b**2 - 25*u.J**2]
69
70 Xguess = [2.2*u.kg, 5.2*u.J]
71 sol, infodict, ier, mesg = fsolve(func2, Xguess, full_output=1)
72 s2a, s2b = sol
73 print(s2a = {0}\ns2b = {1}.format(s2a, s2b))
74
75 ### Problem 3 - with an arg
76 def func3(a, arg):
77 return a**2 - 4*u.kg**2 + arg**2
78
79 Xguess = 1.5 * u.kg
80 arg = 0.0* u.kg
81
82 sol3, = fsolve(func3, Xguess, args=(arg,))
83
84 print(sol3 = , sol3)
sol1 = 4.605170185988092 s
s2a = 1.9999999999999867 kg
s2b = 5.000000000000002 J
sol3 = 2.0 kg
The only downside I can see in the quantities module is that it only
handle temperature differences, and not absolute temperatures. If you only
use absolute temperatures, this would not be a problem I think. But, if
you have mixed temperature scales, the quantities module does not convert
them on an absolute scale.
1 import quantities as u
2
3 T = 20 * u.degC
4
5 print(T.rescale(u.K))
6 print(T.rescale(u.degF))
381
20.0 K
36.0 degF
Nevertheless, this module seems pretty promising, and there are a lot
more features than shown here. Some documentation can be found at http:
//pythonhosted.org/quantities/.
1 import quantities as u
2
3 k = 0.23 / u.s
4 Ca0 = 1 * u.mol / u.L
5
6 def dCadt(Ca, t):
7 return -k * Ca
8
9 import numpy as np
10 from scipy.integrate import odeint
11
12 tspan = np.linspace(0, 5) * u.s
13
14 sol = odeint(dCadt, Ca0, tspan)
15
16 print(sol[-1])
[ 0.31663678]
1 import quantities as u
2 import matplotlib.pyplot as plt
3
4 import numpy as np
5 from scipy.integrate import odeint as _odeint
6
7 def odeint(func, y0, t, args=(),
8 Dfun=None, col_deriv=0, full_output=0,
9 ml=None, mu=None, rtol=None, atol=None,
10 tcrit=None, h0=0.0, hmax=0.0, hmin=0.0,
382
11 ixpr=0, mxstep=0, mxhnil=0, mxordn=12,
12 mxords=5, printmessg=0):
13
14 def wrapped_func(Y0, T, *args):
15 # put units on T if they are on the original t
16 # check for units so we dont put them on twice
17 if not hasattr(T, units) and hasattr(t, units):
18 T = T * t.units
19 # now for the dependent variable units. Y0 may be a scalar or
20 # a list or an array. we want to check each element of y0 for
21 # units, and add them to the corresponding element of Y0 if we
22 # need to.
23 try:
24 uY0 = [x for x in Y0] # a list copy of contents of Y0
25 # this works if y0 is an iterable, eg. a list or array
26 for i, yi in enumerate(y0):
27 if not hasattr(uY0[i],units) and hasattr(yi, units):
28
29 uY0[i] = uY0[i] * yi.units
30
31 except TypeError:
32 # we have a scalar
33 if not hasattr(Y0, units) and hasattr(y0, units):
34 uY0 = Y0 * y0.units
35
36 val = func(uY0, t, *args)
37
38 try:
39 return np.array([float(x) for x in val])
40 except TypeError:
41 return float(val)
42
43 if full_output:
44 y, infodict = _odeint(wrapped_func, y0, t, args,
45 Dfun, col_deriv, full_output,
46 ml, mu, rtol, atol,
47 tcrit, h0, hmax, hmin,
48 ixpr, mxstep, mxhnil, mxordn,
49 mxords, printmessg)
50 else:
51 y = _odeint(wrapped_func, y0, t, args,
52 Dfun, col_deriv, full_output,
53 ml, mu, rtol, atol,
54 tcrit, h0, hmax, hmin,
55 ixpr, mxstep, mxhnil, mxordn,
56 mxords, printmessg)
57
58 # now we need to put units onto the solution units should be the
59 # same as y0. We cannot put mixed units in an array, so, we return a list
60 m,n = y.shape # y is an ndarray, so it has a shape
61 if n > 1: # more than one equation, we need a list
62 uY = [0 for yi in range(n)]
63
64 for i, yi in enumerate(y0):
65 if not hasattr(uY[i],units) and hasattr(yi, units):
66 uY[i] = y[:,i] * yi.units
383
67 else:
68 uY[i] = y[:,i]
69
70 else:
71 uY = y * y0.units
72
73 y = uY
74
75
76 if full_output:
77 return y, infodict
78 else:
79 return y
80
81 ##################################################################
82 # test a single ODE
83 k = 0.23 / u.s
84 Ca0 = 1 * u.mol / u.L
85
86 def dCadt(Ca, t):
87 return -k * Ca
88
89 tspan = np.linspace(0, 5) * u.s
90 sol = odeint(dCadt, Ca0, tspan)
91
92 print(sol[-1])
93
94 plt.plot(tspan, sol)
95 plt.xlabel(Time ({0}).format(tspan.dimensionality.latex))
96 plt.ylabel($C_A$ ({0}).format(sol.dimensionality.latex))
97 plt.savefig(images/ode-units-ca.png)
98
99 ##################################################################
100 # test coupled ODEs
101 lbmol = 453.59237*u.mol
102
103 kprime = 0.0266 * lbmol / u.hr / u.lb
104 Fa0 = 1.08 * lbmol / u.hr
105 alpha = 0.0166 / u.lb
106 epsilon = -0.15
107
108 def dFdW(F, W, alpha0):
109 X, y = F
110 dXdW = kprime / Fa0 * (1.0 - X)/(1.0 + epsilon * X) * y
111 dydW = - alpha0 * (1.0 + epsilon * X) / (2.0 * y)
112 return [dXdW, dydW]
113
114 X0 = 0.0 * u.dimensionless
115 y0 = 1.0
116
117 # initial conditions
118 F0 = [X0, y0] # one without units, one with units, both are dimensionless
119
120 wspan = np.linspace(0,60) * u.lb
121
122 sol = odeint(dFdW, F0, wspan, args=(alpha,))
384
123 X, y = sol
124
125 print(Test 2)
126 print(X[-1])
127 print(y[-1])
128
129 plt.figure()
130 plt.plot(wspan, X, wspan, y)
131 plt.legend([X,$P/P_0$])
132 plt.xlabel(Catalyst weight ({0}).format(wspan.dimensionality.latex))
133 plt.savefig(images/ode-coupled-units-pdrpo.png)
[ 0.31663678] mol/L
Test 2
0.6655695781563288 dimensionless
0.263300470681
385
That is not too bad. This is another example of a function you would
want to save in a module for reuse. There is one bad feature of the wrapped
odeint function, and that is that it changes the solution for coupled ODEs
from an ndarray to a list. That is necessary because you apparently cannot
have mixed units in an ndarray. It is fine, however, to have a list of mixed
units. This is not a huge problem, but it changes the syntax for plotting
results for the wrapped odeint function compared to the unwrapped function
without units.
1 import quantities as u
2
386
3 a = 5 * u.m
4 L = 10 * u.m # characteristic length
5
6 print(a/L)
7 print(type(a/L))
0.5 dimensionless
<class quantities.quantity.Quantity>
As you can see, the dimensionless number is scaled properly, and is listed
as dimensionless. The result is still an instance of a quantities object though.
That is not likely to be a problem.
Now, we consider using fsolve with dimensionless equations. Our goal is
to solve CA = CA0 exp(kt) for the time required to reach a desired CA . We
let X = Ca/Ca0 and = t k, which leads to X = exp in dimensionless
terms.
1 import quantities as u
2 import numpy as np
3 from scipy.optimize import fsolve
4
5 CA0 = 1 * u.mol / u.L
6 CA = 0.01 * u.mol / u.L # desired exit concentration
7 k = 1.0 / u.s
8
9 # we need new dimensionless variables
10 # let X = Ca / Ca0
11 # so, Ca = Ca0 * X
12
13 # let tau = t * k
14 # so t = tau / k
15
16 X = CA / CA0 # desired exit dimensionless concentration
17
18 def func(tau):
19 return X - np.exp(-tau)
20
21 tauguess = 2
22
23 print(func(tauguess)) # confirm we have a dimensionless function
24
25 tau_sol, = fsolve(func, tauguess)
26 t = tau_sol / k
27 print(t)
-0.1253352832366127 dimensionless
4.605170185988091 s
387
Now consider the ODE dCadt = kCa. We let X = Ca/Ca0, so Ca0dX =
dCa. Let = tk which in this case is dimensionless. That means d = kdt.
Substitution of these new variables leads to:
d = kCa0X
Ca0 k dX
or equivalently: dX
d = X
1 import quantities as u
2
3 k = 0.23 / u.s
4 Ca0 = 1 * u.mol / u.L
5
6 # Let X = Ca/Ca0 -> Ca = Ca0 * X dCa = dX/Ca0
7 # let tau = t * k -> dt = 1/k dtau
8
9
10 def dXdtau(X, tau):
11 return -X
12
13 import numpy as np
14 from scipy.integrate import odeint
15
16 tspan = np.linspace(0, 5) * u.s
17 tauspan = tspan * k
18
19 X0 = 1
20 X_sol = odeint(dXdtau, X0, tauspan)
21
22 print(Ca at t = {0} = {1}.format(tspan[-1], X_sol.flatten()[-1] * Ca0))
Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
<http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
388
of this license document, but changing it is not allowed.
0. PREAMBLE
We have designed this License in order to use it for manuals for free
software, because free software needs free documentation: a free
program should come with manuals providing the same freedoms that the
software does. But this License is not limited to software manuals;
it can be used for any textual work, regardless of subject matter or
whether it is published as a printed book. We recommend this License
principally for works whose purpose is instruction or reference.
This License applies to any manual or other work, in any medium, that
contains a notice placed by the copyright holder saying it can be
distributed under the terms of this License. Such a notice grants a
world-wide, royalty-free license, unlimited in duration, to use that
work under the conditions stated herein. The "Document", below,
refers to any such manual or work. Any member of the public is a
licensee, and is addressed as "you". You accept the license if you
copy, modify or distribute the work in a way requiring permission
under copyright law.
389
modifications and/or translated into another language.
The "Cover Texts" are certain short passages of text that are listed,
as Front-Cover Texts or Back-Cover Texts, in the notice that says that
the Document is released under this License. A Front-Cover Text may
be at most 5 words, and a Back-Cover Text may be at most 25 words.
390
ASCII without markup, Texinfo input format, LaTeX input format, SGML
or XML using a publicly available DTD, and standard-conforming simple
HTML, PostScript or PDF designed for human modification. Examples of
transparent image formats include PNG, XCF and JPG. Opaque formats
include proprietary formats that can be read and edited only by
proprietary word processors, SGML or XML for which the DTD and/or
processing tools are not generally available, and the
machine-generated HTML, PostScript or PDF produced by some word
processors for output purposes only.
The "Title Page" means, for a printed book, the title page itself,
plus such following pages as are needed to hold, legibly, the material
this License requires to appear in the title page. For works in
formats which do not have any title page as such, "Title Page" means
the text near the most prominent appearance of the works title,
preceding the beginning of the body of the text.
The Document may include Warranty Disclaimers next to the notice which
states that this License applies to the Document. These Warranty
Disclaimers are considered to be included by reference in this
License, but only as regards disclaiming warranties: any other
implication that these Warranty Disclaimers may have is void and has
no effect on the meaning of this License.
2. VERBATIM COPYING
You may copy and distribute the Document in any medium, either
commercially or noncommercially, provided that this License, the
copyright notices, and the license notice saying this License applies
391
to the Document are reproduced in all copies, and that you add no
other conditions whatsoever to those of this License. You may not use
technical measures to obstruct or control the reading or further
copying of the copies you make or distribute. However, you may accept
compensation in exchange for copies. If you distribute a large enough
number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and
you may publicly display copies.
3. COPYING IN QUANTITY
If you publish printed copies (or copies in media that commonly have
printed covers) of the Document, numbering more than 100, and the
Documents license notice requires Cover Texts, you must enclose the
copies in covers that carry, clearly and legibly, all these Cover
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
the back cover. Both covers must also clearly and legibly identify
you as the publisher of these copies. The front cover must present
the full title with all words of the title equally prominent and
visible. You may add other material on the covers in addition.
Copying with changes limited to the covers, as long as they preserve
the title of the Document and satisfy these conditions, can be treated
as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit
legibly, you should put the first ones listed (as many as fit
reasonably) on the actual cover, and continue the rest onto adjacent
pages.
392
location until at least one year after the last time you distribute an
Opaque copy (directly or through your agents or retailers) of that
edition to the public.
It is requested, but not required, that you contact the authors of the
Document well before redistributing any large number of copies, to
give them a chance to provide you with an updated version of the
Document.
4. MODIFICATIONS
You may copy and distribute a Modified Version of the Document under
the conditions of sections 2 and 3 above, provided that you release
the Modified Version under precisely this License, with the Modified
Version filling the role of the Document, thus licensing distribution
and modification of the Modified Version to whoever possesses a copy
of it. In addition, you must do these things in the Modified Version:
A. Use in the Title Page (and on the covers, if any) a title distinct
from that of the Document, and from those of previous versions
(which should, if there were any, be listed in the History section
of the Document). You may use the same title as a previous version
if the original publisher of that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities
responsible for authorship of the modifications in the Modified
Version, together with at least five of the principal authors of the
Document (all of its principal authors, if it has fewer than five),
unless they release you from this requirement.
C. State on the Title page the name of the publisher of the
Modified Version, as the publisher.
D. Preserve all the copyright notices of the Document.
E. Add an appropriate copyright notice for your modifications
adjacent to the other copyright notices.
F. Include, immediately after the copyright notices, a license notice
giving the public permission to use the Modified Version under the
terms of this License, in the form shown in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections
and required Cover Texts given in the Documents license notice.
H. Include an unaltered copy of this License.
393
I. Preserve the section Entitled "History", Preserve its Title, and add
to it an item stating at least the title, year, new authors, and
publisher of the Modified Version as given on the Title Page. If
there is no section Entitled "History" in the Document, create one
stating the title, year, authors, and publisher of the Document as
given on its Title Page, then add an item describing the Modified
Version as stated in the previous sentence.
J. Preserve the network location, if any, given in the Document for
public access to a Transparent copy of the Document, and likewise
the network locations given in the Document for previous versions
it was based on. These may be placed in the "History" section.
You may omit a network location for a work that was published at
least four years before the Document itself, or if the original
publisher of the version it refers to gives permission.
K. For any section Entitled "Acknowledgements" or "Dedications",
Preserve the Title of the section, and preserve in the section all
the substance and tone of each of the contributor acknowledgements
and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document,
unaltered in their text and in their titles. Section numbers
or the equivalent are not considered part of the section titles.
M. Delete any section Entitled "Endorsements". Such a section
may not be included in the Modified Version.
N. Do not retitle any existing section to be Entitled "Endorsements"
or to conflict in title with any Invariant Section.
O. Preserve any Warranty Disclaimers.
394
You may add a passage of up to five words as a Front-Cover Text, and a
passage of up to 25 words as a Back-Cover Text, to the end of the list
of Cover Texts in the Modified Version. Only one passage of
Front-Cover Text and one of Back-Cover Text may be added by (or
through arrangements made by) any one entity. If the Document already
includes a cover text for the same cover, previously added by you or
by arrangement made by the same entity you are acting on behalf of,
you may not add another; but you may replace the old one, on explicit
permission from the previous publisher that added the old one.
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this
License, under the terms defined in section 4 above for modified
versions, provided that you include in the combination all of the
Invariant Sections of all of the original documents, unmodified, and
list them all as Invariant Sections of your combined work in its
license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and
multiple identical Invariant Sections may be replaced with a single
copy. If there are multiple Invariant Sections with the same name but
different contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original
author or publisher of that section if known, or else a unique number.
Make the same adjustment to the section titles in the list of
Invariant Sections in the license notice of the combined work.
395
6. COLLECTIONS OF DOCUMENTS
8. TRANSLATION
396
Replacing Invariant Sections with translations requires special
permission from their copyright holders, but you may include
translations of some or all Invariant Sections in addition to the
original versions of these Invariant Sections. You may include a
translation of this License, and all the license notices in the
Document, and any Warranty Disclaimers, provided that you also include
the original English version of this License and the original versions
of those notices and disclaimers. In case of a disagreement between
the translation and the original version of this License or a notice
or disclaimer, the original version will prevail.
9. TERMINATION
However, if you cease all violation of this License, then your license
from a particular copyright holder is reinstated (a) provisionally,
unless and until the copyright holder explicitly and finally
terminates your license, and (b) permanently, if the copyright holder
fails to notify you of the violation by some reasonable means prior to
60 days after the cessation.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
397
this License. If your rights have been terminated and not permanently
reinstated, receipt of a copy of some or all of the same material does
not give you any rights to use it.
The Free Software Foundation may publish new, revised versions of the
GNU Free Documentation License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in
detail to address new problems or concerns. See
http://www.gnu.org/copyleft/.
11. RELICENSING
398
"Incorporate" means to publish or republish a Document, in whole or in
part, as part of another Document.
The operator of an MMC Site may republish an MMC contained in the site
under CC-BY-SA on the same site at any time before August 1, 2009,
provided the MMC is eligible for relicensing.
17 References
http://scipy-lectures.github.com/index.html
18 Index
399