Python 3 Cheat Sheet: Int Float Bool STR List Tuple

©2012-2015 - Laurent Pointal Mémento v2.0.
3
License Creative Commons Attribution 4
Python 3 Cheat Sheet Official Python documentation on
http://docs.python.org/py3k
integer, float, boolean, string, bytes Base Types ◾ ordered sequences, fast index access, repeatable values Container Types
int 783 0 -192 0b010 0o642 0xF3 list [1,5,9] ["x",11,8.9] ["mot"] []
null binary octal hexa tuple (1,5,9) 11,"y",7.4 ("mot",) ()
float 9.23 0.0 -1.7e-6
-6 Non modifiable values (immutables) ☝ expression with just comas →tuple
bool True False ×10
""
str bytes (ordered sequences of chars / bytes)
str "One\nTwo" Multiline string: b""
escaped new line """X\tY\tZ ◾ key containers, no a priori order, fast key acces, each key is unique
'I\'m' 1\t2\t3""" dictionary dict {"key":"value"} dict(a=3,b=4,k="v") {}
escaped ' escaped tab (key/value associations) {1:"one",3:"three",2:"two",3.14:"π"}
bytes b"toto\xfe\775" collection set {"key1","key2"} {1,9,3,0} set()
hexadecimal octal ☝ immutables ☝ keys=hashable values (base types, immutables…) frozenset immutable set empty
for variables, functions, Identifiers type(expression) Conversions

modules, classes… names
int("15") → 15
int("3f",16) → 63 nd
can specify integer number base in 2 parameter
a…zA…Z_ followed by a…zA…Z_0…9
◽ diacritics allowed but should be avoided int(15.56) → 15 truncate decimal part
◽ language keywords forbidden float("-11.24e8") → -1124000000.0
◽ lower/UPPER case discrimination round(15.56,1)→ 15.6 rounding to 1 decimal (0 decimal → integer number)
☺ a toto x7 y_max BigOne bool(x) False for null x, empty container x , None x or False x ; True for other x
☹ 8y and for str(x)→ "…" representation string of x for display (cf. formating on the back)
chr(64)→'@' ord('@')→64 code ↔ char
= Variables assignment
repr(x)→ "…" literal representation string of x
1) evaluation of right side expression value
2) assignment in order with left side names bytes([72,9,64]) → b'H\t@'
☝ assignment ⇔ binding of a name with a value list("abc") → ['a','b','c']
x=1.2+8+sin(y) dict([(3,"three"),(1,"one")]) → {1:'one',3:'three'}
a=b=c=0 assignment to same value set(["one","two"]) → {'one','two'}
y,z,r=9.2,-7.6,0 multiple assignments separator str and sequence of str → assembled str
a,b=b,a values swap ':'.join(['toto','12','pswd']) → 'toto:12:pswd'
a,*b=seq unpacking of sequence in str splitted on whitespaces → list of str
*a,b=seq item and list "words with spaces".split() → ['words','with','spaces']
and str splitted on separation str → list of str
x+=3 increment ⇔ x=x+3 *=
x-=2 decrement ⇔ x=x-2 /= "1,4,8,2".split(",") → ['1','4','8','2']
x=None « undefined » constant value %= sequence of one type → list of another type (via comprehension list)
del x remove name x … [int(x) for x in ('1','29','-3')] → [1,29,-3]
for lists, tuples, strings, bytes… Sequence Containers Indexing
negative index -5 -4 -3 -2 -1 Items count Individual access to items via lst[index]
positive index 0 1 2 3 4 len(lst)→5 lst[0]→10 ⇒ first one lst[1]→20
lst=[10, 20, 30, 40, 50] lst[-1]→50 ⇒ last one lst[-2]→40
positive slice 0 1 2 3 4 5 ☝ index from 0
On mutable sequences (list), remove with
negative slice -5 -4 -3 -2 -1 (here from 0 to 4)
del lst[3] and modify with assignment
lst[4]=25
Access to sub-sequences via lst[start slice:end slice:step]
lst[:-1]→[10,20,30,40] lst[::-1]→[50,40,30,20,10] lst[1:3]→[20,30] lst[:3]→[10,20,30]
lst[1:-1]→[20,30,40] lst[::-2]→[50,30,10] lst[-3:-1]→[30,40] lst[3:]→[40,50]
lst[::2]→[10,30,50] lst[:]→[10,20,30,40,50] shallow copy of sequence
Missing slice indication → from start / up to end.
On mutable sequences (list), remove with del lst[3:5] and modify with assignment lst[1:4]=[15,25]
Boolean Logic Statements Blocks Modules/Names Imports

module truc⇔file truc.py
Comparators: < > <= >= == != from monmod import nom1,nom2 as fct
(boolean results) ≤ ≥ = ≠ parent statement: →direct acces to names, renaming with as
a and b logical and both simulta-
statement block 1… import monmod →acces via monmod.nom1 …
indentation !
-neously ⁝ ☝ modules and packages searched in python path (cf sys.path)

a or b logical or one or other parent statement: statement block executed only Conditional Statement
or both statement block2…
☝ pitfall : and and or return value of a or if a condition is true
⁝ yes no yes
of b (under shortcut evaluation). if logical condition: ? ?
⇒ ensure that a and b are booleans. no
not a logical not
next statement after block 1 statements block
True Can go with several elifi, elif... and only
True and False constants ☝ configure editor to insert 4 spaces in if age<=18:
False one final else. Only the block of first true
place of an indentation tab. state="Kid"
condition is executed. elif age>65:
☝ floating numbers… approximated values Maths
angles in radians ☝ with a boolean var x: state="Retired"
if x==True: ⇔ if x: else:
Operators: + - * / // % ** from math import sin,pi… state="Active"
ab if x==False: ⇔ if not x:
Priority (…) × ÷ sin(pi/4)→0.707…
integer ÷ ÷ remainder cos(2*pi/3)→-0.4999… Exceptions on Errors
Signaling an error:
@ → matrix × python3.5+numpy sqrt(81)→9.0 √ raise Exception(…) Errors processing:
(1+5.3)*2→12.6 log(e**2)→2.0 error try:
abs(-3.2)→3.2 ceil(12.5)→13 normal processing normal procesising block
raise error
round(3.57,1)→3.6 floor(12.5)→12 processing processing raise
except Exception as e:
pow(4,3)→64.0 modules math, statistics, random, error processing block
☝ usual priorities decimal, fractions, numpy, etc. (cf. doc) ☝ finally block for final processing in all cases.
statements block executed as long as Conditional Loop Statement statements block executed for each Iterative Loop Statement
condition is true item of a container or iterator
☝ beware of infinite loops!
yes next
while condition logique: ? Loop Control for var in sequence: …
no break immediate exit finish
statements block statements block
continue next iteration
s = 0 initializations before the loop ☝ else block for normal loop exit. Go over sequence's values
i = 1 condition with a least one variable value (here i) s = "Some text" initializations before the loop
Algo: cnt = 0
☝ good habit : don't modify loop variable

while i <= 100: i=100 loop variable, assignment managed by for statement
s = s + i**2
i = i + 1 ☝ make condition variable change ! s= ∑ i 2 for c in s:
if c == "e": Algo: count
print("sum:",s) i=1 cnt = cnt + 1 number of e
print("found",cnt,"'e'") in the string.
print("v=",3,"cm :",x,",",y+4) Display loop on dict/set ⇔ loop on keys sequences
use slices to loop on a subset of a sequence
items to display : literal values, variables, expressions Go over sequence's index

print options: ◽ modify item at index
◽ sep=" " items separator, default space ◽ access items around index (before / after)
lst = [11,18,9,12,23,4,17]
◽ end="\n" end of print, default new line lost = []
◽ file=f print to file, default standard output for idx in range(len(lst)): Algo: limit values greater
val = lst[idx] than 15, memorizing
s = input("Instructions:") Input if val> 15: of lost values.
☝ input always returns a string, convert it to required type lost.append(val)
(cf. boxed Conversions on the other side). lst[idx] = 15
print("modif:",lst,"-lost:",lost)
len(c)→ items count Generic Operations on Containers Go simultaneously on sequence's index and values:
min(c) max(c) sum(c) Note: For dictionaries and sets, these for idx,val in enumerate(lst):
sorted(c)→ list sorted copy operations use keys.
val in c → boolean, membership operator in (absence not in) range([start,] end [,step]) Integers Sequences
enumerate(c)→ iterator on (index, value) ☝ start default 0, fin not included in sequence, pas signed default 1
zip(c1,c2…)→ iterator on tuples containing ci items at same index
range(5)→ 0 1 2 3 4 range(2,12,3)→ 2 5 8 11
all(c)→ True if all c items evaluated to true, else False range(3,8)→ 3 4 5 6 7 range(20,5,-5)→ 20 15 10
any(c)→ True if at least one item of c evaluated true, else False range(len(seq))→ sequence of index of values in seq
Specific to ordered sequences containers (lists, tuples, strings, bytes…) ☝ range provides an immutable sequence of int constructed as needed
reversed(c)→ inversed iterator c*5→ duplicate c+c2→ concatenate
c.index(val)→ position c.count(val)→ events count function name (identifier) Function Definition
import copy named parameters
copy.copy(c)→ shallow copy of container def fct(x,y,z):
copy.deepcopy(c)→ deep copy of container fct
"""documentation"""
☝ modify original list Opérations on Lists # statements block, res computation, etc.
lst.append(val) add item at end return res result value of the call, if no computed
result to return: return None
lst.extend(seq) add sequence of items at end ☝ parameters and all
lst.insert(idx,val) insert item at index variables of this block exist only in the block and during the function
lst.remove(val) remove first item with value val call (think of a “black box”)
lst.pop([idx])→value remove & return item at index idx (default last) Advanced: def fct(x,y,z,*args,a=3,b=5,**kwargs):
lst.sort() lst.reverse() sort / reverse liste in place *args variable positional arguments (→tuple), default values,
**kwargs variable named arguments (→dict)
Operations on Dictionaries Operations on Sets
d[key]=value d.clear() Operators: r = fct(3,i+2,2*i) Function Call
d[key]→ value del d[key] | → union (vertical bar char) storage/use of one argument per
& → intersection returned value parameter
d.update(d2) update/add
associations - ^ → différence/symetric diff. ☝ this is the use of function Advanced: fct() fct
d.keys()
< <= > >= → inclusion relations
d.values() →iterable views on name with parenthesis *sequence
Operators also exist as methods. which does the call **dict
d.items() keys/values/associations
d.pop(key[,default])→ value s.update(s2) s.copy()
d.popitem()→ (key,value) s.add(key) s.remove(key) s.startswith(prefix[,start[,end]]) Operations on Strings
d.get(key[,default])→ value s.discard(key) s.clear() s.endswith(suffix[,start[,end]]) s.strip([chars])
d.setdefault(key[,default])→value s.pop() s.count(sub[,start[,end]]) s.partition(sep)→ (before,sep,after)
s.index(sub[,start[,end]]) s.find(sub[,start[,end]])
storing data on disk, and reading it back Files s.is…() tests on chars categories (ex. s.isalpha())
f = open("fil.txt","w",encoding="utf8") s.upper() s.lower() s.title() s.swapcase()
s.casefold() s.capitalize() s.center([width,fill])
file variable name of file
opening mode encoding of s.ljust([width,fill]) s.rjust([width,fill]) s.zfill([width])
for operations ◽ 'r' read
on disk chars for text s.encode(encoding) s.split([sep]) s.join(seq)
◽ 'w' write
(+path…) files:
◽ 'a' append utf8 ascii formating directives values to format Formating
cf. modules os, os.path and pathlib ◽ …'+' 'x' 'b' 't' latin1 …
☝ text mode t by default (read/write str), possible binary modeb (read/write bytes) "modele{} {} {}".format(x,y,r) str
"{selection:formating!conversion}"
writing empty string if end of file reading
f.write("coucou") s = f.read(4) if char count not ◽ Selection : "{:+2.3f}".format(45.72793)
2 →'+45.728'
Examples
specified, read
☝ if text file → read / write only read next line nom "{1:>10s}".format(8,"toto")
whole file 0.nom
strings, convert from/to required →' toto'
type s = f.readline() 4[key] "{x!r}".format(x="I'm")
0[2]
f.close() ☝ dont forget to close the file after use ! →'"I\'m"'
◽ Formating :
f.flush() write cache f.truncate([taille]) resize fill char alignment sign mini width.precision~maxwidth type
reading/wriding progress sequentially in the file, modifiable with:
<>^= + - space 0 at start for filling with 0
f.tell()→position f.seek(position[,origin]) integer: b binary, c char, d decimal (default), o octal, x or X hexa…
Very common: opening with a guarded block with open(…) as f: float: e or E exponential, f or F fixed point, g or G appropriate (default),
(automatic closing) and reading loop on lines for line in f : string: s … % percent
of a text file: # processing ofline ◽ Conversion : s (readable texte) or r (literal representation)
Python For Data Science Cheat Sheet Inspecting Your Array Subsetting, Slicing, Indexing Also see Lists
>>> a.shape Array dimensions Subsetting
NumPy Basics >>>
>>>
len(a)
b.ndim
Length of array
Number of array dimensions
>>> a[2]
3
1 2 3 Select the element at the 2nd index
Learn Python for Data Science Interactively at www.DataCamp.com >>> e.size Number of array elements >>> b[1,2] 1.5 2 3 Select the element at row 1 column 2
>>> b.dtype Data type of array elements 6.0 4 5 6 (equivalent to b[1][2])
>>> b.dtype.name Name of data type
>>> b.astype(int) Convert an array to a different type Slicing
NumPy >>> a[0:2]
array([1, 2])
1 2 3 Select items at index 0 and 1
2
The NumPy library is the core library for scientific computing in Asking For Help >>> b[0:2,1] 1.5 2 3 Select items at rows 0 and 1 in column 1
>>> np.info(np.ndarray.dtype) array([ 2., 5.]) 4 5 6
Python. It provides a high-performance multidimensional array
Array Mathematics
1.5 2 3
>>> b[:1] Select all items at row 0
object, and tools for working with these arrays. array([[1.5, 2., 3.]]) 4 5 6 (equivalent to b[0:1, :])
Arithmetic Operations >>> c[1,...] Same as [1,:,:]
Use the following import convention: array([[[ 3., 2., 1.],
>>> import numpy as np [ 4., 5., 6.]]])
>>> g = a - b Subtraction
array([[-0.5, 0. , 0. ], >>> a[ : :-1] Reversed array a
NumPy Arrays [-3. , -3. , -3. ]])
array([3, 2, 1])
>>> np.subtract(a,b) Boolean Indexing

1D array 2D array 3D array Subtraction
>>> a[a<2] Select elements from a less than 2
>>> b + a Addition 1 2 3
array([[ 2.5, 4. , 6. ], array([1])
axis 1 axis 2
1 2 3 axis 1 [ 5. , 7. , 9. ]]) Fancy Indexing
1.5 2 3 >>> np.add(b,a) Addition >>> b[[1, 0, 1, 0],[0, 1, 2, 0]] Select elements (1,0),(0,1),(1,2) and (0,0)
axis 0 axis 0 array([ 4. , 2. , 6. , 1.5])
4 5 6 >>> a / b Division
array([[ 0.66666667, 1. , 1. ], >>> b[[1, 0, 1, 0]][:,[0,1,2,0]] Select a subset of the matrix’s rows
[ 0.25 , 0.4 , 0.5 ]]) array([[ 4. ,5. , 6. , 4. ], and columns
>>> np.divide(a,b) Division [ 1.5, 2. , 3. , 1.5],
Creating Arrays >>> a * b
array([[ 1.5, 4. , 9. ],
Multiplication
[ 4. , 5.
[ 1.5, 2.
,
,
6.
3.
,
,
4. ],
1.5]])
>>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]])

>>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float) >>> np.multiply(a,b) Multiplication Array Manipulation
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]], >>> np.exp(b) Exponentiation
dtype = float) >>> np.sqrt(b) Square root Transposing Array
>>> np.sin(a) Print sines of an array >>> i = np.transpose(b) Permute array dimensions
Initial Placeholders >>> np.cos(b) Element-wise cosine >>> i.T Permute array dimensions
>>> np.log(a) Element-wise natural logarithm
>>> np.zeros((3,4)) Create an array of zeros >>> e.dot(f) Dot product
Changing Array Shape
>>> np.ones((2,3,4),dtype=np.int16) Create an array of ones array([[ 7., 7.], >>> b.ravel() Flatten the array
>>> d = np.arange(10,25,5) Create an array of evenly [ 7., 7.]]) >>> g.reshape(3,-2) Reshape, but don’t change data
spaced values (step value)
>>> np.linspace(0,2,9) Create an array of evenly Comparison Adding/Removing Elements
spaced values (number of samples) >>> h.resize((2,6)) Return a new array with shape (2,6)
>>> e = np.full((2,2),7) Create a constant array >>> a == b Element-wise comparison >>> np.append(h,g) Append items to an array
>>> f = np.eye(2) Create a 2X2 identity matrix array([[False, True, True], >>> np.insert(a, 1, 5) Insert items in an array
>>> np.random.random((2,2)) Create an array with random values [False, False, False]], dtype=bool) >>> np.delete(a,[1]) Delete items from an array
>>> np.empty((3,2)) Create an empty array >>> a < 2 Element-wise comparison
array([True, False, False], dtype=bool) Combining Arrays
>>> np.array_equal(a, b) Array-wise comparison >>> np.concatenate((a,d),axis=0) Concatenate arrays
I/O array([ 1, 2,
>>> np.vstack((a,b))
3, 10, 15, 20])
Stack arrays vertically (row-wise)
Aggregate Functions array([[ 1. , 2. , 3. ],
Saving & Loading On Disk [ 1.5, 2. , 3. ],
>>> a.sum() Array-wise sum [ 4. , 5. , 6. ]])
>>> np.save('my_array', a) >>> a.min() Array-wise minimum value >>> np.r_[e,f] Stack arrays vertically (row-wise)
>>> np.savez('array.npz', a, b) >>> b.max(axis=0) Maximum value of an array row >>> np.hstack((e,f)) Stack arrays horizontally (column-wise)
>>> np.load('my_array.npy') >>> b.cumsum(axis=1) Cumulative sum of the elements array([[ 7., 7., 1., 0.],
>>> a.mean() Mean [ 7., 7., 0., 1.]])
Saving & Loading Text Files >>> b.median() Median >>> np.column_stack((a,d)) Create stacked column-wise arrays
>>> np.loadtxt("myfile.txt") >>> a.corrcoef() Correlation coefficient array([[ 1, 10],
>>> np.std(b) Standard deviation [ 2, 15],
>>> np.genfromtxt("my_file.csv", delimiter=',') [ 3, 20]])
>>> np.savetxt("myarray.txt", a, delimiter=" ") >>> np.c_[a,d] Create stacked column-wise arrays
Copying Arrays Splitting Arrays
Data Types >>> h = a.view() Create a view of the array with the same data >>> np.hsplit(a,3) Split the array horizontally at the 3rd
>>> np.copy(a) Create a copy of the array [array([1]),array([2]),array([3])] index
>>> np.int64 Signed 64-bit integer types >>> np.vsplit(c,2) Split the array vertically at the 2nd index
>>> np.float32 Standard double-precision floating point >>> h = a.copy() Create a deep copy of the array [array([[[ 1.5, 2. , 1. ],
>>> np.complex Complex numbers represented by 128 floats [ 4. , 5. , 6. ]]]),
array([[[ 3., 2., 3.],
>>>
>>>
np.bool
np.object
Boolean type storing TRUE and FALSE values
Python object type Sorting Arrays [ 4., 5., 6.]]])]
>>> np.string_ Fixed-length string type >>> a.sort() Sort an array

>>> np.unicode_ Fixed-length unicode type >>> c.sort(axis=0) Sort the elements of an array's axis DataCamp
Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet Linear Algebra Also see NumPy
You’ll use the linalg and sparse modules. Note that scipy.linalg contains and expands on numpy.linalg.
SciPy - Linear Algebra >>> from scipy import linalg, sparse Matrix Functions
Learn More Python for Data Science Interactively at www.datacamp.com
Creating Matrices Addition
>>> np.add(A,D) Addition
>>> A = np.matrix(np.random.random((2,2)))
>>> B = np.asmatrix(b) Subtraction
SciPy >>> C = np.mat(np.random.random((10,5))) >>> np.subtract(A,D) Subtraction
The SciPy library is one of the core packages for >>> D = np.mat([[3,4], [5,6]]) Division
>>> np.divide(A,D) Division
scientific computing that provides mathematical Basic Matrix Routines Multiplication
algorithms and convenience functions built on the >>> A @ D Multiplication operator
Inverse
NumPy extension of Python. >>> A.I Inverse (Python 3)
>>> np.multiply(D,A) Multiplication
>>> linalg.inv(A) Inverse
>>> np.dot(A,D) Dot product
Interacting With NumPy Also see NumPy Transposition >>> np.vdot(A,D) Vector dot product
>>> import numpy as np >>> A.T Tranpose matrix >>> np.inner(A,D) Inner product
>>> A.H Conjugate transposition >>> np.outer(A,D) Outer product
>>> a = np.array([1,2,3])
>>> b = np.array([(1+5j,2j,3j), (4j,5j,6j)]) Trace >>> np.tensordot(A,D) Tensor dot product
>>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]]) >>> np.trace(A) Trace >>> np.kron(A,D) Kronecker product
Norm Exponential Functions
Index Tricks >>> linalg.expm(A) Matrix exponential
>>> linalg.norm(A) Frobenius norm
>>> np.mgrid[0:5,0:5] Create a dense meshgrid >>> linalg.expm2(A) Matrix exponential (Taylor Series)
>>> linalg.norm(A,1) L1 norm (max column sum)
>>> np.ogrid[0:2,0:2] Create an open meshgrid >>> linalg.expm3(D) Matrix exponential (eigenvalue
>>> linalg.norm(A,np.inf) L inf norm (max row sum) decomposition)
>>> np.r_[3,[0]*5,-1:1:10j] Stack arrays vertically (row-wise)
>>> np.c_[b,c] Create stacked column-wise arrays Rank Logarithm Function
>>> np.linalg.matrix_rank(C) Matrix rank >>> linalg.logm(A) Matrix logarithm
Shape Manipulation Determinant Trigonometric Functions
>>> linalg.det(A) Determinant >>> linalg.sinm(D) Matrix sine
>>> np.transpose(b) Permute array dimensions
Solving linear problems >>> linalg.cosm(D) Matrix cosine
>>> b.flatten() Flatten the array >>> linalg.tanm(A) Matrix tangent
>>> np.hstack((b,c)) Stack arrays horizontally (column-wise) >>> linalg.solve(A,b) Solver for dense matrices
>>> np.vstack((a,b)) Stack arrays vertically (row-wise) >>> E = np.mat(a).T Solver for dense matrices Hyperbolic Trigonometric Functions
>>> np.hsplit(c,2) Split the array horizontally at the 2nd index >>> linalg.lstsq(F,E) Least-squares solution to linear matrix >>> linalg.sinhm(D) Hypberbolic matrix sine
>>> np.vpslit(d,2) Split the array vertically at the 2nd index equation >>> linalg.coshm(D) Hyperbolic matrix cosine
Generalized inverse >>> linalg.tanhm(A) Hyperbolic matrix tangent
Polynomials >>> linalg.pinv(C) Compute the pseudo-inverse of a matrix Matrix Sign Function
(least-squares solver) >>> np.signm(A) Matrix sign function
>>> from numpy import poly1d
>>> p = poly1d([3,4,5]) Create a polynomial object >>> linalg.pinv2(C) Compute the pseudo-inverse of a matrix Matrix Square Root
(SVD) >>> linalg.sqrtm(A) Matrix square root
Vectorizing Functions Creating Sparse Matrices Arbitrary Functions
>>> def myfunc(a): >>> linalg.funm(A, lambda x: x*x) Evaluate matrix function
if a < 0: >>> F = np.eye(3, k=1) Create a 2X2 identity matrix
return a*2
else:
>>> G = np.mat(np.identity(2)) Create a 2x2 identity matrix Decompositions
return a/2 >>> C[C > 0.5] = 0
>>> np.vectorize(myfunc) Vectorize functions
>>> H = sparse.csr_matrix(C) Compressed Sparse Row matrix Eigenvalues and Eigenvectors
>>> I = sparse.csc_matrix(D) Compressed Sparse Column matrix >>> la, v = linalg.eig(A) Solve ordinary or generalized
>>> J = sparse.dok_matrix(A) Dictionary Of Keys matrix eigenvalue problem for square matrix
Type Handling >>> E.todense() Sparse matrix to full matrix >>> l1, l2 = la Unpack eigenvalues
>>> sparse.isspmatrix_csc(A) Identify sparse matrix >>> v[:,0] First eigenvector
>>> np.real(b) Return the real part of the array elements >>> v[:,1] Second eigenvector
>>> np.imag(b) Return the imaginary part of the array elements
>>> np.real_if_close(c,tol=1000) Return a real array if complex parts close to 0 Sparse Matrix Routines >>> linalg.eigvals(A) Unpack eigenvalues
>>> np.cast['f'](np.pi) Cast object to a data type Singular Value Decomposition
Inverse >>> U,s,Vh = linalg.svd(B) Singular Value Decomposition (SVD)
>>> sparse.linalg.inv(I) Inverse >>> M,N = B.shape
Other Useful Functions
Norm >>> Sig = linalg.diagsvd(s,M,N) Construct sigma matrix in SVD
>>> np.angle(b,deg=True) Return the angle of the complex argument >>> sparse.linalg.norm(I) Norm LU Decomposition
>>> g = np.linspace(0,np.pi,num=5) Create an array of evenly spaced values Solving linear problems >>> P,L,U = linalg.lu(C) LU Decomposition
(number of samples)
>>> g [3:] += np.pi >>> sparse.linalg.spsolve(H,I) Solver for sparse matrices
>>> np.unwrap(g) Unwrap Sparse Matrix Decompositions
>>>
>>>
np.logspace(0,10,3) Create an array of evenly spaced values (log scale)
np.select([c<4],[c*2]) Return values from a list of arrays depending on
Sparse Matrix Functions >>> la, v = sparse.linalg.eigs(F,1) Eigenvalues and eigenvectors
conditions >>> sparse.linalg.expm(I) Sparse matrix exponential >>> sparse.linalg.svds(H, 2) SVD
>>> misc.factorial(a) Factorial
>>> Combine N things taken at k time
>>>
misc.comb(10,3,exact=True)
misc.central_diff_weights(3) Weights for Np-point central derivative Asking For Help DataCamp
>>> misc.derivative(myfunc,1.0) Find the n-th derivative of a function at a point >>> help(scipy.linalg.diagsvd)
>>> np.info(np.matrix) Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet Plot Anatomy & Workflow
Plot Anatomy Workflow
Matplotlib Axes/Subplot The basic steps to creating plots with matplotlib are:
Learn Python Interactively at www.DataCamp.com 1 Prepare data 2 Create plot 3 Plot 4 Customize plot 5 Save plot 6 Show plot
>>> import matplotlib.pyplot as plt
>>> x = [1,2,3,4] Step 1
>>> y = [10,20,25,30]
>>> fig = plt.figure() Step 2
Matplotlib Y-axis Figure >>> ax = fig.add_subplot(111) Step 3
>>> ax.plot(x, y, color='lightblue', linewidth=3) Step 3, 4
Matplotlib is a Python 2D plotting library which produces >>> ax.scatter([2,4,6],
publication-quality figures in a variety of hardcopy formats [5,15,25],
color='darkgreen',
and interactive environments across marker='^')
platforms. >>> ax.set_xlim(1, 6.5)
X-axis
>>> plt.savefig('foo.png')
1 Prepare The Data Also see Lists & NumPy

>>> plt.show() Step 6
1D Data 4 Customize Plot

>>> import numpy as np Colors, Color Bars & Color Maps Mathtext
>>> x = np.linspace(0, 10, 100)
>>> y = np.cos(x) >>> plt.plot(x, x, x, x**2, x, x**3) >>> plt.title(r'$sigma_i=15$', fontsize=20)
>>> z = np.sin(x) >>> ax.plot(x, y, alpha = 0.4)
>>> ax.plot(x, y, c='k') Limits, Legends & Layouts
2D Data or Images >>> fig.colorbar(im, orientation='horizontal')
>>> im = ax.imshow(img, Limits & Autoscaling
>>> data = 2 * np.random.random((10, 10)) cmap='seismic')
>>> data2 = 3 * np.random.random((10, 10)) >>> ax.margins(x=0.0,y=0.1) Add padding to a plot
>>> Y, X = np.mgrid[-3:3:100j, -3:3:100j] >>> ax.axis('equal') Set the aspect ratio of the plot to 1
Markers >>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5]) Set limits for x-and y-axis
>>> U = -1 - X**2 + Y
>>> V = 1 + X - Y**2 >>> fig, ax = plt.subplots() >>> ax.set_xlim(0,10.5) Set limits for x-axis
>>> from matplotlib.cbook import get_sample_data >>> ax.scatter(x,y,marker=".") Legends
>>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy')) >>> ax.plot(x,y,marker="o") >>> ax.set(title='An Example Axes', Set a title and x-and y-axis labels
ylabel='Y-Axis',
Linestyles xlabel='X-Axis')
2 Create Plot >>>
>>>
plt.plot(x,y,linewidth=4.0)
plt.plot(x,y,ls='solid')
>>> ax.legend(loc='best')
Ticks
No overlapping plot elements
>>> import matplotlib.pyplot as plt >>> ax.xaxis.set(ticks=range(1,5), Manually set x-ticks

>>> plt.plot(x,y,ls='--') ticklabels=[3,100,-12,"foo"])
Figure >>> plt.plot(x,y,'--',x**2,y**2,'-.') >>> ax.tick_params(axis='y', Make y-ticks longer and go in and out
>>> plt.setp(lines,color='r',linewidth=4.0) direction='inout',
>>> fig = plt.figure() length=10)
>>> fig2 = plt.figure(figsize=plt.figaspect(2.0)) Text & Annotations
Subplot Spacing
Axes >>> ax.text(1, >>> fig3.subplots_adjust(wspace=0.5, Adjust the spacing between subplots
-2.1, hspace=0.3,
All plotting is done with respect to an Axes. In most cases, a 'Example Graph', left=0.125,
style='italic') right=0.9,
subplot will fit your needs. A subplot is an axes on a grid system. >>> ax.annotate("Sine", top=0.9,
>>> fig.add_axes() xy=(8, 0), bottom=0.1)
>>> ax1 = fig.add_subplot(221) # row-col-num xycoords='data', >>> fig.tight_layout() Fit subplot(s) in to the figure area
xytext=(10.5, 0),
>>> ax3 = fig.add_subplot(212) textcoords='data', Axis Spines
>>> fig3, axes = plt.subplots(nrows=2,ncols=2) arrowprops=dict(arrowstyle="->", >>> ax1.spines['top'].set_visible(False) Make the top axis line for a plot invisible
>>> fig4, axes2 = plt.subplots(ncols=3) connectionstyle="arc3"),) >>> ax1.spines['bottom'].set_position(('outward',10)) Move the bottom axis line outward
3 Plotting Routines 5 Save Plot

Save figures
1D Data Vector Fields >>> plt.savefig('foo.png')
>>> lines = ax.plot(x,y) Draw points with lines or markers connecting them >>> axes[0,1].arrow(0,0,0.5,0.5) Add an arrow to the axes Save transparent figures
>>> ax.scatter(x,y) Draw unconnected points, scaled or colored >>> axes[1,1].quiver(y,z) Plot a 2D field of arrows >>> plt.savefig('foo.png', transparent=True)
>>> axes[0,0].bar([1,2,3],[3,4,5]) Plot vertical rectangles (constant width) >>> axes[0,1].streamplot(X,Y,U,V) Plot 2D vector fields
>>> axes[1,0].barh([0.5,1,2.5],[0,1,2])
6
Plot horiontal rectangles (constant height)
>>> axes[1,1].axhline(0.45) Draw a horizontal line across axes Data Distributions Show Plot
>>> axes[0,1].axvline(0.65) Draw a vertical line across axes >>> ax1.hist(y) Plot a histogram
>>> ax.fill(x,y,color='blue') Draw filled polygons >>> ax3.boxplot(y) Make a box and whisker plot >>> plt.show()
>>> ax.fill_between(x,y,color='yellow') Fill between y-values and 0 >>> ax3.violinplot(z) Make a violin plot
2D Data or Images Close & Clear
>>> fig, ax = plt.subplots() >>> plt.cla() Clear an axis
>>> axes2[0].pcolor(data2) Pseudocolor plot of 2D array >>> plt.clf() Clear the entire figure
>>> im = ax.imshow(img, Colormapped or RGB arrays >>> axes2[0].pcolormesh(data) Pseudocolor plot of 2D array
cmap='gist_earth', >>> plt.close() Close a window
interpolation='nearest', >>> CS = plt.contour(Y,X,U) Plot contours
vmin=-2, >>> axes2[2].contourf(data1) Plot filled contours
vmax=2) >>> axes2[2]= ax.clabel(CS) Label a contour plot DataCamp
Python For Data Science Cheat Sheet Asking For Help Dropping
>>> help(pd.Series.loc)
>>> s.drop(['a', 'c']) Drop values from rows (axis=0)
Pandas Basics Selection Also see NumPy Arrays >>> df.drop('Country', axis=1) Drop values from columns(axis=1)
Learn Python for Data Science Interactively at www.DataCamp.com
>>> s['b'] Get one element Sort & Rank

-5
Pandas >>> df.sort_index() Sort by labels along an axis
>>> df.sort_values(by='Country') Sort by the values along an axis
>>> df[1:] Get subset of a DataFrame
The Pandas library is built on NumPy and provides easy-to-use Country Capital Population >>> df.rank() Assign ranks to entries
data structures and data analysis tools for the Python 1 India New Delhi 1303171035
2 Brazil Brasília 207847528
programming language. Retrieving Series/DataFrame Information
Basic Information
Use the following import convention: By Position >>> df.shape (rows,columns)
>>> import pandas as pd >>> df.iloc[[0],[0]] Select single value by row & >>> df.index Describe index
'Belgium' column >>> df.columns Describe DataFrame columns
Pandas Data Structures >>> df.iat([0],[0])
>>>
>>>
df.info()
df.count()
Info on DataFrame
Number of non-NA values
Series 'Belgium'
Summary
A one-dimensional labeled array a 3 By Label
>>> df.loc[[0], ['Country']] Select single value by row & >>> df.sum() Sum of values
capable of holding any data type b -5 'Belgium' column labels >>> df.cumsum() Cummulative sum of values
>>> df.min()/df.max() Minimum/maximum values
Index c 7 >>> df.at([0], ['Country']) >>> df.idxmin()/df.idxmax() Minimum/Maximum index value
d 4 'Belgium' >>> df.describe() Summary statistics
>>> df.mean() Mean of values
>>> s = pd.Series([3, -5, 7, 4], index=['a', 'b', 'c', 'd'])
By Label/Position >>> df.median() Median of values
>>> df.ix[2] Select single row of
DataFrame Country
Capital
Brazil
Brasília
subset of rows Applying Functions
Population 207847528 >>> f = lambda x: x*2
Columns
Country Capital Population A two-dimensional labeled >>> df.ix[:,'Capital'] Select a single column of >>> df.apply(f) Apply function
>>> df.applymap(f) Apply function element-wise
data structure with columns 0 Brussels subset of columns
0 Belgium Brussels 11190846 1 New Delhi
of potentially different types 2 Brasília Data Alignment
1 India New Delhi 1303171035
Index >>> df.ix[1,'Capital'] Select rows and columns
2 Brazil Brasília 207847528 Internal Data Alignment
'New Delhi'
NA values are introduced in the indices that don’t overlap:
Boolean Indexing
>>> data = {'Country': ['Belgium', 'India', 'Brazil'], >>> s3 = pd.Series([7, -2, 3], index=['a', 'c', 'd'])
>>> s[~(s > 1)] Series s where value is not >1
'Capital': ['Brussels', 'New Delhi', 'Brasília'], >>> s[(s < -1) | (s > 2)] s where value is <-1 or >2 >>> s + s3
'Population': [11190846, 1303171035, 207847528]} >>> df[df['Population']>1200000000] Use filter to adjust DataFrame a 10.0
b NaN
>>> df = pd.DataFrame(data,
c 5.0
columns=['Country', 'Capital', 'Population']) >>> s['a'] = 6 Set index a of Series s to 6
d 7.0
I/O Arithmetic Operations with Fill Methods

You can also do the internal data alignment yourself with
Read and Write to CSV Read and Write to SQL Query or Database Table
the help of the fill methods:
>>> pd.read_csv( , header=None, nrows=5) >>> from sqlalchemy import create_engine
>>> df.to_csv('myDataFrame.csv') >>> engine = create_engine('sqlite:///:memory:') a 10.0
>>> pd.read_sql("SELECT * FROM my_table;", engine) b -5.0
Read and Write to Excel c 5.0
>>> pd.read_sql_table('my_table', engine) d 7.0
>>> pd.read_excel( ) >>> pd.read_sql_query("SELECT * FROM my_table;", engine)
>>> df.to_excel('dir/myDataFrame.xlsx', sheet_name='Sheet1')
read_sql()is a convenience wrapper around read_sql_table() and
Read multiple sheets from the same file
read_sql_query()
>>> xlsx = pd.ExcelFile( )
>>> df = pd.read_excel(xlsx, 'Sheet1') >>> df.to_sql('myDf', engine) DataCamp
Data Wrangling Tidy Data – A foundation for wrangling in pandas
with pandas F M A F M A Tidy data complements pandas’s vectorized M
* A F
Cheat Sheet
http://pandas.pydata.org
In a tidy
data set:
& operations. pandas will automatically preserve
observations as you manipulate variables. No
other format works as intuitively with pandas.
M A
Each variable is saved
in its own column
Each observation is
saved in its own row *
Syntax – Creating DataFrames Reshaping Data – Change the layout of a data set
a b c df.sort_values('mpg')
1 4 7 10 Order rows by values of a column (low to high).
2 5 8 11
3 6 9 12
df.sort_values('mpg',ascending=False)
Order rows by values of a column (high to low).
df = pd.DataFrame(
{"a" : [4 ,5, 6], pd.melt(df) df.pivot(columns='var', values='val') df.rename(columns = {'y':'year'})
"b" : [7, 8, 9], Gather columns into rows. Spread rows into columns. Rename the columns of a DataFrame
"c" : [10, 11, 12]},
index = [1, 2, 3]) df.sort_index()
Specify values for each column. Sort the index of a DataFrame
df = pd.DataFrame( df.reset_index()
[[4, 7, 10], Reset index of DataFrame to row numbers, moving
[5, 8, 11], index to columns.
[6, 9, 12]], pd.concat([df1,df2]) pd.concat([df1,df2], axis=1) df.drop(columns=['Length','Height'])
index=[1, 2, 3], Append rows of DataFrames Append columns of DataFrames Drop columns from DataFrame
columns=['a', 'b', 'c'])
Specify values for each row.
n v
a b c Subset Observations (Rows) Subset Variables (Columns)
1 4 7 10
d
2 5 8 11
e 2 6 9 12
df = pd.DataFrame( df[['width','length','species']]
df[df.Length > 7] df.sample(frac=0.5) Select multiple columns with specific names.
{"a" : [4 ,5, 6],
Extract rows that meet logical Randomly select fraction of rows. df['width'] or df.width
"b" : [7, 8, 9],
criteria. df.sample(n=10) Select single column with specific name.
"c" : [10, 11, 12]},
df.drop_duplicates() Randomly select n rows. df.filter(regex='regex')
index = pd.MultiIndex.from_tuples(
Remove duplicate rows (only df.iloc[10:20] Select columns whose name matches regular expression regex.
[('d',1),('d',2),('e',2)],
considers columns). Select rows by position.
names=['n','v'])) regex (Regular Expressions) Examples
df.head(n) df.nlargest(n, 'value')
Create DataFrame with a MultiIndex
Select first n rows. Select and order top n entries. '\.' Matches strings containing a period '.'
df.tail(n) df.nsmallest(n, 'value') 'Length$' Matches strings ending with word 'Length'
Method Chaining Select last n rows. Select and order bottom n entries. '^Sepal' Matches strings beginning with the word 'Sepal'
Most pandas methods return a DataFrame so that '^x[1-5]$' Matches strings beginning with 'x' and ending with 1,2,3,4,5
another pandas method can be applied to the Logic in Python (and pandas) '^(?!Species$).*' Matches strings except the string 'Species'
result. This improves readability of code. < Less than != Not equal to
df = (pd.melt(df) df.loc[:,'x2':'x4']
.rename(columns={
> Greater than df.column.isin(values) Group membership Select all columns between x2 and x4 (inclusive).
'variable' : 'var', == Equals pd.isnull(obj) Is NaN df.iloc[:,[1,2,5]]
'value' : 'val'}) <= Less than or equals pd.notnull(obj) Is not NaN
Select columns in positions 1, 2 and 5 (first column is 0).
.query('val >= 200') df.loc[df['a'] > 10, ['a','c']]
>= Greater than or equals &,|,~,^,df.any(),df.all() Logical and, or, not, xor, any, all
) Select rows meeting logical condition, and only the specific columns .
http://pandas.pydata.org/ This cheat sheet inspired by Rstudio Data Wrangling Cheatsheet (https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) Written by Irv Lustig, Princeton Consultants
Summarize Data Handling Missing Data Combine Data Sets
df['w'].value_counts() df.dropna() adf bdf
Count number of rows with each unique value of variable Drop rows with any column having NA/null data. x1 x2 x1 x3
len(df) df.fillna(value) A 1 A T
# of rows in DataFrame. Replace all NA/null data with value. B 2 B F
df['w'].nunique() C 3 D T
# of distinct values in a column.
df.describe()
Make New Columns Standard Joins
Basic descriptive statistics for each column (or GroupBy) x1 x2 x3 pd.merge(adf, bdf,
A 1 T how='left', on='x1')
B 2 F Join matching rows from bdf to adf.
C 3 NaN
df.assign(Area=lambda df: df.Length*df.Height)
pandas provides a large set of summary functions that operate on Compute and append one or more new columns. x1 x2 x3 pd.merge(adf, bdf,
different kinds of pandas objects (DataFrame columns, Series, df['Volume'] = df.Length*df.Height*df.Depth A 1.0 T how='right', on='x1')
GroupBy, Expanding and Rolling (see below)) and produce single Add single column. B 2.0 F Join matching rows from adf to bdf.
values for each of the groups. When applied to a DataFrame, the pd.qcut(df.col, n, labels=False) D NaN T
result is returned as a pandas Series for each column. Examples: Bin column into n buckets.
x1 x2 x3 pd.merge(adf, bdf,
sum() min()
A 1 T how='inner', on='x1')
Sum values of each object. Minimum value in each object. Vector Vector B 2 F Join data. Retain only rows in both sets.
count() max() function function
Count non-NA/null values of Maximum value in each object.
each object. mean() x1 x2 x3 pd.merge(adf, bdf,
median() Mean value of each object. pandas provides a large set of vector functions that operate on all A 1 T how='outer', on='x1')
Median value of each object. var() columns of a DataFrame or a single selected column (a pandas B 2 F Join data. Retain all values, all rows.
quantile([0.25,0.75]) Variance of each object. Series). These functions produce vectors of values for each of the C 3 NaN
Quantiles of each object. std() columns, or a single Series for the individual Series. Examples: D NaN T
apply(function) Standard deviation of each max(axis=1) min(axis=1) Filtering Joins
Apply function to each object. object. Element-wise max. Element-wise min. x1 x2 adf[adf.x1.isin(bdf.x1)]
clip(lower=-10,upper=10) abs() A 1 All rows in adf that have a match in bdf.
Group Data Trim values at input thresholds Absolute value. B 2
df.groupby(by="col") The examples below can also be applied to groups. In this case, the x1 x2 adf[~adf.x1.isin(bdf.x1)]
Return a GroupBy object, function is applied on a per-group basis, and the returned vectors C 3 All rows in adf that do not have a match in bdf.
grouped by values in column are of the length of the original DataFrame.
named "col". shift(1) shift(-1) ydf zdf
Copy with values shifted by 1. Copy with values lagged by 1. x1 x2 x1 x2
df.groupby(level="ind") rank(method='dense') cumsum() A 1 B 2
Return a GroupBy object, Ranks with no gaps. Cumulative sum. B 2 C 3
grouped by values in index rank(method='min') cummax() C 3 D 4
level named "ind". Ranks. Ties get min rank. Cumulative max.
Set-like Operations
All of the summary functions listed above can be applied to a group. rank(pct=True) cummin()
Additional GroupBy functions: Ranks rescaled to interval [0, 1]. Cumulative min. x1 x2 pd.merge(ydf, zdf)
size() agg(function) rank(method='first') cumprod() B 2 Rows that appear in both ydf and zdf
Size of each group. Aggregate group using function. Ranks. Ties go to first value. Cumulative product. C 3 (Intersection).
x1 x2 pd.merge(ydf, zdf, how='outer')

Windows Plotting A
B
1
2
Rows that appear in either or both ydf and zdf
(Union).
df.expanding() df.plot.hist() df.plot.scatter(x='w',y='h') C 3
Return an Expanding object allowing summary functions to be Histogram for each column Scatter chart using pairs of points D 4 pd.merge(ydf, zdf, how='outer',
applied cumulatively. indicator=True)
df.rolling(n) x1 x2
A 1 .query('_merge == "left_only"')
Return a Rolling object allowing summary functions to be .drop(columns=['_merge'])
applied to windows of length n. Rows that appear in ydf but not zdf (Setdiff).
http://pandas.pydata.org/ This cheat sheet inspired by Rstudio Data Wrangling Cheatsheet (https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) Written by Irv Lustig, Princeton Consultants
Python For Data Science Cheat Sheet Model Architecture Inspect Model
>>> model.output_shape
Sequential Model Model output shape
Keras >>> from keras.models import Sequential
>>>
>>>
model.summary()
model.get_config()
Model summary representation
Model configuration
Learn Python for data science Interactively at www.DataCamp.com >>> model = Sequential() >>> model.get_weights() List all weight tensors in the model
>>> model2 = Sequential()
>>> model3 = Sequential() Compile Model
Multilayer Perceptron (MLP) MLP: Binary Classification
Keras Binary Classification >>> model.compile(optimizer='adam',
loss='binary_crossentropy',
Keras is a powerful and easy-to-use deep learning library for >>> from keras.layers import Dense metrics=['accuracy'])
Theano and TensorFlow that provides a high-level neural >>> model.add(Dense(12, MLP: Multi-Class Classification
input_dim=8, >>> model.compile(optimizer='rmsprop',
networks API to develop and evaluate deep learning models. kernel_initializer='uniform', loss='categorical_crossentropy',
activation='relu')) metrics=['accuracy'])
A Basic Example >>> model.add(Dense(8,kernel_initializer='uniform',activation='relu'))
MLP: Regression
>>> model.add(Dense(1,kernel_initializer='uniform',activation='sigmoid')) >>> model.compile(optimizer='rmsprop',
>>> import numpy as np loss='mse',
>>> from keras.models import Sequential Multi-Class Classification metrics=['mae'])
>>> from keras.layers import Dense >>> from keras.layers import Dropout
>>> data = np.random.random((1000,100)) >>> model.add(Dense(512,activation='relu',input_shape=(784,))) Recurrent Neural Network
>>> labels = np.random.randint(2,size=(1000,1)) >>> model.add(Dropout(0.2)) >>> model3.compile(loss='binary_crossentropy',
>>> model = Sequential() optimizer='adam',
>>> model.add(Dense(512,activation='relu')) metrics=['accuracy'])
>>> model.add(Dense(32, >>> model.add(Dropout(0.2))
activation='relu', >>> model.add(Dense(10,activation='softmax'))
>>>
input_dim=100))
model.add(Dense(1, activation='sigmoid'))
Regression Model Training
>>> model.compile(optimizer='rmsprop', >>> model.add(Dense(64,activation='relu',input_dim=train_data.shape[1])) >>> model3.fit(x_train4,
loss='binary_crossentropy', >>> model.add(Dense(1)) y_train4,
metrics=['accuracy']) batch_size=32,
>>> model.fit(data,labels,epochs=10,batch_size=32) Convolutional Neural Network (CNN) epochs=15,
verbose=1,
>>> predictions = model.predict(data) >>> from keras.layers import Activation,Conv2D,MaxPooling2D,Flatten validation_data=(x_test4,y_test4))
>>> model2.add(Conv2D(32,(3,3),padding='same',input_shape=x_train.shape[1:]))
Data Also see NumPy, Pandas & Scikit-Learn >>>
>>>
model2.add(Activation('relu'))
model2.add(Conv2D(32,(3,3))) Evaluate Your Model's Performance
Your data needs to be stored as NumPy arrays or as a list of NumPy arrays. Ide- >>> model2.add(Activation('relu')) >>> score = model3.evaluate(x_test,
>>> model2.add(MaxPooling2D(pool_size=(2,2))) y_test,
ally, you split the data in training and test sets, for which you can also resort batch_size=32)
>>> model2.add(Dropout(0.25))
to the train_test_split module of sklearn.cross_validation.
>>> model2.add(Conv2D(64,(3,3), padding='same'))
Keras Data Sets >>>
>>>
model2.add(Activation('relu'))
model2.add(Conv2D(64,(3, 3)))
Prediction
>>> from keras.datasets import boston_housing, >>> model2.add(Activation('relu')) >>> model3.predict(x_test4, batch_size=32)
mnist, >>> model2.add(MaxPooling2D(pool_size=(2,2))) >>> model3.predict_classes(x_test4,batch_size=32)
cifar10, >>> model2.add(Dropout(0.25))
imdb
>>> (x_train,y_train),(x_test,y_test) = mnist.load_data()
>>> (x_train2,y_train2),(x_test2,y_test2) = boston_housing.load_data()
>>>
>>>
model2.add(Flatten())
model2.add(Dense(512))
Save/ Reload Models
>>> (x_train3,y_train3),(x_test3,y_test3) = cifar10.load_data() >>> model2.add(Activation('relu')) >>> from keras.models import load_model
>>> (x_train4,y_train4),(x_test4,y_test4) = imdb.load_data(num_words=20000) >>> model2.add(Dropout(0.5)) >>> model3.save('model_file.h5')
>>> num_classes = 10 >>> my_model = load_model('my_model.h5')
>>> model2.add(Dense(num_classes))
>>> model2.add(Activation('softmax'))
Other
Recurrent Neural Network (RNN) Model Fine-tuning
>>> from urllib.request import urlopen
>>> data = np.loadtxt(urlopen("http://archive.ics.uci.edu/
ml/machine-learning-databases/pima-indians-diabetes/
>>> from keras.klayers import Embedding,LSTM Optimization Parameters
pima-indians-diabetes.data"),delimiter=",") >>> model3.add(Embedding(20000,128)) >>> from keras.optimizers import RMSprop
>>> X = data[:,0:8] >>> model3.add(LSTM(128,dropout=0.2,recurrent_dropout=0.2)) >>> opt = RMSprop(lr=0.0001, decay=1e-6)
>>> y = data [:,8] >>> model3.add(Dense(1,activation='sigmoid')) >>> model2.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
Preprocessing Also see NumPy & Scikit-Learn
Early Stopping
Sequence Padding Train and Test Sets >>> from keras.callbacks import EarlyStopping
>>> from keras.preprocessing import sequence >>> from sklearn.model_selection import train_test_split >>> early_stopping_monitor = EarlyStopping(patience=2)
>>> x_train4 = sequence.pad_sequences(x_train4,maxlen=80) >>> X_train5,X_test5,y_train5,y_test5 = train_test_split(X, >>> model3.fit(x_train4,
>>> x_test4 = sequence.pad_sequences(x_test4,maxlen=80) y,
test_size=0.33, y_train4,
random_state=42) batch_size=32,
One-Hot Encoding epochs=15,
>>> from keras.utils import to_categorical Standardization/Normalization validation_data=(x_test4,y_test4),
>>> Y_train = to_categorical(y_train, num_classes) >>> from sklearn.preprocessing import StandardScaler callbacks=[early_stopping_monitor])
>>> Y_test = to_categorical(y_test, num_classes) >>> scaler = StandardScaler().fit(x_train2)
>>> Y_train3 = to_categorical(y_train3, num_classes) >>> standardized_X = scaler.transform(x_train2) DataCamp
>>> Y_test3 = to_categorical(y_test3, num_classes) >>> standardized_X_test = scaler.transform(x_test2) Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet 3 Renderers & Visual Customizations
Glyphs Customized Glyphs Also see Data
Bokeh Scatter Markers Selection and Non-Selection Glyphs
Learn Bokeh Interactively at www.DataCamp.com, >>> p1.circle(np.array([1,2,3]), np.array([3,2,1]), >>> p = figure(tools='box_select')
taught by Bryan Van de Ven, core contributor fill_color='white') >>> p.circle('mpg', 'cyl', source=cds_df,
>>> p2.square(np.array([1.5,3.5,5.5]), [1,4,3], selection_color='red',
color='blue', size=1) nonselection_alpha=0.1)
Line Glyphs
Plotting With Bokeh >>> p1.line([1,2,3,4], [3,4,5,6], line_width=2) Hover Glyphs
>>> p2.multi_line(pd.DataFrame([[1,2,3],[5,6,7]]), >>> hover = HoverTool(tooltips=None, mode='vline')
The Python interactive visualization library Bokeh pd.DataFrame([[3,4,5],[3,2,1]]), >>> p3.add_tools(hover)
enables high-performance visual presentation of color="blue")
Colormapping
large datasets in modern web browsers. Rows & Columns Layout US
Asia >>> color_mapper = CategoricalColorMapper(
factors=['US', 'Asia', 'Europe'],
Europe
Rows Columns palette=['blue', 'red', 'green'])

Bokeh’s mid-level general purpose bokeh.plotting >>> from bokeh.layouts import row >>> from bokeh.layouts import columns >>> p3.circle('mpg', 'cyl', source=cds_df,
>>> layout = row(p1,p2,p3) >>> layout = column(p1,p2,p3)
interface is centered around two main components: data color=dict(field='origin',
Nesting Rows & Columns transform=color_mapper),
and glyphs. >>>layout = row(column(p1,p2), p3) legend='Origin'))
Grid Layout Linked Plots Also see Data

+ = >>> from bokeh.layouts import gridplot
>>> row1 = [p1,p2]
Linked Axes
data glyphs plot >>> p2.x_range = p1.x_range
>>> row2 = [p3]
>>> p2.y_range = p1.y_range
>>> layout = gridplot([[p1,p2],[p3]])
The basic steps to creating plots with the bokeh.plotting Linked Brushing
interface are: Tabbed Layout >>> p4 = figure(plot_width = 100, tools='box_select,lasso_select')
>>> from bokeh.models.widgets import Panel, Tabs >>> p4.circle('mpg', 'cyl', source=cds_df)
1. Prepare some data: >>> tab1 = Panel(child=p1, title="tab1") >>> p5 = figure(plot_width = 200, tools='box_select,lasso_select')
Python lists, NumPy arrays, Pandas DataFrames and other sequences of values >>> p5.circle('mpg', 'hp', source=cds_df)
>>> tab2 = Panel(child=p2, title="tab2")
2. Create a new plot >>> layout = Tabs(tabs=[tab1, tab2]) >>> layout = row(p4,p5)
3. Add renderers for your data, with visual customizations Legends
4. Specify where to generate the output Legend Location Legend Orientation
5. Show or save the results Inside Plot Area >>> p.legend.orientation = "horizontal"
>>> from bokeh.plotting import figure >>> p.legend.location = 'bottom_left' >>> p.legend.orientation = "vertical"
>>> from bokeh.io import output_file, show Outside Plot Area
>>> x = [1, 2, 3, 4, 5] >>> r1 = p2.asterisk(np.array([1,2,3]), np.array([3,2,1]) Legend Background & Border
Step 1 >>> r2 = p2.line([1,2,3,4], [3,4,5,6])
>>> y = [6, 7, 2, 4, 5] >>> p.legend.border_line_color = "navy"
>>> legend = Legend(items=[("One" , [p1, r1]),("Two" , [r2])], location=(0, -30))
>>> p = figure(title="simple line example", Step 2 >>> p.add_layout(legend, 'right') >>> p.legend.background_fill_color = "white"
x_axis_label='x',
y_axis_label='y')
>>> p.line(x, y, legend="Temp.", line_width=2) Step 3 4 Output Statistical Charts With Bokeh
Bokeh’s high-level bokeh.charts interface is ideal for quickly
Also see Data
>>> output_file("lines.html") Step 4 Output to HTML File
>>> show(p) Step 5 creating statistical charts
>>> from bokeh.io import output_file, show
1 Data Also see Lists, NumPy & Pandas

>>> output_file('my_bar_chart.html', mode='cdn')
Notebook Output
Bar Chart
>>> from bokeh.charts import Bar
Under the hood, your data is converted to Column Data >>> p = Bar(df, stacked=True, palette=['red','blue'])
>>> from bokeh.io import output_notebook, show
Sources. You can also do this manually: Box Plot
>>> import numpy as np >>> output_notebook()
>>> import pandas as pd >>> from bokeh.charts import BoxPlot
>>> df = pd.DataFrame(np.array([[33.9,4,65, 'US'], Embedding >>> p = BoxPlot(df, values='vals', label='cyl',
[32.4,4,66, 'Asia'], legend='bottom_right')
Standalone HTML
Label 1
[21.4,4,109, 'Europe']]),
Label 2
Label 3
columns=['mpg','cyl', 'hp', 'origin'],

index=['Toyota', 'Fiat', 'Volvo'])
>>> from bokeh.embed import file_html Histogram
>>> html = file_html(p, CDN, "my_plot") Histogram
>>> from bokeh.charts import Histogram

>>> from bokeh.models import ColumnDataSource Components >>> p = Histogram(df, title='Histogram')
>>> cds_df = ColumnDataSource(df)
>>> from bokeh.embed import components Scatter Plot
2 Plotting
>>> script, div = components(p) >>> from bokeh.charts import Scatter
5
>>> p = Scatter(df, x='mpg', y ='hp', marker='square',
Show or Save Your Plots
y-axis
>>> from bokeh.plotting import figure xlabel='Miles Per Gallon',
>>> p1 = figure(plot_width=300, tools='pan,box_zoom') ylabel='Horsepower')
x-axis
>>> p2 = figure(plot_width=300, plot_height=300, >>> show(p1) >>> save(p1)

x_range=(0, 8), y_range=(0, 8)) >>> show(layout) >>> save(layout) DataCamp
>>> p3 = figure() Learn Python for Data Science Interactively
Python For Data Science Cheat Sheet Duplicate Values GroupBy
>>> df = df.dropDuplicates() >>> df.groupBy("age")\ Group by age, count the members
PySpark - SQL Basics .count() \
.show()
in the groups
Learn Python for data science Interactively at www.DataCamp.com Queries
>>> from pyspark.sql import functions as F
Select Filter
>>> df.select("firstName").show() Show all entries in firstName column >>> df.filter(df["age"]>24).show() Filter entries of age, only keep those
>>> df.select("firstName","lastName") \ records of which the values are >24
PySpark & Spark SQL .show()
>>> df.select("firstName", Show all entries in firstName, age

Spark SQL is Apache Spark's module for "age", and type

Sort
explode("phoneNumber") \
working with structured data. .alias("contactInfo")) \
.select("contactInfo.type", >>> peopledf.sort(peopledf.age.desc()).collect()
>>> df.sort("age", ascending=False).collect()
Initializing SparkSession "firstName",
"age") \ >>> df.orderBy(["age","city"],ascending=[0,1])\
A SparkSession can be used create DataFrame, register DataFrame as tables, .show() .collect()
execute SQL over tables, cache tables, and read parquet files. >>> df.select(df["firstName"],df["age"]+ 1) Show all entries in firstName and age,
.show() add 1 to the entries of age
>>> from pyspark.sql import SparkSession
>>> spark = SparkSession \
>>> df.select(df['age'] > 24).show()
When
Show all entries where age >24 Missing & Replacing Values
.builder \ >>> df.select("firstName", Show firstName and 0 or 1 depending
.appName("Python Spark SQL basic example") \ >>> df.na.fill(50).show() Replace null values
F.when(df.age > 30, 1) \ on age >30 >>> df.na.drop().show() Return new df omitting rows with null values
.config("spark.some.config.option", "some-value") \ .otherwise(0)) \
.getOrCreate() >>> df.na \ Return new df replacing one value with
.show() .replace(10, 20) \ another
>>> df[df.firstName.isin("Jane","Boris")] Show firstName if in the given options .show()
Creating DataFrames Like
.collect()
From RDDs
>>> df.select("firstName", Show firstName, and lastName is
df.lastName.like("Smith")) \ TRUE if lastName is like Smith
Repartitioning
.show()
>>> from pyspark.sql.types import * Startswith - Endswith >>> df.repartition(10)\ df with 10 partitions
>>> df.select("firstName", Show firstName, and TRUE if .rdd \
Infer Schema .getNumPartitions()
>>> sc = spark.sparkContext df.lastName \ lastName starts with Sm
.startswith("Sm")) \ >>> df.coalesce(1).rdd.getNumPartitions() df with 1 partition
>>> lines = sc.textFile("people.txt")
.show()
>>> parts = lines.map(lambda l: l.split(",")) >>> df.select(df.lastName.endswith("th")) \ Show last names ending in th
>>>
>>>
people = parts.map(lambda p: Row(name=p[0],age=int(p[1])))
peopledf = spark.createDataFrame(people)
.show() Running SQL Queries Programmatically
Substring
Specify Schema >>> df.select(df.firstName.substr(1, 3) \ Return substrings of firstName Registering DataFrames as Views
>>> people = parts.map(lambda p: Row(name=p[0], .alias("name")) \
age=int(p[1].strip()))) .collect() >>> peopledf.createGlobalTempView("people")
>>> schemaString = "name age" Between >>> df.createTempView("customer")
>>> fields = [StructField(field_name, StringType(), True) for >>> df.select(df.age.between(22, 24)) \ Show age: values are TRUE if between >>> df.createOrReplaceTempView("customer")
field_name in schemaString.split()] .show() 22 and 24
>>> schema = StructType(fields) Query Views
>>> spark.createDataFrame(people, schema).show()
+--------+---+
| name|age|
Add, Update & Remove Columns >>> df5 = spark.sql("SELECT * FROM customer").show()
+--------+---+ >>> peopledf2 = spark.sql("SELECT * FROM global_temp.people")\
|
|
Mine| 28|
Filip| 29|
Adding Columns .show()
|Jonathan| 30|
+--------+---+ >>> df = df.withColumn('city',df.address.city) \
.withColumn('postalCode',df.address.postalCode) \
From Spark Data Sources .withColumn('state',df.address.state) \
.withColumn('streetAddress',df.address.streetAddress) \
Output
.withColumn('telePhoneNumber', Data Structures
JSON explode(df.phoneNumber.number)) \
>>> df = spark.read.json("customer.json") .withColumn('telePhoneType',
>>> df.show() >>> rdd1 = df.rdd Convert df into an RDD
+--------------------+---+---------+--------+--------------------+ explode(df.phoneNumber.type)) >>> df.toJSON().first() Convert df into a RDD of string
| address|age|firstName |lastName| phoneNumber|
+--------------------+---+---------+--------+--------------------+ >>> df.toPandas() Return the contents of df as Pandas
|[New York,10021,N...| 25|
|[New York,10021,N...| 21|
John|
Jane|
Smith|[[212 555-1234,ho...|
Doe|[[322 888-1234,ho...|
Updating Columns DataFrame
+--------------------+---+---------+--------+--------------------+
>>> df2 = spark.read.load("people.json", format="json")
>>> df = df.withColumnRenamed('telePhoneNumber', 'phoneNumber') Write & Save to Files
Parquet files Removing Columns >>> df.select("firstName", "city")\
>>> df3 = spark.read.load("users.parquet") .write \
TXT files >>> df = df.drop("address", "phoneNumber") .save("nameAndCity.parquet")
>>> df4 = spark.read.text("people.txt") >>> df = df.drop(df.address).drop(df.phoneNumber) >>> df.select("firstName", "age") \
.write \
.save("namesAndAges.json",format="json")
Inspect Data
>>> df.dtypes Return df column names and data types >>> df.describe().show() Compute summary statistics Stopping SparkSession
>>> df.show() Display the content of df >>> df.columns Return the columns of df
>>> df.count() >>> spark.stop()
>>> df.head() Return first n rows Count the number of rows in df
>>> df.first() Return first row >>> df.distinct().count() Count the number of distinct rows in df
>>> df.take(2) Return the first n rows >>> df.printSchema() Print the schema of df DataCamp
>>> df.schema Return the schema of df >>> df.explain() Print the (logical and physical) plans
Python For Data Science Cheat Sheet Create Your Model Evaluate Your Model’s Performance
Supervised Learning Estimators Classification Metrics
Scikit-Learn
Learn Python for data science Interactively at www.DataCamp.com Linear Regression Accuracy Score
>>> from sklearn.linear_model import LinearRegression >>> knn.score(X_test, y_test) Estimator score method
>>> lr = LinearRegression(normalize=True) >>> from sklearn.metrics import accuracy_score Metric scoring functions
>>> accuracy_score(y_test, y_pred)
Support Vector Machines (SVM)
Scikit-learn >>> from sklearn.svm import SVC Classification Report
>>> svc = SVC(kernel='linear') >>> from sklearn.metrics import classification_report Precision, recall, f1-score
Scikit-learn is an open source Python library that Naive Bayes >>> print(classification_report(y_test, y_pred)) and support
implements a range of machine learning, >>> from sklearn.naive_bayes import GaussianNB Confusion Matrix
>>> gnb = GaussianNB() >>> from sklearn.metrics import confusion_matrix
preprocessing, cross-validation and visualization >>> print(confusion_matrix(y_test, y_pred))
algorithms using a unified interface. KNN
>>> from sklearn import neighbors Regression Metrics
A Basic Example >>> knn = neighbors.KNeighborsClassifier(n_neighbors=5)
>>> from sklearn import neighbors, datasets, preprocessing
Mean Absolute Error
>>> from sklearn.model_selection import train_test_split Unsupervised Learning Estimators >>> from sklearn.metrics import mean_absolute_error
>>> from sklearn.metrics import accuracy_score >>> y_true = [3, -0.5, 2]
>>> iris = datasets.load_iris() Principal Component Analysis (PCA) >>> mean_absolute_error(y_true, y_pred)
>>> X, y = iris.data[:, :2], iris.target >>> from sklearn.decomposition import PCA Mean Squared Error
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=33) >>> pca = PCA(n_components=0.95) >>> from sklearn.metrics import mean_squared_error
>>> scaler = preprocessing.StandardScaler().fit(X_train) >>> mean_squared_error(y_test, y_pred)
>>> X_train = scaler.transform(X_train)
K Means
>>> X_test = scaler.transform(X_test) >>> from sklearn.cluster import KMeans R² Score
>>> knn = neighbors.KNeighborsClassifier(n_neighbors=5) >>> k_means = KMeans(n_clusters=3, random_state=0) >>> from sklearn.metrics import r2_score
>>> r2_score(y_true, y_pred)
>>> knn.fit(X_train, y_train)
>>> y_pred = knn.predict(X_test)
>>> accuracy_score(y_test, y_pred) Model Fitting Clustering Metrics
Adjusted Rand Index
Supervised learning >>> from sklearn.metrics import adjusted_rand_score
Loading The Data Also see NumPy & Pandas >>> lr.fit(X, y) Fit the model to the data
>>> adjusted_rand_score(y_true, y_pred)
>>> knn.fit(X_train, y_train)
Your data needs to be numeric and stored as NumPy arrays or SciPy sparse >>> svc.fit(X_train, y_train) Homogeneity
>>> from sklearn.metrics import homogeneity_score
matrices. Other types that are convertible to numeric arrays, such as Pandas Unsupervised Learning >>> homogeneity_score(y_true, y_pred)
DataFrame, are also acceptable. >>> k_means.fit(X_train) Fit the model to the data
>>> pca_model = pca.fit_transform(X_train) Fit to data, then transform it V-measure
>>> import numpy as np >>> from sklearn.metrics import v_measure_score
>>> X = np.random.random((10,5)) >>> metrics.v_measure_score(y_true, y_pred)
>>> y = np.array(['M','M','F','F','M','F','M','M','F','F','F'])
>>> X[X < 0.7] = 0 Prediction Cross-Validation
>>> from sklearn.cross_validation import cross_val_score
Supervised Estimators >>> print(cross_val_score(knn, X_train, y_train, cv=4))
Training And Test Data >>> y_pred = svc.predict(np.random.random((2,5))) Predict labels
>>> y_pred = lr.predict(X_test)
>>> print(cross_val_score(lr, X, y, cv=2))
Predict labels
>>> from sklearn.model_selection import train_test_split >>> y_pred = knn.predict_proba(X_test) Estimate probability of a label
>>> X_train, X_test, y_train, y_test = train_test_split(X,
y, Unsupervised Estimators Tune Your Model
random_state=0) >>> y_pred = k_means.predict(X_test) Predict labels in clustering algos Grid Search
>>> from sklearn.grid_search import GridSearchCV
>>> params = {"n_neighbors": np.arange(1,3),
Preprocessing The Data "metric": ["euclidean", "cityblock"]}
>>> grid = GridSearchCV(estimator=knn,
Standardization Encoding Categorical Features param_grid=params)
>>> grid.fit(X_train, y_train)
>>> from sklearn.preprocessing import StandardScaler >>> from sklearn.preprocessing import LabelEncoder >>> print(grid.best_score_)
>>> scaler = StandardScaler().fit(X_train) >>> print(grid.best_estimator_.n_neighbors)
>>> enc = LabelEncoder()
>>> standardized_X = scaler.transform(X_train) >>> y = enc.fit_transform(y)
>>> standardized_X_test = scaler.transform(X_test) Randomized Parameter Optimization
Normalization Imputing Missing Values >>> from sklearn.grid_search import RandomizedSearchCV
>>> params = {"n_neighbors": range(1,5),
>>> from sklearn.preprocessing import Normalizer "weights": ["uniform", "distance"]}
>>> from sklearn.preprocessing import Imputer >>> rsearch = RandomizedSearchCV(estimator=knn,
>>> scaler = Normalizer().fit(X_train) >>> imp = Imputer(missing_values=0, strategy='mean', axis=0) param_distributions=params,
>>> normalized_X = scaler.transform(X_train) >>> imp.fit_transform(X_train) cv=4,
>>> normalized_X_test = scaler.transform(X_test) n_iter=8,
random_state=5)
Binarization Generating Polynomial Features >>> rsearch.fit(X_train, y_train)
>>> print(rsearch.best_score_)
>>> from sklearn.preprocessing import Binarizer >>> from sklearn.preprocessing import PolynomialFeatures
>>> binarizer = Binarizer(threshold=0.0).fit(X) >>> poly = PolynomialFeatures(5)
>>> binary_X = binarizer.transform(X) >>> poly.fit_transform(X) DataCamp
Turtle
Moving Turtle
import turtle Import the turtle library
t = turtle.Pen() Create a new turtle called t
t.forward ( yyy ) Move the turtle forward yyy Headings
t.backward ( yyy ) Move the turtle backwards yyy 0 = East
t.left ( yyy ) Turn the turtle yyy degrees to the left 90 = North
t.right ( yyy ) Turn the turtle yyy degrees to the right 180 = West
t.setheading ( yyy ) Make the turtle point in the specified direction 270 = South
Changing Turtle
t.pencolor ( “red” ) Set the line colour to be “red”
t.fillcolor ( “green” ) Set fill colour to be “green” Turtle Shapes
t.pensize ( yy ) Set the width of the lines arrow
t.begin_fill ( ) Start filling a shape turtle
t.end_fill ( ) Stop filling a shape circle
t.showturtle ( ) Show the turtle square
t.hideturtle ( ) Hide the turtle triangle
t.shape ( “turtle” ) Change the turtle’s costume (shape) classic
t.shapesize ( yy ) Change the size of the turtle costume
Turtle Functions Colours

t.circle ( yy ) Draw a circle, of size yy, to the left of the turtle Red
t.dot ( yy,“orange” ) Draw a dot of size yy at the current position Pink
t.stamp ( ) ‘Stamp’ a copy of turtle at the current position Orange
t.write ( “words” ) Write the words at the current position Green
Blue
Positioning Turtle Cyan
t.penup ( ) Stop the turtle from drawing Yellow
t.pendown ( ) Start the turtle drawing again Gold
t.speed ( yy ) Set the speed of the turtle Purple
t.goto ( x, y ) Send the turtle to the coordinates x/y Navy
t.home ( ) Send the turtle home (the centre) Olive
t.setx ( x ) Change the turtle’s x coordinate Salmon
t.sety ( y ) Change the turtle’s y coordinate PeachPuff
Lavender
Turtle’s World Magenta
screen = t.getscreen ( ) Get the screen Black
screen.bgcolor ( “red” ) Change the colour of the screen White
screen.exitonclick ( ) Set the screen to close when clicked Gray
screen.title ( “Title Here” ) Give the window a title

Python 3 Cheat Sheet: Int Float Bool STR List Tuple

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Python 3 Cheat Sheet: Int Float Bool STR List Tuple

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python 3 Cheat Sheet: Int Float Bool STR List Tuple

Uploaded by

Copyright:

Available Formats

©2012-2015 - Laurent Pointal Mémento v2.0.

for variables, functions, Identifiers type(expression) Conversions

Boolean Logic Statements Blocks Modules/Names Imports

-neously ⁝ ☝ modules and packages searched in python path (cf sys.path)

☝ good habit : don't modify loop variable

items to display : literal values, variables, expressions Go over sequence's index

>>> np.subtract(a,b) Boolean Indexing

>>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]])

>>> np.string_ Fixed-length string type >>> a.sort() Sort an array

1 Prepare The Data Also see Lists & NumPy

1D Data 4 Customize Plot

>>> import matplotlib.pyplot as plt >>> ax.xaxis.set(ticks=range(1,5), Manually set x-ticks

3 Plotting Routines 5 Save Plot

>>> s['b'] Get one element Sort & Rank

I/O Arithmetic Operations with Fill Methods

x1 x2 pd.merge(ydf, zdf, how='outer')

Rows Columns palette=['blue', 'red', 'green'])

Grid Layout Linked Plots Also see Data

1 Data Also see Lists, NumPy & Pandas

columns=['mpg','cyl', 'hp', 'origin'],

>>> from bokeh.charts import Histogram

>>> p2 = figure(plot_width=300, plot_height=300, >>> show(p1) >>> save(p1)

Spark SQL is Apache Spark's module for "age", and type

Turtle Functions Colours

You might also like