Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
63 views

Programming With Python: by Mustapha

This document discusses Python programming fundamentals including variables, data types, strings, tuples, ranges, and lists. It explains that variables store values that can be changed, strings are sequences of characters that can be indexed and sliced, tuples are immutable sequences, ranges generate sequences of numbers, and lists are mutable sequences that can hold elements of any type.

Uploaded by

Mustapha
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Programming With Python: by Mustapha

This document discusses Python programming fundamentals including variables, data types, strings, tuples, ranges, and lists. It explains that variables store values that can be changed, strings are sequences of characters that can be indexed and sliced, tuples are immutable sequences, ranges generate sequences of numbers, and lists are mutable sequences that can hold elements of any type.

Uploaded by

Mustapha
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 99

PROGRAMMING WITH

PYTHON
by Mustapha
VARIABLES AND SIMPLE DATA
TYPES
• Every variable is connected to a value, which is the information
associated with that variable.
• Examples: message = “Live and Love!”
• Acc1 = 200000
• To display the value associated with a variable, the print function is used.
• Examples: print(message)
• print(Acc1)
• The value of the variable can be changed at any time. Python will keep
track of the current value.
NAMING AND USING VARIABLES
• Variable names can contain only letters, numbers, and underscores.
• They can start with a letter or an underscore, but not with a number
• Examples: Valid: Classifier_1, _1classifier
• Invalid: 1classifier, #1classifier
• Spaces are not allowed in variable names, but underscores can be
used to separate words in variable names.
• Valid: knn_classifier
• Invalid: knn classifier
• As a rule, avoid using Python keywords and function names as
variable names.
NAMING AND USING VARIABLES
• Example: keywords: if, elif, while, etc.
Function names: print, sort, etc.

• Variable names should be short but descriptive


• Example: student_name is better than s_n
split75_25 is better than s_75_25
General points to note
• Indentation, not braces: whitespace (tabs or spaces ) is used to
structure code instead of braces as in many other languages.
• Colon: a colon denotes the start of an indented block
• Semicolons: Semicolons can be used to separate multiple statements
on a single line.
• Everything is an object: Every number, string, data structure, function,
class, module exists as a Python object. Each object has an associated
type and internal data.
• Comments: Any text preceded by the hash mark # is ignored by the
Python interpreter.
Binary operators
Some basic binary operators are:
a+b add a and b
a-b subtract b from a
a*b multiply a and b
a/b divide a by b
a//b floor-divide a by b, dropping any fractional remainder
a**b raise a to the b power
a & b True if both a and b are True; for integers take the bitwise AND
a|b True if either a or b is True; for integer, take the bitwise OR
a==b True if a equals b
a!=b True if a is not equal to b
a<=b, a<b True if a is less than (les than or equal) to b
a is b True if a and b reference the same Python object
a is not b True if a and b reference different Python objects
DATA STRUCTURES
STRINGS
STRINGS
• A string is a series of characters
• Anything inside quotes is considered a string in python. The quotes
can be single or double.
• Examples:
• “Monte Carlo is not a simulation.”
• ‘Monte Carlo is not a simulation.’
• This flexibility allows programmers to use quotes and apostrophes
within strings:
• ‘Natural Languages are the hardest says an “experienced” programmer’
• “Memoization is not ‘memorization’ wrongly spelt!”
STRINGS
• Strings are one of the several sequence types in Python
• Strings share three operations with the other sequence types:
1. Length
2. Indexing
3. Slicing
• The length of a string can be found using the len function. For
example, len(‘mathematica’) outputs 11.
• Indexing can be used to extract individual characters from a string
• In Python, all indexing is zero-based
• Square brackets are used to index into all sequence types
STRINGS
• Python uses 0 to represent the first element of a string
• “mathematica”[0] will produce m
• Negative numbers are used to index from the end of a string. For
example: “mathematica"[-1] outputs a
• Slicing is used to extract substrings of arbitrary length. If s is a string,
the expression s[start: end] denotes the sub string of s that starts at
index start and ends at index end -1
• This logic was probably adopted so that expressions like s[0:len(s)]
could produce the value one might expect. I.e. The full string.
STRINGS
• In the expression s[0:len(s)], where s is a string, if:
• the value before the colon is omitted, it defaults to 0
As in: s[:len(s)]
• the value after the colon is omitted, it defaults to the length of the string
As in: s[0:]
• both values are omitted as in s[:], the expression becomes semantically
equivalent to the more verbose s[0:len(s)]
As in: s[:]
• s[::] (double colon) -:
• Without a preceding or succeeding argument is semantically equivalent to s[:]
• E.g. “mathematica”[::] will output “mathematica”
STRINGS
• s[::] (double colons)-:
• s[start::] begins subscripting from the index corresponding to start
• “mathematica”[1::] outputs “athematica”
• s[start::skip]
 begins subscripting from the index corresponding to start
 skips skip-1 number of values periodically
 “mathematica”[2::2] outputs “teaia”
STRINGS
• String formatting
a = 2; b= 4
print(‘a squared is {0}, b cubed is {1}’.format( a**2, b**3))
Output: a is 4, b is 16
• The numbers in the curly braces specifies the positions of the values to be printed
• Template = ‘{0:2f} {1:s} are the equivalent of {2:d}’
 {0:2f} means to format the first number as a floating point number with two decimal places
 {1:s} means to format the second argument as a string
 {2:d} means to format the third number as an exact integer
Some methods on strings
• s.count(s1) counts how many times the string s1 occurs in s.
• s.find(s1) returns the index of the first occurrence of the substring s1 in s, and -1 if s1 is
not in s.
• s.rfind(s1) same as find, but starts from the end of s (the “r” in rfind stands for reverse).
• s.index(s1) same as find, but raises an exception (Chapter 7) if s1 is not in s.
• s.rindex(s1) same as index, but starts from the end of s.
• s.lower() converts all uppercase letters in s to lowercase.
• s.replace(old, new) replaces all occurrences of the string old in s with the string new.
• s.rstrip() removes trailing white space from s.
• s.split(d) Splits s using d as a delimiter. Returns a list of substrings of s. For example, the
value of 'David Guttag plays basketball'.split(' ') is ['David', 'Guttag', 'plays', 'basketball'].
If d is omitted, the substrings are separated by arbitrary strings of whitespace characters.
TUPLES
• Tuples are immutable sequences of elements like strings
• The elements of a tuple need not be characters
• The individual elements can be of any type, and need not be the same
type as each other
• Literals of type tuple are written by enclosing a comma separated list
of elements within parentheses. For example:
• tu1 = ()
• tu2 = (1, ‘two’, 1.9)
• A singleton tuple must have its element trailed by a comma
TUPLES
• Singleton tuple:
• Invalid: tu1 = (1) or tu2 = (‘propagate’) or tu3 = (1.0)
>>>type(tu1)
<class ‘int’>
>>>type(tu2)
<class ‘str’>
>>>type(tu3)
<class ‘float’>
• Valid: tu1 = (1,) or tu2 = (‘propagate’,) or tu3 = (1.0,)
>>>type(tu1) == type(tu2) == type(tu3)
True
>>>type(tu1)
<class ‘tuple’>
TUPLES
• Concatenation: like strings, tuples can be concatenated
• t1 = (‘love’, ‘and’, ‘lead’, “56”)
• t2 = (‘live’, ‘and’, ‘flourish’)
>>>t1 + t2
(‘love’, ‘and’, ‘lead’, ‘live’, ‘and’, ‘flourish’)
• Indexing: tuples can also be indexed
>>>t1[1]
‘and’
• Slicing: the elements of a tuple can be sliced to form a new tuple
>>>t1[0:2]
(‘love’, ‘and’)
RANGES
• Like strings and tuples, ranges are immutable
• The range function returns an object of type range
• It takes three (3) integer arguments: start, stop, and step and returns
the progression of integers start, start + step, start + 2*step, etc.
enclosed in parentheses.
• range(start, stop, step)
• If step is positive, the last element is the largest integer start + i*step
less than stop
• If step is negative, the last element is the smallest integer start + i*step
greater than stop.
RANGE
• If only two arguments are supplied, a step of 1 is used.
• If only one argument is supplied, that argument is stop, start defaults
to 0, and step defaults to 1.
• All of the operations on tuples are also available for ranges, except for
concatenation and repetition.
>>>range(10)[2:6][2]
4
WHY? Explain!
RANGE
• The order of the integers in a arrange matters.
>>range(0, 7, 2) == (6, -1, -2)
False
• The test evaluates to false because:
the two ranges contain the same integers, they occur in a different order
>>> elem_r1 = [elem for elem in range(0, 7, 2)]
>>> elem_r1
[0, 2, 4, 6]

>>>elem_r2 = [elem for elem in range(6, -1, -2)]


>>>elem_r2
[6, 4, 2, 0]
RANGE
• The technique above is called list comprehension.
• The amount of space occupied by a range is not proportional to its
length.
• It is fully defined by its start, stop, and step values, therefore can be
stored in a small amount of space.
• Objects of type range can be used anywhere a sequence of integers
can be used.
LISTS
• A list is an ordered sequence of values, where each value is identified
by an index.
• The elements of a list are enclosed in square brackets.
• The empty list is written as []
• Singleton lists are written without the comma before the closing
bracket – [1] is a singleton list.
>>>singletn_L = [1]
>>>type(singletn_L)
<class ‘list’>
Branching Programs: control flow
• A conditional statement has three parts:
1. A test, i.e., an expression that evaluates to either True or False
2. A block of code that is executed if the test evaluates to True
3. An optional block of code that is executed if the test evaluates to False.
Form: if Boolean expression:
block of code
else:
block of code
Comparison of sequence types
Type Type of elements Examples of literals Mutable

str Characters ‘ ’, ‘a’, ‘abc’ ‘123’ No

tuple Any type (), (3,), (‘abc’, 4) No

range Integers range(10), range(1, 10, 2) No

list Any type [], [3], [‘abc’, 4] Yes


Branching Programs: control flow
• When either the true block or the false block of a conditional contains another
conditional, the conditionals are nested
• if x%2 == 0:
if x%3 == 0:
print(‘Divisible by 2 and 3’)
else:
print(‘Divisible by 2 and not by 3’)
elif x%3 == 0: #else if
print(‘Divisible by 3 and not by 2’)
• Elif stands for “else if”
Branching Programs: control flow
Exercise:
Write a program that examines three distinct variables – x, y, and z –
and prints the largest odd number among them. If none of them is odd,
it should print a message to that effect.
Branching Programs: iteration or looping
• Looping or iteration enables programs to repeat an action many times
• Similar to a conditional statement, it begins with a test.
• If the test evaluates to True, the program executes the loop body
once, and then goes back to reevaluate the test.
• The process is repeated until the test evaluates to False, after which
control passes to the code following the iteration statement.
While statement
• The while statement can be used to create a loop
# to square an integer, the hard way
x = int(input(‘enter an integer:)
ans = 0
iterLeft = x
While (iterLeft ! = 0):
ans = ans + abs(x)
iterLeft = iterLeft – 1
Print(str(x) + ‘*’ + str(x) + ‘=‘ + str(ans))
While statement
# example 2: a program that computes the cube root of a perfect cube
x = int(input(‘Enter an integer:’))
ans = 0
While ans **3 < abs(x):
ans = ans + 1
If ans**3 ! = abs(x)
print(x, ‘is not a perfect cube’)
else:
if x < 0:
ans = -ans
print(‘Cube root of x’, x, ‘is’ ans)
For Loops
• The general form for a for statement is
for variable in sequence
code block
• The variable following for is bound to the first value in the sequence,
and the code block is executed.
• The variable is then assigned to the second value, and the code block
is executed again.
• The process continues until the sequence is exhausted or the break
statement is executed within the code block.
For Loops
• The for statement can be used to iterate over the characters of a
string.
Example: a program to print the total number of characters of a string
total = 0
string = input(‘Enter a string’)
for c in string:
if c!=“ “:
total +=1
print(total)
For Loops
Exercises:
1. Write a program that can compute the sum of the digits
“54356786540”
2. Let s be a string that contains a sequence of decimal numbers
separated by commas, e.g. s= ‘1.23,2.4,3.123’. Write a program that
prints the sum of the numbers in s
3. Let s be a string that contains a sequence of decimals separated by
commas and white spaces, e.g. s = “1.23,2.4 3.123, 6.7”. Write a
program that prints the sum of the numbers in s
*Your program should request for an input
Functions
• There is an almost endless number of built-in functions in Python
• However, the focus here is on user defined functions
• The ability for programmers to define and use their own functions, is
a qualitative leap forward in convenience
• It enables programmers to write code with general utility
• Each function definition is of the form
def name of function (list of formal parameters):
body of function
Functions
• We can define a function double as follows;
def double(x):
return 2*x
• def is a keyword that tells Python a function is about to be defined
• x is a formal parameter of the function
• A function call or invocation binds the formal parameter(s) to the
actual parameters specified in the call. E.g double(3) assigns 3 to x
• The code in the body of the function is executed until either a return
statement is encountered or there are no more statements to execute.
Functions
• Return can only be used within the body of a function
• A function call is an expression, the value is returned by the invoked
function
• The code in the body of the function is executed until either a return
statement is executed, or there are no more statements to execute.\
• When the return statement is executed, the value of the expression
following the return becomes the value of the function invocation.
Functions
Example: a function that returns the maximum of two numbers
def maxVal(x, y):
if x > y:
return x
else:
return y

• You should use the built-in function max() rather than defining yours.
Exercise: Write a function isIn that accepts two strings as arguments and
returns True if either string occurs anywhere in the other, and False
otherwise.
Hint: you might want to use the built-in str operation in.
Functions: scoping
def f(x): #x is a formal parameter
y=1
x=x+y
print(‘x =‘, x)
return x
x=3
y=2
z = f(x)
print(‘z = ‘, z)
print(‘x= ‘, x)
print(‘y =‘, y)
What are the outputs? Explain z = 5, y = 2, x = 3,
Functions: Scoping
• Each function defines a new name space, also called a scope.
• The formal parameter x and the local variable y that are used in f exist
only within the scope of the definition of f.
• The assignment statement x = x+y within the function body binds the
local name x to the object 4
• The assignments in f have no effect at all on the bindings of the names x
and y that exist outside the scope of f.
x=4
z=4
x=3
y=2
Functions: scoping
def f(x):
def g():
x = ‘abc’
print(‘x =‘, x)
def h():
z=x
print(‘z = ‘, z)
x=x+1
print(‘x = ‘, x)
h()
g()
print(‘x = ‘, x)
h() ‘abc’, 4, ‘abc’, 4
g()
print(‘x =‘, x)
return g
x = 3; z = f(x); print(‘x =‘, x); print(‘z =‘, z); z() #Predict the outputs
Functions: Specification
• A specification of a function defines a contract between the
implementer function and those who will be writing programs that use
the function or the clients
• Often contains two parts: Assumption and Guarantees
• Assumptions describe conditions that must be met by clients of the
function.
• Typically, assumptions describe constraints on the actual parameters
• Almost always, assumptions specify the acceptable set of types for each
parameter, and not-infrequently some constraints on the value of one
or more of the parameters
Functions: Specification
• Guarantees describe conditions that must be met by the function,
provided it has been called in a way that satisfies the assumptions
• Assumptions and guarantees are enclosed in a pair of three quotation
marks.
• The first two lines of the docstring findRoot describe conditions
(assumptions) that must be met by clients of the function.
• The last two lines of the docstring of findRoot describe the guarantees
that the implementation of the function must must.
Functions: Specification
Dictionaries (associative array)
• Objects of type dict are similar to lists except that indexing is done
using keys.
• A dictionary can be thought of as a set of key/ value pairs
• Literals of type dict are enclosed in curly braces. {}
• Each element is written as a key followed by a colon followed by a
value. Ty = {1: “jan’’, 2: “feb”}
• Dictionaries are mutable.
Dictionaries
• The general form of dictionaries is:
Variable = {key_1: value_1, key_2:value_2, key_3:value_3,…., key_n:value_n}
• The keys can be objects of type str, int, or a tuple. Lists and dicts are
unhashable
calender = {‘Jan’:1, ‘Feb’:2, ‘Mar’:3, ‘Apr’:4, 1: ‘Jan’, 2: ‘Feb’, 3: ‘Mar’, (‘26 Dec’, 1
Jan): [‘Boxing Day’, ‘Christmas’]}
• In calender, ‘Jan’ is mapped to 1, ‘Feb’ to 2, etc.
• The entries in a dictionary are unordered and cannot be accessed with
an index
• Entries are accessed with the keys associated with them
Dictionaries: Example program
Dictionaries
• It is often convenient to use tuples as keys.
• E.g. A tuple of the form (EquipNumber, MTTR) to represent
equipment reliability.
• It would easy to use this tuple as key in a dictionary implementing a
mapping from equipment to reliabilities.
• A for statement can be used to iterate over the entries in a dictionary.
• As with lists, there are many useful methods associated with
dictionaries.
Dictionaries: common methods
• len(d) returns the number of items in d.
• d.keys() returns a view of the keys in d.
• d.values() returns a view of the values in d.
• k in d returns True if key k is in d.
• d[k] returns the item in d with key k.
• d.get(k, v) returns d[k] if k is in d, and v otherwise.
• d[k] = v associates the value v with the key k in d. If there is already a value
associated with k, that value is replaced.
• del d[k] removes the key k from d.
• for k in d iterates over the keys in d.
Functions as Objects
• In Python, functions are first-class objects: they can be treated like
objects of any other type, e.g int or list.
• They have types e.g type(abs) has the value <type built-in_function_or_method>
• They can appear in expressions, e.g., as the right-hand side of an assignment
statement or as an argument to a function
• They can be elements of lists; etc.
• Using functions as arguments allows a style of coding called higher-
order programming.
Functions
def applyToEach(L, f):
”””Assumes L is a list, f a function
Mutates L by replacing each element, e, of L by f(e)”””
for i in range(len(L)):
L[i] = f(L[i])

L = [1, -2, 0, -4, 5, 9]


print(‘L =‘, L)
print(‘Apply int to each element of L.’)
applyToEach(L, int)
• The function applyToEach is called higher-order because it has an
argument that itself is a function.
Exceptions and Assertions
• An “exception” is usually defined as “something that does not conform to norm” and is therefore
somewhat rare.
• There is nothing rare about exceptions in Python.
• Virtually every module in the standard Python library uses them, and Python will raise them in
many different circumstances.
• For example,
test = [1, 2, 3]
test[3]
• The interpreter will respond with
IndexError: list index out of range
• IndexError is the type of exception that Python raises when a program tries to access an element
that is outside the bounds of an indexable type.
• The string following IndexError provides additional information about what caused the exception to
occur.
Handling Exceptions
• When an exception is raised that causes the program to terminate, we
say that an unhandled exception has been raised.
• Many times, an exception is something a programmer can and should
anticipate.
• If you know a line of code might raise an exception when executed,
you should handle the exception.
• In a well written program, unhandled exceptions should be the
exceptions.
• try-except blocks are used to handle exceptions
Exceptions
• A try-except block has the form:
try:
statement1
except errorName:
statement2
File handling
• Python is very popular for text and file munging because it offers a simple
way to work with files.
• Each operating system comes with its own file system for creating and
accessing files.
• Python achieves operating-system independence by accessing files through
a file handle
• The code below instructs the operating system to create a file with the
name results, and return a file handle for that file.
nameHandle = open(‘results’, ‘w’)
• The argument ‘w’ to open indicates that the file is to be opened for writing.
File handling
nameHandle = open(‘results’, ‘w’)
for i in range(2):
name = input(‘Enter name: ‘)
nameHandle.write(name + ‘\n’)
nameHandle.close()
• The above code opens a file, uses the write method to write two
lines, and then closes the file.
• To avoid the risk of losing your action, it is important to close your file
once he program is finished using it.
File handling
• We can open the file for reading and print its content.
• Python treats the file as a sequence of lines; a for statement can be used
to iterate over the file’s contents.
nameHandle = open(‘results’, ‘r’)
for line in nameHandle:
print(line)
nameHandle.close()
• The extra line(s) between (or among) your entries are there because
print starts a new line each time it encounters the ‘\n’ at the end of each
line in the file.
• To avoid printing an extra line, pass line[:-1] to the print function
File handling
nameHandle = open(‘results’, ‘w’)
nameHandle.write(‘reliability\n’)
nameHandle.write(‘Mean Time To Failure\n’)
nameHandle.write(‘Mean Time To Repair\n’)
nameHandle.write(‘Mean Time To Detection\n’)
nameHandle.write(‘Mean Time Between Failure\n’)
nameHandle.write(‘Average lifetime\n’)
nameHandle.write(‘Probability of failure\n’)
nameHandle.close()
nameHandle = open(‘results’, ‘r’)
for line in nameHandle:
print(line)
nameHandle.close()
• Running the above code will overwrite the previous contents of the file results.
• To avoid this, you should open the file for appending (instead of writing) by using the argument
‘a’
File handling
• open() accepts an optional argument path that holds or provides the
file address or locates the file.
path = ‘examples/results.txt’
f.open(path)

• The open() function can be written inside the with statement. When
the with block is exited, the file is automatically closed.
with open(path, ‘r’) as f:
lines = [x.rstrip() for x in f]
File handling
Exercise:
Write a program to open results for appending, write three new lines to
it, and print the number of lines in the updated file.
Common functions for accessing files
open(fn, ‘w’) creates a file fn for writing and returns a file handle
open(fn, ‘r’) opens an existing file for reading and returns a file handle
open(fn, ‘a’) opens an existing file for appending and returns a file handle
fh.read() returns a string containing the contents of the file associated with the file handle fh
fh.readline() returns the next line in the file associated with the file handle fh
fh.readlines() returns a list each element of which is one line of the file associated with the
file handle fh.
fh.write(s) writes the string s to the end of the file associated with the file handle fh
fh.writeLines(S) S is a sequence of strings. Write each element of S as a separate line to the
file associated with the file handle fh.
fh.close() closes the file associated with the file handle fh.
Exercises
1. Write a program that asks the user to input a string, then searches the string for numbers, and prints a dictionary in
which each number is mapped to a key that depicts its position in the list.
For example,
Given a string, string_ = “I know 7one is the first word 4 in the library”, your program should be print dictionary, {‘1st’: 7, “2nd”: 2}
Hint: it might help to consider including a try-except block
2. Implement a function that meets the specification below. Use a try-except block
def sumDigits(s):
“””Assumes s is a string
Returns the sum of the decimal digits in s
For example, if s is ‘a2b3c’ it returns 5”””
3. Implement a function that meets the specification below.
def numWords(p):
“””Assumes p is a string for the file path
Returns the total number of words in the file”””

4. Write a program that examines three distinct variables – x, y, and z – and prints the largest odd number among
them. If none of them is odd, it should print a message to that effect.

5. Write a program to open results for appending, write three new lines to it, and print the number of lines in the
updated file.
6. Let s be a string that contains a sequence of decimals separated by commas and white spaces, e.g.
s = “1.23,2.4 3.123, 6.7”. Write a program that prints the sum of the numbers in s
*Your program should request for an input
NumPy Basics: Arrays and Vectorized
Computation
• NumPy is short for Numerical Python.
• It is one of the most important foundational packages for numerical
computation in python.
• NumPy itself does not provide modeling or scientific functionality.
• However, an understanding of NumPy arrays and array-oriented
computing will help you use tools with array-oriented semantics, like
pandas
• NumPy is important for numerical computations in Python because it
is designed for efficiency on large arrays of data
NumPy Basics: why is NumPy efficient?
• NumPy internally stores data in contiguous block of memory,
independent of other built-in python objects
• NumPy operations perform complex computations on entire arrays
without the need for python for loops.
• NumPy-based algorithms are generally 10 to 100 times faster (or
more) than their pure Python counterparts and use significantly less
memory.
• An example code to compare the compute times on array and a
Python list is shown below:
NumPy Basics
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))

#let’s multiply each sequence by 2:


%time for i in range(10): my_arr2 = my_arr * 2

%time for i in range(10): my_list2 = [x * 2 for x in my_list]


NumPy ndarray
• ndarray is short for N-dimensional array object.
• It is a fast, flexible container for large datasets in Python
• Arrays enable the user to perform operations on whole blocks of data (batch
operations) using similar syntax to the equivalent operations between scalar
elements
import numpy as np
#generate some random data
data = np.random.randn(2, 3)
data
data * 10
data+data
NumPy ndarrays
• ndarray is a multidimensional container for homogenous data; that is,
all of the elements must be the same type
• Every array has a shape, a tuple indicating the size of each dimension.
• An array dtype, an object describing the data type of the array:
data.shape

data.dtype
NumPy Basics: creating ndarrays
• The easiest way to create an array is to use the array function
• The array function accepts any sequence-like object (including other
arrays) and produces a NumPy array containing the passed data.
data1 = [9,6,4,3,3,6.3]
arr1 = np.array(data1)

arr1

• Nested sequences, like a list of equal-length lists will be converted into


multi-dimensional array
data2 = [[1,2,3,4], [5,6,7,8]]
arr2 = np.array(data2)
arr2
Creating ndarray
• Since data2 was a list of lists, arr2 has two dimensions with shape
inferred from the data
arr2.ndim
arr2.shape

• Unless explicitly specified, np.array tries to infer a good data type for
the array that it creates.
arr1.dtype
arr2.dtype
• There are a number of other functions for creating new arrays.
Examples are zeros, ones, and empty.
Creating ndarrays
np.zeros(10)

np.zeros((3,6))

np.empty(2,3,2)

• arange is an array-valued version of the built-in Python range function


np.arange(15)
Array creation functions
Data Types for ndarrays
• The data type or dtype is a special object containing the information
about the data
arr1 = np.array([1,2,3], dtype = np.float64)
arr2 = np.array([1,2,3], dtype = np.int32)

arr1.dtype
arr2.dtype1
Arithmetic with NumPy Arrays
• Arrays of the same shape can be added, subtracted, and multiplied using the +, -, and
* operators, respectively.
• When arrays of the same size are compared, the operation produces boolean arrays.
arr = np.array(([1, 2, 3], [4, 5, 6]))
arr2 = np.array(([0, 4, 1], [7, 2, 12]))

arr2>arr1
• Operations between arrays of different shapes is called broadcasting.
• The simplest example of broadcasting occurs when combining a scalar value with an
array.
arr * 4
• The scalar value 4 is broadcast to all the other elements in the multiplication
operation.
Broadcasting cont’d…..
• Broadcasting is useful in standardization or scaling of data.
• Scaling can help to reduce multicollinearity. It is especially useful when
the model contains highly correlated predictors, higher order terms and
interaction terms
• For example, in standard scaling, Z =
• Broadcasting can be used to implement the computation of the Z-score
above.
• A scalar value (the mean) is subtracted from each feature (or
predictor), then the differences are each divided by another scalar (the
standard deviation).
Broadcasting cont’d
• The columns of arr can be broadcasted as shown in the code below:
mean = arr.mean(0) # computes the mean of each column
demeaned = arr – mean
scaled_arr = demeaned/arr.std(0)
Quiz:
Create a 7x4 array. Scale all the columns of the array using the min-max
scaling (use broadcasting)
Xmm =
Broadcasting Rule
• Two arrays are compatible for broadcasting if for each trailing
dimension (i.e, starting from the end) the axis lengths match, or if
either of the lengths is 1.
Broadcasting is performed over the missing or length 1 dimensions
(4, 3) (3, ) (4, 3)
0 0 0 1 2 3 1 2 3
1 1 1 1 2 3 2 3 4
2 2 2 1 2 3 = 3 4 5
3 3 3 1 2 3 4 5 6
Basic Indexing and Slicing
• NumPy indexing can be used to select a subset of your data or
individual elements.
• Indexing in 1D arrays on the surface act similarly to Python lists.
arr = np.arange(10)

arr[5]

arr[5:8] #slicing

arr[5:8] = 12 #mutation through assignment, what do you observe?


Basic Indexing and Slicing
• A distinction from Python’s built-in list is that array slices are views on the
original array. i.e. the data is not copied, therefore, any modification to
the view will be reflected in the source array.
• For example,
arr = np.arange(10)
arr_slice = arr[5:8]

arr_slice[1] = 1234

arr

• Consequently, arr_slice[:]=64 will produce the same effect on arr as


arr[5:8] = 64
Basic Indexing and Slicing
• One justification for this behavior is performance optimization.
NumPy has been designed to be able to work with very large arrays,
performance and memory problems may become significant if NumPy
insisted on always copying data.
• To copy a slice of an ndarray is stead of a view, you will need to
explicitly copy the array – for example arr[5:8].copy()
• In a two dimensional array, the elements at each index are no longer
scalars but rather one – dimensional arrays
arr1 = np.array([[1,2,3], [4,5,6], [7,8,9]])
arr1[2]
Indexing elements in a NumPy array
• The individual elements of an array can be accessed recursively, a more preferred
approach is to pass a comma separated list of indices.
For example, arr1[0,2] is the equivalent of the more lengthy arr1[0][2]
• In a multidimensional array, if later indices are omitted, the returned object will be a
lower dimensional ndarray consisting of all the elements along the higher dimensions.
Quiz:
arr_3d = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
predict the outputs:
arr_3d[0]
arr_3d[1,0]
arr_3d[0,1]
arr_3d[0,1,2]
Indexing with slices
Boolean Indexing
Basic Indexing and Slicing
Quiz:

Q_arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
Predict the shapes of:
1) Q_arr[:2, 1:] (2,2)
2) Q_arr[2] (3,)
3) Q_arr[2, :] (3,)
4) Q_arr[2:, :] (1,3)
5) Q_arr[:, :2] (3,2)
6) Q_arr[1, :2] (2,)
7) Q_arr[1:2, :2] (1,2)
Boolean Indexing
• The  ~ operator can be useful when you need to invert a general condition.
• For example,
cond = names == ‘Ada’
data[~cond]
• Selection can be made on multiple items by using the Boolean arithmetic
operators like &(and) and | (or)
For example,
Tail = (names == ‘ada’) | (names == ‘ola’)
• Selecting data from an array by Boolean indexing always creates a copy of
the data, even if the returned array is unchanged.
Boolean Indexing
• The Python keywords and and or do not work with Boolean arrays
• Other operators such as less than (<), greater than (>) can be used similarly to the equal (=), !, or a
combination of any two.
• For example,
data[data<0]
data[names ! = ‘ada’]
data[data<=0]
• These operations with two-dimensional data are more convenient to do with pandas
arr = np.empty((5,5))
for i in range(4):
arr[i+1] = 0

arr[arr<0]
Fancy Indexing
• Fancy indexing is used to describe indexing using integer array
• A list of integer slice can be passed to the array to be sliced
arr = np.empty((10,5)) # creates an empty 10x5 array

for i in range(10):
arr[i] = i

arr
• To select a subset of the rows in a particular order, you can pass a list or ndarray of integers
specifying the desired order:
arr[[4, 3, 0, 6]]
Quiz:
Predict the output
arr[[-3, -5, -7]]
Fancy Indexing
• Passing multiple index arrays selects a one-dimensional array of
elements corresponding to each tuple of indices:
arr = np.arange(24).reshape((6,4))
arr[[1,4,3,5,0], [3,0,1,2,0]]
• Fancy indexing unlike slicing copies the data into a new array
Quiz:
Predict the outputs
arr[[-1,-3,-4,0],[-3,0,1,2]]
arr[[1,2,3,4],[0,1,2,3]]
Transposing Arrays and Swapping Axes
• Transposing is a form of reshaping. Similar to reshaping, it returns a view on the underlying data
without copying anything.
• Arrays have a transpose method and a special transpose attribute
• To transpose an array using the array attribute, use:
arr.T

• To transpose an array using the transpose function, use:


arr.transpose()

• For higher dimensional array, you can pass a tuple indicating the axes transpose order.
arr = np.arange(24).reshape(2,4,3)

arr

arr.transpose(1,2,0)

• Here, the axes will be reordered with the second axis first, the third axes second, and the first axis
unchanged. The transposed array will have a shape corresponding to 4x3x2
Universal Functions
• A universal function, or ufunc performs element-wise operations on data in
ndarrays
• Examples are np.sqrt(array), np.exp(array). These are referred to as unary
ufuncs.
• Binary unfuncs take two arrays and return a single array as the result
Examples are: np.add(array1, array2) and np.maximum(array1, array2)

Quiz:
Create two arrays of the same shape and compare each element of the arrays
and save the maximum in a new array.
Unary ufuncs
Binary Unfunc
Other binary unfunc
Simple Mathematical and Statistical
Methods
• A set of mathematical functions that can compute statistics about an entire array
are accessible as methods of the array class.
• Aggregations such as sum, mean, std can be used either by calling the array
instance method or using the top-level NumPy function.
arr = np.random.randn(5,4)

arr.mean() # computes the mean of all elements in arr


np.mean(arr) #computes the mean of the elements in arr
arr.sum() #computes the sum of the elements in arr
• Functions such as mean and sum take an optional axis argument that computes
the statistic over the given axis, resulting in an array with one fewer dimension.
arr.mean(axis = 1) #or arr.mean(1) means “compute mean across the columns”
arr.sum(axis = 0) # arr.sum(0) means “compute sum down the rows.”
Accumulation functions
• Accumulation functions like cumsum and cumprod return an array of
the same size.
arr = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])

QUIZ:
Predict the outputs of:
arr.cumsum(axis = 0)

arr.cumprod(axis = 1)
Other statistical methods
Methods for Boolean Arrays
• Certain methods can work on Boolean arrays to provide insights
• For example sum can be used to count the True values in a Boolean
array.
(arr>0).sum() #counts the number of positive values in arr

(arr>0).any() #returns True if one or more values in arr is positive

(arr>0).all() #returns True if all the values in arr are positive

• any and all also work with non-boolean arrays, where non-zero
elements evaluate to True.
Sorting and Unique set operations
• NumPy arrays can be sorted in-place with the sort method
arr = np.random.randn(8)
arr.sort()

arr1 = np.random.randn(5,3)

arr1

arr1.sort(1) #sorting a multidimensional array


• Quiz
Predict the output:
arr1.sort(0)
arr1
Other operations
Linear Algebra
• Linear algebra such as matrix multiplication, decompositions, determinants, and
other square matrix math, is an important part of any array library.
• Unlike in other languages like MATLAB, multiplying two-dimensional arrays with *
is an element-wise product.
• For matrix multiplication, the function dot is used.
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[6, 23], [-1, 7], [8, 9]])

x.dot(y) #is equivalent to np.dot(x, y)


Quiz:
Predict the size of:
np.dot(x, np.ones(3))
Linear Algebra
• To perform other functions such as matrix decompositions, inverse
and determinant, numpy.linalg can be used.
from numpy.linalg import inv

You might also like