Machine Learning With Python
Machine Learning With Python
By
Dr. Manish Sharma
Topics to be covered…..
• Motivation
• Python
• Introduction to Machine Learning
• Supervised Learning
• Unsupervised Learning
• Python libraries for Machine Learning
Motivation
• Data is the new fuel.
• Every application is using the data to get the
actual insight.
• Technologies like big data, Cloud computing,
IoT, smart infrastructures, 5G communication.
• Existing statistical analytics not capable of
exploring the details.
Brief History of Python
• Invented in the Netherlands, early 90s by Guido
van Rossum
• Named after Monty Python
• Open sourced from the beginning
• Considered a scripting language, but is much more
• Scalable, object oriented and functional from the
beginning
• Used by Google from the beginning
• Increasingly popular
Python’s Benevolent Dictator For Life
“Python is an experiment in
how much freedom
program-mers need. Too
much freedom and nobody
can read another's code;
too little and expressive-
ness is endangered.”
- Guido van Rossum
http://docs.python.org/
The Python tutorial is good!
Python IDEs
• Jupyter Notebook
• Jupyter Lab
• Spyder
• PyCharm
• Pydev
• Thonny
• ANACONDA(Distribution)
Running Python
Installing
• Python is pre-installed on most Unix systems,
including Linux and MAC OS X
• The pre-installed version may not be the most
recent one
• Download from http://python.org/download/
• Python comes with a large library of standard
modules
• There are several options for an IDE
– IDLE – works well with Windows
– Emacs with python-mode or your favorite text editor
– Eclipse with Pydev (http://pydev.sourceforge.net/)
IDLE Development Environment
• IDLE is an Integrated DeveLopment Environment for
Python, typically used on Windows
• Multi-window text editor with syntax highlighting,
auto-completion, smart indent and other.
• Python shell with syntax highlighting.
• Integrated debugger
with stepping, persis-
tent breakpoints,
and call stack visibility
Python Scripts
• When you call a python program from the command
line the interpreter evaluates each expression in the
file
• Familiar mechanisms are used to provide command
line arguments and/or redirect input and output
• Python also has mechanisms to allow a python
program to act both as a script and as a module to
be imported and used by another python program
The Basics
A Code Sample (in IDLE)
x = 34 - 23 # A comment.
y = “Hello” # Another one.
z = 3.45
if z == 3.45 or y == “Hello”:
x = x + 1
y = y + “ World” # String concat.
print x
print y
Enough to Understand the Code
Indentation matters to code meaning
• Block structure indicated by indentation
First assignment to a variable creates it
• Variable types don’t need to be declared.
• Python figures out the variable types on its own.
Assignment is = and comparison is ==
For numbers + - * / % are as expected
• Special use of + for string concatenation and % for
string formatting (as in C’s printf)
Logical operators are words (and, or,
not) not symbols
The basic printing command is print
Basic Datatypes
Integers (default for numbers)
z = 5 / 2 # Answer 2, integer division
Floats
x = 3.456
Strings
• Can use “” or ‘’ to specify with “abc” ==
‘abc’
• Unmatched can occur within the string:
“matt’s”
• Use triple double-quotes for multi-line strings or
strings than contain both ‘ and “ inside of them:
“““a‘b“c”””
Whitespace
Whitespace is meaningful in Python: especially
indentation and placement of newlines
Use a newline to end a line of code
Use \ when must go to next line prematurely
No braces {} to mark blocks of code, use
consistent indentation instead
• First line with less indentation is outside of the block
• First line with more indentation starts a nested block
Colons start of a new block in many constructs,
e.g. function definitions, then clauses
Comments
Start comments with #, rest of line is ignored
Can include a “documentation string” as the
first line of a new function or class you define
Development environments, debugger, and
other tools use it: it’s good style to include one
def fact(n):
“““fact(n) assumes n is a positive
integer and returns facorial of n.”””
assert(n>0)
return 1 if n==1 else n*fact(n-1)
Assignment
Binding a variable in Python means setting a name to
hold a reference to some object
• Assignment creates references, not copies
Names in Python do not have an intrinsic type,
objects have types
• Python determines the type of the reference automatically
based on what data is assigned to it
You create a name the first time it appears on the left
side of an assignment expression:
x = 3
A reference is deleted via garbage collection after
any names bound to it have passed out of scope
Python uses reference semantics (more later)
Naming Rules
Names are case sensitive and cannot start
with a number. They can contain letters,
numbers, and underscores.
bob Bob _bob _2_bob_ bob_2 BoB
There are some reserved words:
and, assert, break, class, continue,
def, del, elif, else, except, exec,
finally, for, from, global, if,
import, in, is, lambda, not, or,
pass, print, raise, return, try,
while
Naming conventions
The Python community has these recommend-
ed naming conventions
joined_lower for functions, methods and,
attributes
joined_lower or ALL_CAPS for constants
StudlyCaps for classes
camelCase only to conform to pre-existing
conventions
Attributes: interface, _internal, __private
Assignment
You can assign to multiple names at the
same time
>>> x, y = 2, 3
>>> x
2
>>> y
3
This makes it easy to swap values
>>> x, y = y, x
Assignments can be chained
>>> a = b = x = 2
Accessing Non-Existent Name
Accessing a name before it’s been properly
created (by placing it on the left side of an
assignment), raises an error
>>> y
>>> [1, 2, 3] * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]
>>> “Hello” * 3
‘HelloHelloHello’
Mutability:
Tuples vs. Lists
Lists are mutable
>>> li.sort(some_function)
# sort in place using user-defined comparison
Tuple details
The comma is the tuple creation operator, not parens
>>> 1,
(1,)
Python shows parens for clarity (best practice)
>>> (1,)
(1,)
Don't forget the comma!
>>> (1)
1
Trailing comma only required for singletons others
Empty tuples have a special syntactic form
>>> ()
()
>>> tuple()
()
Summary: Tuples vs. Lists
Lists slower but more powerful than tuples
• Lists can be modified, and they have lots of
handy operations and mehtods
• Tuples are immutable and have fewer
features
To convert between tuples and lists use the
list() and tuple() functions:
li = list(tu)
tu = tuple(li)
Dictionaries
Hash tables, "associative arrays"
—d = {"duck": "eend", "water": "water"}
Lookup:
—d["duck"] -> "eend"
—d["back"] # raises KeyError exception
Delete, insert, overwrite:
—del d["water"] # {"duck": "eend", "back": "rug"}
—d["back"] = "rug" # {"duck": "eend", "back":
"rug"}
—d["duck"] = "duik" # {"duck": "duik", "back":
"rug"}
Conditional Branching
if and else
if variable == condition:
#do something based on v == c
else:
#do something based on v != c
elif allows for additional branching
if condition:
elif another condition:
…
else: #none of the above
Looping with For
For allows you to loop over a block of
code a set number of times
For is great for manipulating lists:
a = ['cat', 'window', 'defenestrate']
for x in a:
print x, len(x)
Results:
cat 3
window 6
defenestrate 12
Looping with For
We could use a for loop to perform
geoprocessing tasks on each layer in a list
We could get a list of features in a feature class
and loop over each, checking attributes
Anything in a sequence or list can be used in a
For loop
Just be sure not to modify the list while looping
URLs
http://www.python.org
• official site
http://starship.python.net
• Community
http://www.python.org/psa/bookstore/
• (alias for http://www.amk.ca/bookstore/)
• Python Bookstore
Further Reading
• Powerful Processing
• Better Decision Making & Prediction
• Quicker Processing
• Accurate
• Affordable Data Management
• Inexpensive
• Analyzing Complex Big Data
Steps Involved in Machine Learning
• Seeds = Algorithms
• Nutrients = Data
• Gardener = You
• Plants = Programs
So what the machine learning is…
• Automating automation
• Getting computers to program themselves
• Writing software is the bottleneck
• Let the data do the work instead!
Machine Learning Techniques
Given below are some techniques in this Machine
Learning tutorial.
•Classification
•Categorization
•Clustering
•Trend analysis
•Anomaly detection
•Visualization
•Decision making
ML in a Nutshell
• Machine Learning is a sub-set of Artificial
Intelligence where computer algorithms are used to
autonomously learn from data and
information. Machine learning computers can
change and improve their algorithms all by
themselves.
• Tens of thousands of machine learning algorithms
• Every machine learning algorithm has three
components:
– Representation
– Evaluation
– Optimization
Representation
• Decision trees
• Sets of rules / Logic programs
• Instances
• Graphical models
• Neural networks
• Support vector machines (SVM)
• Model ensembles
etc………
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• Etc.
Optimization
• Combinatorial optimization
– E.g.: Greedy search
• Convex optimization
– E.g.: Gradient descent
• Constrained optimization
– E.g.: Linear programming
Features of Machine Learning
Let us look at some of the features of Machine
Learning.
•Machine Learning is computing-intensive and
generally requires a large amount of training data.
•It involves repetitive training to improve the
learning and decision making of algorithms.
•As more data gets added, Machine Learning
training can be automated for learning new data
patterns and adapting its algorithm.
Machine Learning Algorithms