An Introduction To Python: Phil Spector
An Introduction To Python: Phil Spector
to
Python
Phil Spector
Statistical Computing Facility
Department of Statistics
University of California, Berkeley
2
Core Python Concepts
• small basic language
• modules (with private namespaces) for added functionality
the import statement provides access to modules
• true object oriented language
• strong typing
• exception handling
try/except clause can trap any error
• indentation is important
Invoking python
• Type python at the command line – starts an interactive
session – type control-D to end
• Type python programname
• If the first line of your program is
#!/usr/bin/env python
and the program is made executable with the chmod +x
command, you can invoke your program through its name
4
Getting Help in Python
• Online documentation may be installed along with Python.
• All documentation can be found at http://python.org .
• In an interactive Python session, the help function will provide
documentation for most Python functions.
• To view all available methods for an object, or all of the
functions defined within a module, use the dir function.
6
Strings in Python
Strings are a simple form of a Python sequence – they are stored as
a collection of individual characters.
There are several ways to initialize strings:
• single (’) or double (") quotes
• triple quotes (’’’ or """ – allows embedded newlines
• raw strings (r’...’ or r"...") – ignores special characters
• unicode strings (u’...’ or u"...") – supports Unicode
(multiple byte) characters
Special Characters
Sequence Meaning Sequence Meaning
\ continuation \\ literal backslash
\’ single quote \" double quote
\a bell \b backspace
\e escape character \0 null terminator
\n newline \t horizontal tab
\f form feed \r carriage return
\0XX octal character XX \xXX hexadecimal value XX
Use raw strings to treat these characters literally.
8
String Operators and Functions
• + – concatenation
• * – repetition
• [i] – subscripting (zero-based; negative subscripts count from
the end)
• [i:j] – slice from i-th character to one before the j-th
(length of the slice is j - i)
• [i:] – slice from i-th character to the end
• [:i] – slice from the first character to one before the i-th
• len(string ) – returns number of characters in string
String Methods
Name Purpose
join Insert a string between each element of a sequence
split Create a list from “words” in a string
splitlines Create a list from lines in a string
count Count the number of occurences of substring
find Return the lowest index where substring is found
index Like find, but raises ValueError if not found
rfind Return the highest index where substring if found
rindex Like rfind, but raises ValueError if not found
center Centers a string in a given width
10
String Methods (continued)
Name Purpose
ljust Left justifies a string
lstrip Removes leading whitespace
rjust Right justifies a string
rstrip Removes trailing whitespace
strip Removes leading and trailing whitespace
capitalize Capitalize the first letter of the string
lower Make all characters lower case
swapcase Change upper to lower and lower to upper
title Capitalize the first letter of each word in the string
upper Make all characters upper case
11
Numbers in Python
• integers - ordinary integers with the default range of the
computer
initialize without decimal point
• longs - “infinite” precision integers – immune to overflow
initialize without decimal point and a trailing “L”
• float - double precision floating point numbers
initialize using decimal point
• complex - double precision complex numbers
initialize as a + bj
Hexadecimal constants can be entered by using a leading 0X or 0x;
octal constants use a leading 0.
12
Operators and Functions for Numbers
• Usual math operators: + - * / %
• Exponentiation: **
• Bit shift: << >>
• Core functions: abs, round, divmod
many more in math module
If both operands of the division operator are integers, Python uses
integer arithmetic. To insure floating point arithmetic, use a
decimal point or the float function on at least one of the operands.
13
Type Conversion
When Python encounters an object which is not of the appropriate
type for an operation, it raises a TypeError exception. The usual
solution is to convert the object in question with one of the
following conversion functions:
• int - integer
• long - long
• float - float
• complex - complex
• str - converts anything to a string
14
Sequence Types
We’ve already seen the simplest sequence type, strings. The other
builtin sequence types in Python are lists, tuples and dictionaries.
Python makes a distinction between mutable sequences (which can
be modified in place) and immutable sequences (which can only be
modified by replacement): lists and dictionaries are mutable, while
strings and tuples are immutable.
Lists are an all-purpose “container” object which can contain any
other object (including other sequence objects). Tuples are like
lists, but immutable. Dictionaries are like lists, but are indexed by
arbitrary objects, instead of consecutive integers.
The subscripting and slicing operations presented for strings also
work for other sequence objects, as does the len function.
15
Sequence Elements
• Lists - use square brackets ([ ])
Empty list: x = []
List with elements: x = [1,2,"dog","cat",abs]
Access using square brackets: print x[2]
• Tuples - use parentheses (( ))
Empty tuple: x = ()
Tuple with elements: x = (1,2,"dog","cat",abs)
Tuple with a single element: x = (7,)
Access using square brackets: print x[2]
• Dictionary - use curly braces ({ })
Empty dictionary: x = {}
Dictionary with elements:
x = {"dog":"Fido","cat":"Mittens’’}
Access using square brackets: print x["cat"]
16
Nesting of Sequence Types
Sequence types can be as deeply nested as necessary. This makes it
very easy to store complex data structures in basic Python objects:
nestlist = [1,2,"dog","cat",(20,30,40),
{"one":("uno",1),"two":("dos",2),"three":("tres",3)}]
print nestlist[5]["one"][0] #prints uno
nestlist[1] = 14 #ok - lists are mutable
nestlist[4][2] = "something" #fails - tuples are immutable
nestlist[4] = "something" #ok to replace whole tuple
17
18
List Operators
Lists support concatenation and repetition like strings, but to
concatenate an element to the end of a list, that element must be
made into a list.
[1,2,3] + 4 results in a TypeError, but
[1,2,3] + [4] yields a list with four elements.
Similarly for repetition
0 * 10 results in the integer 0, but
[0] * 10 results in a list containing ten zeroes.
19
List Methods
Name Purpose
append Adds a single element to a list
count Counts how many times an element appears
extend Adds multiple elements to a list
index Returns lowest index of an element in a list
insert Inserts an object into a list
pop Returns and removes first element of a list
remove Removes first occurence of an element from a list
reverse Reverses a list in place
sort Sorts a list in place
Notice that joining together the elements of a list into a string is
done with the join method for strings.
20
Sorting Lists in Python
The sort method for lists accepts an optional function argument
which defines how you want the elements of the list sorted. This
function should accept two arguments and return -1 if the first is
less than the second, 0 if they are equal, and 1 if the first is greater
than the second.
Suppose we wish to sort words disregarding case. We could define
the following function, and pass it to sort:
>>> def cmpcase(a,b):
... return cmp(a.lower(),b.lower())
...
>>> names = [’Bill’,’fred’,’Tom’,’susie’]
>>> names.sort()
>>> names
[’Bill’, ’Tom’, ’fred’, ’susie’]
>>> names.sort(cmpcase)
>>> names
[’Bill’, ’fred’, ’susie’, ’Tom’]
21
Dictionaries
Dictionaries are very convenient because it’s often easier to
associate a string with a piece of data than remember its position
in an array. In addition, the keys of a Python dictionary can be
any Python object, not just strings.
The following methods are provided for dictionaries:
Name Purpose
clear remove all keys and values
get access values through key with default
has key tests for presence of key
keys returns all keys
values returns all values
22
Using Dictionaries for Counting
Since it is an error to refer to a non-existent key, care must be
taken when creating a dictionary. Suppose we wish to count the
number of times different words appear in a document.
1. Use exceptions
try:
counts[word] = counts[word] + 1
except KeyError:
counts[word] = 1
2. Check with has key
if counts.has_key(word):
counts[word] = counts[word] + 1
else:
counts[word] = 1
3. Use get
counts[word] = counts.get(word,0) + 1
23
Printing
While the print statement accepts any Python object, more
control over printed output can be achieved by using formatting
strings combined with the “%” operator.
A formatting string contains one or more %-codes, indicating how
corresponding elements (in a tuple on the right hand side of the %
operator) will be printed. This table shows the possible codes:
Code Meaning Code Meaning
d or i Decimal Integer e or E Exponential Notation
u Unsigned Integer g or G “Optimal” Notation
o Octal Integer s Display as string
h or H Hexadecimal Integer c Single character
f Floating Point Number % Literal percent sign
24
Examples of Formatting
Field widths can be specified after the % sign of a code:
>>> animal = ’chicken’
>>> print ’%20s’ % animal
chicken
With floating point arguments, the number of decimal places can
be specified:
>>> x = 7. / 3.
>>> print x
2.33333333333
>>> print ’%5.2f’ % x
2.33
When formatting more than one item, use a tuple, not a list.
>>> print ’Animal name: %s Number: %5.2f’ % (animal,x)
Animal name: chicken Number: 2.33
The result of these operations is a string
>>> msg = ’Animal name: %s Number: %5.2f’ % (animal,x)
>>> msg
’Animal name: chicken Number: 2.33’
25
File Objects
The open function returns a file object, which can later by
manipulated by a variety of methods. This function takes two
arguments: the name of the file to be opened, and a string
representing the mode. The possible modes are:
String Meaning
r Open file for reading; file must exist.
w Open file for writing; will be created if it doesn’t exist
a Open file for appending; will be created if it doesn’t exist
r+ Open file for reading and writing; contents are not destroyed
w+ Open file for reading and writing; contents are destroyed
a+ Open file for reading and writing; contents are not destroyed
By default, files are opened with mode "r". A ’b’ can be
appended to the mode to indicate a binary file.
26
Using File Objects: Reading
Suppose we wish to read the contents of a file called ”mydata”.
First, create the appropriate file object.
try:
f = open(’mydata’,’r’)
except IOError:
print "Couldn’t open mydata"
sys.exit(1)
27
total = 0 # initialize
while 1:
line = f.readline()
if not line:
break
line = line[:-1] # removes newline
total = total + int(line) # type conversion!
28
Using File Objects: Writing
If a file is opened with a mode of ’w’ or ’a’ the following methods
can be used to write to the file:
• write - writes its argument to the specified file
• writelines - writes each element of a list to the specified file
These methods do not automatically add a newline to the file. The
print statement automatically adds a newline, and can be used
with file objects using the syntax:
29
30
File Objects and Object Oriented Programming
Although they are refered to as file objects, any object which
provides the appropriate methods can be treated as a file, making
it very easy to modify programs to use different sources. Some of
the functions in Python which can provide file-like objects include
• os.popen – pipes (shell command input and output)
• urllib.urlopen – remote files specified as URLs
• StringIO.StringIO – treats a string like a file
• gzip.GzipFile – reads compressed files directly
31
Assignment Statements
To assign a value to a variable, put the name of the variable on the
left hand side of an equals sign (=), and the value to be assigned on
the right hand side:
x = 7
names = [’joe’,’fred’,’sam’]
y = x
Python allows multiple objects to be set to the same value with a
chained assignment statement:
i = j = k = 0
Furthermore, multiple objects can be assigned in one statment
using unrolling:
name = [’john’,’smith’]
first, last = name
x, y, z = 10, 20, 30
32
A Caution About List Assignments
When you perform an assignment, Python doesn’t copy values – it
just makes one variable a reference to another. It only does the
actual copy when the original variable is overwritten or destroyed.
For immutable objects, this creates no surprises. But notice what
happens when we change part of a mutable object that’s been
assigned to another variable:
>>> breakfast = [’spam’,’spam’,’sausage’,’spam’]
>>> meal = breakfast
>>> breakfast[1] = ’beans’
>>> breakfast
[’spam’, ’beans’, ’sausage’, ’spam’]
>>> meal
[’spam’, ’beans’, ’sausage’, ’spam’]
33
You can use the is operator to test if two things are actually
references to the same object.
34
Comparison Operators
Python provides the following comparison operators for
constructing logical tests:
Operator Tests for Operator Tests for
== Equality != Non-equality
> Greater than < Less than
>= Greater than or equal <= Less than or equal
in Membership in sequence is Equivalence
not in Lack of membership not is Non-equivalence
Logical expressions can be combined using and or or.
You can treat logical expressions as integers (True = 1, False = 0).
35
36
Indentation
People who start programming in Python are often surprised that
indentation, which is mostly cosmetic in most languages, actually
determines the structure of your Python program. Indentation is very
useful for several reasons:
1. Just looking at your program gives you an excellent idea of what its
structure is, and you’re never deceived by improper indentation,
since it will generate a syntax error.
2. Since everyone has to indent, other people’s programs are generally
easier to read and understand.
3. Most experienced programmers agree that good indentation is
useful, but requires too much discipline. In Python, you’re
guaranteed to develop good indentation practices.
Many editors provide facilities for automatic and consistent indentation
(emacs, vim, bbedit, etc.). The majority of indentation problems arise
from using more than one editor to edit the same program.
37
if statement
The if/elif/else statement is the basic tool for conditional
execution in Python. The form is:
if expression:
statement(s)
elif expression:
statement(s)
elif expression:
statement(s)
. . .
else:
statements
The elif and else clauses are optional.
The colon (:) after the expression is required, and the statement(s)
following the if statement must be consistently indented.
You must have at least one statement after an if statement; you
can use the keyword pass to indicate “do nothing”.
38
The for loop
The for loop is the basic construct in Python for iterating over the
elements of a sequence. The syntax is:
for var in sequence:
statements
else:
statements
39
40
Examples of the range function
Suppose we have two lists, prices and taxes, and we wish to
create a list called total which contains the sum of each element in
the two arrays.
total = len(prices) * [0]
for i in range(len(prices)):
total[i] = prices[i] + taxes[i]
The range function can be used when you need to modify elements
of a mutable sequence:
for i in range(len(x)):
if x[i] < 0:
x[i] = 0
41
The statements following the else clause are executed if the expression
is initially false.
As always, remember the colon (:) and proper indentation.
42
Control inside Loops
If you wish to stop the execution of a loop before it’s completed,
you can use the break statement. Control will be transferred to the
next statement after the loop (including any else and elsif
clauses). The break statement is especially useful in Python,
because you can not use an assignment statement as the expression
to be tested in a while loop.
If you wish to skip the remaining statements for a single iteration
of a loop, and immediately begin the next iteration, you can use
the continue statement.
43
got = 0
nums = []
44
Example: Counting Fields in a File
import sys
filename = sys.argv[1]
try:
f = open(filename,’r’)
except IOError:
print >>sys.stderr, "Couldn’t open %s" % filename
sys.exit(1)
counts = {}
while 1:
line = f.readline()
if not line:
break
line = line[:-1]
fields = line.split(’,’)
l = len(fields)
counts[l] = counts.get(l,0) + 1
keys = counts.keys()
keys.sort()
for k in keys:
print ’%d %d’ % (k,counts[k])
45
Writing Functions
The def statement creates functions in Python. Follow the
statement with the name of the function and a parenthesized list of
arguments. The arguments you use in your function are local to
that function. While you can access objects outside of the function
which are not in the argument list, you can not change them.
The function definition should end with a colon (:) and the
function body should be properly indented.
If you want your function to return a value, you must use a return
statement.
You can embed a short description of the function (accessed
through the interactive help command) by including a quoted
string immediately after the function definition statement.
46
Example: Writing a Function
def merge(list1, list2):
"""merge(list1,list2) returns a list consisting of the
original list1, along with any elements in list2 which
were not included in list1"""
newlist = list1[:]
for i in list2:
if i not in newlist:
newlist.append(i)
return newlist
47
48
Functional Programming: map and filter
Python provides two functions which accepts a function as one of
its arguments.
map takes a function and a list, and applies the function to each
member of the list, returning the results in a second list. As an
example of the map function, suppose we have a list of strings which
need to be converted to floating point numbers. We could use:
values = map(float,values)
filter takes a function which returns True or False as its first
argument and a list as its second. It returns a list containing only
those elements for which the provided function returned True. The
path.isdir function of the os module returns a value of 1 (True)
if its argument is a directory. To extract a list of directories from a
list of files, we could use:
dirs = map(os.path.isdir,files)
49
50
Using Modules
There are three basic ways to use the import statement to make
functions and other objects from modules available in your
program.
1. import module
Objects from module need to be refered to as
module.objectname. Only the module name is actually
imported into the namespace.
2. from module import function
The name function will represent the object of that name
from module. No other symbols are imported from the module
(including the module’s name).
3. from module import *
The names of all of the objects in module are imported into the
namespace. This form of the import statement should only be
used if the module author explicitly says so.
51
52