Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Idiomatic Python
Enrico Franchi
efranchi@ce.unipr.it


                       1
Could you please lend me the thing that you put in
                                                      2
the wall when you want to turn on the hairdryer and
   the hairdryer comes from a different country?




      Could you please lend me a power adapter?
3




If you are out to describe the truth,
leave elegance to the tailor.
                          Albert Einstein
4
Debugging is twice as hard as writing the
code in the first place.

Therefore, if you write the code as
cleverly as possible, you are, by definition,
not smart enough to debug it.
                              Brian Kernighan
5



READABILITY
  COUNTS
        Zen of Python
TOC                                          6

Iteration
Naming
Functions are objects
Choice
Attributes and methods
Duck Typing
Exceptions [unless TimeoutError is thrown]
FOR vs. WHILE vs. ...                                        7

 Iteration vs. Recursion
    sys.setrecursionlimit(n)

 for   vs. while
   Traditionally bounded iteration vs. unbounded iteration
   In C for and while are completely equivalent
   Some languages have for/foreach to iterate on
   collections
  for file in *.py; do
      pygmentize -o ${file%.py}.rtf $file
  done
Numerical Iteration                                  8


  int i = 0;                    i = 0
  while(i < MAX) {              while i < MAX:
      printf("%dn", i);            print i
      ++i;                          i += 1
  }



                             # O(n) space
                             for i in range(MAX):
  int i = 0;
                                 print i
  for(i=0; i < MAX; ++i) {
      printf("%dn", i);
                             # O(1) space
  }
                             for i in xrange(MAX):
                                 print i
Iteration on elements                                    9

 It is also common to iterate
 on elements of some            i = 0
 collection                     while i < len(lst):
                                    process(lst[i])    BAD
 C uses indices to iterate on       i += 1
 array elements
 Python uses for
 What if we want to iterate     for el in lst:
 both on elements and               process(el)       GOOD
 indices?
j = 0
while j < len(lst):
    process(index=j, element=lst[j])   BAD
    j += 1                                    10




for j in range(len(lst)):
    process(index=j, element=lst[j])   BAD




for j, el in enumerate(lst):
    process(index=j, element=el)       GOOD
What about Turing?                                    11

 for  is usually considered the more pythonic
 alternative
 Ideally every iteration should be done using for
 However, we have shown only iteration on finite
 collections, that is to say, for would not provide
 turing-completeness
 But everybody knows about generators: Python
 has infinite (lazy) sequences and they cover
 many other patterns as well
Design Implications                                   12

 Python for statement uses external iterators, that
 are extremely easy to implement through
 generators
 itertools  provides lots of functions to
 manipulate iterators
 The iteration logic is pushed inside the iterator;
 the client code becomes totally agnostic on how
 values are generated
def server_socket(host, port):
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.bind((host, port))
    sock.listen(5)
    csock, info = sock.accept()
    return csock.makefile('rw')
                                                             13

def server(host, port):
    fh = server_socket(host, port)
    for i, line in enumerate(fh):
        if line == "EOFrn":
            break
        fh.write("%4.d:t%s" % (i, line))
    fh.close()



                                   ... (Forking)TCPServer and
             higher level modules and frameworks are better!
def depth_first_visit(node):
    stack = [node, ]
    while stack:
        current_node = stack.pop()
        stack.extend(reversed(current_node.children))   14
        yield current_node.value

def breadth_first_visit(node):
    queue = collections.deque((node, ))
    while queue:
        current_node = queue.popleft()
        queue.extend(current_node.children)
        yield current_node.value



for v in depth_first_visit(tree):
    print v,
print

for v in breadth_first_visit(tree):
    print v,
print
PEP-8                                                          15


         http://www.python.it/doc/articoli/pep-8.html
         http://www.python.org/dev/peps/pep-0008/



 ‘‘‘One of Guido’s key insights is that code is read much
 more often than it is written. The guidelines provided here
 are intended to improve the readability of code and make
 consistent across the wide spectrum of Python code. As
 PEP 20 [6] says, “Readability counts”.’’’
PEP-8 (II)                                         16

 Standard for source code style
   names
   whitespace
   indentation
 Consistency with this style guide is important.
 Consistency within a project is more important.
 Consistency within one module or function is
 most important.
Indentation                                                17

 4 spaces, don’t mix tabs and spaces
 79 characters per line max
 Wrap lines in using implied line cont. in (), [] and {}
 Add parentheses to wrap lines
    (not filename.startswith('.') and
     filename.endswith(('.pyc', '.pyo')))

 Sometimes backslash is more appropriate
 Newline after operators
 One blank line between functions, two between
 classes
Space Invaders                                            18

 Put a space after “,” [parameters, lists, tuples, etc]
 Put a space after “:” in dicts, not before
 Put spaces around assignments and comparisons
   Unless it is an argument list
 No spaces just inside parentheses or just before
 argument lists
Naming conventions (I)                           19

 Always use descriptive names; the longer the
 scope, the longer the name
 Trailing underscore: avoids conflict with
 keywords or builtins (class_)
 Leading underscore: “internal use”/non-public
 Double leading underscore: name mangling
 Double leading and trailing: “magic”
 Avoid l, 1 and similar confusing names
Naming conventions (II)                                    20

             simple   lower_case CamelCase ALL_CAPS
  Classes                           X
 Variables     X          X
 Methods       X          X
 Functions     X          X
 Constants                                     X
 Packages      X
  Modules      X           (x)
             ... and self/cls first argument name for methods
Default values                                                     21

 The default values are evaluated once, when      >>>   def f(x=[]):
                                                  ...     x.append(1)
 the function is defined and is ‘shared’ among     ...     return x
 all call points                                  ...
                                                  >>>   f()
 If the default value is a mutable object, that
                                                  [1]
 leads to bugs                                    >>>   f()
 >>>   def g(x=None):                             [1,   1]
 ...     x = [] if x is None else x               >>>   f()
 ...     return x                                 [1,   1, 1]
 ...
 >>>   g()
 []
 >>>   g([1, 2])
 [1,   2]
Functions are Objects                                              22

 In Python everything is an object
   Thus, functions are objects
   Functions can be passed as arguments (easy)
   Functions can be returned as return values
 Some APIs explicitly expect functions as arguments (sort(key=))
       import sys, urllib
       def reporthook(*a): print a
       for url in sys.argv[1:]:
           i = url.rfind('/')
           file = url[i+1:]
           print url, "->", file
           urllib.urlretrieve(url, file, reporthook)
Internal Iterators                                      23

def dfs(node, action):
    stack = [node, ]
    while stack:
        current_node = stack.pop()
        stack.extend(reversed(current_node.children))
        action(current_node.value)

def bfs(node, action):
    queue = collections.deque((node, ))
    while queue:
        current_node = queue.popleft()
        queue.extend(current_node.children)
        action(current_node.value)


dfs(tree, lambda x: sys.stdout.write("%s, " % x))
def dfs(node, pre_action=None, post_action=None):
           def nop(node): pass
           pre_action = pre_action or nop # bad, use if
           post_action = post_action or nop # bad
           stack = []                                       24
           def process_node(n):
               def do_pre(): pre_action(n.value)
               def do_post(): post_action(n.value)
               def do_process():
                   stack.append(do_post)
                   for child in reversed(n.children):
                        stack.append(process_node(child))
                   stack.append(do_pre)
               return do_process
           stack.append(process_node(node))
           while stack:
               action = stack.pop()
               action()

dfs(tree, pre_action=lambda x: sys.stdout.write("%s, " % x))
print
dfs(tree, post_action=lambda x: sys.stdout.write("%s, " % x))
print
A             1   A                   11 A   C   E

B         C       2   A   C   B   A       12 A   C   E   E
                  3   A   C   B           13 A   C   E       25
     D        E
                  4   A   C   B   B       14 A   C
    Pre           5   A   C   B           15 A

    Proc          6 A     C
                  7 A     C   E   D   C
    Post
                  8 A     C   E   D
                  9 A     C   E   D   D
                  10 A    C   E   D
def dfs(node, pre_action=None, post_action=None):
    def nop(node): pass
    pre_action = pre_action or nop
    post_action = post_action or nop
    stack = []
                                                         26
    def process_node(n):
        def do_pre(): pre_action(n.value)
        def do_post(): post_action(n.value)
        def do_process():
            stack.append(do_post)
            for child in reversed(n.children):
                stack.append(process_node(child))
            stack.append(do_pre)
        return do_process

    stack.append(process_node(node))

    while stack:
        action = stack.pop()
        action()
                        Command Pattern is obsolete...
class TreePrinter(object):
    def __init__(self, fh, step='   '):
        self.out = fh
        self.step = step
        self.level = 0                                           27
    def pre_print(self, value):
        self.out.write(self.step * self.level)   0
        self.out.write(str(value))                   1
        self.out.write('n')
        self.level += 1
                                                         2
                                                         3
    def post_print(self, _):                         4
        self.level -= 1
                                                         5
                                                         6
tp = TreePrinter(sys.stdout)                             7
dfs(tree, tp.pre_print, tp.post_print)
                                                             8
                                                             9
                                                 10
                                                  11
The case of
the missing switch                                      28

 Some people think Python should have a switch/
 case like statement, something that executes a
 block of code determined by the value of a
 variable
 Possible solutions
   Python if/elif/else statement
   Seems the job for a dictionary + functions
   A cleverly designed class can solve the problem as
   well
What if we use the if?                                               29

 An if statement is easy to read and write, if there are few
 branches. Confusing if there are many branches
 Theoretically correct (provided that the conditions are disjoint)
                  ⎧ φ ( x ,…, x ) if ρ ( x ,…, x )
                  ⎪ 1 1        n      1    1      n

                  ⎪
 f (x1 ,…, xn ) = ⎨
                  ⎪ m( 1
                     φ x ,…, xn ) if ρm ( x1 ,…, xn )
                  ⎪ φ ( x ,…, x )    otherwise
                  ⎩  m+1  1      n


 Maybe slower as conditions are evaluated in order
 Some suggest that if statements should be banned ;)
Dictionary                                                30

 If the body of the switch essentially sets some
 (set of) variable(s), a dictionary is perfect

 def some_function(n, *more_args):
     # ...
     masks = {
         0: '0000', 1: '0001', 2: '0010', 3: '0011',
         4: '0100', 5: '0101', 6: '0110', 7: '0111',
         8: '1000', 9: '1001', 10: '1010', 11: '1011',
         12: '1100', 13: '1101', 14: '1110', 15: '1111'
     }
     # ...
     str_bits = masks[n]
Dictionary [+ Functions]                                    31

 If the “actions” in the branches are naturally
 abstracted as functions, a dictionary is perfect
 import operator
 # ...
 class BinOp(Node):
     # ...
     def compute(self):
         operations = {
             '+': operator.add,
             '-': operator.sub,
             '*': operator.mul,
             '/': operator.div
         }
         return operations[self.op](self.left.compute(),
                                    self.right.compute())
import cmd

class Example(cmd.Cmd):
    def do_greet(self, rest):                   32
        print 'Hello %s!' % rest

   def do_quit(self, rest):
       return True



while 1:
    words = raw_input('(cmd) ').split(' ', 1)
    command = words[0]
    try: rest = words[1]
    except IndexError: rest=''

    switch command:
        case 'greet':
             print 'Hello %s!' % rest
        case 'quit':
            break
33



Properties are a neat way to implement attributes
whose usage resembles attribute access, but
whose implementation uses method calls.

These are sometimes known as “managed
attributes”.
                                      GvR
Example (Track)                                    34

class Track(object):
    def __init__(self, artist, title, duration):
        self.artist = artist
        self.title = title
        self.duration = duration

   def __str__(self):
       return '%s - %s - %s' % (self.artist,
                                self.title,
                                self.duration)
Properties (I)                                                            35

                                   class A(object):
 Track has public attributes           def __init__(self, foo):
                                           self._foo = foo
 “Java” bad-practice
                                      def get_foo(self):
                                          print 'got foo'
   Dependency from                        return self._foo
   “implementation details”
                                      def set_foo(self, val):
                                          print 'set foo'
   What if we need validation             self._foo = val
   in setters and such?
                                       foo = property(get_foo, set_foo)
 property: old  attribute access   a = A('hello')
 syntax, function calls under      print a.foo
                                   # => 'got foo'
 the hood                          # => 'hello'
                                   a.foo = 'bar'
                                   # => 'set foo'
Properties (II)                                    36

 Sometimes we don’t need the setter...
 class A(object):
     def __init__(self, foo):
         self._foo = foo

    def get_foo(self):
        print 'got foo'
        return self._foo

     foo = property(get_foo)

 a = A('ciao')
 print a.foo
 # => 'got foo'
 # => 'ciao'
 a.foo = 'bar'
 # Traceback (most recent call last):
 # File "prop_example2.py", line 15, in <module>
 #    a.foo = 'bar'
 # AttributeError: can't set attribute'
Properties (III)                                   37

 Nicer syntax: decorators are handy
 class A(object):
     def __init__(self, foo):
         self._foo = foo

    @property
    def foo(self):
        print 'got foo'
        return self._foo


 a = A('hello')
 print a.foo
 # => 'got foo'
 # => 'hello'
 a.foo = 'bar'
 # Traceback (most recent call last):
 # File "prop_example2.py", line 15, in <module>
 #    a.foo = 'bar'
 # AttributeError: can't set attribute'
Properties (IV)                               38

 From Python 2.6, decorator for the setter:
 class A(object):
     def __init__(self, foo):
         self._foo = foo

    @property
    def foo(self):
        print 'got foo'
        return self._foo

    @foo.setter
    def foo(self, value):
        print 'set foo'
        self._foo = value


 a = A('hello')
 a.foo = 'bar'
 # => 'set foo'
class Track(object):
    def __init__(self, artist, title, duration):
        self._artist = artist
        self._title = title
        self._duration = duration                  39
   @property
   def artist(self):
       return self._artist

   @property
   def title(self):
       return self._title

   @property
   def duration(self):
       return self._duration

   def __str__(self):
       return '%s - %s - %s' % (self.artist,
                                self.title,
                                self.duration)
How Pythonic?                                     40

We can decouple interface from implementation
(getters/setters)
We have “read-only” attributes,
  therefore, “immutable” objects
Trivial getter/setters are repetitive
Properties are helpful in order to evolve code,
but are verbose to define “immutable objects”
Named Tuples                                                 41

Named Tuples solve the problem nicely
  Immutable objects (easier to use, too much C++
  and FP lately ☺)
  Can be used both as objects and tuples
  __str__ and other methods have good default
  implementation
    Subclassing can be used to change defaults
  Very quick to write!
  http://code.activestate.com/recipes/500261-named-tuples/
42




Track = collections.namedtuple('Track',
    ['title', 'artist', 'duration'])
About Java/C++ types...                              43

 In statically typed languages like C++ we
 constrain parameters to be of a given type or any
 of its subtypes
 However, a good programming practice is
 program to an interface
   Java interfaces (true dynamic polymorphism)
   C++ Templates (static polymorphism)
   Both solutions have problems
   (however, I do love ML static typing...)
Books, search by title                                                   44

If the list contains a non   class Book(object):
                                 def __init__(self, title, author):
book, an exception is                self.title = title
                                     self.author = author
raised
                             def find_by_title(seq, title):
Does not even work with          for item in seq:
                                     if type(item) == Book: # horrible
subclasses                               if item.title == title:
                                             return item
Worst strategy                       else:
                                         raise TypeError

  Never type-check like      def find_by_author(seq, author):
  that                           for item in seq:
                                     if type(item) == Book: # horrible
                                         if item.author == author:
  Solving a non-problem                      return item
                                     else:
                                         raise TypeError
Books, search by title       44

If the list contains a non
book, an exception is
raised
Does not even work with
subclasses
Worst strategy
  Never type-check like
  that
  Solving a non-problem
Books, search by title                                                45

Subclasses are ok          class Book(object):
                               def __init__(self, title, author):
                                   self.title = title
However, code does not             self.author = author
depend on elements being   def find_by_title(seq, title):
books                          for item in seq:
                                   if isinstance(item, Book): # bad
                                       if item.title == title:
  They have a title                        return item
                                   else:
  They have an author                  raise TypeError

What about songs?          def find_by_author(seq, author):
                               for item in seq:
                                   if isinstance(item, Book): # bad
Bad strategy, afterall                 if item.author == author:
                                           return item
                                   else:
                                       raise TypeError
Books, search by title                                                45

Subclasses are ok          class Song(object):
                               def __init__(self, title, author):
                                   self.title = title
However, code does not             self.author = author
depend on elements being   def find_by_title(seq, title):
books                          for item in seq:
                                   if isinstance(item, Book): # bad
                                       if item.title == title:
  They have a title                        return item
                                   else:
  They have an author                  raise TypeError

What about songs?          def find_by_author(seq, author):
                               for item in seq:
                                   if isinstance(item, Book): # bad
Bad strategy, afterall                 if item.author == author:
                                           return item
                                   else:
                                       raise TypeError
What about movies?                                   46

Movies have a title. However, they have a director
and no author
 find_by_title   should work, find_by_author,
shouldn’t
Interface for Book e Song. And what about Movie?
Design Pattern o code duplication
Square Wheel      Roads designed for square wheels

Duck typing simply avoids the problem
Books and Songs                                                         47

                                    The simplest solution is the best
class Book(object):                 Programmers do not code by
    def __init__(self, t, a):
        self.title = t
                                    chance (hopefully)
        self.author = a
                                    AttributeErrors are raised in case
def find_by_title(seq, title):
    for item in seq:
                                    of problems
        if item.title == title:
            return item             UnitTests discover these kind of
def find_by_author(seq, author):    errors
    for item in seq:
        if item.author == author:     You have unit tests, don’t you?
            return item
def find_by(seq, **kwargs):
    for obj in seq:
        for key, val in kwargs.iteritems():
            try:
                 if getattr(obj, key) != val:
                     break
                                                         48
            except AttributeError:
                 break
        else:
            return obj
    raise NotFound

print find_by(books, title='Python in a Nutshell')
print find_by(books, author='M. Beri')
print find_by(books, title='Python in a Nutshell',
                     author='A. Martelli')

try:
    print find_by(books, title='Python in a Nutshell',
                         author='M. Beri')
    print find_by(books, title='Python in a Nutshell',
                         pages=123)
except NotFound: pass
def find_by(seq, **kwargs):
    for obj in seq:
        for key, val in kwargs.iteritems():
            try:
                 attr = getattr(obj, key)
            except AttributeError:
                 break
            else:
                 if val != attr and val not in attr:
                     break
        else:
            yield obj
Life expectations                                   50

 Function parameters and every variable bound in
 a function body constitutes the function local
 scope
 These variables scope is the whole function body
 However, using them before binding is an error
Life expectations                                   50

 Function parameters and every variable bound in
 a function body constitutes the function local
 scope
 These variables scope is the whole function body
                     a = None
                      if s.startswith(t):
 However, using   them before binding is an error
                          a = s[:4]
                      else:
                          a = t
                      print a

                             WRONG
Life expectations                                   50

 Function parameters and every variable bound in
 a function body constitutes the function local
 scope
 These variables scope is the whole function body
                      if s.startswith(t):
 However, using   them before binding is an error
                          a = s[:4]
                      else:
                          a = t
                      print a

                              GOOD
LBYL vs. EAFP                                             51

LBYL: Look before you leap    # LBYL -- bad
                              if id_ in employees:
EAFP: Easier to ask               emp = employees[id_]
forgiveness than permission   else:
                                  report_error(...)
Usually EAFP is the best
strategy
  Exception are rather fast   #EAFP -- good
                              try:
  Atomicity, ...                   emp = employees[id_]
                              except KeyError:
                                   report_error(...)
if os.access(filename, os.F_OK):
    fh = file(filename)
else:                                   BAD
    print "Something went bad."                  52

if os.access(filename, os.F_OK):
    try:
         fh = file(filename)
    except IOError:                    VERBOSE
         print "Something went bad."
else:
    print "Something went bad."



try:
    fh = file(filename)
except IOError:                        GOOD
    print "Something went bad."
More on Exceptions                                   53

Exceptions should subclass Exception directly or
indirectly
Catch exceptions using the most specific
specifier
Don’t use the base except: unless
  You plan to re-raise the exception (but you
  probably should use finally)
  You want to log any error or something like that
  Also catches KeyboardInterrupt
Limit the try scope                                         54

 try:
     # Too broad!                                     BAD
     return handle_value(collection[key])
 except KeyError:
     # Will also catch KeyError raised by handle_value()
     return key_not_found(key)



 try:                                              GOOD
     value = collection[key]
 except KeyError:
     return key_not_found(key)
 else:
     return handle_value(value)
References                                               55

 Python in a Nutshell, 2ed, Alex Martelli, O’Reilly
 Python Cookbook, Alex Martelli, Anna Martelli
 Ravenscroft and David Ascher, O’Reilly
 Agile Software Development: Principles, Patterns and
 Practices, Robert C. Martin, Prentice Hall
 Code Clean, Robert C. Martin, Prentice Hall
 Structure and Interpretation of Computer Programs,
 H. Abelson, G. Sussman, J. Sussman,
 http://mitpress.mit.edu/sicp/full-text/book/book.html
References                                         56

 http://python.net/~goodger/projects/pycon/2007/
 idiomatic/handout.html
 http://dirtsimple.org/2004/12/python-is-not-
 java.html
 http://docs.python.org/dev/howto/
 doanddont.html
 http://www.slideshare.net/sykora/idiomatic-
 python
 http://bayes.colorado.edu/PythonIdioms.html
57




Q&A

More Related Content

Pydiomatic

  • 2. Could you please lend me the thing that you put in 2 the wall when you want to turn on the hairdryer and the hairdryer comes from a different country? Could you please lend me a power adapter?
  • 3. 3 If you are out to describe the truth, leave elegance to the tailor. Albert Einstein
  • 4. 4 Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. Brian Kernighan
  • 5. 5 READABILITY COUNTS Zen of Python
  • 6. TOC 6 Iteration Naming Functions are objects Choice Attributes and methods Duck Typing Exceptions [unless TimeoutError is thrown]
  • 7. FOR vs. WHILE vs. ... 7 Iteration vs. Recursion sys.setrecursionlimit(n) for vs. while Traditionally bounded iteration vs. unbounded iteration In C for and while are completely equivalent Some languages have for/foreach to iterate on collections for file in *.py; do pygmentize -o ${file%.py}.rtf $file done
  • 8. Numerical Iteration 8 int i = 0; i = 0 while(i < MAX) { while i < MAX: printf("%dn", i); print i ++i; i += 1 } # O(n) space for i in range(MAX): int i = 0; print i for(i=0; i < MAX; ++i) { printf("%dn", i); # O(1) space } for i in xrange(MAX): print i
  • 9. Iteration on elements 9 It is also common to iterate on elements of some i = 0 collection while i < len(lst): process(lst[i]) BAD C uses indices to iterate on i += 1 array elements Python uses for What if we want to iterate for el in lst: both on elements and process(el) GOOD indices?
  • 10. j = 0 while j < len(lst): process(index=j, element=lst[j]) BAD j += 1 10 for j in range(len(lst)): process(index=j, element=lst[j]) BAD for j, el in enumerate(lst): process(index=j, element=el) GOOD
  • 11. What about Turing? 11 for is usually considered the more pythonic alternative Ideally every iteration should be done using for However, we have shown only iteration on finite collections, that is to say, for would not provide turing-completeness But everybody knows about generators: Python has infinite (lazy) sequences and they cover many other patterns as well
  • 12. Design Implications 12 Python for statement uses external iterators, that are extremely easy to implement through generators itertools provides lots of functions to manipulate iterators The iteration logic is pushed inside the iterator; the client code becomes totally agnostic on how values are generated
  • 13. def server_socket(host, port): sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.bind((host, port)) sock.listen(5) csock, info = sock.accept() return csock.makefile('rw') 13 def server(host, port): fh = server_socket(host, port) for i, line in enumerate(fh): if line == "EOFrn": break fh.write("%4.d:t%s" % (i, line)) fh.close() ... (Forking)TCPServer and higher level modules and frameworks are better!
  • 14. def depth_first_visit(node): stack = [node, ] while stack: current_node = stack.pop() stack.extend(reversed(current_node.children)) 14 yield current_node.value def breadth_first_visit(node): queue = collections.deque((node, )) while queue: current_node = queue.popleft() queue.extend(current_node.children) yield current_node.value for v in depth_first_visit(tree): print v, print for v in breadth_first_visit(tree): print v, print
  • 15. PEP-8 15 http://www.python.it/doc/articoli/pep-8.html http://www.python.org/dev/peps/pep-0008/ ‘‘‘One of Guido’s key insights is that code is read much more often than it is written. The guidelines provided here are intended to improve the readability of code and make consistent across the wide spectrum of Python code. As PEP 20 [6] says, “Readability counts”.’’’
  • 16. PEP-8 (II) 16 Standard for source code style names whitespace indentation Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is most important.
  • 17. Indentation 17 4 spaces, don’t mix tabs and spaces 79 characters per line max Wrap lines in using implied line cont. in (), [] and {} Add parentheses to wrap lines (not filename.startswith('.') and filename.endswith(('.pyc', '.pyo'))) Sometimes backslash is more appropriate Newline after operators One blank line between functions, two between classes
  • 18. Space Invaders 18 Put a space after “,” [parameters, lists, tuples, etc] Put a space after “:” in dicts, not before Put spaces around assignments and comparisons Unless it is an argument list No spaces just inside parentheses or just before argument lists
  • 19. Naming conventions (I) 19 Always use descriptive names; the longer the scope, the longer the name Trailing underscore: avoids conflict with keywords or builtins (class_) Leading underscore: “internal use”/non-public Double leading underscore: name mangling Double leading and trailing: “magic” Avoid l, 1 and similar confusing names
  • 20. Naming conventions (II) 20 simple lower_case CamelCase ALL_CAPS Classes X Variables X X Methods X X Functions X X Constants X Packages X Modules X (x) ... and self/cls first argument name for methods
  • 21. Default values 21 The default values are evaluated once, when >>> def f(x=[]): ... x.append(1) the function is defined and is ‘shared’ among ... return x all call points ... >>> f() If the default value is a mutable object, that [1] leads to bugs >>> f() >>> def g(x=None): [1, 1] ... x = [] if x is None else x >>> f() ... return x [1, 1, 1] ... >>> g() [] >>> g([1, 2]) [1, 2]
  • 22. Functions are Objects 22 In Python everything is an object Thus, functions are objects Functions can be passed as arguments (easy) Functions can be returned as return values Some APIs explicitly expect functions as arguments (sort(key=)) import sys, urllib def reporthook(*a): print a for url in sys.argv[1:]: i = url.rfind('/') file = url[i+1:] print url, "->", file urllib.urlretrieve(url, file, reporthook)
  • 23. Internal Iterators 23 def dfs(node, action): stack = [node, ] while stack: current_node = stack.pop() stack.extend(reversed(current_node.children)) action(current_node.value) def bfs(node, action): queue = collections.deque((node, )) while queue: current_node = queue.popleft() queue.extend(current_node.children) action(current_node.value) dfs(tree, lambda x: sys.stdout.write("%s, " % x))
  • 24. def dfs(node, pre_action=None, post_action=None): def nop(node): pass pre_action = pre_action or nop # bad, use if post_action = post_action or nop # bad stack = [] 24 def process_node(n): def do_pre(): pre_action(n.value) def do_post(): post_action(n.value) def do_process(): stack.append(do_post) for child in reversed(n.children): stack.append(process_node(child)) stack.append(do_pre) return do_process stack.append(process_node(node)) while stack: action = stack.pop() action() dfs(tree, pre_action=lambda x: sys.stdout.write("%s, " % x)) print dfs(tree, post_action=lambda x: sys.stdout.write("%s, " % x)) print
  • 25. A 1 A 11 A C E B C 2 A C B A 12 A C E E 3 A C B 13 A C E 25 D E 4 A C B B 14 A C Pre 5 A C B 15 A Proc 6 A C 7 A C E D C Post 8 A C E D 9 A C E D D 10 A C E D
  • 26. def dfs(node, pre_action=None, post_action=None): def nop(node): pass pre_action = pre_action or nop post_action = post_action or nop stack = [] 26 def process_node(n): def do_pre(): pre_action(n.value) def do_post(): post_action(n.value) def do_process(): stack.append(do_post) for child in reversed(n.children): stack.append(process_node(child)) stack.append(do_pre) return do_process stack.append(process_node(node)) while stack: action = stack.pop() action() Command Pattern is obsolete...
  • 27. class TreePrinter(object): def __init__(self, fh, step=' '): self.out = fh self.step = step self.level = 0 27 def pre_print(self, value): self.out.write(self.step * self.level) 0 self.out.write(str(value)) 1 self.out.write('n') self.level += 1 2 3 def post_print(self, _): 4 self.level -= 1 5 6 tp = TreePrinter(sys.stdout) 7 dfs(tree, tp.pre_print, tp.post_print) 8 9 10 11
  • 28. The case of the missing switch 28 Some people think Python should have a switch/ case like statement, something that executes a block of code determined by the value of a variable Possible solutions Python if/elif/else statement Seems the job for a dictionary + functions A cleverly designed class can solve the problem as well
  • 29. What if we use the if? 29 An if statement is easy to read and write, if there are few branches. Confusing if there are many branches Theoretically correct (provided that the conditions are disjoint) ⎧ φ ( x ,…, x ) if ρ ( x ,…, x ) ⎪ 1 1 n 1 1 n ⎪ f (x1 ,…, xn ) = ⎨ ⎪ m( 1 φ x ,…, xn ) if ρm ( x1 ,…, xn ) ⎪ φ ( x ,…, x ) otherwise ⎩ m+1 1 n Maybe slower as conditions are evaluated in order Some suggest that if statements should be banned ;)
  • 30. Dictionary 30 If the body of the switch essentially sets some (set of) variable(s), a dictionary is perfect def some_function(n, *more_args): # ... masks = { 0: '0000', 1: '0001', 2: '0010', 3: '0011', 4: '0100', 5: '0101', 6: '0110', 7: '0111', 8: '1000', 9: '1001', 10: '1010', 11: '1011', 12: '1100', 13: '1101', 14: '1110', 15: '1111' } # ... str_bits = masks[n]
  • 31. Dictionary [+ Functions] 31 If the “actions” in the branches are naturally abstracted as functions, a dictionary is perfect import operator # ... class BinOp(Node): # ... def compute(self): operations = { '+': operator.add, '-': operator.sub, '*': operator.mul, '/': operator.div } return operations[self.op](self.left.compute(), self.right.compute())
  • 32. import cmd class Example(cmd.Cmd): def do_greet(self, rest): 32 print 'Hello %s!' % rest def do_quit(self, rest): return True while 1: words = raw_input('(cmd) ').split(' ', 1) command = words[0] try: rest = words[1] except IndexError: rest='' switch command: case 'greet': print 'Hello %s!' % rest case 'quit': break
  • 33. 33 Properties are a neat way to implement attributes whose usage resembles attribute access, but whose implementation uses method calls. These are sometimes known as “managed attributes”. GvR
  • 34. Example (Track) 34 class Track(object): def __init__(self, artist, title, duration): self.artist = artist self.title = title self.duration = duration def __str__(self): return '%s - %s - %s' % (self.artist, self.title, self.duration)
  • 35. Properties (I) 35 class A(object): Track has public attributes def __init__(self, foo): self._foo = foo “Java” bad-practice def get_foo(self): print 'got foo' Dependency from return self._foo “implementation details” def set_foo(self, val): print 'set foo' What if we need validation self._foo = val in setters and such? foo = property(get_foo, set_foo) property: old attribute access a = A('hello') syntax, function calls under print a.foo # => 'got foo' the hood # => 'hello' a.foo = 'bar' # => 'set foo'
  • 36. Properties (II) 36 Sometimes we don’t need the setter... class A(object): def __init__(self, foo): self._foo = foo def get_foo(self): print 'got foo' return self._foo foo = property(get_foo) a = A('ciao') print a.foo # => 'got foo' # => 'ciao' a.foo = 'bar' # Traceback (most recent call last): # File "prop_example2.py", line 15, in <module> # a.foo = 'bar' # AttributeError: can't set attribute'
  • 37. Properties (III) 37 Nicer syntax: decorators are handy class A(object): def __init__(self, foo): self._foo = foo @property def foo(self): print 'got foo' return self._foo a = A('hello') print a.foo # => 'got foo' # => 'hello' a.foo = 'bar' # Traceback (most recent call last): # File "prop_example2.py", line 15, in <module> # a.foo = 'bar' # AttributeError: can't set attribute'
  • 38. Properties (IV) 38 From Python 2.6, decorator for the setter: class A(object): def __init__(self, foo): self._foo = foo @property def foo(self): print 'got foo' return self._foo @foo.setter def foo(self, value): print 'set foo' self._foo = value a = A('hello') a.foo = 'bar' # => 'set foo'
  • 39. class Track(object): def __init__(self, artist, title, duration): self._artist = artist self._title = title self._duration = duration 39 @property def artist(self): return self._artist @property def title(self): return self._title @property def duration(self): return self._duration def __str__(self): return '%s - %s - %s' % (self.artist, self.title, self.duration)
  • 40. How Pythonic? 40 We can decouple interface from implementation (getters/setters) We have “read-only” attributes, therefore, “immutable” objects Trivial getter/setters are repetitive Properties are helpful in order to evolve code, but are verbose to define “immutable objects”
  • 41. Named Tuples 41 Named Tuples solve the problem nicely Immutable objects (easier to use, too much C++ and FP lately ☺) Can be used both as objects and tuples __str__ and other methods have good default implementation Subclassing can be used to change defaults Very quick to write! http://code.activestate.com/recipes/500261-named-tuples/
  • 42. 42 Track = collections.namedtuple('Track', ['title', 'artist', 'duration'])
  • 43. About Java/C++ types... 43 In statically typed languages like C++ we constrain parameters to be of a given type or any of its subtypes However, a good programming practice is program to an interface Java interfaces (true dynamic polymorphism) C++ Templates (static polymorphism) Both solutions have problems (however, I do love ML static typing...)
  • 44. Books, search by title 44 If the list contains a non class Book(object): def __init__(self, title, author): book, an exception is self.title = title self.author = author raised def find_by_title(seq, title): Does not even work with for item in seq: if type(item) == Book: # horrible subclasses if item.title == title: return item Worst strategy else: raise TypeError Never type-check like def find_by_author(seq, author): that for item in seq: if type(item) == Book: # horrible if item.author == author: Solving a non-problem return item else: raise TypeError
  • 45. Books, search by title 44 If the list contains a non book, an exception is raised Does not even work with subclasses Worst strategy Never type-check like that Solving a non-problem
  • 46. Books, search by title 45 Subclasses are ok class Book(object): def __init__(self, title, author): self.title = title However, code does not self.author = author depend on elements being def find_by_title(seq, title): books for item in seq: if isinstance(item, Book): # bad if item.title == title: They have a title return item else: They have an author raise TypeError What about songs? def find_by_author(seq, author): for item in seq: if isinstance(item, Book): # bad Bad strategy, afterall if item.author == author: return item else: raise TypeError
  • 47. Books, search by title 45 Subclasses are ok class Song(object): def __init__(self, title, author): self.title = title However, code does not self.author = author depend on elements being def find_by_title(seq, title): books for item in seq: if isinstance(item, Book): # bad if item.title == title: They have a title return item else: They have an author raise TypeError What about songs? def find_by_author(seq, author): for item in seq: if isinstance(item, Book): # bad Bad strategy, afterall if item.author == author: return item else: raise TypeError
  • 48. What about movies? 46 Movies have a title. However, they have a director and no author find_by_title should work, find_by_author, shouldn’t Interface for Book e Song. And what about Movie? Design Pattern o code duplication Square Wheel Roads designed for square wheels Duck typing simply avoids the problem
  • 49. Books and Songs 47 The simplest solution is the best class Book(object): Programmers do not code by def __init__(self, t, a): self.title = t chance (hopefully) self.author = a AttributeErrors are raised in case def find_by_title(seq, title): for item in seq: of problems if item.title == title: return item UnitTests discover these kind of def find_by_author(seq, author): errors for item in seq: if item.author == author: You have unit tests, don’t you? return item
  • 50. def find_by(seq, **kwargs): for obj in seq: for key, val in kwargs.iteritems(): try: if getattr(obj, key) != val: break 48 except AttributeError: break else: return obj raise NotFound print find_by(books, title='Python in a Nutshell') print find_by(books, author='M. Beri') print find_by(books, title='Python in a Nutshell', author='A. Martelli') try: print find_by(books, title='Python in a Nutshell', author='M. Beri') print find_by(books, title='Python in a Nutshell', pages=123) except NotFound: pass
  • 51. def find_by(seq, **kwargs): for obj in seq: for key, val in kwargs.iteritems(): try: attr = getattr(obj, key) except AttributeError: break else: if val != attr and val not in attr: break else: yield obj
  • 52. Life expectations 50 Function parameters and every variable bound in a function body constitutes the function local scope These variables scope is the whole function body However, using them before binding is an error
  • 53. Life expectations 50 Function parameters and every variable bound in a function body constitutes the function local scope These variables scope is the whole function body a = None if s.startswith(t): However, using them before binding is an error a = s[:4] else: a = t print a WRONG
  • 54. Life expectations 50 Function parameters and every variable bound in a function body constitutes the function local scope These variables scope is the whole function body if s.startswith(t): However, using them before binding is an error a = s[:4] else: a = t print a GOOD
  • 55. LBYL vs. EAFP 51 LBYL: Look before you leap # LBYL -- bad if id_ in employees: EAFP: Easier to ask emp = employees[id_] forgiveness than permission else: report_error(...) Usually EAFP is the best strategy Exception are rather fast #EAFP -- good try: Atomicity, ... emp = employees[id_] except KeyError: report_error(...)
  • 56. if os.access(filename, os.F_OK): fh = file(filename) else: BAD print "Something went bad." 52 if os.access(filename, os.F_OK): try: fh = file(filename) except IOError: VERBOSE print "Something went bad." else: print "Something went bad." try: fh = file(filename) except IOError: GOOD print "Something went bad."
  • 57. More on Exceptions 53 Exceptions should subclass Exception directly or indirectly Catch exceptions using the most specific specifier Don’t use the base except: unless You plan to re-raise the exception (but you probably should use finally) You want to log any error or something like that Also catches KeyboardInterrupt
  • 58. Limit the try scope 54 try: # Too broad! BAD return handle_value(collection[key]) except KeyError: # Will also catch KeyError raised by handle_value() return key_not_found(key) try: GOOD value = collection[key] except KeyError: return key_not_found(key) else: return handle_value(value)
  • 59. References 55 Python in a Nutshell, 2ed, Alex Martelli, O’Reilly Python Cookbook, Alex Martelli, Anna Martelli Ravenscroft and David Ascher, O’Reilly Agile Software Development: Principles, Patterns and Practices, Robert C. Martin, Prentice Hall Code Clean, Robert C. Martin, Prentice Hall Structure and Interpretation of Computer Programs, H. Abelson, G. Sussman, J. Sussman, http://mitpress.mit.edu/sicp/full-text/book/book.html
  • 60. References 56 http://python.net/~goodger/projects/pycon/2007/ idiomatic/handout.html http://dirtsimple.org/2004/12/python-is-not- java.html http://docs.python.org/dev/howto/ doanddont.html http://www.slideshare.net/sykora/idiomatic- python http://bayes.colorado.edu/PythonIdioms.html