Iterables
Iterables
Iterables
Iterables
Infinite Iterables
Lazy Evaluation
Iterator Delegation
Iterating Sequences
Sets are unordered collections of items s = {'x', 'y', 'b', 'c', 'a'}
For general iteration, all we really need is the concept of "get the next item" in the collection
for _ in range(10):
item = coll.get_next_item()
print(item)
But how do we know when to stop asking for the next item?
i.e. when all the elements of the collection have been returned
by calling get_next_item()?
→ StopIteration built-in Exception
Attempting to build an Iterable ourselves
Let's try building our own class, which will be a collection of squares of integers
We could make this a sequence, but we want to avoid the concept of indexing
In order to implement a next method, we need to know what we've already "handed out"
so we can hand out the "next" item without repeating ourselves
class Squares:
def __init__(self):
self.i = 0
def next_(self):
result = self.i ** 2
self.i += 1
return result
class Squares:
Iterating over Squares def __init__(self):
self.i = 0
print(item)
Output: 0
1
4
9
16
Python's next() function
class Squares:
def __init__(self, length):
self.i = 0
self.length = length
def __next__(self):
if self.i >= self.length:
raise StopIteration
else:
result = self.i ** 2
self.i += 1
return result
Iterating over Squares instances
sq = Squares(5) Output: 0
while True: 1
try: 4
item = next(sq) 9
print(item) 16
except StopIteration:
break
Somehow, we need to tell Python that our class has that __next__
method and that it will behave in a way consistent with using a
while loop to iterate
A protocol is simply a fancy way of saying that our class is going to implement certain
functionality that Python can count on
To let Python know our class can be iterated over using __next__ we implement the iterator protocol
The iterator protocol is quite simple – the class needs to implement two methods:
→ __iter__ this method should just return the object (class instance) itself
sounds weird, but we'll understand why later
__next__ → returns the next item from the container, or raises SopIteration
sq = Squares(5) 0
class Squares:
1
def __init__(self, length):
for item in sq: → 4
self.i = 0
print(item) 9
self.length = length
16
def __next__(self):
if self.i >= self.length: Still one issue though!
raise StopIteration
else: The iterator cannot be "restarted"
result = self.i ** 2
self.i += 1 Once we have looped through all the items
return result the iterator has been exhausted
The drawback is that iterators get exhausted → become useless for iterating again
→ become throw away objects
maintaining the collection of items (the container) (e.g. creating, mutating (if mutable), etc)
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._cities):
raise StopIteration
else:
item = self._cities[self._index]
self._index += 1
return item
class Cities:
def __init__(self):
self._cities = ['New York', 'New Delhi', 'Newcastle']
def __len__(self):
return len(self._cities)
class CityIterator:
def __init__(self, cities):
self._cities = cities
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._cities):
raise StopIteration
else:
etc…
Example
To use the Cities and CityIterator together here's how we would proceed:
city_iterator = CityIterator(cities)
for city in cities_iterator:
print(city)
But this time, we did not have to re-create the collection – we just
passed in the existing one!
So far…
and if we could just iterate over the Cities object instead of CityIterator
The iterable protocol requires that the object implement a single method
class Cities:
def __init__(self):
self._cities = ['New York', 'New Delhi', 'Newcastle']
def __len__(self):
return len(self._cities)
def __iter__(self):
return CityIterator(self)
Iterable vs Iterator
It calls the __iter__ method (we'll actually come back to this for sequences!)
The first thing Python does when we try to iterate over an object
properties of classes may not always be populated when the object is created
value of a property only becomes known when the property is requested - deferred
Example
class Actor:
def __init__(self, actor_id):
self.actor_id = actor_id
self.bio = lookup_actor_in_db(actor_id)
self.movies = None
@property
def movies(self):
if self.movies is None:
self.movies = lookup_movies_in_db(self.actor_id)
return self.movies
Application to Iterables
Example
iterable → Factorial(n)
each call to next returns a list of 5 posts (or some page size)
→ every time next is called, go back to database and get next 5 posts
Application to Iterables → Infinite Iterables
Using that lazy evaluation technique means that we can actually have infinite iterables
You should always be aware of whether you are dealing with an iterable or an iterator
why? if an object is an iterable (but not an iterator) you can iterate over it many times
enumerate(l1) → iterator
open('cars.csv') → iterator
The very first thing Python does is call the iter() function on the object we want to iterate
What happens if the object does not implement the __iter__ method?
So how does iterating over a sequence type – that maybe only implemented __getitem__ work?
You'll notice I did not say Python always calls the __iter__ method
Not really!
Let's think about sequence types and how we can iterate over them
Suppose seq is some sequence type that implements __getitem__ (but not __iter__)
Remember what happens when we request an index that is out of bounds from the
__getitem__ method? → IndexError
index = 0
while True:
try:
print(seq[index])
index += 1
except IndexError:
break
Making an Iterator to iterate over any Sequence
class SeqIterator:
def __init__(self, seq):
self.seq = seq
self.index = 0
def __iter__(self):
return self
def __next__:
try:
item = self.seq[self.index]
self.index += 1
return item
except IndexError:
raise StopIteration()
Calling iter()
→ if it's not
__getitem__ or __iter__
and that __iter__ returns an iterator
countdown() → 5
We now want to run a loop that will call countdown()
countdown() → 4
until 0 is reached
countdown() → 3
countdown() → 2
countdown() → 1 We could certainly do that using a loop and testing the
countdown() → 0 value to break out of the loop once 0 has been reached
countdown() → -1
...
while True:
val = countdown()
if val == 0:
break
else:
print(val)
An iterator approach
We could take a different approach, using iterators, and we can also make it quite generic
if the iterable did not implement the iterator protocol, but implemented the sequence protocol
Notice that the iter() function was able to generate an iterator for us automatically
The second form of the iter() function
iter(callable, sentinel)
and either raise StopIteration if the result is equal to the sentinel value
or return the result otherwise
Coding
Exercises
Iterating a sequence in reverse order
If we have a sequence type, then iterating over the sequence in reverse order is quite simple:
for item in seq[::-1]: This works, but is wasteful because it makes a copy of
print(item) the sequence
for i in range(len(seq)):
print(seq[len(seq) – i – 1])
This is more efficient, but the syntax is messy
for i in range(len(seq)-1, -1, -1):
print(seq[i])
Unfortunately, reversed() will not work with custom iterables without a little bit of extra work
When we call reversed() on a custom iterable, Python will look for and call
the __reversed__ function
That function should return an iterator that will be used to perform the reversed iteration
exception otherwise
Card Deck Example
In the code exercises I am going to build an iterable containing a deck of 52 sorted cards
2 Spades … Ace Spades, 2 Hearts … Ace Hearts, 2 Diamonds … Ace Diamonds, 2 Clubs … Ace Clubs
But I don't want to create a list containing all the pre-created cards → Lazy evaluation
So I want my iterator to figure out the suit and card name for a given index in the sorted deck
2S … AS 2H … AH 2D … AD 2C … AC
Each card in this deck has a positional index: a number from 0 to len(deck) - 1 0 - 51
To find the suit index of a card at index i: To find the rank index of a card at index i:
i // len(RANKS) i % len(RANKS)
Examples Examples