Data Structures - Python 3.7.0
Data Structures - Python 3.7.0
0 documentation
5. Data Structures
This chapter describes some things you’ve learned about already in more detail, and
adds some new things as well.
You might have noticed that methods like insert , remove or sort that only modify
the list have no return value printed – they return the default None . [1] This is a
design principle for all mutable data structures in Python.
>>> stack.pop()
7
>>> stack
[3, 4, 5, 6]
>>> stack.pop()
6
>>> stack.pop()
5
>>> stack
[3, 4]
Note that this creates (or overwrites) a variable named x that still exists after the loop
completes. We can calculate the list of squares without any side effects using:
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 3/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
or, equivalently:
squares = [x**2 for x in range(10)]
Note how the order of the for and if statements is the same in both these snippets.
If the expression is a tuple (e.g. the (x, y) in the previous example), it must be
parenthesized.
>>>
>>> vec = [-4, -2, 0, 2, 4]
>>> # create a new list with the values doubled
>>> [x*2 for x in vec]
[-8, -4, 0, 4, 8]
>>> # filter the list to exclude negative numbers
>>> [x for x in vec if x >= 0]
[0, 2, 4]
>>> # apply a function to all the elements
>>> [abs(x) for x in vec]
[4, 2, 0, 2, 4]
>>> # call a method on each element
>>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
>>> [weapon.strip() for weapon in freshfruit]
['banana', 'loganberry', 'passion fruit']
>>> # create a list of 2-tuples like (number, square)
>>> [(x, x**2) for x in range(6)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
>>> # the tuple must be parenthesized, otherwise an error is raised
>>> [x, x**2 for x in range(6)]
File "<stdin>", line 1, in <module>
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 4/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
As we saw in the previous section, the nested listcomp is evaluated in the context of
the for that follows it, so this example is equivalent to:
>>>
>>> transposed = []
>>> for i in range(4):
... transposed.append([row[i] for row in matrix])
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 5/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
In the real world, you should prefer built‑in functions to complex flow statements.
The zip() function would do a great job for this use case:
>>>
>>> list(zip(*matrix))
[(1, 5, 9), (2, 6, 10), (3, 7, 11), (4, 8, 12)]
See Unpacking Argument Lists for details on the asterisk in this line.
Referencing the name a hereafter is an error (at least until another value is assigned
to it). We’ll find other uses for del later.
As you see, on output tuples are always enclosed in parentheses, so that nested
tuples are interpreted correctly; they may be input with or without surrounding par‑
entheses, although often parentheses are necessary anyway (if the tuple is part of a
larger expression). It is not possible to assign to the individual items of a tuple, how‑
ever it is possible to create tuples which contain mutable objects, such as lists.
Though tuples may seem similar to lists, they are often used in different situations
and for different purposes. Tuples are immutable, and usually contain a heterogen‑
eous sequence of elements that are accessed via unpacking (see later in this section)
or indexing (or even by attribute in the case of namedtuples ). Lists are mutable, and
their elements are usually homogeneous and are accessed by iterating over the list.
A special problem is the construction of tuples containing 0 or 1 items: the syntax
has some extra quirks to accommodate these. Empty tuples are constructed by an
empty pair of parentheses; a tuple with one item is constructed by following a value
with a comma (it is not sufficient to enclose a single value in parentheses). Ugly, but
effective. For example:
>>>
>>> empty = ()
>>> singleton = 'hello', # <-- note trailing comma
>>> len(empty)
0
>>> len(singleton)
1
>>> singleton
('hello',)
This is called, appropriately enough, sequence unpacking and works for any se‑
quence on the right‑hand side. Sequence unpacking requires that there are as many
variables on the left side of the equals sign as there are elements in the sequence.
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 7/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
Note that multiple assignment is really just a combination of tuple packing and se‑
quence unpacking.
5.4. Sets
Python also includes a data type for sets. A set is an unordered collection with no du‑
plicate elements. Basic uses include membership testing and eliminating duplicate
entries. Set objects also support mathematical operations like union, intersection,
difference, and symmetric difference.
Curly braces or the set() function can be used to create sets. Note: to create an
empty set you have to use set() , not {} ; the latter creates an empty dictionary, a
data structure that we discuss in the next section.
Here is a brief demonstration:
>>> basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}>>>
>>> print(basket) # show that duplicates have been
{'orange', 'banana', 'pear', 'apple'}
>>> 'orange' in basket # fast membership testing
True
>>> 'crabgrass' in basket
False
5.5. Dictionaries
Another useful data type built into Python is the dictionary (see Mapping Types —
dict). Dictionaries are sometimes found in other languages as “associative memories”
or “associative arrays”. Unlike sequences, which are indexed by a range of numbers,
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 8/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
dictionaries are indexed by keys, which can be any immutable type; strings and num‑
bers can always be keys. Tuples can be used as keys if they contain only strings,
numbers, or tuples; if a tuple contains any mutable object either directly or indirectly,
it cannot be used as a key. You can’t use lists as keys, since lists can be modified in
place using index assignments, slice assignments, or methods like append() and
extend() .
It is best to think of a dictionary as a set of key: value pairs, with the requirement that
the keys are unique (within one dictionary). A pair of braces creates an empty dic‑
tionary: {} . Placing a comma‑separated list of key:value pairs within the braces adds
initial key:value pairs to the dictionary; this is also the way dictionaries are written on
output.
The main operations on a dictionary are storing a value with some key and extracting
the value given the key. It is also possible to delete a key:value pair with del . If you
store using a key that is already in use, the old value associated with that key is for‑
gotten. It is an error to extract a value using a non‑existent key.
Performing list(d) on a dictionary returns a list of all the keys used in the diction‑
ary, in insertion order (if you want it sorted, just use sorted(d) instead). To check
whether a single key is in the dictionary, use the in keyword.
Here is a small example using a dictionary:
>>>
>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'jack': 4098, 'sape': 4139, 'guido': 4127}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'jack': 4098, 'guido': 4127, 'irv': 4127}
>>> list(tel)
['jack', 'guido', 'irv']
>>> sorted(tel)
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False
In addition, dict comprehensions can be used to create dictionaries from arbitrary key
and value expressions:
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 9/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
>>>
>>> {x: x**2 for x in (2, 4, 6)}
{2: 4, 4: 16, 6: 36}
When the keys are simple strings, it is sometimes easier to specify pairs using
keyword arguments:
>>>
>>> dict(sape=4139, guido=4127, jack=4098)
{'sape': 4139, 'guido': 4127, 'jack': 4098}
When looping through a sequence, the position index and corresponding value can
be retrieved at the same time using the enumerate() function.
>>>
>>> for i, v in enumerate(['tic', 'tac', 'toe']):
... print(i, v)
...
0 tic
1 tac
2 toe
To loop over two or more sequences at the same time, the entries can be paired with
the zip() function.
>>>
>>> questions = ['name', 'quest', 'favorite color']
>>> answers = ['lancelot', 'the holy grail', 'blue']
>>> for q, a in zip(questions, answers):
... print('What is your {0}? It is {1}.'.format(q, a))
...
What is your name? It is lancelot.
What is your quest? It is the holy grail.
What is your favorite color? It is blue.
To loop over a sequence in reverse, first specify the sequence in a forward direction
and then call the reversed() function.
>>>
>>> for i in reversed(range(1, 10, 2)):
... print(i)
...
9
7
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 10/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
5
3
1
To loop over a sequence in sorted order, use the sorted() function which returns a
new sorted list while leaving the source unaltered.
>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']>>>
>>> for f in sorted(set(basket)):
... print(f)
...
apple
banana
orange
pear
It is sometimes tempting to change a list while you are looping over it; however, it is
often simpler and safer to create a new list instead.
>>>
>>> import math
>>> raw_data = [56.2, float('NaN'), 51.7, 55.3, 52.5, float('NaN'), 47.8
>>> filtered_data = []
>>> for value in raw_data:
... if not math.isnan(value):
... filtered_data.append(value)
...
>>> filtered_data
[56.2, 51.7, 55.3, 52.5, 47.8]
Comparisons may be combined using the Boolean operators and and or , and the
outcome of a comparison (or of any other Boolean expression) may be negated with
not . These have lower priorities than comparison operators; between them, not has
the highest priority and or the lowest, so that A and not B or C is equivalent to (A
and (not B)) or C . As always, parentheses can be used to express the desired
composition.
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 11/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
The Boolean operators and and or are so‑called short‑circuit operators: their argu‑
ments are evaluated from left to right, and evaluation stops as soon as the outcome
is determined. For example, if A and C are true but B is false, A and B and C does
not evaluate the expression C . When used as a general value and not as a Boolean,
the return value of a short‑circuit operator is the last evaluated argument.
It is possible to assign the result of a comparison or other Boolean expression to a
variable. For example,
>>>
>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
>>> non_null = string1 or string2 or string3
>>> non_null
'Trondheim'
Note that in Python, unlike C, assignment cannot occur inside expressions. C pro‑
grammers may grumble about this, but it avoids a common class of problems en‑
countered in C programs: typing = in an expression when == was intended.
Note that comparing objects of different types with < or > is legal provided that the
objects have appropriate comparison methods. For example, mixed numeric types
are compared according to their numeric value, so 0 equals 0.0, etc. Otherwise,
rather than providing an arbitrary ordering, the interpreter will raise a TypeError
exception.
Footnotes
[1] Other languages may return the mutated object, which allows method
chaining, such as d->insert("a")->remove("b")->sort(); .
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 12/13
25/12/2018 5. Data Structures — Python 3.7.0 documentation
file:///Users/boss/Downloads/IT%20DOCs/DOC%20Python/5.%20Data%20Structures%20%E2%80%94%20Python%203.7.0%20documentation.html 13/13