Python's functools
module has a function called reduce
that I usually recommend avoiding.
functools.reduce
function?The functools.reduce
function looks a little bit like this:
not_seen = object()
def reduce(function, iterable, default=not_seen):
"""An approximation of the code for functools.reduce."""
value = default
for item in iterable:
if value is not_seen:
value = item
continue
value = function(value, item)
return value
The reduce
function is a bit complex.
It's best understood with an example.
>>> from functools import reduce
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> reduce(lambda x, y: x + y, numbers)
In the above example, we're calling reduce
with two arguments:
When we call reduce
with those arguments it doesn't just add the first two numbers together.
Instead it adds all the numbers together:
>>> reduce(lambda x, y: x + y, numbers)
46
That first function is called repeatedly to add up all of the numbers in this list.
The reduce
function first calls the given function on the first two items in numbers
, then it takes the result it got back and uses that along with the third number as the new two arguments, and so on.
This is a bit of a silly example, because we have a function built into Python that can do this for us.
The built-in sum
function is both easier to understand and faster than using reduce
:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> sum(numbers)
46
Even multiplying numbers isn't a great example of reduce
:
>>> from functools import reduce
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> reduce(lambda x, y: x * y, numbers)
33264
When multiplying it's better to use the prod
function in Python's math
module (added in Python 3.8) because it's again faster and more readable that reduce
:
>>> from math import prod
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> prod(numbers)
33264
Those two examples are silly uses of reduce
, but not all reduce
calls can be summarized in just a single line of code though.
This deep_get
function allows us to deeply query a nested dictionary of dictionaries:
from functools import reduce
def deep_get(mapping, key_tuple):
"""Deeply query dict-of-dicts from given key tuple."""
return reduce(lambda acc, val: acc[val], key_tuple, mapping)
For example, here's a dictionary of dictionaries:
>>> webhook_data = {
... "event_type": "subscription_created",
... "content": {
... "customer": {
... "created_at": 1575397900,
... "card_status": "card",
... "subscription": {
... "status": "active",
... "created_at": 1575397900,
... "next_billing_at": 1577817100
... }
... }
... }
... }
We might wanna look up a key in this dictionary, and then look up a key in the dictionary we get back, and a key in the dictionary we get back there, and a key in it to finally get a value that we're looking for:
>>> webhook_data["content"]["customer"]["subscription"]["status"]
'active'
Instead of doing this querying manually, we could make a tuple of strings representing these keys, and pass that tuple to our deep_get
function so it can do the querying for us:
>>> status_key = ("content", "customer", "subscription", "status")
>>> deep_get(webhook_data, status_key)
'active'
This deep_get
function works, and it is powerful.
But it's also pretty complex.
from functools import reduce
def deep_get(mapping, key_tuple):
"""Deeply query dict-of-dicts from given key tuple."""
return reduce(lambda acc, val: acc[val], key_tuple, mapping)
Personally, I find this deep_get
function hard to understand.
We've condensed quite a bit of logic into just one line of code.
I would much rather see this deep_get
function implemented using a for
loop:
def deep_get(mapping, key_tuple):
"""Deeply query dict-of-dicts from given key tuple."""
value = mapping
for key in key_tuple:
value = value[key]
return value
I find that for
loop easier to understand than the equivalent reduce
call.
Even if you're familiar with functional programming techniques and you really like reduce
, you might want to ask yourself:
Is thereduce
call I'm about to use more efficient or less efficient than either afor
loop or another tool included in Python?
For example, years ago, I saw this use of reduce
in an answer to a programming question online:
>>> from functools import reduce
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> reduce(lambda accum, n: accum and n > 0, numbers, True)
True
This code checks whether all the numbers in a given list are greater than zero.
This code works but there's a better way to accomplish this task in Python.
The built-in all
function in Python can accept a generator expression that performs the same task for us:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> all(n > 0 for n in numbers)
True
I find that all
call easier to read, but it's also more efficient than the reduce
call.
If we had many numbers and one of them was less than or equal to zero, the all
function would return early (as soon as it found the number that doesn't match our condition).
Whereas reduce
will always loop all the way to the end.
Try to avoid reinventing the wheel with reduce
.
Your code will often be more readable (and sometimes even more efficient) without functools.reduce
.
reduce
operations in PythonHere are some common reduction operations in Python as well as some tools included in Python that are often more efficient and more readable than an equivalent reduce
call:
Operation | With functools.reduce |
Without reduce |
---|---|---|
Sum all | reduce(lambda x, y: x+y, nums) |
sum(nums) |
Multiply all | reduce(lambda x, y: x*y, nums) |
math.prod(nums) |
Join strings | reduce(lambda s, t: s+t, strs) |
"".join(strs) |
Merge dictionaries | reduce(lambda g, h: g|h, cfgs) |
ChainMap(*reversed(cfgs)) |
Set union | reduce(lambda s, t: s|t, sets) |
set.union(*sets) |
Set intersection | reduce(lambda s, t: s&t, sets) |
set.intersect(*sets) |
Some of these are built-in functions, some are methods on built-in objects, and some are in the standard library.
functools.reduce
Python's reduce
function (in the functools
module) can implement a complex reduction operation with just a single line of code.
But that single line of code is sometimes more confusing and less efficient than an equivalent for
loop or another specialized reduction tool that's included with Python.
So I usually recommend avoiding functools.reduce
.
Need to fill-in gaps in your Python skills?
Sign up for my Python newsletter where I share one of my favorite Python tips every week.
Unlike, JavaScript, C, Java, and many other programming languages we don't have traditional C-style for
loops.
Our for
loops in Python don't have indexes.
This small distinction makes for some big differences in the way we loop in Python.
To track your progress on this Python Morsels topic trail, sign in or sign up.
Need to fill-in gaps in your Python skills? I send weekly emails designed to do just that.
Sign in to your Python Morsels account to track your progress.
Don't have an account yet? Sign up here.