Creating Python dictionaries with “reduce”

In the last few installments (first, second, third, and fourth) of this series, we saw how “reduce” can be used to build up a scalar value (such as a number or string), or even a simple collection, such as a list (in Python) or an array (in Ruby).  The jewel in the data-structure crown for these high-level languages is known by many names: Dictionary, hash, hash table, hash map, and mapping.  I tend to use these quite a bit, and I know that I’m not alone; hashes are easy to use and work with, have O(1) lookup characteristics, guarantee the uniqueness of their keys, and make the code self-documenting.  What’s not to like?

There are times when you might want to build a dictionary step by step, and “reduce” can help you to do that.  I should note that recent versions of Python offer “dictionary comprehensions,” which are one of my favorite features of the language, and can be used similarly to “reduce” — and probably with less confusion among new programmers.  Nevertheless, I find it interesting and instructive to see how we can use “reduce” to create a data structure that we wouldn’t normally associate with this function.

I should note that some of the things you’re going to see here are not really recommended coding practices, particularly in Python.  But they will be fun, which is important, no?

Let’s start with Python: If I want to create a dictionary using “reduce”, I have a few different options.  Perhaps the first and easiest one is not exactly what you might have had in mind, namely passing the “dict” function (which creates a new instance of “dict” — i.e., a new dictionary) a list of two-element tuples.  For example, I can say:

dict([('a', 1), ('b', 2)])

and I get:

{'a': 1, 'b': 2}

So if I can tell “reduce” to emit a list of tuples, I can then pass that list of tuples to “dict”, and get a dictionary back.  Let’s try that:

reduce(lambda output, current: output + [(current, ord(current))], 'abc', [])

In the above case (and in all of the examples we’ll use here), I’m trying to build an oh-so-useful dictionary in which the keys are the letters a, b, and c, and the values are the ASCII codes for those letters.  In the above Python code, I iterate over the string ‘abc’, which is a three-element sequence of those three letters.  For each letter, I return the current value of “output” (which is guaranteed to be a list), plus a new, single-element list.  That single-element list is a tuple consisting of the current letter and its ASCII code.  So the output of the above “reduce” call will be:

[('a', 97), ('b', 98), ('c', 99)]

which, when we feed it into dict():

>>> dict([('a', 97), ('b', 98), ('c', 99)])
{'a': 97, 'b': 98, 'c': 99}

Voila!  We’ve created a dictionary.  (By the way, the “dict.items” method, which is often used to iterate over the keys and values of a dictionary, returns a list of tuples in precisely this format.)

So this is nice, but perhaps there’s a way to have “reduce” build up a dictionary all by itself?  The answer is “yes, but” — because Python tries hard to stop us from modifying data structures within a lambda.  But if we’re willing to be a bit weird, we can get around that.  How? By taking the dictionary that we received from the previous iteration (i.e., “output”), turning it into a list of tuples with dict.items, adding that list to our current pair, and then turning the whole thing back into a dictionary.

reduce(lambda output, current: dict(output.items() + 
                     [(current, ord(current))]), 'abc', {})

Note that while this is unwieldy, it doesn’t violate one of the key tenets of functional programming, namely treating data as immutable.  However, putting this sort of code in a production system will lead to hatred from your colleagues, job security, or (if you’re really lucky) both.  Oh, and it’s probably rather inefficient as well, although I’m not crazy enough to actually benchmark it.

That said, consider what we’ve done here: We create a new dictionary with each iteration, passing it to “current”.  The new dictionary is created by taking the old one, breaking it into a list of tuples, adding to it a new tuple, and then turning it all into a dictionary again.

There is at least one other way to do this, but it’s going to make the above code seem like beautiful, classic Python in comparison: Remember that we cannot assign to a dictionary within a lambda.  However, we can invoke methods, including the dict.update method, which lets us merge one dictionary into another.  The thing is. dict.update, like most methods in Python that modify data structures, returns None.  If we’re willing to take some risks (and why stop now), we can use the following code to modify our existing dictionary and then return it to the next iteration:

reduce(lambda d, current: d.update({current : ord(current)}) or d, 'abc', {})

The above code takes advantage of the fact that “and” and “or” are short-circuit operators.  Thus, we first invoke d.update on our dictionary.  That returns None.  Our “or” operator then says, “Well, I’d better go to the second argument, because the first one returned a false-y value.  Sure enough, our dictionary — because it isn’t empty — is a true value, which is what the statement returns.  And sure enough, we get our dictionary.

A final way to do this would be to hide the assignment of the dictionary behind an external function.  That is, we have a function do the dirty work for us.  That’ll certainly work, although I see it as somewhat less fun than lambdas and trying to work around their limitations.

def update_and_return(d, new_key, new_value):
    d.update({new_key: new_value})
    return d

>>> reduce(lambda output, current: update_and_return(output, current, ord(current)), 'abc', {})

{'a': 97, 'b': 98, 'c': 99}

Fun stuff!  And, perhaps, the worst method ever for getting an ASCII table into a dictionary.

Next time, we’ll see how to do this with Ruby.

One thought on “Creating Python dictionaries with “reduce””

Leave a Reply

Your email address will not be published. Required fields are marked *

thirty six ÷ = 4