Sorting lists of dicts — an exercise from “Practice Makes Python”

Cover of "Practice Makes Python"
Cover of “Practice Makes Python”

My ebook, Practice Makes Python, will go on pre-sale one week from today. The book is a collection of 50 exercises that I have used and refined when training people in Python in the United States, Europe, Israel, and China. I have found these exercises to be useful in helping people to go beyond Python’s syntax, and see how the parts of the language fit together.

(By the way, I’ll be giving my functional programming class on October 27th, and my object-oriented programming class on October 29th. Both are full-day classes, taught live and online. Signups for both classes will be announced here more formally in the coming days. Contact me at reuven@lerner.co.il if you already want details.)

Today, I’m posting an exercise that involves a common complex data structure, a list of dictionaries (aka “dicts”). I welcome feedback on the exercise, my proposed solution, and my discussion of that solution. If you’re trying to improve your Python skills, I strongly encourage you to try to solve the exercise yourself before looking at the answer. You will get much more out of struggling through the solutions to these exercises than from simply reading the answers.

Alphabetizing names

Let’s assume that you have phone book data in a list of dictionaries, as follows:

people =
[{'first':'Reuven', 'last':'Lerner', 'email':'reuven@lerner.co.il'},
 {'first':'Barack', 'last':'Obama', 'email':'president@whitehouse.gov'},
 {'first':'Vladimir', 'last':'Putin', 'email':'president@kremvax.kremlin.ru'}
 ]

First of all, if these are the only people in your phone book, then you should rethink whether Python programming is truly the best use of your time and connections. Regardless, let’s assume that you want to print information about all of these people, but in phone-book order — that is, sorted by last name and then by first name. Each line of the output should just look like this:

LastName, FirstName: email@example.com

Solution

for person in sorted(people, key=lambda person: [person['last'], person['first']]):
    print("{last}, {first}: {email}".format(**person))

Discussion

While Python’s data structures are useful by themselves, they become even more powerful and useful when combined. Lists of lists, lists of tuples, lists of dictionaries, and dictionaries of dictionaries are all quite common in Python. Learning to work with these is an important part of being a fluent Python programmer.

There are two parts to the above solution. The first is how we sort the names of the people in our list, and the second is how we print each of the people.

Let’s take the second problem first: We have a list of dictionaries. This means that when we iterate over our list, “person” is assigned a dictionary in each iteration. The dictionary has three keys: “first”, “last”, and “email”. We will want to use each of these keys to display each phone-book entry.

It’s true that the “str.format” method allows us to pass individual values, and then to grab those values in numerical order. Thus, we could say:

for person in people:
    print("{0}, {1}: {2}".format(person['last'], person['first'], person['email'])

Starting in Python 2.7, we can even eliminate the numbers, if we are planning to use them in order:

for person in people:
   print("{}, {}: {}".format(person['last'], person['first'], person['email'])

The thing is, we can also pass name-value pairs to “str.format”. For example, we could say:

for person in people:
    print("{last}, {first}: {email}".format(last=person['last'],
                                            first=person['first'],
                                            email=person['email'])

Even if our format string, with the “{first}” and “{last}”, is more readable, the name-value pairs we are passing are annoying to write. All we’re basically doing is taking our “person” dictionary, expanding it, and passing its name-value pairs as arguments to “str.format”.

However, there is a better way: We can take a dictionary and turn it into a set of keyword arguments by applying the “double splat” operator, “**”, on a dictionary. In other words, we can say:

for person in people:
    print("{last}, {first}: {email}".format(**person)

So far, so good. But we still haven’t covered the first problem, namely sorting the list of dictionaries by last name and then first name. Basically, we want to tell Python’s sort facility that before it compares two dictionaries from our “people” list, it should turn the dictionary into a list, consisting of the person’s last and first names. In other words, we want:

{'first':'Vladimir', 
 'last':'Putin', 
 'email':'president@kremvax.kremlin.ru'} 

to become

['Putin', 'Vladimir']

Note that we’re not trying to sort them as strings. That would work in our particular case, but if two people have *almost* the same last name (e.g., “Lerner” and “Lerner-Friedman”), then sorting them as strings won’t work. Sorting them by lists will work, because Python sorts lists by comparing each element in sequence. One element cannot “spill over” into the next element when making the comparison.

If we want to apply a function to each list element before the sorting comparison takes place, pass a function to the “key” parameter. Thus, we can sort elements of a list by saying:

mylist = ['abcd', 'efg', 'hi', 'j']
mylist.sort(key=len)

After executing the above, “mylist” will now be sorted in increasing order of length, because the built-in “len” function will be applied to each element before it is compared with others. In the case of our alphabetizing exercise, we could write a function that takes a dict and returns the sort of list that’s necessary:

def person_dict_to_list(d):
 return [d['last'], d['first']]

We could then apply this function when sorting our list:

people.sort(key=person_dict_to_list)

Following that, we could then iterate over the now-sorted list, and display our people.

However, it feels wrong to me to sort “people” permanently, if it’s just for the purposes of displaying its elements. Furthermore, I don’t see the point in writing a special-purpose named function if I’m only going to use it once.

We can thus use two pieces of Python which come from the functional programming world — the built-in “sorted” function, which returns a new, sorted list based on its inputs and the “lambda” operator, which returns a new, anonymous function. Combining these, thus get to the solution suggested above, namely:

for person in sorted(people, key=lambda person: [person['last'], person['first']]):
    print("{last}, {first}: {email}".format(**person))

This solution does not change the “people” list, but it does sort its elements for the purposes of printing them. And it prints them, in the phone-book order that we wanted, combining the “sorted” function, “lambda” for a built-in anonymous function, and the double-splat (“**”) operator on an argument to “str.format”.

Leave a Reply

Your email address will not be published. Required fields are marked *

twenty six − = 17