Teaching an old dog new tricks — or, how I learned to love Python’s str.format, and gave up on %

I have been programming in Python for many years. One of the things that I wondered, soon after starting to work in Python, was how you can get Perl-style variable interpolation. After all, Perl (like the Unix shell) supports two types of quotes — single quotes (in which everything is taken literally) and double quotes (in which variables’ values are inserted). Thus, in Perl, you can do something like:

$name = 'Reuven';
print "Hello, $name\n";

And sure enough, it’ll print “Hello, Reuven”.

Because single and double quotes are equivalent in Python (so long as they come as a matched set), there is no variable interpolation. The technique that I learned years ago, when I started with Python, was that you could use the % operator on a string. In this context, % looks to the string on its left, determines how many values within the string need to be replaced, and then looks right to find those values. It then returns a new string, effectively interpolating the values. For example:

>>> name = 'Reuven'
>>> "Hello, %s" % name

'Hello, Reuven'

The above Python code works just fine, returning a string with a nice, personalized greeting. And indeed, for the length of my time working with Python, I have enjoyed using this % syntax. Yes, it’s a bit weird. And no, I cannot ever remember more than the absolute basics of printf’s various % codes, meaning that I either make everything a string (with %s), or I guess, or I look up the printf codes and what they do. But to be honest, I normally just use %s, and thus benefit additionally from the fact that Python will silently invoke “str” on the parameter.

The thing is, % is supposedly going away, or is at least deprecated. (A note on the python-dev list indicates that % will go away no sooner than 2022, which is a heckuva long time from now.) As of Python 2.6, not to mention Python 3.x, we have been told that it will eventually disappear, and that we shouldn’t use % any more. Instead, we should use the str.format method.

I have always mentioned str.format to my Python classes, but to be honest, I’ve usually relied upon % when giving live demonstrations and answering questions. And I would even encourage my students to use the % syntax, in part because I found it to be so much easier.

And yet.  I knew that I was doing something wrong, and I knew that I was probably misleading my students to some degree. Thus, in the last three classes I taught, I started to push harder in the direction of str.format. And that’s when I realized two things: (1) It’s just as easy as %, and actually easier in some ways, and (2) I hadn’t learned enough about str.format to use it, beyond the simplest ways. I thus spent a great deal of time researching it — and found out that str.format, while it takes some getting used to, is more than worth the effort.

Let’s start with the simplest case. I’d like to be able to say “Good morning” to someone, using both their first and last names. Assuming that I have variables named “first” and “last”, I can do this with the old syntax as follows:

>>> first = 'Reuven'
>>> last = 'Lerner'
>>> "Good morning, %s %s" % (first, last)

'Good morning, Reuven Lerner'

In this example, we already see one of the problems with the % syntax — namely, that if we have more than one value, we need to put it into a tuple. Perhaps this is logical and reasonable from Python’s perspective, but I can assure you that it surprises many of my students.

So, how would we do it using str.format? Pretty similarly, in many ways:

>>> "Good morning, {} {}".format(first, last)

'Good morning, Reuven Lerner'

Notice that we’ve changed things a bit here. No longer are we invoking a binary operator (%) on the string. Rather, we’re invoking a string method that takes a set of parameters. This is more logical and consistent. I can’t tell you how many of my students think that % is somehow connected to “print”, when in fact it’s connected (in the case of string formatting) to strings. Having to use put the “.format” at the end of the string makes the method call more obvious.

As you might already know, the “{} {}” in the string tells str.format to take its two parameters, and to insert them, in order, into the string. Because there are two arguments, we can only have two {} inside of the string. This is a bit harder to understand, both because having {} in Python reminds many people of a dictionary, and because the empty curly braces look a bit weird. But fine, I can live with that, and got used to it very quickly.

Where str.format quickly shows its advantages over %, however, is if I want to display the input parameters in reverse order. When I use %, there is no real way to do that. Plus, if I want to reuse a value passed to %, I cannot do so. With str.format, I can swap the order in which the inputs are displayed:

>>> "Good morning, {1} {0}".format(first, last)

'Good morning, Lerner Reuven'

Notice what happened in the above: If I just use “{} {}”, then str.format uses the two parameters in order. However, I’m also able to treat the parameters as a sequence, with indexes starting at 0. I can then insert them in reverse order, as I did above, or in the regular order:

>>> "Good morning, {0} {1}".format(first, last)

'Good morning, Reuven Lerner'

Note that if you explicitly state the field numbers, then you cannot rely on the automatic numbering.

Of course, this lets me also pass a sequence of values to be inserted, so long as we then use the splat (*) operator on it, to turn it into a parameter list:

>>> names = ('Reuven', 'Lerner')
>>> "Good morning, {} {}".format(*names)

'Good morning, Reuven Lerner'

You can also call str.format with keyword arguments. When you do this, you can then put a keyword name within the {}:

>>> "Good morning, {first} {last}".format(first='Reuven', last='Lerner')

'Good morning, Reuven Lerner'

The above really appeals to me. The named parameters are explicit (if a bit long), and the use of {first} and {last} is quite readable — certainly more so than %(first)s ever was with the % operator!

I can, of course, also pass a dictionary of names, using the ** operator to turn it into a set of keyword arguments:

>>> person = {'first':'Reuven', 'last':'Lerner'}
>>> "Good morning, {first} {last}".format(**person)

'Good morning, Reuven Lerner'

I described all of these to my students in the last month, and I was pleasantly surprised to see how comfortable they were with the syntax. I’m sure that this reflects, to some degree, my comfort with the syntax, as well.

I should note that you can combine numeric and keyword arguments when working with str.format. I really suggest that you not do so. The results would look like this:

>>> person = {'first':'Reuven', 'last':'Lerner'}
>>> "Good {0}, {first} {last}".format('morning', **person)

'Good morning, Reuven Lerner'

Yukko.

Now, the one thing that would appear to be missing from str.format is… well, formatting! The bad news is that str.format has a completely and different way of indicating how you want to format output. The good news is that it’s not too hard to learn and understand.

Let’s start with the easiest part: If you want to display a string within a fixed-width field, then you can do so by adding a colon (:) and then a number.  So to put my name in a fixed-width field of 10 spaces, we would say:

>>> "Your name is {name:10}".format(name="Reuven")

'Your name is Reuven    '

(Notice the trailing spaces after my name.)

In the above example, my name is left-justified. If I want it to be right-justified, I could use a > sign between the : and the number:

>>> "Your name is {name:>10}".format(name="Reuven")

'Your name is     Reuven'

And yes, I could have used an optional < symbol to say that my name should be left-justified within the field of 10 spaces in the first example.  Or I could center the text in a field of 10 spaces with the ^ specifier instead of < or >.

To pad the string with something other than a space, we specify it before the <, >, or ^ character. For example, if I’m moving to Hollywood, then perhaps I should do something like this:

>>> "Your name is {name:*^10}".format(name="Reuven")

'Your name is **Reuven**'

If I want to put the string in the (default) left-most position of the string, filling with characters on the right, then I must use the < specifier, so that the text will be on the left, and the stars on the right.

So it’s pretty clear that str.format is pretty snazzy when it comes to text. How about numbers? I wasn’t really sure how things would work here, but it turns out that they’re also quite straightforward. If you’re displaying integers, then you can go ahead and say:

>>> "The price is ${number}.".format(number=123)

'The price is $123.'

So far, we don’t see any difference between passing an integer and a string. And indeed, they share many characteristics. However, we might want to display an integer in a different way. We can do that using one of the (many) modifiers that str.format provides — letters placed just before the end of the closing } character. For example, we can get the price in binary (with a trailing “b”), or in hexadecimal (with a trailing “x”), as in the following example:

>>> "The price is ${number:x}.".format(number=123)

'The price is $7b.'

Of course, we can also zero-pad the number, such that it will always take up a fixed width. Just place a 0 between the colon and the width:

>>> "Your call is important to us. You are call #{number:05}.".format(number=123)

'Your call is important to us. You are call #00123.'

Notice that inside of the {}, we cannot put executable Python code. Instead, there is a mini-language that is separate and different from Python. However, there are two small exceptions to this rule: (1) We can retrieve any attribute with the standard . notation, and (2) we can retrieve a single item with the [] notation.

For example:

>>> class Foo(object):
        def __init__(self):
        self.x = 100
>>> f = Foo()
>>> 'Your number is {o.x}'.format(o=f)

'Your number is 100'n

Notice how we were able to retrieve the “x” attribute from the “f” object, which we mapped to “o” within the string. However, while you can retrieve an attribute, you cannot execute it. Thus, the following will not work:

>>> "Your name is {name.upper()}".format(name="Reuven")

AttributeError: 'str' object has no attribute 'upper()'

See what happened? I said “name.upper()”, in order to execute the method “str.upper” on “name”.  However, Python doesn’t want me to execute code there. So it takes the name of the attribute literally — and thus complained that there is no attribute “upper()”, with the parentheses. Of course, if you try it without the parentheses, it’ll work, for some value of “work”:

>>> "Your name is {name.upper}".format(name="Reuven")

'Your name is <built-in method upper of str object at 0x1028bf2a0>'

Similarly, we can retrieve an individual element of a sequence or mapping with []. However, we cannot use the slice notation for more than one element. For example:

>>> "Your favorite number is {n[3]}.".format(n=numbers)

'Your favorite number is 3.'

However:

>>> "Your favorite numbers are {n[2:4]}.".format(n=numbers)

ValueError: Missing ']' in format string

The “:” character, which we use for slices, isn’t available in format strings, because it’s used to control the formatting of the output.

You can, of course, use [] on a dictionary, as well. However — and this is a bit weird for Python — we omit the quote marks, even when our key is a string. For example:

>>> person = {'first':'Reuven', 'last':'Lerner'}
>>> "Your name is {p[first]}.".format(p=person)

'Your name is Reuven.'

If we were to include the quotes…

>>> "Your name is {p['first']}.".format(p=person)

KeyError: "'first'"

There is actually a lot more to str.format than what I have shared here. In particular,  each type has its own format specifications, which means that you can do certain things with floats (e.g., setting the precision) that you cannot do with strings.

You can even add formatting functionality to your own Python classes, such that they’ll be displayed in the way that you want, along with format specifiers that you define.

If you want to learn more about this, I’d definitely suggest reading PEP 3101, which describes str.format. I’d also suggest a slide show by Eric Smith, which summarizes things nicely. Finally, the Python documentation has some excellent examples, including a guide for moving from % to str.format.

I hope that this was helpful and useful! If you enjoyed this blog post, check out my many other resources, including my free e-mail course on Python scoping, and my free Webinar on functional programming in Python.

14 thoughts on “Teaching an old dog new tricks — or, how I learned to love Python’s str.format, and gave up on %”

  1. It should be noted that str.format is not supported for versions of Python before 2.6, and that the % formatting is fully supported in the latest version (3.4.1) of Python 3. For those who need maximum portability for legacy Python installations (Python 2.4 on RHEL or CentOS version 5) up to the latest Python 3, str.format must unfortunately be avoided.

  2. Where str.format quickly shows its advantages over %, however, is if I want to display the input parameters in reverse order. When I use %, there is no real way to do that. Plus, if I want to reuse a value passed to %, I cannot do so.

    The % syntax also supports a dictionary on the right hand side, as well as a tuple. This allows both arbitrary ordering and reuse (as well as making the string more readable compared to just having a bunch of %s):


    >>> "%(author)s wrote a good article about %(subject)s. Thanks %(author)s!" % {'subject': 'str.format', 'author': 'Reuven'}
    'Reuven wrote a good article about str.format. Thanks Reuven!'

    1. Yes, you’re right. And for experienced Python folks, that syntax is totally useful and convenient. But when I show it to my students, they freak out when they see “%(something)s”, which is quite non-standard. Logical, yes. Convenient, definitely. But they find themselves dealing with a few different new syntaxes, on top of the rest of the language. Which means that while % is useful and convenient, str.format is (I think) a bit easier for people to grasp.

      I just taught a Python course using str.format, ignoring % completely, and got many fewer questions about how to print things out. That’s not overwhelming evidence, but it’s something, and encourages me to continue along this path.

  3. Good article. Thanks for writing it.

    If you’re just using **kwargs, there’s also str.format_map. For example:

    >>> name=”Reuven”
    >>> “Your name is {name}”.format_map(locals())
    ‘Your name is Reuven’

    The advantages of this are 2-fold. First is performance, and the second is to preserve the type of the mapping.

    From the docs:

    >>> class Default(dict):
    … def __missing__(self, key):
    … return key

    >>> ‘{name} was born in {country}’.format_map(Default(name=’Guido’))
    ‘Guido was born in country’

    str.format_map didn’t show up until 3.2, and is not present in 2.7, unfortunately.

  4. Great post. Only advice would be to change the line

    >>> "The price is ${number:x}.".format(number=123)

    'The price is $7b.'

    To a hex number that doesn’t end in b, as I was looking for a “trailing b” per the text right above it.

  5. Here’s something that without the str.format style you cannot easily do at all (and it actually mixes the two uses together!):


    just_char = {'left':''}[justification]
    width = 20
    str = "A silly actor: {name:%s%d}" % (just_char, width)
    print(str.format(name="Michael Palin"))

    Better yet, you can columnize everything based on the longest string:

    names = ('Michael Palin', 'John Cleese', 'Eric Idle')
    width = max([len(it) for it in names])
    for n in names:
    print(str.format(n))

    Try doing that without str.format!

  6. Your example doesn’t make sense to me, Brian. First I’ll assume you meant to start with “justification = “left” and that you didn’t want to overshadow the built-in “str”. And then I still don’t need string format. I don’t know how the alignment is expressed with modulo, but a similar example formatting the number presentation would be:

    justification = “hexadecimal”
    base = {‘hexadecimal’:”X”}[justification]
    string = “Number %%(number)s with base 16: %%(number)%(base)s” % {“base”: base}
    print(string % {“number”:169})

    As for your second example, the same holds. Also, please make sure the code you post is actually executable as is, so I don’t have to figure out what assignments you left out and where to insert which lines from your second part into the first to get a running example.

    Anyway, I started using format() because it allows to traverse through dictionaries in the local environment, e.g. “{dictionary[key][nestedKey]}”.format(**locals()) – using modulo would force to pass the values individually, which is just a pain.

    PS: You might want to increase the captcha time or make sure that what has been entered isn’t lost in case someone takes too long to post his comment (probably writing a longer one, which makes this especially infuriating – luckily I learnt to *always* copy everything to the clipboard before I hit sent.).

Leave a Reply

Your email address will not be published. Required fields are marked *

twenty three − twenty =