Implementing “min” and “max” with “reduce”

This is the third installment of my “reduce” series of blog posts.  For the first, see here, and for the second, see here.

If you have been reading this series, then you know that “reduce” can be used to sum numbers, or to calculate scores.  In that sense, “reduce” justifies its name; we’re invoking a function repeatedly on each element of a collection, boiling the collection down to its essense, as defined by our function.

But “reduce” can also be used to to other, more interesting things.  One of them, mentioned in the Ruby documentation, isn’t an obvious candidate for beginners with “reduce”, namely implementing “min” and “max”.

In a world without functional programming or reduce, we could implement a “min” function as follows in Ruby:

def mymin(items)
  output = items[0]           # Assume the first item is the smallest
  items[1..-1].each do |item| # Iterate through all remaining items
    if item < output
      output = item           # This item is smaller, so keep it
    end
  end
  output                      # So far, this is the smallest we've seen
end

The definition, if you’re somewhat new to Ruby, works as follows: The “min” function takes an enumerable sequence of items. We assume that the first item is the smallest one, simply to make life easier for ourselves.  We then iterate over the remaining items, checking to see if the current one is smaller than the current winner.  Each iteration thus leaves “output” with its existing value, or replaces it with something smaller.

Now, the above code works just fine — but it’s a bit long and inelegant just to find the minimum value in an enumerable.  After all, it should be possible to implement the same algorithm we we used above in the implementation of “mymin”, but somehow more elegantly and concisely.

That’s where “reduce” comes in.  We saw in an earlier blog post that when we use “reduce” to sum the numbers in a collection, is sort of like saying ((((1+2)+3)+4)+5).  Well, let’s say that we had a function “smaller” that would return the smaller of two numbers.  Then, we could use “reduce” in the following way:

smaller(smaller(smaller(smaller(1,2),3),4),5)

Oh, is that hard to read?  Yeah, I thought so.  So instead, we can do the following:

items.reduce {|smallest, current| smaller(smallest, current) }

Each time “reduce” invokes our block, it keeps the smaller of its two parameters.  The first parameter, which I’ve here called “smallest”, contains the smallest value that we’ve seen so far. The second parameter, “current”, contains the current value from the collection.

That one line certainly looks nicer than our original implementation of “mymin”, I think. There’s just one problem: We haven’t defined any “smaller” method!  Fortunately, we can use Ruby’s ternary operator to have a tiny, one-line if-then-else statement that does the same thing.  In other words:

items.reduce {|smallest, current| smallest < current ? smallest : current }

The moment you realize that “reduce” can be used to hold onto only one of the previous values, you begin to see the opportunities and possibilities that it offers. For example, we can find the shortest or longest word in an array of words:

words = 'This is a sample sentence'.split
words.reduce {|shortest, current| (shortest.size < current.size) ? shortest : current }
=> "a"

words.reduce {|longest, current| (longest.size > current.size) ? longest : current }
=> "sentence"

Now, can we do the same thing in Python?  The builtin “reduce” function works similarly to Ruby’s Enumerate#reduce, as we’ve seen in previous posts I wrote on the subject.  However, the function that we pass to Python’s “reduce” is more limited than a Ruby block, especially if we use a lambda (i.e., anonymous function).

However, we do have a trick up our sleeve: Python provides an analogous version of Ruby’s ternary operator, which is meant for precisely these occasions.  I generally tell people to avoid this operator completely, because it causes more problems and confusion than necessary, and probably means that you’re doing something that you shouldn’t be in Python.  It works as follows:

>>> x = 'a'
>>> is_an_a = "yes" if x == 'a' else "no"

>>> is_an_a
'yes'

The oddest thing for me about this form of “if” in Python is that the condition is in the middle of the line.  It’s not the postfix “if” of Perl and Ruby, and it’s not the prefix “if” that everyone is used to, but something else entirely.  And as such, it’s usually a bad idea to have in your code.  But right now, it fits perfectly.

Here, then, is a version of “min” in Python, using “reduce”:

>>> items = [5,-5,10,1,100,-20,30]
>>> reduce(lambda smallest, current: smallest if (smallest < current) else current, items)
-20

And sure enough, we get the lowest value!  Next time, we’ll produce even more complex outputs, with more complex data structures.

Calculating Scrabble scores with “reduce”

This is the second installment of my series of blog posts on the “reduce” function/method.  For an introduction, see here.

I love to play Scrabble — or more commonly nowadays, I play Words with Friends on my phone. (I often say that the game should instead be called, “Words with people who used to be your friends,” given the competition that can result.)  The game, if you’re not familiar with it, is simple enough to understand: You choose seven English letters at a time from a bag or pool of letters.  Each letter has a different point value.  Your aim is to create, with your seven letters, the highest-scoring word that you can in each turn.

Fine, it’s a bit more complex than that: Your word needs to hook onto an existing word or letter, and you can get bonus points by completing more than one word, and there are squares which multiply your score, by doubling or tripling the value of an individual letter or the entire word.  I’m going to ignore all of these rules for now.

Let’s say that I want to write a Ruby program that will calculate the score of a given word in Scrabble. How can I do that?

Before I can begin, I’m going to need to need a list of the points that you get for each letter.  Because we’re dealing with a mapping between two values — a letter and its points — the easiest solution would be a hash:

points = {'a'=> 1, 'b'=> 3, 'c'=> 3, 'd'=> 2, 'e'=> 1, 'f'=> 4, 'g'=> 2, 'h'=>
 4, 'i'=> 1, 'j'=> 8, 'k'=> 5, 'l'=> 1, 'm'=> 3, 'n'=> 1, 'o'=> 1, 'p'=> 3,
 'q'=> 10, 'r'=> 1, 's'=> 1, 't'=> 1, 'u'=> 1, 'v'=> 4, 'w'=> 4, 'x'=> 8, 'y'=>
 4, 'z'=> 10}

Once we’ve defined our “points” hash, we can start calculating the number of points that we’ll get for a given word.  If you aren’t familiar with “reduce”, but have been programming in Ruby for at least a short while, you’ll probably realize that this is easily accomplished with a loop.  That is, we can iterate over each letter in the string, getting the point value for that letter, and adding it to the total.  For example, we could do something like this:

word = 'encyclopedia'
total = 0 
word.each_char do |letter|
  total += points[letter]
end

After the above loop, we’ll find that the total score is 22 points.  That’s fine, and it’ll work… but by using “reduce”, we can do the same thing in a single line:

word.split('').reduce(0) {|total, current| total + points[current]}

Now, how does the above work? First, it takes the string “word”, and turns it into an array, in which each letter is an element.  Then, we use “reduce” to calculate a score: Since it’s a numeric score, we’ll initialize the total with 0.  Then, for each letter in our word, we add the number of points associated with that word to our total.

To do the same thing in Python will require surprisingly similar code. First, we create a dictionary:

points = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f': 4, 'g': 2, 'h':
 4, 'i': 1, 'j': 8, 'k': 5, 'l': 1, 'm': 3, 'n': 1, 'o': 1, 'p': 3,
 'q': 10, 'r': 1, 's': 1, 't': 1, 'u': 1, 'v': 4, 'w': 4, 'x': 8, 'y':
 4, 'z': 10}

Then we run reduce in a similar way. Because strings in Python are sequences, we don’t need to turn our word into an an array before iterating over it; we can invoke “reduce” directly on the string:

>>> word = 'encyclopedia'
>>> reduce(lambda total, current: total + points[current], word, 0)
22

This is a simple demonstration of where “reduce” can be useful. I’m looking to apply the same operation on the elements of a sequence of data, adding the latest value to the total.  When I get to the end, I’d like to see what has accumulated so far.

In the next installment, we’ll see how we can use “reduce” to implement some other functions that we wouldn’t normally associate with functional programming, let alone with something called “reduce”.

 

Understanding “reduce” (first in a series)

One of the notable things about MIT’s computer science curriculum, at least back when I was studying there, was that you didn’t learn any “practical” programming languages.  Our work was all done in either Scheme (a dialect of Lisp) or in CLU (an early object-oriented language).  I can’t say that I have too many memories, let alone fond ones, of CLU.  But I definitely drank the Kool-Aid about Lisp, and have long believed that it has always represented the pinnacle of programming languages.  Things that I learned long ago in Lisp are only now becoming standard in popular languages.

For example, functional programming has become increasingly popular in the last few years, for a variety of reasons including the shrinking effects of Moore’s Law — and the resulting need to have multiple, immutable copies of your data in the computer, distributed across multiple processors.  Lisp has long had many functional capabilities; it was Lisp that first introduced me to such functions as “map”, which I use multiple times each day.  However, my regular uses of “map” are rarely in Lisp; rather, they’re in Python and Ruby, the languages that I use most in my day-to-day work.

When I teach classes in Ruby and Python, I spend a fair amount of time talking about functional programming techniques.  It might sound funny to discuss functional programming in two languages that are so clearly object-oriented, but I actually find it quite natural.  Sure, I create classes and throw objects around.  But if I have an array of values, then it’s very fast and easy for me to process them with functional techniques.  Python’s list comprehensions have basically taken the place of the “map” and “filter” functions, so while those exist, they’re not as necessary any more.  But once you understand such functions as “map” and “filter”, you’re poised to do all sorts of amazing things.

Perhaps the most intriguing of the functions from this school is “reduce”.  I have found, consistently over time, that “reduce” is the function most likely to surprise and confuse newcomers to this type of programming.  That’s because it’s easy to confuse what “reduce” is doing, and to forget how flexible its output can be.

I’ve thus decided to write a series of blog posts about “reduce”, in both Python and Ruby.  I’ll start with the simple stuff, and then move ahead with increasingly complex tasks.  You won’t necessarily start to use “reduce” all of the time, but if you’re like me, you will find all sorts of interesting uses for it.  I tell people that I tend to use “reduce” about once every six months — but when I do use it, it really saves the day.

In Ruby, we can use the “reduce” method (also available as “inject”, to satisfy people from the Smalltalk world) on any enumerable object.  The invocation looks like this:

[1,2,3,4,5].reduce(0) {|a, b| a+b}

I’ll explain what this all means in a moment.  But the most important change that I can already make is to use better names for the block parameters:

[1,2,3,4,5].reduce(0) {|total, current| total+current}

Ruby goes through each element of the enumerable, invoking the block on each element. Described in this way, we might confuse “reduce” with “map”.  However, in “map”, the output is an array of the same length as the input.  By contrast, with “reduce”, the output of each iteration is remembered for the next time around.  That is, the value of “total” in each iteration is the block’s result from the last iteration.  

The initial value of “total”, in the first iteration, is the parameter value that we pass, which is 0 in this case.  If you don’t pass a parameter, then “total” is initialized with the first element of the enumerable, and the first iteration (i.e., the first application of the block) takes place on the second element.  This is fine in the above example, but depending on the output you want, failing to pass a parameter, or passing one of the wrong type, can make a big difference.

You can also think of “reduce” as a sort of “join”, but one that evaluates its inputs and operations.  So instead of getting the string “1+2+3+4+5”, you get the result of actually invoking 1+2, and then (1+2)+3, and then ((1+2)+3)+4, and then finally (((1+2)+3)+4)+5. This is why some Lisp versions call this “fold”, rather than “reduce”.  I still don’t quite get why Smalltalk people call it “inject”, and thus never use that term in Ruby (except when introducing the five rhyming functional methods — select, detect, collect, inject, and reject — because it’s so much fun to say). But the effect, once you internalize it, can be used in many interesting ways.

Python doesn’t have a “reduce” method on sequences, but does have a builtin “reduce” function that can be invoked on sequences.  (In Python 3, the “reduce” function was moved to the “functools” module — which is better than the fate Guido had originally planned for it.)  Python’s “reduce” is similar to the one in Ruby.  To sum numbers, we say:

>>> numbers = range(10)
>>> reduce(lambda total, current: total + current, numbers)
45

As you can see (I hope), the use of “lambda” to create an anonymous function is analogous to the use of a block in Ruby.  In Python’s “reduce” function, you first pass the function you wish to invoke on the sequence, and then the sequence itself.  If you wish to pass an initial value for “total”, you can do so with an optional third parameter:

>>> reduce(lambda total, current: total + current, numbers, 10)
55

The classic use of “reduce” is to sum integers, as we saw above.  But we can, of course, perform additional types of operations, and produce additional types of output.  And to be honest, that’s where “reduce” starts to get more intriguing.  Any operation that you want to perform on an enumerable, such as an array, set, or range, and apply cumulatively to its elements, makes for a good choice for “reduce”. By experimenting with the input enumerable, the initial value of “total” that we pass as a parameter, and the block, we can do many interesting things.

In coming posts, I’ll explore some more of these ideas, and give you a tour of “reduce” in both Ruby and Python that will hopefully open your eyes to some of them, and give you a sense of where “reduce” can help to improve your thinking, and your code.

If you build it, they will come — but they might hate you

Several months ago, I was teaching an introductory Python course, and I happened to mention the fact that I use Git for all of my version-control needs.  I think that I would have gotten a more positive response if I had told them that my hobby is kicking puppies.

The reactions were roughly — and I’m not exaggerating here — something like, “What?  You use Git?!?  That so-called version control system whose main feature is eating our files?!?”   And I got this not just from one person, but from all 20-something people who were taking my Python course.  The more experience they had with Git, the more violently negative their reactions were.

I managed to calm them down a bit, and tried to tell them that Git is a wonderful system, except for one little problem, namely the fact that its interface is very hard to understand.  But, I promised them, once you understand how Git works, and once you start to work with it within the context of understanding what it’s doing, things start to make sense, and you can really enjoy and appreciate the system.

I should note that since that Python class, I’ve returned to the same company to give two day-long Git classes.  Based on the feedback I received, the Git class was very helpful, and I’m guessing that this is because I concentrated on what Git is really doing, and how the commands map to those actions.  I’m pretty sure that people from that class are starting to appreciate the power and flexibility of Git, rather than focusing only on their frustrations with it.

However, my experience working with and teaching Git have taught me a great deal about designing both software and UIs.  We love to say and think that excellent products with terrible marketing never get anywhere.  And in the commercial world, that might well be true. Everyone loves to quote the movie “Field of Dreams” (which I never really liked anyway), and how the main character builds a baseball field after repeatedly hearing, “If you build it, they will come.” As numerous other people have said, this is not the case for businesses: If you build it, they probably won’t come, unless you’ve invested time and money in marketing your product. 

However, in the open-source world,  we expect to invest time in learning a technology, and are generally more technical folks in any event.  Thus, we tend to be more forgiving of bad UIs, focusing on features rather than design. It’s thus possible for something brilliant, efficient, flexible, and profoundly frustrating for new users to become popular. Git is a perfect example of this.

Now, I happen to think that Git is one of the most brilliant pieces of software I’ve ever seen. Really, it’s impressively designed.  However, the commands are counter-intuitive for many people who used other version-control systems, and it’s possible to get yourself into a situation from which an expert can extract himself or herself, but in which a novice is completely befuddled.  Once you understand how Git works (brilliantly described in this video), things start to make sense.  But getting to that point can take a great deal of time, and not everyone has that time.

In open source, then, “If you build it, they will come” might sometimes work.  However, even if they do come, and even if they use the software that you have written, you might end up in a particularly unenviable situation: People will use the software, but will hate you for the way in which you designed it.

The upshot, then, is that it’s worth taking a bit of time to think about your users, and how they will use your system.  It’s worth taking the time to create an interface (including commands) that will make sense for people.  Look at WordPress, for example: It packs in a great deal of functionality, but also pays attention to the UI… and as a result, has become a hugely dominant part of the Web ecosystem.

Sure, Git is famous and popular, and I’m one of its biggest fans, at least in terms of functionality. But if Linus had spent just a bit more time thinking about command names, or behaviors, I think that we would have had an equally powerful tool, but with fewer people in need of courses to understand why their files are getting trampled.

Good intentions, unexpected results: Mailing lists and DMARC

If there’s anything that software people know, it’s that changing one part of a program can result in a change in a seemingly unrelated part of the program.  That’s why automated testing is so powerful; it can show you when you have made a mistake that you not only didn’t intend, but that you didn’t expect.

If unexpected results can happen in a system that you control and supposedly understand, it’s not hard to imagine what happens when the results of your changes involve many pieces of software other than yours, running on computers other than yours, being used by customers who aren’t yours.

This would appear to be the situation with one of the latest anti-spam and security features for e-mail, known as DMARC.

I’m not intimately familiar with this standard, but I’ve seen other standards relating to e-mail in the past to know that anything having to do with e-mail will be frustrating for some of the people involved.  E-mail is in use by so many people, on so many computers, and by so many different programs, that you can’t possibly make changes without someone getting upset.  Nevertheless, the DMARC implementation and rollout by a number of large e-mail providers over the last few weeks has been causing trouble.

Let me explain: DMARC promises, to some degree, to reduce the amount of spam that we get by verifying that the sender’s e-mail address (in the “From” field) matches the server from which the e-mail was sent.  So if you get e-mail from me, with a “From” address of “reuven@lerner.co.il”, DMARC will verify that the e-mail was really sent from the lerner.co.il server.  To anyone who has received spam, or fake messages, or illegal “phishing” messages, this sounds like a great thing: No longer will you get messages from your friend with a hotmail.com address, asking for money now that they’re stranded in London.  It really, admirably aims to reduce the number of such messages.

How? Very simply, by checking that the “From” address in the message matches the server from which the message was sent.  If your DMARC-compliant server receives e-mail from “reuven@lerner.co.il”, but the server was some anonymous IP address in Mongolia, your server will refuse to receive the e-mail message.

So far, so good.  But of course, for every rule, there are exceptions.  Consider, for example, e-mail lists: When someone posts to a list, the “From” address is preserved, so that the message appears to be coming from the sender.  But in fact, the message isn’t coming from the sender.  Rather, it’s coming from the e-mail program running on a server.

For example, if I (reuven@lerner.co.il) send e-mail to a mailing list (list@example.com), the e-mail will really be coming from the example.com server.  But it’ll have a “From” address of reuven@lerner.co.il.  So now, if a receiver is using DMARC, they’ll see the discrepancy, and refuse to receive the e-mail message.

If lerner.co.il is using DMARC in the strictest way possible, then reuven@lerner.co.il sending to list@example.com will have especially unpleasant consequences: lerner.co.il will refuse to receive its own subscriber’s message to the list, because DMARC will show it to be a fake.  These refusals will count as a “bounce” on the mailing list, meaning a message that failed to get to the recipient’s inbox.  Enough such bounces, and everyone at lerner.co.il will be unsubscribed.

Yes, this means that if your e-mail provider uses DMARC, and if you subscribe to an e-mail list, then posting to such a list may result (eventually) in every other user of your provider being unsubscribed from the list!

I’ve witnessed this myself over the last few weeks, as members of a large e-mail list I maintain for residents of my city have slowly but surely been unsubscribed.  Simply put, any time that a Hotmail, Yahoo, or AOL users posts to the list for Modi’in residents, all of these companies (and perhaps more) refuse the message.  This refusal increases the number of bounces attributed to the users, and eventually results in mass auto-subscriptions.

As if that weren’t bad enough (and yes, it’s pretty bad), people who have been passively reading (i.e., not participating) in the e-mail list for years are now getting cryptic messages from the list-management software, saying that they have been unsubscribed because of excessive bounces.  Most people have no idea what this means, which in turn leads to the list managers (such as me) having to explain intricate e-mail policy issues.

There are some solutions to this problem, of course.  But they’re all bad, so far as I can tell, and came without any serious warning or notification.  And when it comes to e-mail, you really don’t want to start rejecting message en masse without warning.  The potential solutions are:

  1. Subscribers can receive the digest mode of the list, which is always “From” an address on the server.  If you get the digest, this problem won’t happen to you.  If you are a mailing-list subscriber, rather than a list administrator, this is really the only recourse that you have.
  2. The list managers can change the list such that instead of each message being “From” the individual, it’ll come from the list’s address.  I know that there are some people who say that this is the right behavior for e-mail lists, but I have long subscribed (so to speak) to the school of thought that you don’t want to change the “From” address.  (For more on this subject, you can read “reply-to considered harmful” and its associated messages.)
  3. Supposedly, Mailman (the list-management software that I use) now has some support for DMARC that might solve the problem.  But the more I learn about DMARC, the less I’m convinced that Mailman can do anything.

And by the way, it’s not just little guys like me who are suffering.  The IETF, which writes the standards that make the Internet work, recently discovered that their e-mail lists are failing, too.

E-mail lists are incredibly useful tools, used by many millions (and perhaps billions) of people around the world.  You really don’t want to mess with how they work unless there’s a very good reason to do so.  Yes, spam and fraud are big problems, and I welcome the chance to change them.  

But really, would it have been so hard to contact all of the list-management software makers (how many can there be?) and work out some sort of deal?  Or at least get the message out to those of us running lists that this is going to happen?  I have personally spent many hours now researching this problem, and trying to find a solution for my list subscribers, with little or no success.

This all brings me back to my original point: The intentions here were good, and DMARC sounds like a good idea overall.  But it is affecting, in a very negative way, a very large number of people who are now suddenly, and to their surprise, cut off from their friends, colleagues, workplaces, and organizations.  The fact that AOL and other e-mail providers are saying, “Well, you’ll just need to reconfigure your list software,” without considering whether we want to do this, or whether e-mail lists really need to change after more than two decades (!) of working in a certain way, is rather surprising to me.  I’m not sure if there’s any way back, but I certainly hope that this is the last time such a drastic, negative solution is foisted on the public in this way.

rvm do

I’ve been using rvm for many years, and love it.  Yes, I know that it rewrites simple commands, such as “ruby” and “gem”, so that I can use lots of different Ruby versions.  Yes, I know that it can be overkill for certain situations.  And yes, I know that rbenv is preferred by many.

But I’ve been using rvm for a long time, and I find it works very well for my needs.  I can (and do) have many different versions of Ruby running on my computer, and having access to all of them at once is terrific.

I tend to be obsessive about updating Ruby gems on my systems, and I’m sure I’m not the only Rubyist who runs “gem update -V” (and yes, I love the “verbose” option) at least once per day.  Updating gems never removes the old versions, and if you’re using Bundler in your Rails or Sinatra application, then it doesn’t really matter how many versions you have on your system.  (And yes, I know that willy-nilly updating all gems on my system is probably not wise.  If only that were the most foolish thing I do…)

The thing is, when I update gems, I do so in a particular version of Ruby.  So even though I’m always running “gem update -V”, I never quite remember which versions of Ruby have the latest gems, and which haven’t been updated in a while.  There is, of course, a clear correlation between the frequency with which I use a Ruby version and the freshness of the gems for that version on my system.  But I sometimes find myself having to update gems in a version of Ruby that I haven’t used in a while.

So you can imagine my delight when I discovered “rvm do”.  This is an rvm command that lets you execute a command in any or all of the Ruby versions installed on your system.  It basically switches to the Ruby version and then executes the requested shell command — so you’re not executing a Ruby program in each separate version, but rather you’re executing a command-line program once for each version of Ruby installed.  You can think of it as executing the same shell command once for each installed version, prefaced by “rvm VERSION_NUMBER”.

So, how can I ensure that all of the gems, for all versions of Ruby, are up to date?  Very simply, I write:

rvm all do gem update -V

And if I want to check out some Ruby code, and see how it runs in all of the versions on my system, I can say

rvm all do ruby test.rb

If I just want to see the difference between doing something in 1.8, 1.9, 2.0, and 2.1 (without all of the patchlevels for 1.9.3), then I can just say:

rvm 1.8.7,1.9.3,2.0,2.1  do ruby test.rb

I’m already loving this feature, and can easily imagine cases — such as when teaching Ruby programming, and trying to show them the differences between versions — when this will be quite handy.

Announcing: Teaching to Code, a community of programming instructors

In the wake of my last blog post, I’ve been thinking a great deal about the practice of teaching, and specifically the practice of teaching programming.  I’ve realized that while instruction in programming is increasingly popular and important, the people engaged in such instruction aren’t comparing notes, learning from one another, or generally working to improve the trade.

I’ve decided to try to change that.  I’ve created a new site, Teaching to Code, a discussion forum aimed at anyone who teaches programming to others.  Whether you teach in person, produce screencasts, or lecture at the university level, I’m sure that there are techniques, ideas, and suggestions that you can share with other people, and which can help to improve the craft of teaching programming.

It’s true that many of us in this community are commercial instructors.  As a result, there will undoubtedly be some overlap and competition among the people who participate.  I’m optimistic that we can balance these competitive instincts and realities with the goal that we all (presumably) have, namely to improve our students’ knowledge and understanding of programming in general, and of the technologies we teach in particular.

In addition to general discussion on a variety of topics, I’m also aiming to have a monthly book/journal club.  Each month, we’ll discuss a book, journal article, or blog post (or even a video, I guess) that can inform and improve our teaching.  Some of the initial suggestions will come from readings I’ve had in graduate school; there were a number of papers that have really influenced my thinking, and that I believe will be interesting and useful for others, too.  But I know that I’ve only read a minority of things written on this subject, and would be delighted to read and then discuss these items, as well.

If you’re a programming instructor of any sort, please join us!  Contribute to the fledgling discussion, and suggest how we can make it better.  If there is something that you feel could help you, or improve your teaching, then you can either ask on the forum or e-mail me at reuven@lerner.co.il.  Either way, I hope that Teaching to Code will become a community of practice for programming instructors worldwide, helping teachers and students alike.

Teaching and acting (or, why I don’t plan to sell recorded classes in the near future)

Several weeks ago, my wife and I saw a wonderful play at our local theater in Modi’in  (“Mother Courage and Her Children“).  At the end, the actors came out to receive their richly deserved applause.  Three times, the actors came out, took their bows, and were warmly applauded by the audience.  We loved their performance — but just as importantly, they loved performing, and they loved to see and hear the reactions from the audience, both during and after the play.

I’m sure that some or all of these actors have worked in television and the movies; Israel is a small country, and it’s hard for me to believe that actors can decide only to work in a single medium.  But I’ve often heard that actors prefer to work on stage, because they can have a connection with the audience.  When they say something funny, sad, or upsetting, they can feel (and even hear) the audience’s reaction.

But while we often hear about TV and movie stars making many millions of dollars off of their work, it’s less common for stage actors to make that kind of money.  That’s because when you act on stage, you’re by definition limiting your audience to the number of people who can fit in a theater.  Even the largest theaters aren’t going to hold more than a few hundred seats; by contrast, even a semi-successful TV show or movie will get tens or hundreds of thousands of viewers on a given night.  (And yes, TV and film have many more expenses than plays do — but the fact remains that you can scale up the number of TV and film viewers much more easily than you can a play.  Plus, movies and TV can both be shown in reruns.)

Another difference is the effort that you need to put into a stage production, as opposed to a TV program or a movie: In the former case, you need to perform each and every night.  In the latter, you record your performance once — and yes, it’ll probably require multiple takes — and then it can be shown any number of times in the future.  You can even be acting on stage while your TV show is broadcast.  Or more than one of your movies can be shown simultaneously, in thousands of cities around the world.

What does this have to do with me?  And why have I been thinking about this so much over the last few weeks, since seeing that play?

While I’m a software developer and consultant, I also spend a not-insignificant time teaching people: In any given week, I will give 2-4 full days of classes in Python, Ruby, Ruby on Rails, PostgreSQL, and Git, with other classes likely to come in the next few months.

I’m starting to dip my toes into the waters of teaching online, and hope to do it increasingly frequently over the coming months and years.  But unlike most online programming courses currently being offered, I intend to make most or all of my courses real-time, live, and in person.

This has some obvious disadvantages: It means that people will need to be available during the precise hours that I’m teaching. It means that the course will have to be higher in price than a pre-recorded video course, because I cannot amortize my time investment over many different purchases and viewings.  And it means that the course is limited in size; I cannot imagine teaching more than 10 people online, just as I won’t teach an in-person class with more than 20 people.

Given all of these disadvantages, why would I prefer to do things this way, live and in person?

The answer, in a word, is: Interactions.

I’m finishing my PhD in Learning Sciences, and if there’s anything that I have gained from my studies and research, it’s that personal interactions are the key to deep learning. That’s why my research is all about online collaboration; I deeply believe that it’s easiest and best to learn when you speak with, ask questions of, challenge, and collaborate with others, ideally when you’re trying to solve a problem.

I’m not saying that it’s impossible to learn on your own; I certainly spend enough hours each week watching screencasts and lectures, and reading blog posts, to demonstrate that it’s possible, pleasurable, and beneficial to learn in these ways. But if you want to understand a subject deeply, then you should communicate somehow with other people.

That’s one of the reasons why pair programming is so helpful, improving both the resulting software and the programmers who engage in the pairing. That’s why open source is so successful — because in a high-quality open-source project, you’ll have people constantly interacting, discussing, arguing, and finally agreeing on the best way to do things. And that’s why I constantly encourage participants in my classes to work together when they’re working on the exercises that I ask them to solve: Talking to someone else will help you to learn better, more quickly, and more deeply.

I thus believe that attending an in-person class offers many advantages over seeing a recorded screencast or lecture, not because the content is necessarily better, but because you have the opportunity to ask questions, to interact with the teacher, to clarify points that weren’t obvious the first time around, and to ask how you might be able to integrate the lectures into your existing work environment.

So for the students, an in-person class is a huge win.  What do I get out of it?  Why do I prefer to teach in person?

To answer that, I return to the topic with which I started this post, namely actors who prefer to work on stage, rather than on TV and in movies. When I give a course, it’s almost like I’m putting on a one-man show. Just as actors can give the same performance night after night without getting bored, I can give the same “introduction to Python” course dozens of times a year without tiring of it.  (And yes, I do constantly update my course materials — but even so, the class has stayed largely the same for some time.)  I’m putting on a show, albeit an interactive and educational one, and while I put on the same show time after time, I don’t get tired of it.

And the reason that I don’t get tired of it? Those same interactions, which are so beneficial to the students’ learning and progress, are good for me, as the instructor.  They keep me on my toes, allow me to know what is working (and what isn’t), provide me with an opportunity to dive more deeply into a subject that is of particular interest to the participants, and assure me that the topics I’m covering are useful and important for the people taking my class.

I live and work in Israel, and one of the things that I love about teaching Israelis is that I’m almost guaranteed to be challenged and questioned at nearly ever turn. Israelis are, by nature, antagonistic toward authority.  As a result, my lectures are constantly interrupted by questions, challenges, and requests for proof.

I have grown so accustomed to this way of things, that it once backfired on me: Years ago, I gave a one-day course in the US that ended at lunchtime — it turns out that the Americans were very polite and quiet, and didn’t ask any questions, allowing me to get through an entire day’s worth of material in just half of the time.  I have since learned to make cultural adjustments to the number of slides I prepare for a given day, depending on where I will be teaching!

When I look at stage actors, and see them giving the same performance that they have given an untold number of times in the past, I now understand where they’re coming from. For them, each night gives them a chance to expose a new audience to the ideas that they’re trying to get across through their characters and dialogue.  And yes, they could do that in a movie — but then they would be missing the interactions that they have with the audience, which provide a sense of excitement that’s hard to match.

Does this mean that I won’t ever record screencasts or lectures?  No, I’m sure that I will do that at some point, and I already have some ideas for doing so. But they’ll be fundamentally different from the courses that I teach, complementing the full-length courses, rather than replacing them. At the end of the day, I get a great deal of satisfaction from lecturing and teaching, both because I see that people are learning (and thus gaining a useful skill), and because my interactions with them are so precious to me, as an instructor.

Benchmarking old-style and new-style Python classes

It has been many years since Python developers were really supposed to worry about new-style vs. old-style classes.  There is only one style (new) in Python 3.x, and even in Python 2.x, old-style classes have not been recommended for many years.  Nevertheless, I mention old-style classes in my Python courses, mostly so that participants will understand the potentially serious implications of creating classes without inheriting from object.  For example:

>>> class Foo(object): pass
>>> type(Foo)
type

>>> f = Foo()
>>> type(f)
__main__.Foo

The above is the way that modern Python programmers define classes.  This is the preferred way, for sure; if you’re writing old-style classes, then you’re almost certainly doing something wrong.  But it’s so easy to create an old-style class in Python — all you have to do is forget to inherit from “object”:

>>> class Foo(): pass
>>> type(Foo)
classobj

>>> f = Foo()
>>> type(f)
instance

As you can see, the fact that I created an old-style class directly affects the types of objects that I have created, and thus their capabilities.  For many years, it has been seen as a mistake to create old-style classes; not only are you missing out on new functionality, but you are creating objects that behave differently from the rest of objects in Python.

I was just teaching a Python class at a company that has a fair amount of legacy Python code.  It turns out that this legacy code includes a large number of old-style classes. The company asked me whether it was worth upgrading all of their old-style classes to use new-style classes; my answer was that (1) if it ain’t broke, don’t fix it, (2) it’s hard to know whether the upgrade would be trivially easy or impossibly hard, and (3) you’ll likely want to upgrade these classes over time, doing so incrementally.

Someone then asked me whether there is a performance difference between old-style and new-style classes, in order to evaluate the importance of doing such an upgrade project.  I had to admit that I wasn’t sure, and couldn’t find anything online (after doing a quick search) on the subject.  I thus decided to do a small benchmark to see what might be faster (or slower).  I’m not an expert in benchmarking, but I did want to check the basic speed of (1) object creation, (2) inheritance, and (3) implementation of __repr__.

The results surprised me: New-style classes are substantially faster.  Here is the benchmark that I ran on the new-style class:

class Person(object):
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name
    def fullname(self):
        return self.first_name + " " + self.last_name
    def __repr__(self):
        return self.fullname()

class Employee(Person):
    def __init__(self, first_name, last_name, employee_id):
        Person.__init__(self, first_name, last_name)
        self.employee_id = employee_id

def test_employee():
    e = Employee('first', 'last', 1)
    return str(e)

My test of old-style classes was precisely the same, except that I omitted “object” between the parentheses in the class definition of Person.

I used %timeit from within IPython to run the function 100,000 times for each of the two versions (old-style and new-style).  The results surprised me: Old-style classes took 3.09 µs per iteration, while new-style classes took 2.44 µs per iteration, a difference of more than 20 percent!

The bottom line would seem to be that if you’re running large systems in Python and are still using old-style classes, it’s not just worth upgrading to new-style classes for reasons of aesthetics, features, and compatibility.  It’s also going to speed up your code, particularly if you have a large, long-running system that invokes lots of methods.

= and = aren’t equal

When I teach a Ruby or Python class, I always begin by going through the various data types.  My students are typically experienced programmers in Java, C++, or C#, and so it no longer surprises me when I begin to describe numbers, and someone asks, “How many bits is an integer?”

My answer used to be, “Who cares?”  I would then follow this with a demonstration of the fact that in these languages, numbers can be pretty darned big before you have to worry about such things.

But over the last few months, I’ve begun to understand the reason for this question, and others.  Indeed, I have begun to understand one of the reasons why dynamic languages can be so difficult for people to learn after they have worked with a static language.

Let’s take a simple example.  In a typical, C-style statically typed language, you don’t just assign a variable.  You must first declare it with a type.  You can thus say something like this:

int x;
x = 5;

In both Ruby and Python, you can do something similar: 

x = 5    # no type declaration needed

On the face of it, these seem to be doing similar things.  But they aren’t.

In a static language, a variable is an alias to a place in memory.  Thus, when I say “int x”, I’m telling the compiler to set aside an integer-sized piece of memory, and to give it an alias of “x”.  When I say “x = 5”, the compiler will stick the number 5 inside of that int-sized memory location. This is why static languages force you to declare types — so that they can allocate the right amount of space for the data you want to store, and so that they can double-check that the type you’re trying to store won’t overflow that allocated area.

Dynamic languages don’t do this at all.  Whereas assignment in a static language means, “Put the value on the right in the address on the left,” assignment in a dynamic language means, “As of now, the name on the left points to the object on the right.”

In other words, assignment in a dynamic language isn’t really assignment in the traditional sense.  There’s no fixed memory location associated with a variable.  Rather, a variable is just a name in the current scope, pointing to an object.  Given that everything in both Python and Ruby is an object, you never have to worry about assignment not “fitting” into memory.

This is also why you can say “x = 5” and then “x = [1,2,3]” in a dynamic language: Types sit on the data, not on the variable.  As long as a variable is pointing to an object, you’re just fine, because all object pointers are the same size.

The bottom line, then, is that  = in static languages and = in dynamic languages would seem, on the surface, to be doing similar things.  But they’re definitely not.  Once you understand what they are doing — putting data in memory, or telling a name to point to a value — many other mysteries of the language suddenly make more sense.