= and = aren’t equal

When I teach a Ruby or Python class, I always begin by going through the various data types.  My students are typically experienced programmers in Java, C++, or C#, and so it no longer surprises me when I begin to describe numbers, and someone asks, “How many bits is an integer?”

My answer used to be, “Who cares?”  I would then follow this with a demonstration of the fact that in these languages, numbers can be pretty darned big before you have to worry about such things.

But over the last few months, I’ve begun to understand the reason for this question, and others.  Indeed, I have begun to understand one of the reasons why dynamic languages can be so difficult for people to learn after they have worked with a static language.

Let’s take a simple example.  In a typical, C-style statically typed language, you don’t just assign a variable.  You must first declare it with a type.  You can thus say something like this:

int x;
x = 5;

In both Ruby and Python, you can do something similar: 

x = 5    # no type declaration needed

On the face of it, these seem to be doing similar things.  But they aren’t.

In a static language, a variable is an alias to a place in memory.  Thus, when I say “int x”, I’m telling the compiler to set aside an integer-sized piece of memory, and to give it an alias of “x”.  When I say “x = 5”, the compiler will stick the number 5 inside of that int-sized memory location. This is why static languages force you to declare types — so that they can allocate the right amount of space for the data you want to store, and so that they can double-check that the type you’re trying to store won’t overflow that allocated area.

Dynamic languages don’t do this at all.  Whereas assignment in a static language means, “Put the value on the right in the address on the left,” assignment in a dynamic language means, “As of now, the name on the left points to the object on the right.”

In other words, assignment in a dynamic language isn’t really assignment in the traditional sense.  There’s no fixed memory location associated with a variable.  Rather, a variable is just a name in the current scope, pointing to an object.  Given that everything in both Python and Ruby is an object, you never have to worry about assignment not “fitting” into memory.

This is also why you can say “x = 5” and then “x = [1,2,3]” in a dynamic language: Types sit on the data, not on the variable.  As long as a variable is pointing to an object, you’re just fine, because all object pointers are the same size.

The bottom line, then, is that  = in static languages and = in dynamic languages would seem, on the surface, to be doing similar things.  But they’re definitely not.  Once you understand what they are doing — putting data in memory, or telling a name to point to a value — many other mysteries of the language suddenly make more sense.

2 thoughts on “= and = aren’t equal”

  1. This is not quite accurate. In a static language like c# I can declare object x = "a string"; and then later assign any other object to x, e.g. x = new Uri(...) because both string and Uri are instances of object.

    A variable like x will always be the same size (the size of a pointer in the underlying virtual machine).

    Discussing primitives such as int vs object references muddies the issue since each language and runtime has subtle differences in how these are allocated and stored.

    For example, in .net, a local variable that is a primitive may be directly allocated on the stack, but when a primitive is stored on the heap (as in class fields, captured closure, etc), it will typically be “boxed” into an object that uses the same pointer size as any other object.

    Static languages prevent you from assigning x to another object type not because the size of the object is different, but because static languages are explicitly designed to prevent this type of mixture.

    1. Chris, that’s a very good point. It’s true that once you start working with objects (rather than primitive types), things are more similar than I described here. Maybe the biggest difference is that (as everyone loves to say) that everything in Ruby and Python is an object, and thus every variable assignment is just a pointer assignment.

      That said, this explanation (or a variation thereof) still seems to satisfy many people in my classes who come from static languages. I haven’t checked, but perhaps this explanation helps those coming from C, which lacks objects, more than other languages. Once people understand that we’re just talking about generic object pointers, rather than the actual data types, they seem to get, much more easily, how assignment, argument passing, and even garbage collection work.

      So I completely accept your comments… and am curious to figure out if there’s a better, more accurate way to describe things that will still allow people to think about dynamic-language variable assignment in a similar way.

Leave a Reply

Your email address will not be published. Required fields are marked *

− one = three