Why Python Rocks III: Parameter expansion

Friday, July 25th, 2008

Okay. So what’s cool about Python? I can’t count the number of times I’ve had to show skeptics why Python is cool, what Python can do that their favorite language can’t do. So I’m writing a bunch of articles showing off Python’s Awesomeness.

All articles in this series:

Why Python Rocks I: Inline documentation
Why Python Rocks II: Data structures
Why Python Rocks III: Parameter expansion

I wasn’t actually planning on making this the third article in this series, but I was just working on a little calendaring application and came up with a simple solution to a certain problem that demonstrates perfectly why Python is such an awesome language.

Python: Dynamic

If you’ve read previous installments in this series of articles, you may have noticed that Python is fairly dynamic. Introspection such as the dir() built-in, getattr(), setattr (more on those in a later article) and other dynamic features make it very easy to accomplish things in Python that can be a big pain in a lot of other, less dynamic, languages. As an example, I’d like to show you something on function parameters.

The premise

Suppose we have the following function (that we’re not allowed to change), which takes a bunch of named parameters and does something with it:

def person(firstname='John', lastname='Doe', age=20, email='unknown@example.com'):
    print '%s, %s(%s): %s' % (lastname, firstname, age, email)

Every parameter in the above function has a default value. When calling it, you can randomly leave out parameters, and Python will fill them in for you:

person(firstname='Pete', age=25)
person(lastname='Johnson', email='j.johnson@example.com')

The output of which is:

Doe, Pete(25): unknown@example.com
Johnson, John(20): j.johnson@example.com

Now suppose we have input from a file or something, which we need to pass to the function. However, the records in the input don’t have all the required fields. Sometimes ‘age’ might be missing, or another field – you just don’t know in advance. The input might look like this:

firstname:Jane,lastname:Foo,email:jane@gmail.com
age:15,firstname:Jake

These are two records, for two persons: Jane and Jake. Parsing this data is easy, but when we want to pass the data to the function, we run into a problem: Not all the fields are present in the data. Jake’s last name is unknown, for instance. We don’t know which fields are missing, so we can’t just call the function like we normally would. Proper Object Oriented languages provide method overloading which allows the developer to specify multiple versions of the same method with a different number of parameters. But we don’t know the order of the parameters either! Besides, it takes a lot of time and code to write all those methods. What if you’ve got twenty parameters?

One way of doing it…

In less dynamic languages one solution would be to redefine the default values for the function call and then overwrite them with the parsed data’s values. Something along the lines of this:

s = 'firstname:Jane,lastname:Foo,email:jane@gmail.com\nage:15,firstname:Jake'

for record in s.split('\n'):
    firstname = 'John'
    lastname = 'Doe'
    age = 20
    email = 'unknown@example.com'

    for field in record.split(','):
        key, value = field.split(':')
        if key == 'firstname':           # Big, ugly IF structure
            firstname = value
        elif key == 'lastname':
            lastname = value
        elif key == 'age':
            age = value
        elif key == 'email':
            email = value
    
    person(firstname, lastname, age, email)

Not a very clean solution, especially if you’ve got a lot of parameters.

A better way: parameter expansion

Python, in its dynamic nature, offers a very nice solution for this which I will call parameter expansion. Here’s how it works:

def foo(a, b, c):
    print a, b, c

data = (1, 2, 3) # Tuple containing the parameter values
foo(*data)  # output: 1 2 3

When *data is encountered by Python, it expands the tuple into the parameters for the function, making the line equivalent to: foo(1, 2, 3). Of course that’s no good for our situation, because we don’t know which parameters will be present in the input. No worries, because we can also use a dictionary to fill in the function’s parameters:

def foo(a=1, b=2, c=3):
    print a, b, c

data = {
    'a': 5,
    'c': 9
}
foo(**data)  # output: 5 2 9

There we go! Python neatly expands the dictionary’s keys and values to the parameters of the function, making the call equivalant to: foo(a=5, c=9). We can use this to map our data to the function we’ve got:

def person(firstname='John', lastname='Doe', age=20, email='unknown@example.com'):
    print '%s, %s(%s): %s' % (lastname, firstname, age, email)

s = 'firstname:Jane,lastname:Foo,email:jane@gmail.com\nage:15,firstname:Jake'
for record in s.split('\n'):        # ['firstname:Jane,lastname:Foo,email:jane@gmail.com', ..]
    personinfo = {}
    for field in record.split(','): # ['firstname:Jane', 'lastname:Foo', ..]
        personinfo.update(          # Add the key, value to the personinfo dictionary
            dict(                   # {'firstname': 'Jane'} 
                (field.split(':'),) # (['firstname', 'Jane'])
            )
        )
    person(**personinfo)            # Call person() with the dict expanded to params

Output of the above:

Foo, Jane(20): jane@gmail.com
Doe, Jake(15): unknown@example.com

Now it doesn’t matter which records are present in the input, nor which order they come in. For each record in the input (each line), we create a ‘personinfo’ dictionary and then add each key/value pair we encounter in the input to that dictionary. The code dict( (field.split(':'), ) ) may look a little mystic. What is does is, it splits the ‘field’ variable on the ‘:’, which gives us a two-value list of the key and value ['firstname', 'Jane']. When you put that list in another list, you can then cast it to a dictionary. For example:

x = ( ('a', 2), ('b', 3) )
print dict(x)                # output: {'a': 2, 'b': 3}

Variable arguments

Naturally we can also do it the other way around. Instead of not having to know the parameters when we call the function, we can also just as easily not know the parameters inside the function:

def person(*args, **kwargs):
    for k, v in kwargs.items():
        print '%s = %s' % (k, v)

This defines a function with a variable amount of normal (positional) parameters (*args) and a variable amount of keyword parameters (**kwargs). When we call the function, it now outputs:

lastname = Foo
email = jane@gmail.com
firstname = Jane
age = 15
firstname = Jake

We can do all kinds of neat things with the ‘person’ function now. For instance, to create SQL queries:

def person(*args, **kwargs):
    keys = kwargs.keys()
    values = []
    for v in kwargs.values():
        values.append("'%s'" % (v))

    sql = 'INSERT INTO foo (%s) VALUES (%s);' % (
        ', '.join(keys),
        ', '.join(values)
    )
    print sql

Output:

INSERT INTO foo (lastname, email, firstname) VALUES ('Foo', 'jane@gmail.com', 'Jane');
INSERT INTO foo (age, firstname) VALUES ('15', 'Jake');

(This is silly example of course; it assumes all columns are text and all columns can have NULL values. Also, the quoting of the values could be done better using list comprehensions, but that’s something for a later article in this series. It’s just an example okay ;-) ).

Conclusion

As you’ve hopefully seen, simple dynamic concepts in Python such as parameter expansion allow us to do things easier and cleaner than in other, less dynamic languages. In other languages you may have to do weird tricks to accomplish these things, or adjust your data structures and code to make such things easier. Something which might not always be possible if you’re using a third-party closed library.

We have to be careful when doing such dynamic things though. Many problems lurk just around the corner. If the data you’re using is untrusted (for instance, from a remote host or a webbrowser), this kind of dynamic programming might make it easier for your code to be exploited. Unreliable data may also cause errors in your code, when it doesn’t contain what you expected it to contain. Remember to always sanitize your input! Another problem with dynamic coding could be that it makes it harder to understand what is happening when looking at the code. Remember that a clear, easy to understand solution is better than a clever but hard to understand one.

I’ll see you in the next installment of this series of articles!

Blog

Why Python Rocks III: Parameter expansion