Why Python Rocks I: Inline documentation

Sunday, June 22nd, 2008

Okay. So what’s cool about Python? I can’t count the number of times I’ve had to show skeptics why Python is cool, what Python can do that their favorite language can’t do. So I’m writing a bunch of articles showing off Python’s Awesome.

All articles in this series:

Why Python Rocks I: Inline documentation
Why Python Rocks II: Data structures
Why Python Rocks III: Parameter expansion

First up: Documentation. I’m talking about inline documentation here: annotating modules, classes, methods, etc. Most languages have third party tools that parse the source code and extract documentation from comments. This is nice, of course, but the comments get out of date and you have to regenerate the documentation each time. Different people use different documentation generators (Doxygen VS. PHPDoc, JSDoc VS. ScriptDoc, etc) which, in turn, use different documentation standards, causing unknown chaos documentation even within the same language space. You may have heard some code monkeys say “The code IS the documentation”. In Python, that’s actually not far from the truth. Let’s look at some of the things you can do with inline Python documentation.

Defining DocStrings

Python allows you to place strings in your module, class and method definitions. These strings then become the documentation for that piece of code. They’re not comments, they’re strings, so they actually stand out from the rest of the implementation, since strings aren’t found in most other language in those locations. Most Python developers use triple quoting for their DocStrings (which is what they’re called). An example:

#!/usr/bin/env python

"""My Module"""

class Foo:
    """My Class"""

    def __init__(self):
        """My Constructor"""
        pass

    def add(self, a, b):
        """Return ``a`` + ``b``"""
        return(a + b)

There we go. As you can see, the docstring comes right after the declaration of the module, class or method. If your editor has syntax highlighting, the DocStrings will probably stand out quite nicely against the rest of the code, so it’s easy to spot.

Accessing DocStrings with the interactive interpreter

Now there are tools which can extract this documentation, but we’ll get to that later on, because there is a much more interesting thing you can do with this documentation. If you’re not a Python programmer, this might be the right time to tell you about the interactive interpreter. Since Python is an interpreted language, it doesn’t actually have to know what code you might want to execute. This makes it possible to have an interactive interpreter for Python which will just give you a prompt and let you type lines of code on-the-fly, which it will execute:

[user@jib]~$ python
Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
>>> s = "Hello, World!"
>>> print s
Hello, World!
>>>

This lets you try out small snippets of Python code so you can test if what you thought was going to happen when executing that code is actually what’s going to happen. But that’s not all it’s useful for! You see, Python let’s you extract DocStrings from the code programmatically! And here’s what that looks like:

>>> import module
>>> help(module)
Help on module module:

NAME
    module - My Module

FILE
    /home/user/module.py

CLASSES
    Foo
    
    class Foo
     |  My Class
     |  
     |  Methods defined here:
     |  
     |  __init__(self)
     |      My Constructor
     |  
     |  add(self, a, b)
     |      Return ``a`` + ``b``


>>> help(module.Foo)
Help on class Foo in module module:

class Foo
 |  My Class
 |  
 |  Methods defined here:
 |  
 |  __init__(self)
 |      My Constructor
 |  
 |  add(self, a, b)
 |      Return ``a`` + ``b``

>>> help(module.Foo.add)
Help on method add in module module:

add(self, a, b) unbound module.Foo method
    Return ``a`` + ``b``

But we can also do:

>>> module.Foo.add.__doc__
'Return `a` + `b`'

This is incredibly useful if you quickly need to know how a certain class or method works. No need to keep switching to your browser to view generated documentation. No need to find the context of your code. If you’ve got a class X, it doesn’t matter whether that came from module ABC or module DEF or module org.omg.CosNaming.NamingContextExtPackage or module org.omg.CosNaming.NamingContextPackage. The class is the class, and the documentation you’re going to get when running help() on it will always be the right one.

Generating documentation

So, we can view documentation straight from the interpreter. That’s neat, but you might also want to generate a static version of that documentation. An API reference. Unlike other languages, where you have to use some third-party tool (and you know everybody uses a different one of course) Python itself comes with a tool to generate documentation: pydoc. Let’s see it in action:

[user@jib]~$ pydoc -w ./module.py
wrote module.html

Rather easy, huh? Granted, the output doesn’t look too stunning (for an example, look here), but it’s quite usable.

Besides being able to generate static documentation from the code, pydoc can do something else too:

[user@jib]~$ pydoc --help
pydoc - the Python documentation tool
...
pydoc -p 
    Start an HTTP server on the given port on the local machine.

When we run pydoc -p 50000, it’ll give us a HTTP server on port 50000 which allows us to browse through the API documentation of every Python module installed on your system, as well as the API documentation for any python code in the directory in which you started the server. Very convenient for finding out which modules are available on the system.

Test cases in your documentation

When writing documentation, it’s common to include an example in your documentation to let the developer know how he should be calling the code. If specified correctly, Python lets you reuse those examples as test cases for the object you’re documenting. Here’s an example.

Let’s say we define a function add, almost like the one in the example above, with the following DocString:

def add(a, b):
    """
    Return ``a`` + ``b``.

    Example:

    >>> add(5, 10)
    15
    """
    return(a + b)

Now we want to manually run the tests in the documentation. We’ll need the doctest module, which is the tool which will run our test cases.

>>> import doctest
>>> doctest.run_docstring_examples(add, None)
>>>

No output! That means our test cases ran successfully, and the tests passed. For a little more information, we can set the verbose flag to True:

>>> doctest.run_docstring_examples(add, None, verbose=True)
Finding tests in NoName
Trying:
    add(5, 10)
Expecting:
    15
ok

Nice huh? But that’s not all. When writing a module, we can make the module basically test itself using doctest:

#!/usr/bin/env python

"""My Module"""

class Foo:
    """My Class"""

    def add(self, a, b):
        """
        Return `a` + `b`

        Example:
        >>> f = Foo()
        >>> f.add(5, 15)
        21
        """
        return(a + b)

if __name__ == '__main__':
    import doctest
    doctest.testmod()

Okay, let’s review that example. We’ve defined a module with a class Foo which has a method add. In the DocString for the add method, we create an instance of the class and then tell it to add 5 and 15, expecting 21 (which, I believe, is incorrect). At the bottom of the file, we check the __name__ constant. This constant contains the name of the program that’s currently being run. If a module is run individually (i.e. with python module.py), the __name__ constant will be __main__, so the tests will be run. Otherwise (if this module is being imported), the __name__ constant will be the name of the module. This way, we can make our modules test themselves! Watch:

[user@jib]~$ python module.py
******************************************************************
File "module.py", line 14, in __main__.Foo.add
Failed example:
    f.add(5, 15)
Expected:
    21
Got:
    20
******************************************************************
1 items had failures:
   1 of   2 in __main__.Foo.add
***Test Failed*** 1 failures.

If we import the module, everything is fine, because the test cases are not being evaluated:

[user@jib]~$ python
>>> import module
>>>

Pretty awesome huh?

Well, that was a quick introduction to Python DocStrings and a couple of the things you can do with them. Of course this is only the beginning. Much more good can come from the extractable Python DocStrings. Remote XML-RPC method help, for example, is another thing which could be easily extracted from the Python DocStrings.

Drawbacks.. or not?

Now, you might think to yourself… “Okay, that’s neat and everything, but my API documentation generator allows me to specify, say, the types of parameters and return values. This doesn’t do that!”, and you’d be right; it doesn’t. While it would be trivial to implement (and some documentation generators for Python have, I believe), it would also be wrong. Why? Take a look at this scenario:

def add(a, b):
	"""
	Return a + b

	@return int: a + b
	"""
	return(a + b)

print type(add(5, 15))
print type(add(5.0, 15.0))
print type(add('a', 'b'))

This DocStrings says it will be returning a value of type int, right? The code below it prints out the actual types it’s returning. It’s output:

<type 'int'>
<type 'float'>
<type 'str'>

That’s right. Python is dynamically typed, so there is basically no way to determine up-front what the resulting type of a + b will be. There will almost certainly be a case where your DocString will be wrong. And there’s nothing worse than wrong documentation.

“Yes”, you say, “but what about the exceptions the code will throw? My API generator supports that, but DocStrings don’t”. That, again, is true. But, you have to ask yourself (again), is your API correct? The following example shows a deficiency in ‘Throws’ API remarks:

def foo():
    raise ValueError('Wrong value')

def bar():
    """
    @throws TypeError
    """
    foo()
    raise TypeError('Wrong type')

bar()

Which results in:

Traceback (most recent call last):
  File "/home/user/blaat.py", line 11, in 
    bar()
  File "/home/user/blaat.py", line 8, in bar
    foo()
  File "/home/user/blaat.py", line 2, in foo
    raise ValueError('Wrong value')
ValueError: Wrong value

Again, the DocString was incorrect. bar() can also throw any exception thrown by code called by bar() itself. In essence, the @throws is never complete. Of course, this has much more to do with encapsulation and bad usage of the whole Exception system. The code in bar() should theoretically catch all exceptions which are to be expected from code it calls itself, making @throws hints only useful for expected errors. But that gets us into the whole “Should exceptions be used for expected errors or not” debate, which I won’t discuss here. I guess my personal experience and opinion is that I’ve never found @throws documentation hints in scripting languages very useful.

So, there we have it. One of the many, many things which makes Python cool. There will be more posts showing off Python’s cool, so stay tuned.

Blog

Why Python Rocks I: Inline documentation