Static Code Analizers for Python

Old-school developers remember lint, the static code analysis
tool for C programs. There are several similar programs available
for Python, and they can all help you clean up your act.

This month we continue examining Python development tools you have
told me you can’t live without. A fair number of you have mentioned
that you use a static analysis tool such as PyChecker, pylint, or PyFlakes. I have to admit, I
was a bit skeptical of how useful any of them would be with Python.
In a past life, when I used to write a lot of C, I used lint
occasionally. Unfortunately, it offered so many false positives,
especially when the X11 or Motif headers were included, that the
output frequently was useless. Eventually the Gnu compiler became
sophisticated (and prevalent) enough that I stopped using lint
altogether. But after looking into the code analysis tools available
for Python, I’m reconsidering that position.

The Test Program

A static analysis tool reads your source code without executing it
and looks for common mistakes. In C programs the types of things lint
found were usually bad pointer casts or array references. Since
Python is a dynamic language, there are different sorts of problems to
watch for. Common examples are redefining functions or methods,
overriding builtin names, and importing modules without using them.
Some of the tools even test your code against style guides (such as
those defined in the official Python style guide, PEP 8). These
are the sorts of common problems that are difficult to find unless you
have a very comprehensive test suite.

In order to compare the three tools I’ll be discussing this month, I
needed to write some sample code with known mistakes in it. I’m sure
I could have used some of my existing code, but I wanted to see how
the tools responded to pre-arranged situations. Listing 1 shows the
carefully crafted bad code I’ll be using for all of the tests. Take a
minute now to see how many of the problems you can spot yourself, then
compare your results with what the tools found.

Listing 1

#!/usr/bin/env python
# encoding: utf-8

import string

module_variable = 0

def functionName(self, int):
    local = 5 + 5
    module_variable = 5*5
    return module_variable

class my_class(object):
    def __init__(self, arg1, string):
        self.value = True

    def method1(self, str):
        self.s = str
        return self.value

    def method2(self):
        print 'How did we get here?'
    def method1(self):
        return self.value + 1
    method2 = method1
class my_subclass(my_class):
    def __init__(self, arg1, string):
        self.value = arg1


One of the oldest lint tools for Python is PyChecker, by Eric Newton,
John Shue, and Neal Norwitz. The last official release of PyChecker
was in 2006, but it looks like more recent work has been done in the
CVS repository (I tested version 0.8.17 for this article). Although
the project is only “lightly maintained”, many readers reported using
it, and the authors intend to provide a release with better Python 2.5
support soon.

I downloaded the tar.gz file from manually, and I was
able to install the program by unpacking the tarball and running
python install (in a fresh virtualenv environment, of
course). Once I had it installed, I ran it with the default settings
to produce this output:

$ pychecker
Processing Listing1...

Warnings... Imported module (string) not used Parameter (int) not used self is argument in function Local variable (local) not used Local variable (module_variable)
shadows global defined on line 8 Local variable (module_variable)
shadows global defined on line 8 Parameter (arg1) not used Parameter (string) not used Redefining attribute (method1)
original line (21) Parameter (string) not used

As you see, it found quite a few of the creative problems I inserted
into the test code. It did not warn me, however, that I was
overriding the builtin int() with the argument to my function on
line 10, or the imported module string with the argument to
__init__() on line 17. It caught one example of redefining a
method but not the second.

The help text for the program (accessed with the usual -h option)
indicates that there are quite a few checks not enabled by default.
Some of these include tests for unused class member variables,
unreachable code, and missing docstrings. Adding the –var
option, for example, exposes the unused module-level variable on line

Specifying options on the command line can be a bit cumbersome,
however, so PyChecker supports three other ways to specify
preferences. First, you can include a __pychecker__ string in
your code to enable or disable the options you want to use. The
second way to pass options to PyChecker is by using the PYCHECKER
environment variable using the same syntax as __pychecker__.

The third means of controlling the tests performed uses a
configuration file for site or project-wide parameters. By default
the file is $HOME/.pycheckerrc, and there is a command line option
to specify a separate file (if, for example, you want to include the
file in your version control system with your source code). The
.pycheckerrc config file uses Python syntax to set the options,
but the names may be different from the names used on the command line
(allVariablesUsed instead of var, for this example). The
–rcfile option prints out a complete set of the options given in
a format easy to capture and save as your configuration file.


The second program I looked at for this column is pylint, from a team
of developers organized through Logilab. The documentation for pylint
refers directly to PyChecker as a predecessor, but it claims to also
test code against a style guide or coding standard. pylint also
supports a plugin system for adding your own custom checks.

Version 0.14.0 of pylint depends on a few other libraries from
Logilab. All the links you need are available on the README page for
pylint. I tried installing the packages with easy_install, but
the results didn’t work, so I resorted to downloading the tarballs and
installing them manually. That did the trick, and I was able to
produce a nice report about my test code, the beginning of which
appears in Listing 2.

Listing 2

No config file found, using default configuration
************* Module Listing1
C:  1: Empty docstring
W:  6: Uses of a deprecated module 'string'
C:  8: Invalid name "module_variable" (should match (([A-Z_][A-Z1-9_]*)|(__.*__))$)
C: 10:functionName: Invalid name "functionName" (should match [a-z_][a-z0-9_]{2,30}$)
C: 10:functionName: Missing docstring
W: 10:functionName: Redefining built-in 'int'
W: 12:functionName: Redefining name 'module_variable' from outer scope (line 8)
W: 10:functionName: Unused argument 'int'
W: 10:functionName: Unused argument 'self'
W: 11:functionName: Unused variable 'local'
C: 15:my_class: Invalid name "my_class" (should match [A-Z_][a-zA-Z0-9]+$)
C: 15:my_class: Missing docstring
C: 22:my_class.method1: Invalid name "s" (should match [a-z_][a-z0-9_]{2,30}$)
W: 17:my_class.__init__: Redefining name 'string' from outer scope (line 6)
W: 17:my_class.__init__: Unused argument 'arg1'
W: 17:my_class.__init__: Unused argument 'string'
C: 21:my_class.method1: Missing docstring
W: 21:my_class.method1: Redefining built-in 'str'
C: 25:my_class.method2: Missing docstring
W: 27:my_class.method2: Unreachable code
R: 25:my_class.method2: Method could be a function
C: 29:my_class.method1: Missing docstring
E: 29:my_class.method1: method already defined line 21
W: 22:my_class.method1: Attribute 's' defined outside __init__
C: 33:my_subclass: Invalid name "my_subclass" (should match [A-Z_][a-zA-Z0-9]+$)
C: 33:my_subclass: Missing docstring
W: 35:my_subclass.__init__: Redefining name 'string' from outer scope (line 6)
W: 35:my_subclass.__init__: __init__ method from base class 'my_class' is not called
W: 35:my_subclass.__init__: Unused argument 'string'
W:  6: Unused import string

The first thing I noticed was the size and scope of the output
report produced. The full report was over 150 lines and included
several ASCII tables with statistics about the results (see Listing 3
for one example). Data for the previous and current runs are
available along with the difference, making it easy to track your
progress as you clean up your code. pylint identified almost all of
the same problems PyChecker did, and many it did not. The one warning
I see that PyChecker gave me that pylint did not is that my
module-level function uses an argument self but is not a method.

The first thing I noticed was the size and scope of the output
report produced.

Listing 3

Messages by category

|type       |number |previous |difference |
|convention |12     |NC       |NC         |
|refactor   |1      |NC       |NC         |
|warning    |16     |NC       |NC         |
|error      |1      |NC       |NC         |

An especially nice feature of pylint is that each check has an
assigned identifier, so it is easy to enable or disable that
particular warning in a consistent manner – no guessing about the
name to use in the config file. Simply specify –enable-msg or
–disable-msg and the message id. Given the sheer number of tests
performed by pylint, I can see that this consistency is going to be
key in setting it up in a meaningful way on any real project.

Since I approached this review with a jump-right-in attitude, all of
these impressions were formed before I had spent any time reading the
documentation delivered with the program, so that was my next step.
The docs provided cover a complete list of the different types of
warnings produced, how to enable/disable them and set other options,
interpret the report output, and all the other sorts of information
you need to really integrate the tool into your daily routine. There
are some holes, as you would expect from a work-in-progress, but the
basic information is there in more detail than for either of the other

In addition to enabling or disabling individual messages, there are a
wide range of command line options available for fine-grained control
over expectations for the tests performed. These range from regular
expressions to enforce naming conventions to various settings to watch
for “complexity” issues within classes and functions. It will be
interesting to see how those settings work out against some of my
older code.

As with PyChecker, pylint also supports setting options within your
source code and with a config file. Perhaps assuming a shared
development environment, it looks first in /etc/pylintrc and then
in $HOME/.pylintrc for settings. This lets a team install a
global configuration file on a development server, so everyone sees
the same options. To print the settings being used, use the
–generate-rcfile option. The output includes comments for each
option, so saving it to a file makes it easy for you start customizing
it to create your own specialized config file.


The last program I examined was PyFlakes from The
installation process for PyFlakes was the easiest of the three. After
a quick “easy_install PyFlakes”, I was up and running (yay!). The
experience after that was a bit of a letdown, though:

$ pyflakes 'string' imported but unused

It found almost none of the errors I was hoping it would identify.
The PyFlakes web site says there are two categories of errors

  • Names which are used but not defined or used before they are defined
  • Names which are redefined without having been used

In the case of this sample file, there are several unused names that
weren’t reported.

PyFlakes is much simpler than either pylint or PyChecker. There don’t
seem to be any command line options for controlling the tests that are
run. Running it with arguments that don’t refer to valid filenames
results in a traceback and error message.

The one feature of PyFlakes mentioned by the users who recommended it
is its speed. My test code is obviously too small to make any real
performance tests, but I have heard from several readers who use it in
conjunction with an IDE like PyDev to look for errors in the
background while they edit.


Both pylint and PyFlakes analyze your source but do not actually
import it. Everything they need they derive from the parse tree.
This gives them an advantage in situations where importing the code
might have undesirable side effects. I used the same approach in
HappyDoc for extracting documentation from Zope-related source code.

All of the tools I tested found some of the errors in the sample code,
but pylint was by far the most comprehensive. The PyChecker output
was more terse, and it doesn’t include the style checks that pylint
has, but that omission may itself constitute a feature for some users.

All of the tools I tested found some of the errors in the sample code.

Of the three tools, only PyFlakes installed correctly with
easy_install. That is annoying, but not a show-stopper for using the
other tools, especially given how much more comprehensive their output
is. All of the tools worked correctly when installed via,
which is certainly better than having to install them entirely by

For my own projects, I intend to continue looking into pylint for now.
Its consistent configuration and exhaustive reporting are appealing
for larger code bases such as I encounter at my day job.

Configuring these tools in your code is useful for suppressing false
positives or warnings you know it is safe to ignore. Use a
configuration file to enable checks you want applied to all of your
code. It is probably best to use a separate configuration file for
each project, since different projects will have different coding
standards and styles.

Next month I will continue this series by introducing you to more
tools to enhance your programming productivity. I haven’t decided on
the topic yet, so if you have a tip to share, feedback on something
I’ve written, or if there is anything you would like for me to cover
in this column, send a note with the details to doug dot hellmann at
pythonmagazine dot com and let me know. You can also add the link to
your account with the tag pymagdifferent, and I’ll see
it there.

Originally published in Python Magazine Volume 2 Issue 3 , March, 2008

Automated Testing with unittest and Proctor

Originally published in Python Magazine Volume 2 Issue 3 , March, 2008

Automated testing is an important part of Agile development
methodologies, and the practice is seeing increasing adoption even in
environments where other Agile tools are not used. This article
discusses testing techniques for you to use with the open source tool
Proctor. By using Proctor, you will not only manage your automated
test suite more effectively, but you will also obtain better results
in the process.

What is Proctor?

Proctor is a tool for running automated tests in Python source
code. It scans input source files looking for classes based on the
TestCase class from the unittest module in the Python standard
library. You can use arbitrary organization schemes for tests defined
in separate source modules or directories by applying user defined
categories to test classes. Proctor constructs test suites dynamically
at run time based on your categories, making it easy to run a subset
of the tests even if they are not in the same location on the
filesystem. Proctor has been specifically designed to operate on a
large number of tests (more than 3500 at one site). Although it
depends on the unittest module, Proctor is also ideally suited for use
with integration or higher level tests, because it is easy to
configure it to run unattended.


Proctor uses the standard distutils module tools for installation
support. If you have previously installed easy_install, using it
is the simplest way to install packages such as Proctor that are
listed in the Python Package Index.

$ sudo easy_install Proctor

Running easy_install will download and install the most recent
version by default. If you do not have easy_install, download the
latest version of the Proctor source code from the home page (see the
references list for this article), then install it as you would any
other Python package:

$ tar zxvf Proctor-1.2.tar.gz
$ cd Proctor-1.2
$ sudo python install

Once Proctor is installed, you will find a command line program,
proctorbatch, in your shell execution path. Listing 1 shows the
command syntax for proctorbatch. I will examine the command line
options in more detail throughout the rest of this article using a few
simple tests.

Listing 1


    Proctor is a tool for running unit tests.  It enhances the
    existing unittest module to provide the ability to find all tests
    in a set of code, categorize them, and run some or all of them.
    Test output may be generated in a variety of formats to support
    parsing by another tool or simple, nicely formatted, reports for
    human review.


    proctorbatch [<options>] [<directory name> ...]



    -h             Displays abbreviated help message.

    --help         Displays complete usage information.

                   Run only the tests in the specified category.

                   Warning: If there are no tests in a category,
                   an error will not be produced.  The test suite
                   will appear to be empty.

                   Add a line exclude pattern
                   (can be a regular expression).

                   Write coverage statistics to the specified file.

    --debug        Turn on debug mode to see tracebacks.

    --interleaved  Interleave error and failure messages
                   with the test list.

    --list         List tests.

                   List test categories.

    --no-coverage  Disable coverage analysis.

    --no-gc        Disable garbage collection and leak reporting.

    --no-run       Do not run the tests

    --parsable     Format output to make it easier to parse.

    -q             Turn on quiet mode.

    -v             Increment the verbose level.
                   Higher levels are more verbose.
                   The default is 1.

Sample Tests and Standard unittest Features

The simplest sample set of test cases needs to include at least three
tests: one to pass, one to fail, and one to raise an exception
indicating an error. For this example, I have separated the tests into
three classes and provided two test methods on each class. Listing 2
shows the code to define the tests, including the standard
unittest boilerplate code for running them directly.

Listing 2

#!/usr/bin/env python
# Sample tests for exercising Proctor.

import unittest

class PassingTests(unittest.TestCase):

    def test1(self):

    def test2(self):

class FailingTests(unittest.TestCase):

    def test1(self):'Always fails 1')

    def test2(self):'Always fails 2')

class ErrorTests(unittest.TestCase):

    def test1(self):
        raise RuntimeError('test1 error')

    def test2(self):
        raise RuntimeError('test2 error')

if __name__ == '__main__': # pragma: no cover

When python is run, it invokes the unittest module’s
main() function. As main() runs, the standard test loader is
used to find tests in the current module, and all of the discovered
tests are executed one after the other. It is also possible to name
individual tests or test classes to be run using arguments on the
command line. For example, python PassingTests runs
both of the tests in the PassingTests class. This standard
behavior is provided by the unittest module and is useful if you know
where the tests you want to run are located in your code base.

It is also possible to organize tests from different classes into
“suites”. You can create the suites using any criteria you like –
themes, feature areas, level of abstraction, specific bugs, etc. For
example, this code sets up a suite containing two test cases:

import unittest
from Listing2  import *

suite1 = unittest.TestSuite([PassingTests('test1'), FailingTests('test1')])

When run, the above code would execute the tests
PassingTests.test1 and FailingTests.test1, since those are
explicitly included in the suite. The trouble with creating test
suites in this manner is that you have to maintain them by hand. Any
time new tests are created or obsolete tests are removed, the suite
definitions must be updated as well. This may not be a lot of work for
small projects, but as project size and test coverage increases, the
extra work can become unmanageable very quickly.

Proctor was developed to make working with tests across classes,
modules, and directories easier by eliminating the manual effort
involved in building suites of related tests. Over the course of
several years, the set of automated tests we have written for my
company’s product has grown to contain over 3500 individual tests.

Our code is organized around functional areas, with user interface and
back-end code separated into different modules and packages. In order
to run the automated tests for all aspects of a specific feature, a
developer may need to run tests in several modules from different
directories in their sandbox. By building on the standard library
features of unittest, Proctor makes it easy to manage all of the tests
no matter where they are located in the source tree.

Running Tests with Proctor

The first improvement Proctor makes over the standard unittest test
loader is that Proctor can scan multiple files to find all of the
tests, then run them in a single batch. Each Python module specified
on the command line is imported, one at a time. After a module is
loaded, it is scanned for classes derived from unittest.TestCase,
just as with the standard test loader. All of the tests are added to a
test suite, and when the scanner is finished loading the test modules,
all of the tests in the suite are run.

For example, to scan all Python files in the installed version of
proctorlib for tests you would run:

$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib/*.py

Proctor also accepts directory names as arguments, so the command can
be written:

$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib

Proctor will search recursively down through any directories given to
find all of the tests in any modules in subdirectories. The file or
directory names are converted to importable package names, so that
directory/ is imported as directory.file. If your code
is organized under a single Python package, and you wish to run all of
the tests in the package, you only need to specify the root directory
for that package.

Expanding Automated Testing Beyond Unit Tests

Tests are often categorized by developers based on the scope of the
functionality being tested as either “unit” or “integration” tests. A
unit test is usually a very low level test that requires few or no
external resources and verifies the functionality of an isolated
feature (such as a single method of a single class). An integration
test, by contrast, depends on the interaction of several classes or
instances, and is designed to ensure that the API between the objects
works as expected. For example, an integration test might verify that
an ORM library stores data in the expected tables, or that temporary
files are managed correctly when some filesystem actions are

At the company where I work, we use the unittest framework found in
the Python standard library for all of our automated unit and
integration tests. It is convenient for us to use a single framework,
because it means the developers only have to manage one set of test
tools. Another benefit is the nightly batch job that runs the
integration tests also includes all of the unit tests at the same
time. By running the unit and integration tests automatically every
night, we can identify regression errors we might not have otherwise
detected until later in the testing cycle. The integration tests use
database and filesystem resources created as fixtures by the test
setUp and tearDown hooks. Our developers can run unit-level
tests directly with unittest.main(), or test entire source
packages with Proctor. The code for integration tests may be mingled
in the same modules with the unit tests or in separate modules,
depending on how the developer responsible for the area of code in
question has it organized.

Some of the tests we have written need to use hardware that may not
always be available in the test environment. We write automated tests
for all of the “device driver” modules that are used to integrate
infrastructure devices such as network switches, load balancers, and
storage arrays with our product. These tests typically require an
actual device for the test to run successfully, since the tests
reconfigure it and then verify the results match what is
expected. This situation poses a problem, since the test equipment is
not always present in every test environment. Sometimes we only have
one compatible device. At other times, a device was on loan and so may
have been returned to the original vendor after the driver was
finished. In both of these cases, the test setup code will not be able
to find the equipment required for testing. Under these circumstances,
the tests will produce an error every time they run, and it is useful
to be able to skip over them and thus avoid false alarms and wasted
test time.

Proctor solves this problem with a flag that causes it to ignore the
tests in a module. To tell Proctor to ignore a specific module, add
the module level variable __proctor_ignore_module__ in the
source. Listing 3 shows an example with this flag set. Proctor still
imports the module, but when it sees the flag set to True, it does
not scan the contents of the module for tests. When the resource
needed by the tests becomes available in our lab and it is time to
test a device driver, we simply run the test file directly instead of
using Proctor.

Listing 3

#!/usr/bin/env python
# The tests in this module are ignored by Proctor

import unittest

# Tell Proctor to ignore the tests in this module.
__proctor_ignore_module__ = True

class IgnoredTest(unittest.TestCase):

    def testShouldNotBeRun(self):'This test will not be run by Proctor')

if __name__ == '__main__':
    # If this file is run directly, the tests are still executed.

Some of our other tests use resources available when the tests are run
on a developer workstation, but not when they are run as part of a
nightly batch job. For example, some portions of the graphical user
interface for our product have automated tests, but since it is an X
Windows application, those tests cannot be run without an X Windows
server, which is not present on the automated test server. Since all
of the GUI code is in one directory, it is easier to instruct Proctor
to ignore all of the modules in that directory instead of setting the
ignore flag separately for each file.

Proctor supports ignoring entire directories through a configuration
file named .proctor. The configuration file in each directory can
be used to specify modules or subdirectories to be ignored by Proctor
when scanning for tests. The files or directories specified in the
ignore variable are not imported at all, so if importing some
modules would fail without resources like an X Windows server
available, you can use a .proctor file as a more effective method
of ignoring them rather than setting the ignore flag inside the
source. All of the file or directory names in the ignore list are
relative to the directory containing the configuration file. For
example, to ignore the productname.gui package, create a file in
the productname directory containing ignore = [“gui”], like

# Proctor instructions file ".proctor"

# Importing the gui module requires an X server,
# which is not available for the nightly test batch job.
ignore = [ 'gui' ]

The .proctor file uses Python syntax and can contain any legal
Python code. This means you can use modules such as os and
glob to build up the list of files to be ignored, following any
rules you want to establish. Here is a more sophisticated example
which only disables the GUI tests if it cannot find the X server they

import os

ignore = []
if os.environ.get('DISPLAY') is None:

Organizing Tests Beyond Classes and Modules

The usual way to organize related test functions is by placing them
together in the same class, and then by placing related classes
together in the same module. Such a neat organizational scheme is not
always possible, however, and related tests might be in different
modules or even in different directories. Sometimes, test modules grow
too large and need to be broken up so they are easier to maintain. In
other cases, when a feature is implemented, different aspects of the
code may be spread among files in multiple source directories,
reflecting the different layers of the application. Proctor can use
test categories to dynamically construct a test suite of related tests
without requiring the test authors to know about all of the tests in
advance or to update a test suite manually.

Proctor uses simple string identifiers as test categories, much like
the tags commonly found in a Web 2.0 application. It is easy to add
categories to your existing tests by setting the class attribute
PROCTOR_TEST_CATEGORIES to a sequence of strings; no special base
class is needed. Then tell proctorbatch to limit the test suite to
tests in specific category using the –category option.

Using proctorbatch

Listing 4 shows some new test classes with categories that are useful
as examples to demonstrate how the command line options to
proctorbatch work. The first class, FeatureOneTests, is
categorized as being related to “feature1”. The tests in the second
class, FeatureOneAndTwoTests, are categorized as being related to
both “feature1” and “feature2”, representing a set of integration
level tests verifying the interface between the two features. The
UncategorizedTests class is not included in any category. Now that
the test classes are defined, I will show how to use proctorbatch
to work with them in a variety of ways.

Listing 4

#!/usr/bin/env python
# Categorized tests.

import unittest

class FeatureOneTests(unittest.TestCase):
    "Unit tests for feature1"

    PROCTOR_TEST_CATEGORIES = ( 'feature1',)

    def test(self):

class FeatureOneAndTwoTests(unittest.TestCase):
    "Integration tests for feature1 and feature2"

    PROCTOR_TEST_CATEGORIES = ( 'feature1', 'feature2', )

    def test1(self):

    def test2(self):

class UncategorizedTests(unittest.TestCase):
    "Not in any category"

    def test(self):

if __name__ == '__main__':

Proctor provides several command line options that are useful for
examining a test suite without actually running the tests.

Proctor provides several command line options that are useful for
examining a test suite without actually running the tests. To print a
list of the categories for all tests in the module, use the
–list-categories option:

$ proctorbatch -q --list-categories

The output is an alphabetical listing of all of the test category
names for all of the tests found in the input files. Proctor creates
two categories automatically every time it is run. The category named
“All” contains every test discovered. The “Unspecified” category
includes any test that does not have a specific category, making it
easy to find uncategorized tests when a test set starts to become
unwieldy or more complex. When a test class does not have any
categories defined, its tests are run when no –category option is
specified on the command line to proctorbatch, or when the “All”
category is used (the default).

To examine the test set to see which tests are present, use the
–list option instead:

$ proctorbatch -q --list
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2
test: Listing4.FeatureOneTests.test
test: Listing4.UncategorizedTests.test

And to see the tests only for a specific category, use the
–category and –list options together:

$ proctorbatch -q --list --category feature2
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2

To see the list of uncategorized tests, use the category

$ proctorbatch -q --list --category Unspecified
test: Listing4.UncategorizedTests.test

After verifying that a category includes the right tests, to run the
tests in the category, use the –category option without the
–list option:

$ proctorbatch --category feature2
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing4.FeatureOneAndTwoTests) ... ok
test2 (test: Listing4.FeatureOneAndTwoTests) ... ok

Ran 2 tests in 0.002s


Identifying Test Categories

While test category names can hold any meaning you want to give them,
over time I have found that using broad categories is more desirable
than using narrowly defined categories. When a category is too
narrowly focused, the tests are more likely to be in the same module
or directory anyway. In that case, there is not as much purpose to be
served by defining the category, since it is easy enough to just run
the tests in that file or directory.

When using a broad category, it is more likely that the tests involved
will span multiple directories. At that point, having a single
category to encompass them becomes a useful way to consolidate the
tests. Suppose, for example, there is an application that
authenticates a user before allowing an action. It has a User
class to manage users and verify their credentials. It also has a
command line interface that depends on the User class to perform
authentication. There are unit tests for methods of the User
class, and integration tests to ensure that authentication works
properly in the command line program. Since the command line program
is unlikely to be in the same section of the source tree as the
low-level module containing the User class, it would be beneficial
to define a test category for “authentication” tests so all of the
related tests can be run together.

These sorts of broad categories are also useful when a feature
involves many aspects of a system at the same level. For example, when
the user edits data through a web application, the user
authentication, session management, cookie handling, and database
aspects might all be involved at different points. A “login” category
could be applied to unit tests from each aspect, so the tests can be
run individually or as a group. Adding categories makes it immensely
easier to run the right tests to identify regression errors when
changes could affect multiple areas of a large application.

Monitoring Test Progress By Hand

Proctor accepts several command line options to control the format of
the output of your test run, depending on your preference or need.
The default output format uses the same style as the unittest test
runner. The verbosity level is set to 1 by default, so the full
names of all tests are printed along with the test outcome. To see
only the pass/fail status for the tests, reduce the verbosity level by
using the -q option. See Listing 5 for an example of the default

Listing 5

$ proctorbatch
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing2.FailingTests) ... FAIL
test2 (test: Listing2.FailingTests) ... FAIL
test1 (test: Listing2.PassingTests) ... ok
test2 (test: Listing2.PassingTests) ... ok

FAIL: test1 (test: Listing2.FailingTests)
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/", line 17, in test1'Always fails 1')
AssertionError: Always fails 1

FAIL: test2 (test: Listing2.FailingTests)
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/", line 21, in test2'Always fails 2')
AssertionError: Always fails 2

Ran 4 tests in 0.006s

FAILED (failures=2)

$ proctorbatch  -q
FAIL: test1 (test: Listing2.FailingTests)
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/", line 17, in test1'Always fails 1')
AssertionError: Always fails 1

FAIL: test2 (test: Listing2.FailingTests)
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/", line 21, in test2'Always fails 2')
AssertionError: Always fails 2

Ran 4 tests in 0.007s

FAILED (failures=2)

When using the default format, Proctor does not print any failure or
error messages until all of the tests have run. If your test suite is
very large, or the integration tests require fixtures that take a lot
of time to configure, you may not want to wait for the tests to finish
before discovering which tests have not passed. When that is the case,
you can use the –interleaved option to show the tests results
along with the name of the test as the test runs, as illustrated in
Listing 6.

Listing 6

$ proctorbatch  --interleaved --no-gc
Writing coverage output to .coverage
Scanning: .
  1/  6 test: Listing2.ErrorTests.test1 ...ERROR in test: Listing2.ErrorTests.test1
Traceback (most recent call last):
  File "", line 27, in test1
    raise RuntimeError('test1 error')
RuntimeError: test1 error

  2/  6 test: Listing2.ErrorTests.test2 ...ERROR in test: Listing2.ErrorTests.test2
Traceback (most recent call last):
  File "", line 30, in test2
    raise RuntimeError('test2 error')
RuntimeError: test2 error

  3/  6 test: Listing2.FailingTests.test1 ...FAIL in test: Listing2.FailingTests.test1
Traceback (most recent call last):
  File "", line 17, in test1'Always fails 1')
AssertionError: Always fails 1

  4/  6 test: Listing2.FailingTests.test2 ...FAIL in test: Listing2.FailingTests.test2
Traceback (most recent call last):
  File "", line 21, in test2'Always fails 2')
AssertionError: Always fails 2

  5/  6 test: Listing2.PassingTests.test1 ...ok
  6/  6 test: Listing2.PassingTests.test2 ...ok

Ran 6 tests in 0.013s

FAILED (failures=2, errors=2)

Automatic Test Output Processing

For especially large test runs, or if you are committed to more
complete test automation, you may not want to examine the test results
by hand at all. Proctor can also produce a simple parsable output
format suitable for automatic processing. The output format can be
processed by another program to summarize the results or even open
tickets in your defect tracking system. To have Proctor report the
test results in this format, pass the –parsable option to
proctorbatch on the command line. Listing 7 includes a sample of
the parsable output format.

Listing 7

$ proctorbatch  --parsable --no-gc
Writing coverage output to .coverage
Scanning: .
__PROCTOR__ Start run
__PROCTOR__ Start test
test: Listing2.ErrorTests.test1
Traceback (most recent call last):
  File "", line 27, in test1
    raise RuntimeError('test1 error')
RuntimeError: test1 error

__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  1/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.ErrorTests.test2
Traceback (most recent call last):
  File "", line 30, in test2
    raise RuntimeError('test2 error')
RuntimeError: test2 error

__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  2/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test1
Traceback (most recent call last):
  File "", line 17, in test1'Always fails 1')
AssertionError: Always fails 1
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  3/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test2
Traceback (most recent call last):
  File "", line 21, in test2'Always fails 2')
AssertionError: Always fails 2
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  4/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test1
__PROCTOR__ Start results
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  5/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test2
__PROCTOR__ Start results
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  6/  6
__PROCTOR__ End progress
__PROCTOR__ End run
__PROCTOR__ Start summary
Failures: 2
Errors: 2
Successes: 2
Tests: 6
Elapsed time (sec): 0.014
Status: FAILED
__PROCTOR__ End summary

Since the test results may be part of a larger log file that includes
other information such as build output and installation messages,
Proctor uses easily identifiable delimiters to separate the sections
in its output. Each delimiter appears on a line by itself, and begins
with __PROCTOR__ to make it less likely that the output of any
other program will be misinterpreted as test output.

Proctor assumes there is no need to automatically process the output
of the scanning phase, so the first delimiter (__PROCTOR__ Start
) is printed at the beginning of the test execution phase. The
string __PROCTOR__ Start test appears at the beginning of each
test, followed on the next line by the name of the test. Any output
produced by the test appears beginning on the line immediately
following the name. The test output is followed by a traceback, if the
test does not pass.

The text between the __PROCTOR__ Start results and __PROCTOR__
End results
delimiters always begins with one of ok, ERROR,
or FAIL, depending on the outcome of the test. If the test did not
pass, the rest of the text in the results section consists of the full
name of the test. The string __PROCTOR__ End test follows each
test result. Between the results for each test, a progress section
shows the current test number and the total number of tests being run.

Proctor comes with proctorfilter, a simple command line program to
process a log file and print the names of tests with certain status
codes. It accepts three command line options, –ok, –error,
and –fail, to control which tests are listed in the output. For
example, to find the tests which failed in the sample output, run:

$ proctorfilter --fail Listing7.txt
test: Listing2.FailingTests.test1: FAIL
test: Listing2.FailingTests.test2: FAIL

The default behavior for proctorfilter, when no command line
options are given, is to print a list of tests that either had an
error or failed.

Building Your Own Results Parser

Using proctorfilter to summarize a set of test results is only one
way to automate results processing for your tests. Another way to
handle the test results is to open a new ticket in a bug tracking
system for each test that does not pass during the nightly test
run. When the ticket is opened, it should include all of the
information available, including output from the test and the
traceback from the failure or error. Although proctorfilter does
not include all of that information, the Proctor library also includes
a result module, with classes useful for building your own test
result processing program.

Listing 8 shows a sample program that recreates the default
functionality of proctorfilter using the proctorlib.result
module. The ResultFactory class parses input text passed to
feed() and creates TestResult instances. Each time a complete
test result has been fed in, a new TestResult is constructed and
passed as an argument to the callback given to the ResultFactory
constructor. In the sample program, the callback function
show_test_result() looks at the status code for the test before
deciding whether to print out the summary.

Listing 8

#!/usr/bin/env python
# Print a list of tests which did not pass.

import fileinput
from proctorlib.result import ResultFactory, TestResult

def show_test_result(test_result):
    "Called for each test result parsed from the input data."
    if not test_result.passed():
        print test_result

# Set up the parser
parser = ResultFactory(show_test_result)

# Process data from stdin or files named via sys.argv
for line in fileinput.input():

A TestResult instance has several attributes of interest. The
name attribute uniquely identifies the test. The name includes the
full import path for the module, as well as the class and method name
of the test. The output attribute includes all of the text
appearing between the __PROCTOR__ Start test and __PROCTOR__
Start results
delimiters, including the traceback, if any. The
result attribute includes the full text from between __PROCTOR__
Start results
and __PROCTOR__ End results, while status
contains only the status code. The status will be the same as one of
TestResult.OK, TestResult.ERROR, or TestResult.FAIL. The
passed() method returns True if the test status is
TestResult.OK and False otherwise.

Code Coverage

At the same time it is running the automated tests, Proctor uses Ned
Batchelder’s coverage module to collect information about which
statements in the source files are actually executed. The code
coverage statistics gathered by coverage can be used to identify areas
of the code that need to have more automated tests written.

By default, proctorbatch writes the code coverage statistics to
the file ./.coverage. Use the –coverage-file option to change
the filename used. To disable coverage statistics entirely, use the
–no-coverage option.

Statistics are normally collected for every line of the source being
run. Some lines should not be included in the statistics, though, if
the code includes debugging sections that are disabled while the tests
are running. In that case, use the –coverage-exclude option to
specify regular expressions to be compared against the source code.
If the source matches the pattern, the line is not included in the
statistics counts. To disable checking for lines that match the
pattern if DEBUG:, for example, add –coverage-exclude=”if
to the command line. The –coverage-exclude option can
be repeated for each pattern to be ignored.

Once the test run is complete, use to produce a report
with information about the portions of the code that were not executed
and the percentage that was. For example, in the following listing the
return statements in the test methods of the FailingTests
class from Listing 2 are never executed. They were skipped because
both of the tests fail before reaching the end of the function.

$ -r -m
Name       Stmts   Exec  Cover   Missing
Listing2      18     16    88%   18, 22

Refer to the documentation provided by –help for more
information on how to print code coverage reports.

Garbage Collection

Proctor can also be used to help identify the source of memory
leaks. When using the interleaved or parsable output formats, Proctor
uses the gc module functions for garbage collection to report on
objects that have not been cleaned up.

Listing 9 defines a test that introduces a circular reference between
two lists, a and b, by appending each to the other. Normally,
when processing leaves a function’s scope, the local variables are
marked so they can be deleted and their memory reclaimed. In this
case, however, since both lists are still referenced from an object
that has not been deleted, the lists are not automatically cleaned up
when the test function returns. The gc standard library module
includes an interface to discover uncollected garbage objects like
these lists, and Proctor includes a garbage collection report in the
output for each test, as in Listing 10. The garbage collection
information can be used to determine which test was being run when the
memory leaked, and then to narrow down the source of the leak.

Listing 9

#!/usr/bin/env python
# Test code with circular reference to illustrate garbage collection

import unittest

class CircularReferenceTest(unittest.TestCase):

    def test(self):
        a = []
        b = []

Listing 10

$  proctorbatch --interleaved
Writing coverage output to .coverage
Scanning: .
  0/  1 test: Listing9.CircularReferenceTest.test ...ok
GC: Collecting...
GC: Garbage objects:
<type 'list'>
<type 'list'>

Ran 1 tests in 0.180s



Automated testing is perhaps one of the biggest productivity
enhancements to come out of the Agile development movement. Even if
you are not doing Test Driven Development, using automated testing to
identify regression errors can provide great peace of mind. The basic
tools provided in the Python standard library do support automated
testing, but they tend to be targeted at library or module developers
rather than large scale projects. I hope this introduction to Proctor
has suggested a few new ideas for expanding your own use of automated
tests, and for managing those tests as your project size and scope

I would like to offer a special thanks to Ned Batchelder for his help
with integrating and Proctor, and Mrs. PyMOTW for her help
editing this article.

IPython and virtualenv

Originally published in Python Magazine Volume 2 Issue 2 , February,

IPython is a feature-rich interactive shell for Python
developers. Virtualenv creates isolated development environments so
you can test or install packages without introducing conflicts. This
month, Doug examines how both tools can make your life a little

Last month, around the time I started working on my January column, I
posted to my blog asking readers to tell me about their favorite
Python development tools. I received several good tips from a variety
of developers. While some of the responses were scattered all over the
map, there were two tools that stood out from the rest with their
popularity. The overwhelming favorite of all commenters was
IPython, an alternate interactive shell. I had heard about IPython
previously from co-workers, but never really looked at it very
closely. After the responses to my post, however, I decided I needed
to give it a more serious review. The other tool mentioned several
times was virtualenv, a “Virtual Python Environment builder”. I
was already a virtualenv user, so I was pleased to see it mentioned.
Let’s look at virtualenv first, and then use it to test IPython.


Ian Bicking’s virtualenv creates a “private” Python environment, or
sandbox, complete with your own interpreter and site-packages
directory. Having a private sandbox like this is useful in situations
where you don’t have root access to install packages into the global
site-packages area, or where you want to test newer versions of
modules without corrupting your development system.

When you run virtualenv, it sets up a fresh sandbox version of Python
in a directory you specify by copying or linking files from your
default installation to create new bin and lib
directories. The sys.path for the new environment is configured
based on options you provide when you create it. By default, the
original Python environment is included at the end of the search path.
This configuration allows you to share common modules in a global
site-packages directory, so they can be used in several projects.
Listing 1 shows some basic details from a default virtualenv
environment. As you can see starting on line 31, the sys.path has
the user library directories before the global versions of those same
directories and the local lib/python2.5 comes before the global
lib/python2.5, etc.

Listing 1

$ pwd

$ virtualenv default
New python executable in default/bin/python
Installing setuptools....................done.

$ find default -type d

$ ls -l default/bin
total 96
-rw-r--r--  1 dhellmann  501   1213 Jan 31 09:30 activate
-rwxr-xr-x  1 dhellmann  501    329 Jan 31 09:30 easy_install
-rwxr-xr-x  1 dhellmann  501    337 Jan 31 09:30 easy_install-2.5
-rwxr-xr-x  1 dhellmann  501  30028 Jan 31 09:30 python
lrwxr-xr-x  1 dhellmann  501      6 Jan 31 09:30 python2.5 -> python

$ default/bin/python
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.prefix
>>> pprint.pprint(sys.path)

Alternately, if you specify –no-site-packages when creating the
new environment, you end up with a completely stand-alone environment
with no reference back to the global site-packages. Listing 2
shows the search path for a private environment without the global
site-packages library directories. Notice that although the global
standard library directory is still included, third-party libraries
are supposed to go into site-packages, so that should not cause
any confusion or conflict when importing modules.

Listing 2

$ virtualenv --no-site-packages private
New python executable in private/bin/python
Installing setuptools............done.

$ private/bin/python
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.prefix
>>> import pprint
>>> pprint.pprint(sys.path)

In both examples so far, I have run the local version of the
interpreter directly so I could examine the module search path. The
sandbox created by virtualenv also includes a simple script called
activate to set up your current shell to use the files in that
sandbox by default. When you source activate, it sets up several
environment variables in your current shell. For example,
VIRTUAL_ENV is set to point to the root of the sandbox, the
directory you specified when you created it. Your shell PATH
variable is also updated to start with the sandbox’s bin
directory. This makes the sandbox version of the interpreter the
default, so when you run scripts they use the libraries from the
sandbox. Any commands you install with easy_install also end up in
$VIRTUAL_ENV/bin, making it easier to run those programs since you
don’t have to remember to type the full path to the program. As a
reminder that you are using a custom environment, your command prompt
also changes to show the name of the virtual environment sandbox.

$ source default/bin/activate
(default)$ echo $VIRTUAL_ENV
(default)$ which python

When I started as Technical Editor here at Python Magazine, I had
some concerns about installing all of the dependencies I would need to
have in order to test code associated with articles submitted for
review by our authors. I knew some articles would have conflicting
requirements, and managing those packages was going to be a bit of a
hassle. Originally, I planned to set up a virtual machine so I could
at least isolate code for the magazine from my own development
environment, and wiping it clean between issues (or even articles)
would be straightforward. Using virtualenv has turned out to be so
much easier that I haven’t bothered with the VM. I typically create a
separate environment for each article in a sandbox, verify the
dependencies listed by the author along with their code, then delete
the entire sandbox. No muss, no fuss.


IPython is an enhanced interactive shell, intended to be used as a
replacement for the standard interactive Python interpreter. Started
by Fernando Pérez as a set of enhancements to the basic interpreter
prompt, it has grown to include a host of other powerful features and
contributions from several developers. Many of the features of IPython
derive from its background in scientific computing, but it can be
useful outside of scientific fields for anyone who works with Python
as well.

The installation instructions have the usual steps (note the use of
sudo to perform the final step with admin privileges):

$ tar -xvzf ipython-0.7.3.tar.gz
$ cd ipython-0.7.3
$ python build
$ sudo python install

To illustrate how I usually review contributions for this column,
though, I decided to use easy_install, even though it wasn’t
listed as an option, and virtualenv. Listing 3 shows how I created the
new sandbox and then installed IPython into it.

Listing 3

$ virtualenv ipython
New python executable in ipython/bin/python
Installing setuptools............done.
$ source ipython/bin/activate

(ipython)$ cd $VIRTUAL_ENV
(ipython)$ ls
bin     lib

(ipython)$ easy_install ipython
Searching for ipython
Best match: ipython 0.8.2
Processing ipython-0.8.2-py2.5.egg
creating /Users/dhellmann/PythonMagazine/PythonEnv/ipython/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg
Extracting ipython-0.8.2-py2.5.egg to /Users/dhellmann/PythonMagazine/PythonEnv/ipython/lib/python2.5/site-packages
Adding ipython 0.8.2 to easy-install.pth file
Installing ipython script to /Users/dhellmann/PythonMagazine/PythonEnv/ipython/bin
Installing pycolor script to /Users/dhellmann/PythonMagazine/PythonEnv/ipython/bin

Installed /Users/dhellmann/PythonMagazine/PythonEnv/ipython/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg
Processing dependencies for ipython
Finished processing dependencies for ipython
(ipython)$ which ipython

Prompt History

Now that IPython is installed, let’s start it up and take it for a
spin. Listing 4 shows the “startup screen”. The first thing to notice,
after the boiler-plate help text, is the different prompt, In
. If your terminal supports it, the prompt will be in color
(mine is green). The different prompt is the first indication of one
of the powerful features of IPython. As you issue instructions,
IPython tracks every input and output expression used in your
interactive session. You can refer back to them to build new
expressions in a couple of different ways. Each input has a history
“number”, similar to the way many Unix shells refer to your command
line history. In the case of IPython, though, this is extended to
allow you to refer back to the outputs as well, without recomputing
them. This feature is especially useful if you are experimenting with
code and creating lots of objects with small variations in their

Listing 4

(ipython)$ ipython
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
Type "copyright", "credits" or "license" for more information.

IPython 0.8.2 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

In [1]:

There are two syntaxes for using back references. You can use built-in
In and Out variables as lists, and index into them. Or if you
only want the output value, you can use the shorthand underscore
notation (_). The values from In are the strings you input,
while the values from Out are the results of evaluating those
expressions. For example:

In [1]: 5*5

Out[1]: 25

In [2]: In[1]

Out[2]: u'5*5n'

In [3]: _1

Out[3]: 25

In [4]: Out[3]

Out[4]: 25

In [5]: _4 * 5

Out[5]: 125

In [6]:

If you are comfortable working at the interactive prompt in this way,
but want to record what you do for future reference after you close
your session, you can use IPython’s logging feature to write the
session to a file. To activate the log, use the control command
%logstart, as illustrated in Listing 5. The output file is a
Python source file, so it is easy to clean it up and turn it into a
“real” module when you are done experimenting.

Listing 5

In [6]: %logstart
Activating auto-logging. Current session state plus future input saved.
Filename       :
Mode           : rotate
Output logging : False
Raw input log  : False
Timestamping   : False
State          : active

In [7]: a = 5

In [8]: b = 6

In [9]: c = a * b

In [10]: c

Out[10]: 30

In [11]: d = [ a, b, c]

In [12]: d

Out[12]: [5, 6, 30]

In [13]: %logstop

If we examine the file (Listing 6), we can see that
everything I typed was logged to the file. The command %logstart
is just one of many magic control commands for giving IPython
instructions about how you want to interact with it.

Listing 6

#log# Automatic Logger file. *** THIS MUST BE THE FIRST LINE ***
#log# opts = Struct({'__allownew': True, 'logfile': ''})
#log# args = []
#log# It is safe to make manual edits below here.
_4 * 5
_ip.magic("logstart ")

a = 5
b = 6
c = a * b
d = [a, b, c]

Shell Commands

In addition to standard Python expressions and magic commands, IPython
supports “shelling out” to run commands in your normal command line
shell. This feature, and the fact that some shell commands are built
into IPython, enables some users to use IPython as their primary
command line shell, instead of a more traditional shell such as bash.
Notice the difference in this example between !pwd and pwd
the latter results in an output value being saved to Out.

In [1]: cd

In [2]: !pwd

In [3]: pwd

Out[3]: '/Users/dhellmann'

IPython is smart enough that many system commands can be run directly
from the command prompt without the ! prefix, and any can be run
with the prefix. The text output of the commands can be captured, just
as with more traditional shells, so the values can be used by your
Python code.

Help Is Never Far Away

With all of the additional features, a new user may feel that the
complexity of IPython can be a bit much to take in at once. As with
any good interactive system, help is available right there in the
shell. The %quickref command prints a quick reference card that
lists many of the magic commands, special operators, and other
features with examples and basic instructions for using them. More
specific details are available when you use the ? operator. By
itself, ? displays a help document for IPython itself. It can be
combined with dynamically created objects to show help text for just
about anything in the system (classes, functions, methods, objects,
etc.). Providing built-in help like this makes it easy to find method
signatures and docstrings when you are in the middle of your work,
without having to shift gears to search through online documentation.


IPython makes debugging your scripts easier by letting you run them in
the context of the interactive session, then retaining all of the
script state once it is completed. This makes it easy to go back and
look at objects created by the script as though you had typed all of
the commands into the interactive interpreter directly. For example,
take this small bit of sample code in

#!/usr/bin/env python
my_list = []
print 'End of sample'

When run through IPython, the output looks like this:

In [1]: %run
End of sample

In [2]: my_list

Out[2]: ['first', 'second']

All of the globals from the script are available to the interactive
session, so I can inspect them to see what their settings are. If the
script raises an exception, the source code where the error originates
is displayed along with information about the exception. Working this
way shortens your development cycle because each time you %run a
script, it is reloaded – you don’t have to restart the interpreter to
reset its state.

Advanced Uses

While IPython does have features that make it well suited as an
interactive shell for any Python programmer, it is also being designed
with more advanced purposes in mind. For example, it is possible to
add new syntax profiles so that it understands your own
domain-specific language, making it a natural choice for an embedded
interpreter. For example, IPython comes with a profile to let it
understand units of measure frequently used in physics computations.
This lets you type mass = 3 kg and have it translated to an object
that knows about the 3 as well as “kilograms” to represent a mass.

Another advanced feature is the ability to control and interact with
GUI programs. Usually, GUIs run an infinite event loop to collect
input events, process them, and then update the screen. There are
bindings available to allow IPython to control most of the popular GUI
toolkits so the interactive prompt is not blocked when the event loop
runs. This sort of support is integral for combining IPython with
something like matplotlib to generate graphs interactively, a key
feature for its scientific audience.

In the highly advanced feature category, IPython includes support for
controlling parallel processing systems directly from the shell
prompt. It supports running Python code distributed over multiple CPUs
and hosts, as well as passing Python objects between the
processes. Message passing, task farming, and shared memory
parallelism are supported. You can even connect and disconnect remote
processes dynamically. This lets you start a job and check in later to
see how it is running, or share the running state with someone else on
your team.


As I mentioned earlier, before I started this series of columns I had
already added virtualenv to my standard toolbox, and I use it on a
regular basis. Combined with easy_install, I have found it to be
irreplaceable for testing new configurations or working on projects
with different dependencies. On the other hand, I’m still learning the
many features of IPython. It is certainly nicer in many respects than
the standard Python shell, but I’m not used to working from the
interactive prompt as much as some other developers are, so it may
take some time for me to understand its potential fully. Based on the
strength of the recommendations I have received and what I have seen
so far, I will continue experimenting with it. In the mean time, check
it out yourself and let me know what you think.

Next month I will continue this series by introducing you to more
tools to enhance your programming productivity. If you have a tip to
share, feedback on something I’ve written, or if there is a topic you
would like for me to cover in this column, send a note with the
details to doug dot hellmann at pythonmagazine dot com and let me
know, or add the link to your account with the tag
pymagdifferent. And if you’re going to PyCon in March, look me up
at the conference.

Multi-processing techniques in Python

Originally published in Python Magazine Volume 1 Number 10 , October,

Has your multi-threaded application grown GILs? Take a look at these
packages for easy-to-use process management and inter-process
communication tools.

There is no predefined theme for this column, so I plan to cover a
different, likely unrelated, subject every month. The topics will
range anywhere from open source packages in the Python Package Index
(formerly The Cheese Shop, now PyPI) to new developments from around
the Python community, and anything that looks interesting in
between. If there is something you would like for me to cover, send a
note with the details to doug dot hellmann at
pythonmagazine dot com and let me know, or add the link to
your account with the tag “pymagdifferent”.

I will make one stipulation for my own sake: any open source libraries
must be registered with PyPI and configured so that I can install them
with distutils. Creating a login at PyPI and registering your
project is easy, and only takes a few minutes. Go on, you know you
want to.

Scaling Python: Threads vs. Processes

In the ongoing discussion of performance and scaling issues with
Python, one persistent theme is the Global Interpreter Lock
(GIL). While the GIL has the advantage of simplifying the
implementation of CPython internals and extension modules, it prevents
users from achieving true multi-threaded parallelism by limiting the
interpreter to executing byte-codes in one thread at a time on a
single processor. Threads which block on I/O or use extension modules
written in another language can release the GIL to allow other threads
to take over control, of course. But if my application is written
entirely in Python, only a limited number of statements will be
executed before one thread is suspended and another is started.

Eliminating the GIL has been on the wish lists of many Python
developers for a long time – I have been working with Python since
1998 and it was a hotly debated topic even then. Around that time,
Greg Stein produced a set of patches for Python 1.5 that eliminated
the GIL entirely, replacing it with a whole set of individual locks
for the mutable data structures (dictionaries, lists, etc.) that had
been protected by the GIL. The result was an interpreter that ran at
roughly half the normal speed, a side-effect of acquiring and
releasing the individual locks used to replace the GIL.

The GIL issue is unique to the C implementation of the
interpreter. The Java implementation of Python, Jython, supports true
threading by taking advantage of the underlying JVM. The IronPython
port, running on Microsoft’s CLR, also has better threading. On the
other hand, those platforms are always playing catch-up with new
language or library features, so if you’re hot to use the latest and
greatest, like I am, the C reference-implementation is still your best

Dropping the GIL from the C implementation remains a low priority for
a variety of reasons. The scope of the changes involved is beyond the
level of anything the current developers are interested in
tackling. Recently, Guido has said he would entertain patches
contributed by the Python community to remove the GIL, as long as
performance of single-threaded applications was not adversely
affected. As far as I know, no one has announced any plans to do so.

Even though there is a FAQ entry on the subject as part of the
standard documentation set for Python, from time to time a request
pops up on comp.lang.python or one of the Python-related mailing lists
to rewrite the interpreter so the lock can be removed. Each time it
happens, the answer is clear: use processes instead of threads.

That response does have some merit. Extension modules become more
complicated without the safety of the GIL. Processes typically have
fewer inherent deadlocking issues than threads. They can be
distributed between the CPUs on a host, and even more importantly, an
application that uses multiple processes is not limited by the size of
a single server, as a multi-threaded application would be.

Since the GIL is still present in Python 3.0, it seems unlikely that
it will be removed from a future version any time soon. This may
disappoint some people, but it is not the end of the world. There are,
after all, strategies for working with multiple processes to scale
large applications. I’m not talking about the well worn, established
techniques from the last millennium that use a different collection of
tools on every platform, nor the time-consuming and error-prone
practices that lead to solving the same problem time and
again. Techniques using low-level, operating system-specific,
libraries for process management are as passé as using compiled
languages for CGI programming. I don’t have time for this low-level
stuff any more, and neither do you. Let’s look at some modern

The subprocess module

Version 2.4 of Python introduced the subprocess module and finally
unified the disparate process management interfaces available in other
standard library packages to provide cross-platform support for
creating new processes. While subprocess solved some of my process
creation problems, it still primarily relies on pipes for inter-process
communication. Pipes are workable, but fairly low-level as far as
communication channels go, and using them for two-way message passing
while avoiding I/O deadlocks can be tricky (don’t forget to flush()).
Passing data through pipes is definitely not as transparent to the
application developer as sharing objects natively between threads.
And pipes don’t help when the processes need to scale beyond a single

Parallel Python

Vitalii Vanovschi’s Parallel Python package (pp) is a more complete
distributed processing package that takes a centralized approach.
Jobs are managed from a “job server”, and pushed out to individual
processing “nodes”.

Those worker nodes are separate processes, and can be running on the
same server or other servers accessible over the network. And when I
say that pp pushes jobs out to the processing nodes, I mean just that
– the code and data are both distributed from the central server to
the remote worker node when the job starts. I don’t even have to
install my application code on each machine that will run the jobs.

Here’s an example, taken right from the Parallel Python Quick Start

import pp
job_server = pp.Server()
# Start tasks
f1 = job_server.submit(func1, args1, depfuncs1,
f2 = job_server.submit(func1, args2, depfuncs1,
f3 = job_server.submit(func2, args3, depfuncs2,
# Retrieve the results
r1 = f1()
r2 = f2()
r3 = f3()

When the pp worker starts, it detects the number of CPUs in the system
and starts one process per CPU automatically, allowing me to take full
advantage of the computing resources available. Jobs are started
asynchronously, and run in parallel on an available node. The callable
object returned when the job is submitted blocks until the response is
ready, so response sets can be computed asynchronously, then merged
synchronously. Load distribution is transparent, making pp excellent
for clustered environments.

One drawback to using pp is that I have to do a little more work up
front to identify the functions and modules on which each job depends,
so all of the code can be sent to the processing node. That’s easy (or
at least straightforward) when all of the jobs are identical, or use a
consistent set of libraries. If I don’t know everything about the job
in advance, though, I’m stuck. It would be nice if pp could
automatically detect dependencies at runtime. Maybe it will, in a
future version.

The processing Package

Parallel Python is impressive, but it is not the only option for
managing parallel jobs. The processing package from Richard Oudkerk
aims to solve the issues of creating and communicating with multiple
processes in a portable, Pythonic way. Whereas Parallel Python is
designed around a “push” style distribution model, the processing
package is set up to make it easy to create producer/consumer style
systems where worker processes pull jobs from a queue.

The package hides most of the details of selecting an appropriate
communication technique for the platform by choosing reasonable
default behaviors at runtime. The API does include a way to explicitly
select the communication mechanism, in case I need that level of
control to meet specific performance or compatibility requirements.
As a result, I end up with the best of both worlds: usable default
settings that I can tweak later to improve performance.

To make life even easier, the processing.Process class was purposely
designed to match the threading.Thread class API. Since the processing
package is almost a drop-in replacement for the standard library’s
threading module, many of my existing multi-threaded applications can
be converted to use processes simply by changing a few import
statements. That’s the sort of upgrade path I like.

Listing 1 contains a simple example, based on the examples found in
the processing documentation, which passes a string value between
processes as an argument to the Process instance and shows the
similarity between processing and threading. How much easier could it

Listing 1

#!/usr/bin/env python
# Simple processing example

import os
from processing import Process, currentProcess

def f(name):
    print 'Hello,', name, currentProcess()

if __name__ == '__main__':
    print 'Parent process:', currentProcess()
    p = Process(target=f, args=[os.environ.get('USER', 'Unknown user')])

In a few cases, I’ll have more work to do to convert existing code
that was sharing objects which cannot easily be passed from one
process to another (file or database handles, etc.). Occasionally, a
performance-sensitive application needs more control over the
communication channel. In these situations, I might still have to get
my hands dirty with the lower-level APIs in the processing.connection
module. When that time comes, they are all exposed and ready to be
used directly.

Sharing State and Passing Data

For basic state handling, the processing package lets me share data
between processes by using shared objects, similar to the way I might
with threads. There are two types of “managers” for passing objects
between processes. The LocalManager uses shared memory, but the types
of objects that can be shared are limited by a low-level interface
which constrains the data types and sizes. LocalManager is
interesting, but it’s not what has me excited. The SyncManager is the
real story.

SyncManager implements tools for synchronizing inter-process
communication in the style of threaded programming. Locks, semaphores,
condition variables, and events are all there. Special implementations
of Queue, dict, and list that can be used between processes safely are
included as well (Listing 2). Since I’m already comfortable with these
APIs, there is almost no learning curve for converting to the versions
provided by the processing module.

Listing 2

#!/usr/bin/env python
# Pass an object through a queue to another process.

from processing import Process, Queue, currentProcess

class Example:
    def __init__(self, name): = name
    def __str__(self):
        return '%s (%s)' % (, currentProcess())

def f(q):
    print 'In child:', q.get()

if __name__ == '__main__':
    q = Queue()
    p = Process(target=f, args=[q])
    o = Example('tester')
    print 'In parent:', o

For basic state sharing with SyncManager, using a Namespace is about
as simple as I could hope. A namespace can hold arbitrary attributes,
and any attribute attached to a namespace instance is available in all
client processes which have a proxy for that namespace. That’s
extremely useful for sharing status information, especially since I
don’t have to decide up front what information to share or how big the
values can be. Any process can change existing values or add new
values to the namespace, as illustrated in Listing 3. Changes to the
contents of the namespace are reflected in the other processes the
next time the values are accessed.

#!/usr/bin/env python
# Using a shared namespace.

import processing

def f(ns):
    print ns
    ns.old_coords = (ns.x, ns.y)
    ns.x += 10
    ns.y += 10

if __name__ == '__main__':
    # Initialize the namespace
    manager = processing.Manager()
    ns = manager.Namespace()
    ns.x = 10
    ns.y = 20

    # Use the namespace in another process
    p = processing.Process(target=f, args=(ns,))

    # Show the resulting changes in this process
    print ns

Remote Servers

Configuring a SyncManager to listen on a network socket gives me even
more interesting options. I can start processes on separate hosts, and
they can share data using all of the same high-level mechanisms
described above. Once they are connected, there is no difference in
the way the client programs use the shared resources remotely or

The objects are passed between client and server using pickles, which
introduces a security hole: because unpacking a pickle may cause code
to be executed, it is risky to trust pickles from an unknown
source. To mitigate this risk, all communication in the processing
package can be secured with digest authentication using the hmac
module from the standard library. Callers can pass authentication keys
to the manager explicitly, but default values are generated if no key
is given. Once the connection is established, the authentication and
digest calculation are handled transparently for me.


The GIL is a fact of life for Python programmers, and we need to
consider it along with all of the other factors that go into planning
large scale programs. Both the processing package and Parallel Python
tackle the issues of multi-processing in Python head on, from
different directions. Where the processing package tries to fit itself
into existing threading designs, pp uses a more explicit distributed
job model. Each approach has benefits and drawbacks, and neither is
suitable for every situation. Both, however, save you a lot of time
over the alternative of writing everything yourself with low-level
libraries. What an age to be alive!

Working with IMAP and iCalendar

How can you access group calendar information if your
Exchange-like mail and calendaring server does not provide
iCalendar feeds, and you do not, or cannot, use Outlook? Use
Python to extract the calendar data and generate your own feed, of
course! This article discusses a surprisingly simple program to
perform what seems like a complex series of operations: scanning
IMAP folders, extracting iCalendar attachments, and merging the
contained events together into a single calendar.


I recently needed to access shared schedule information stored on an
Exchange-like mail and calendaring server. Luckily, I was able to
combine an existing third party open source library with the tools in
the Python standard library to create a command line program to
convert the calendar data into a format I could use with my desktop
client directly. The final product is called mailbox2ics. It ended up
being far shorter than I had anticipated when I started thinking about
how to accomplish my goal. The entire program is just under 140 lines
long, including command line switch handling, some error processing,
and debug statements. The output file produced can be consumed by any
scheduling client that supports the iCalendar standard.

Using Exchange, or a compatible replacement, for email and scheduling
makes sense for many businesses and organizations. The client program,
Microsoft Outlook, is usually familiar to non-technical staff members,
and therefore new hires can hit the ground running instead of being
stymied trying to figure out how to accomplish their basic, everyday
communication tasks. However, my laptop runs Mac OS X and I do not
have Outlook. Purchasing a copy of Outlook at my own expense, in
addition to inflicting further software bloat on my already crowded
computer, seemed like an unnecessarily burdensome hassle just to be
able to access schedule information.

Changing the server software was not an option. A majority of the
users already had Outlook and were accustomed to using it for their
scheduling, and I did not want to have to support a different server
platform. That left me with one option: invent a way to pull the data
out of the existing server, so I could convert it to a format that I
could use with my usual tools: Apple’s iCal and Mail.

With iCal (and many other standards-compliant calendar tools) it is
possible to subscribe to calendar data feeds. Unfortunately, the
server we were using did not have the ability to export the schedule
data in a standard format using a single file or URL. However, the
server did provide access to the calendar data via IMAP using shared
public folders. I decided to use Python to write a program to extract
the data from the server and convert it into a usable feed. The feed
would be passed to iCal, which would merge the group schedule with the
rest of my calendar information so I could see the group events
alongside my other meetings, deadlines, and reminders about when the
recycling is picked up on our street.

IMAP Basics

The calendar data was only accessible to me as attachments on email
messages accessed via an IMAP server. The messages were grouped into
several folders, with each folder representing a separate public
calendar used for a different purpose (meeting room schedules, event
planning, holiday and vacation schedules, etc.). I had read-only
access to all of the email messages in the public calendar
folders. Each email message typically had one attachment describing a
single event. To produce the merged calendar, I needed to scan several
folders, read each message in the folder, find and parse the calendar
data in the attachments, and identify the calendar events. Once I
identified the events to include in the output, I needed to add them
to an output file in a format iCal understands.

Python’s standard library includes the imaplib module for working
with IMAP servers. The IMAP4 and IMAP4_SSL classes provide a high
level interface to all of the features I needed: connecting to the
server securely, accessing mailboxes, finding messages, and
downloading them. To experiment with retrieving data from the IMAP
server, I started by establishing a secure connection to the server on
the standard port for IMAP-over-SSL, and logging in using my regular
account. This would not be a desirable way to run the final program on
a regular basis, but it works fine for development and testing.

mail_server = imaplib.IMAP4_SSL(hostname)
mail_server.login(username, password)

It is also possible to use IMAP over a non-standard port. In that
case, the caller can pass port as an additional option to
imaplib.IMAP4_SSL(). To work with an IMAP server without SSL
encryption, you can use the IMAP4 class, but using SSL is
definitely preferred.

mail_server = imaplib.IMAP4_SSL(hostname, port)
mail_server.login(username, password)

The connection to the IMAP server is “stateful”. The client remembers
which methods have been called on it, and changes its internal state
to reflect those calls. The internal state is used to detect logical
errors in the sequence of method calls without the round-trip to the

On an IMAP server, messages are organized into “mailboxes”. Each
mailbox has a name and, since mailboxes might be nested, the full name
of the mailbox is the path to that mailbox. Mailbox paths work just
like paths to directories or folders in a filesystem. The paths are
single strings, with levels usually separated by a forward slash
(/) or period (.). The actual separator value used depends on
the configuration of your IMAP server; one of my servers uses a slash,
while the other uses a period. If you do not already know how your
server is set up, you will need to experiment to determine the correct
values for folder names.

Once I had my client connected to the server, the next step was to
call select() to set the mailbox context to be used when searching
for and downloading messages.'Public Folders/EventCalendar')
# or'Public Folders.EventCalendar')

After a mailbox is selected, it is possible to retrieve messages from
the mailbox using search(). The IMAP method search() supports
filtering to identify only the messages you need. You can search for
messages based on the content of the message headers, with the rules
evaluated in the server instead of your client, thus reducing the
amount of information the server has to transmit to the client. Refer
to RFC 3501 (“Internet Message Access Protocol”) for details about the
types of queries which can be performed and the syntax for passing the
query arguments.

In order to implement mailbox2ics, I needed to look at all of the
messages in every mailbox the user named on the command line, so I
simply used the filter “ALL” with each mailbox. The return value
from search() includes a response code and a string with the
message numbers separated by spaces. A separate call is required to
retrieve more details about an individual message, such as the headers
or body.

(typ, [message_ids]) =, 'ALL')
message_ids = message_ids.split()

Individual messages are retrieved via fetch(). If only part of the
message is desired (size, envelope, body), that part can be fetched to
limit bandwidth. I could not predict which subset of the message body
might include the attachments I wanted, so it was simplest for me to
download the entire message. Calling fetch(“(RFC822)”) returns a
string containing the MIME-encoded version of the message with all
headers intact.

typ, message_parts = mail_server.fetch(
    message_ids[0], '(RFC822)')
message_body = message_parts[0][1]

Once the message body had been downloaded, the next step was to parse
it to find the attachments with calendar data. Beginning with version
2.2.3, the Python standard library has included the email package
for working with standards-compliant email messages. There is a
straightforward factory for converting message text to Message
objects. To parse the text representation of an email and create a
Message instance from it, use email.message_from_string().

msg = email.message_from_string(message_body)

Message objects are almost always made up of multiple parts. The parts
of the message are organized in a tree structure, with message
attachments supporting nested attachments. Subparts or attachments can
even include entire email messages, such as when you forward a message
which already contains an attachment to someone else. To iterate over
all of the parts of the Message tree recursively, use the walk()

for part in msg.walk():
    print part.get_content_type()

Having access to the email package saved an enormous amount of time on
this project. Parsing multi-part email messages reliably is tricky,
even with (or perhaps because of) the many standards involved. With
the email package, in just a few lines of Python, you can parse and
traverse all of the parts of even the most complex standard-compliant
multi-part email message, giving you access to the type and content of
each part.

Accessing Calendar Data

The “Internet Calendaring and Scheduling Core Object Specification”,
or iCalendar, is defined in RFC 2445. iCalendar is a data format
for sharing scheduling and other date-oriented information. One
typical way to receive an iCalendar event notification, such as an
invitation to a meeting, is via an email attachment. Most standard
calendaring tools, such as iCal and Outlook, generate these email
messages when you initially “invite” another participant to a meeting,
or update an existing meeting description. The iCalendar standard says
the file should have filename extension ICS and mime-type
text/calendar. The input data for mailbox2ics came from email
attachments of this type.

The iCalendar format is text-based. A simple example of an ICS file
with a single event is provided in Listing 1. Calendar events have
properties to indicate who was invited to an event, who originated it,
where and when it will be held, and all of the other expected bits of
information important for a scheduled event. Each property of the
event is encoded on its own line, with long values wrapped onto
multiple lines in a well-defined way to allow the original content to
be reconstructed by a client receiving the iCalendar representation of
the data. Some properties also can be repeated, to handle cases such
as meetings with multiple invitees.

Listing 1

PRODID:-//Big Calendar Corp//Server Version X.Y.Z//EN

In addition to having a variety of single or multi-value properties,
calendar elements can be nested, much like email messages with
attachments. An ICS file is made up of a VCALENDAR component,
which usually includes one or more VEVENT components. A
VCALENDAR might also include VTODO components (for tasks on a
to-do list). A VEVENT may contain a VALARM, which specifies
the time and means by which the user should be reminded of the event.
The complete description of the iCalendar format, including valid
component types and property names, and the types of values which are
legal for each property, is available in the RFC.

This sounds complex, but luckily, I did not have to worry about
parsing the ICS data at all. Instead of doing the work myself, I took
advantage of an open source Python library for working with iCalendar
data released by Max M. ( His iCalendar library
(available from makes parsing ICS data sources very
simple. The API for the library was designed based on the email
package discussed previously, so working with Calendar instances and
email.Message instances is similar. Use the class method
Calendar.from_string() to parse the text representation of the
calendar data to create a Calendar instance populated with all of the
properties and subcomponents described in the input data.

from icalendar import Calendar, Event
cal_data = Calendar.from_string(open('sample.ics', 'rb').read())

Once you have instantiated the Calendar object, there are two
different ways to iterate through its components: via the walk()
method or subcomponents attribute. Using walk() will traverse
the entire tree and let you process each component in the tree
individually. Accessing the subcomponents list directly lets you
work with a larger portion of the calendar data tree at one time.
Properties of an individual component, such as the summary or start
date, are accessed via the __getitem__() API, just as with a
standard Python dictionary. The property names are not case sensitive.

For example, to print the “SUMMARY” field values from all top level
events in a calendar, you would first iterate over the subcomponents,
then check the name attribute to determine the component type. If
the type is VEVENT, then the summary can be accessed and printed.

for event in cal_data.subcomponents:
    if == 'VEVENT':
        print 'EVENT:', event['SUMMARY']

While most of the ICS attachments in my input data would be made up of
one VCALENDAR component with one VEVENT subcomponent, I did
not want to require this limitation. The calendars are writable by
anyone in the organization, so while it was unlikely that anyone would
have added a VTODO or VJOURNAL to public data, I could not
count on it. Checking for VEVENT as I scanned each component let
me ignore components with types that I did not want to include in the

Writing ICS data to a file is as simple as reading it, and only takes
a few lines of code. The Calendar class handles the difficult tasks of
encoding and formatting the data as needed to produce a fully
formatted ICS representation, so I only needed to write the formatted
text to a file.

ics_output = open('output.ics', 'wb')

Finding Max M’s iCalendar library saved me a lot of time and effort,
and demonstrates clearly the value of Python and open source in
general. The API is concise and, since it is patterned off of another
library I was already using, the idioms were familiar. I had not
embarked on this project eager to write parsers for the input data, so
I was glad to have libraries available to do that part of the work for

Putting It All Together

At this point, I had enough pieces to build a program to do what I
needed. I could read the email messages from the server via IMAP,
parse each message, and then search through its attachments to find
the ICS attachments. Once I had the attachments, I could parse them
and produce another ICS file to be imported into my calendar client.
All that remained was to tie the pieces together and give it a user
interface. The source for the resulting program,,
is provided in Listing 2.

Listing 2

#!/usr/bin/env python

"""Convert the contents of an imap mailbox to an ICS file.

This program scans an IMAP mailbox, reads in any messages with ICS
files attached, and merges them into a single ICS file as output.

# Import system modules
import imaplib
import email
import getpass
import optparse
import sys

# Import Local modules
from icalendar import Calendar, Event

# Module

def main():
    # Set up our options
    option_parser = optparse.OptionParser(
        usage='usage: %prog [options] hostname username mailbox [mailbox...]'
    option_parser.add_option('-p', '--password', dest='password',
                             help='Password for username',
    option_parser.add_option('--port', dest='port',
                             help='Port for IMAP server',
    option_parser.add_option('-v', '--verbose', 
                             help='Show progress',
    option_parser.add_option('-q', '--quiet', 
                             help='Do not show progress',
    option_parser.add_option('-o', '--output', dest="output",
                             help="Output file",

    (options, args) = option_parser.parse_args()
    if len(args) < 3:
        print >>sys.stderr, 'nERROR: Please specify a username, hostname, and mailbox.'
        return 1
    hostname = args[0]
    username = args[1]
    mailboxes = args[2:]

    # Make sure we have the credentials to login to the IMAP server.
    password = options.password or getpass.getpass(stream=sys.stderr)

    # Initialize a calendar to hold the merged data
    merged_calendar = Calendar()
    merged_calendar.add('prodid', '-//mailbox2ics//')
    merged_calendar.add('calscale', 'GREGORIAN')

    if options.verbose:
        print >>sys.stderr, 'Logging in to "%s" as %s' % (hostname, username)

    # Connect to the mail server
    if options.port is not None:
        mail_server = imaplib.IMAP4_SSL(hostname, options.port)
        mail_server = imaplib.IMAP4_SSL(hostname)
    (typ, [login_response]) = mail_server.login(username, password)
        # Process the mailboxes
        for mailbox in mailboxes:
            if options.verbose: print >>sys.stderr, 'Scanning %s ...' % mailbox
            (typ, [num_messages]) =
            if typ == 'NO':
                raise RuntimeError('Could not find mailbox %s: %s' % 
                                   (mailbox, num_messages))
            num_messages = int(num_messages)
            if not num_messages:
                if options.verbose: print >>sys.stderr, '  empty'

            # Find all messages
            (typ, [message_ids]) =, 'ALL')
            for num in message_ids.split():

                # Get a Message object
                typ, message_parts = mail_server.fetch(num, '(RFC822)')
                msg = email.message_from_string(message_parts[0][1])

                # Look for calendar attachments
                for part in msg.walk():
                    if part.get_content_type() == 'text/calendar':
                        # Parse the calendar attachment
                        ics_text = part.get_payload(decode=1)
                        importing = Calendar.from_string(ics_text)

                        # Add events from the calendar to our merge calendar
                        for event in importing.subcomponents:
                            if != 'VEVENT':
                            if options.verbose: 
                                print >>sys.stderr, 'Found: %s' % event['SUMMARY']
        # Disconnect from the IMAP server
        if mail_server.state != 'AUTH':

    # Dump the merged calendar to our output destination
    if options.output:
        output = open(options.output, 'wt')
        print str(merged_calendar)
    return 0

if __name__ == '__main__':
        exit_code = main()
    except Exception, err:
        print >>sys.stderr, 'ERROR: %s' % str(err)
        exit_code = 1

Since I wanted to set up the export job to run on a regular basis via
cron, I chose a command line interface. The main() function for starts out at line 24 with the usual sort of
configuration for command line option processing via the optparse
module. Listing 3 shows the help output produced when the program is
run with the -h option.

Listing 3

Usage: [options] hostname username mailbox [mailbox...]

  -h, --help            show this help message and exit
  -p PASSWORD, --password=PASSWORD
                        Password for username
  --port=PORT           Port for IMAP server
  -v, --verbose         Show progress
  -q, --quiet           Do not show progress
  -o OUTPUT, --output=OUTPUT
                        Output file

The –password option can be used to specify the IMAP account
password on the command line, but if you choose to use it consider the
security implications of embedding a password in the command line for
a cron task or shell script. No matter how you specify the password, I
recommend creating a separate mailbox2ics account on the IMAP server
and limiting the rights it has so no data can be created or deleted
and only public folders can be accessed. If –password is not
specified on the command line, the user is prompted for a password
when they run the program. While less useful with cron, providing the
password interactively can be a solution if you are unable, or not
allowed, to create a separate restricted account on the IMAP server.
The account name used to connect to the server is required on the
command line.

There is also a separate option for writing the ICS output data to a
file. The default is to print the sequence of events to standard
output in ICS format. Though it is easy enough to redirect standard
output to a file, the -o option can be useful if you are using the
-v option to enable verbose progress tracking and debugging.

The program uses a separate Calendar instance, merged_data, to
hold all of the ICS information to be included in the output. All of
the VEVENT components from the input are copied to merged_data
in memory, and the entire calendar is written to the output location
at the end of the program. After initialization (line 64),
merged_data is configured with some basic properties. PRODID
is required and specifies the name of the product which produced the
ICS file. CALSCALE defines the date system, or scale, used for the

After setting up merged_calendar, mailbox2ics connects to the IMAP
server. It tests whether the user has specified a network port using
–port and only passes a port number to imaplib if the user
includes the option. The optparse library converts the option value to
an integer based on the option configuration, so options.port is
either an integer or None.

The names of all mailboxes to be scanned are passed as arguments to
mailbox2ics on the command line after the rest of the option
switches. Each mailbox name is processed one at a time, in the for
loop starting on line 79. After calling select() to change the
IMAP context, the message ids of all of the messages in the mailbox
are retrieved via a call to search(). The full content of each
message in the mailbox is fetched in turn, and parsed with
email.message_from_string(). Once the message has been parsed, the
msg variable refers to an instance of email.Message.

Each message may have multiple parts containing different MIME
encodings of the same data, as well as any additional message
information or attachments included in the email which generated the
event. For event notification messages, there is typically at least
one human-readable representation of the event and frequently both
HTML and plain text are included. Of course, the message also includes
the actual ICS file, as well. For my purposes, only the ICS
attachments were important, but there is no way to predict where they
will appear in the sequence of attachments on the email message. To
find the ICS attachments, mailbox2ics walks through all of the parts
of the message recursively looking for attachments with mime-type
text/calendar (as specified in the iCalendar standard) and
ignoring everything else. Attachment names are ignored, since
mime-type is a more reliable way to identify the calendar data

for part in msg.walk():
    if part.get_content_type() == 'text/calendar':
        # Parse the calendar attachment
        ics_text = part.get_payload(decode=1)
        importing = Calendar.from_string(ics_text)

When it finds an ICS attachment, mailbox2ics parses the text of the
attachment to create a new Calendar instance, then copies the
VEVENT components from the parsed Calendar to merged_calendar.
The events do not need to be sorted into any particular order when
they are added to merged_calendar, since the client reading the
ICS file will filter and reorder them as necessary to displaying them
on screen. It was important to take the entire event, including any
subcomponents, to ensure that all alarms are included. Instead of
traversing the entire calendar and accessing each component
individually, I simply iterated over the subcomponents of the
top-level VCALENDAR node. Most of the ICS files only included one
VEVENT anyway, but I did not want to miss anything important if
that ever turned out not to be the case.

for event in importing.subcomponents:
    if != 'VEVENT':

Once all of the mailboxes, messages, and calendars are processed, the
merged_calendar refers to a Calendar instance containing all of
the events discovered. The last step in the process, starting at line
119, is for mailbox2ics to create the output. The event data is
formatted using str(merged_calendar), just as in the example
above, and written to the output destination selected by the user
(standard output or file).


Listing 4 includes sample output from running mailbox2ics to merge two
calendars for a couple of telecommuting workers, Alice and Bob. Both
Alice and Bob have placed their calendars online at
In the output of mailbox2ics, you can see that Alice has 2 events in
her calendar indicating the days when she will be in the office. Bob
has one event for the day he has a meeting scheduled with Alice.

Listing 4

$ -o group_schedule.ics mailbox2ics  "Calendars.Alice" "Calendars.Bob"
Logging in to "" as mailbox2ics
Scanning Calendars.Alice ...
Found: In the office to work with Bob on project proposal
Found: In the office
Scanning Calendars.Bob ...
Found: In the office to work with Alice on project proposal

The output file created by mailbox2ics containing the merged calendar
data from Alice and Bob’s calendars is shown in Listing 5. You can see
that it includes all 3 events as VEVENT components nested inside a
single VCALENDAR. There were no alarms or other types of
components in the input data.

Listing 5

SUMMARY:In the office to work with Bob on project proposal
SUMMARY:In the office
SUMMARY:In the office to work with Alice on project proposal

Mailbox2ics In Production

To solve my original problem of merging the events into a sharable
calendar to which I could subscribe in iCal, I scheduled mailbox2ics
to run regularly via cron. With some experimentation, I found that
running it every 10 minutes caught most of the updates quickly enough
for my needs. The program runs locally on a web server which has
access to the IMAP server. For better security, it connects to the
IMAP server as a user with restricted permissions. The ICS output
file produced is written to a directory accessible to the web server
software. This lets me serve the ICS file as static content on the web
server to multiple subscribers. Access to the file through the web is
protected by a password, to prevent unauthorized access.

Thoughts About Future Enhancements

Mailbox2ics does everything I need it to do, for now. There are a few
obvious areas where it could be enhanced to make it more generally
useful to other users with different needs, though. Input and output
filtering for events could be added. Incremental update support would
help it scale to manage larger calendars. Handling non-event data in
the calendar could also prove useful. And using a configuration file
to hold the IMAP password would be more secure than passing it on the
command line.

At the time of this writing, mailbox2ics does not offer any way to
filter the input or output data other than by controlling which
mailboxes are scanned. Adding finer-grained filtering support could
be useful. The input data could be filtered at two different points,
based on IMAP rules or the content of the calendar entries themselves.

IMAP filter rules (based on sender, recipient, subject line, message
contents, or other headers) would use the capabilities of and the IMAP server without much effort on my part.
All that would be needed are a few command line options to pass the
filtering rules, or code to read a configuration file. The only
difference in the processing by mailbox2ics would be to convert the
input rules to the syntax understood by the IMAP server and pass them
to search().

Filtering based on VEVENT properties would require a little more
work. The event data must be downloaded and checked locally, since the
IMAP server will not look inside the attachments to check the
contents. Filtering using date ranges for the event start or stop date
could be very useful, and not hard to implement. The Calendar class
already converts dates to datetime instances. The datetime
package makes it easy to test dates against rules such as “events in
the next 7 days” or “events since Jan 1, 2007”.

Another simple addition would be pattern matching against other
property values such as the event summary, organizer, location, or
attendees. The patterns could be regular expressions, or a simpler
syntax such as globbing. The event properties, when present in the
input, are readily available through the __getitem__() API of the
Calendar instance and it would be simple to compare them against the

If a large amount of data is involved, either spread across several
calendars or because there are a lot of events, it might also be
useful to be able to update an existing cached file, rather than
building the whole ICS file from scratch each time. Looking only at
unread messages in the folder, for example, would let mailbox2ics skip
downloading old events that are no longer relevant or already appear
in the local ICS file. It could then initialize merged_calendar by
reading from the local file before updating it with new events and
re-writing the file. Caching some of the results in this way would
place less load on the IMAP server, so the export could easily be run
more frequently than once every 10 minutes.

In addition to filtering to reduce the information included in the
output, it might also prove useful to add extra information by
including component types other than VEVENT. For example,
including VTODO would allow users to include a group action list
in the group calendar. Most scheduling clients support filtering the
to-do items and alarms out of calendars to which you subscribe, so if
the values are included in a feed, individual users can always ignore
the ones they choose.

As mentioned earlier, using the –password option to provide the
password to the IMAP server is convenient, but not secure. For
example, on some systems it is possible to see the arguments to
programs using ps. This allows any user on the system to watch for
mailbox2ics to run and observe the password used. A more secure way to
provide the password is through a configuration file. The file can
have filesystem permissions set so that only the owner can access
it. It could also, potentially, be encrypted, though that might be
overkill for this type of program. It should not be necessary to run
mailbox2ics on a server where there is a high risk that the password
file might be exposed.


Mailbox2ics was a fun project that took a me just a few hours over a
weekend to implement and test. This project illustrates two reasons
why I enjoy developing with Python. First, difficult tasks are made
easier through the power of the “batteries included” nature of
Python’s standard distribution. And second, coupling Python with the
wide array of other open source libraries available lets you get the
job done, even when the Python standard library lacks the exact tool
you need. Using the ICS file produced by mailbox2ics, I am now able to
access the calendar data I need using my familiar tools, even though
iCalendar is not supported directly by the group’s calendar server.

Originally published in Python Magazine Volume 1 Issue 10 , October, 2007