Automated Testing with unittest and Proctor

Originally published in Python Magazine Volume 2 Issue 3 , March, 2008

Automated testing is an important part of Agile development
methodologies, and the practice is seeing increasing adoption even in
environments where other Agile tools are not used. This article
discusses testing techniques for you to use with the open source tool
Proctor. By using Proctor, you will not only manage your automated
test suite more effectively, but you will also obtain better results
in the process.

What is Proctor?

Proctor is a tool for running automated tests in Python source
code. It scans input source files looking for classes based on the
TestCase class from the unittest module in the Python standard
library. You can use arbitrary organization schemes for tests defined
in separate source modules or directories by applying user defined
categories to test classes. Proctor constructs test suites dynamically
at run time based on your categories, making it easy to run a subset
of the tests even if they are not in the same location on the
filesystem. Proctor has been specifically designed to operate on a
large number of tests (more than 3500 at one site). Although it
depends on the unittest module, Proctor is also ideally suited for use
with integration or higher level tests, because it is easy to
configure it to run unattended.

Installation

Proctor uses the standard distutils module tools for installation
support. If you have previously installed easy_install, using it
is the simplest way to install packages such as Proctor that are
listed in the Python Package Index.

$ sudo easy_install Proctor

Running easy_install will download and install the most recent
version by default. If you do not have easy_install, download the
latest version of the Proctor source code from the home page (see the
references list for this article), then install it as you would any
other Python package:

$ tar zxvf Proctor-1.2.tar.gz
$ cd Proctor-1.2
$ sudo python setup.py install

Once Proctor is installed, you will find a command line program,
proctorbatch, in your shell execution path. Listing 1 shows the
command syntax for proctorbatch. I will examine the command line
options in more detail throughout the rest of this article using a few
simple tests.

Listing 1

proctorbatch



    Proctor is a tool for running unit tests.  It enhances the
    existing unittest module to provide the ability to find all tests
    in a set of code, categorize them, and run some or all of them.
    Test output may be generated in a variety of formats to support
    parsing by another tool or simple, nicely formatted, reports for
    human review.



SYNTAX:

    proctorbatch [<options>] [<directory name> ...]

        --category=categoryName
        --coverage-exclude=pattern
        --coverage-file=filename
        --debug
        --interleaved
        --list
        --list-categories
        --no-coverage
        --no-gc
        --no-run
        --parsable
        -q
        -v


OPTIONS:

    -h             Displays abbreviated help message.

    --help         Displays complete usage information.

    --category=categoryName
                   Run only the tests in the specified category.

                   Warning: If there are no tests in a category,
                   an error will not be produced.  The test suite
                   will appear to be empty.


    --coverage-exclude=pattern
                   Add a line exclude pattern
                   (can be a regular expression).


    --coverage-file=filename
                   Write coverage statistics to the specified file.


    --debug        Turn on debug mode to see tracebacks.


    --interleaved  Interleave error and failure messages
                   with the test list.


    --list         List tests.


    --list-categories
                   List test categories.


    --no-coverage  Disable coverage analysis.


    --no-gc        Disable garbage collection and leak reporting.


    --no-run       Do not run the tests


    --parsable     Format output to make it easier to parse.


    -q             Turn on quiet mode.


    -v             Increment the verbose level.
                   Higher levels are more verbose.
                   The default is 1.

Sample Tests and Standard unittest Features

The simplest sample set of test cases needs to include at least three
tests: one to pass, one to fail, and one to raise an exception
indicating an error. For this example, I have separated the tests into
three classes and provided two test methods on each class. Listing 2
shows the code to define the tests, including the standard
unittest boilerplate code for running them directly.

Listing 2

#!/usr/bin/env python
# Sample tests for exercising Proctor.

import unittest

class PassingTests(unittest.TestCase):

    def test1(self):
        return

    def test2(self):
        return

class FailingTests(unittest.TestCase):

    def test1(self):
        self.fail('Always fails 1')
        return

    def test2(self):
        self.fail('Always fails 2')
        return

class ErrorTests(unittest.TestCase):

    def test1(self):
        raise RuntimeError('test1 error')

    def test2(self):
        raise RuntimeError('test2 error')

if __name__ == '__main__': # pragma: no cover
    unittest.main()

When python Listing2.py is run, it invokes the unittest module’s
main() function. As main() runs, the standard test loader is
used to find tests in the current module, and all of the discovered
tests are executed one after the other. It is also possible to name
individual tests or test classes to be run using arguments on the
command line. For example, python Listing2.py PassingTests runs
both of the tests in the PassingTests class. This standard
behavior is provided by the unittest module and is useful if you know
where the tests you want to run are located in your code base.

It is also possible to organize tests from different classes into
“suites”. You can create the suites using any criteria you like –
themes, feature areas, level of abstraction, specific bugs, etc. For
example, this code sets up a suite containing two test cases:

import unittest
from Listing2  import *

suite1 = unittest.TestSuite([PassingTests('test1'), FailingTests('test1')])
unittest.main(defaultTest='suite1')

When run, the above code would execute the tests
PassingTests.test1 and FailingTests.test1, since those are
explicitly included in the suite. The trouble with creating test
suites in this manner is that you have to maintain them by hand. Any
time new tests are created or obsolete tests are removed, the suite
definitions must be updated as well. This may not be a lot of work for
small projects, but as project size and test coverage increases, the
extra work can become unmanageable very quickly.

Proctor was developed to make working with tests across classes,
modules, and directories easier by eliminating the manual effort
involved in building suites of related tests. Over the course of
several years, the set of automated tests we have written for my
company’s product has grown to contain over 3500 individual tests.

Our code is organized around functional areas, with user interface and
back-end code separated into different modules and packages. In order
to run the automated tests for all aspects of a specific feature, a
developer may need to run tests in several modules from different
directories in their sandbox. By building on the standard library
features of unittest, Proctor makes it easy to manage all of the tests
no matter where they are located in the source tree.

Running Tests with Proctor

The first improvement Proctor makes over the standard unittest test
loader is that Proctor can scan multiple files to find all of the
tests, then run them in a single batch. Each Python module specified
on the command line is imported, one at a time. After a module is
loaded, it is scanned for classes derived from unittest.TestCase,
just as with the standard test loader. All of the tests are added to a
test suite, and when the scanner is finished loading the test modules,
all of the tests in the suite are run.

For example, to scan all Python files in the installed version of
proctorlib for tests you would run:

$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib/*.py

Proctor also accepts directory names as arguments, so the command can
be written:

$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib

Proctor will search recursively down through any directories given to
find all of the tests in any modules in subdirectories. The file or
directory names are converted to importable package names, so that
directory/file.py is imported as directory.file. If your code
is organized under a single Python package, and you wish to run all of
the tests in the package, you only need to specify the root directory
for that package.

Expanding Automated Testing Beyond Unit Tests

Tests are often categorized by developers based on the scope of the
functionality being tested as either “unit” or “integration” tests. A
unit test is usually a very low level test that requires few or no
external resources and verifies the functionality of an isolated
feature (such as a single method of a single class). An integration
test, by contrast, depends on the interaction of several classes or
instances, and is designed to ensure that the API between the objects
works as expected. For example, an integration test might verify that
an ORM library stores data in the expected tables, or that temporary
files are managed correctly when some filesystem actions are
performed.

At the company where I work, we use the unittest framework found in
the Python standard library for all of our automated unit and
integration tests. It is convenient for us to use a single framework,
because it means the developers only have to manage one set of test
tools. Another benefit is the nightly batch job that runs the
integration tests also includes all of the unit tests at the same
time. By running the unit and integration tests automatically every
night, we can identify regression errors we might not have otherwise
detected until later in the testing cycle. The integration tests use
database and filesystem resources created as fixtures by the test
setUp and tearDown hooks. Our developers can run unit-level
tests directly with unittest.main(), or test entire source
packages with Proctor. The code for integration tests may be mingled
in the same modules with the unit tests or in separate modules,
depending on how the developer responsible for the area of code in
question has it organized.

Some of the tests we have written need to use hardware that may not
always be available in the test environment. We write automated tests
for all of the “device driver” modules that are used to integrate
infrastructure devices such as network switches, load balancers, and
storage arrays with our product. These tests typically require an
actual device for the test to run successfully, since the tests
reconfigure it and then verify the results match what is
expected. This situation poses a problem, since the test equipment is
not always present in every test environment. Sometimes we only have
one compatible device. At other times, a device was on loan and so may
have been returned to the original vendor after the driver was
finished. In both of these cases, the test setup code will not be able
to find the equipment required for testing. Under these circumstances,
the tests will produce an error every time they run, and it is useful
to be able to skip over them and thus avoid false alarms and wasted
test time.

Proctor solves this problem with a flag that causes it to ignore the
tests in a module. To tell Proctor to ignore a specific module, add
the module level variable __proctor_ignore_module__ in the
source. Listing 3 shows an example with this flag set. Proctor still
imports the module, but when it sees the flag set to True, it does
not scan the contents of the module for tests. When the resource
needed by the tests becomes available in our lab and it is time to
test a device driver, we simply run the test file directly instead of
using Proctor.

Listing 3

#!/usr/bin/env python
# The tests in this module are ignored by Proctor

import unittest

# Tell Proctor to ignore the tests in this module.
__proctor_ignore_module__ = True

class IgnoredTest(unittest.TestCase):

    def testShouldNotBeRun(self):
        self.fail('This test will not be run by Proctor')
        return

if __name__ == '__main__':
    # If this file is run directly, the tests are still executed.
    unittest.main()

Some of our other tests use resources available when the tests are run
on a developer workstation, but not when they are run as part of a
nightly batch job. For example, some portions of the graphical user
interface for our product have automated tests, but since it is an X
Windows application, those tests cannot be run without an X Windows
server, which is not present on the automated test server. Since all
of the GUI code is in one directory, it is easier to instruct Proctor
to ignore all of the modules in that directory instead of setting the
ignore flag separately for each file.

Proctor supports ignoring entire directories through a configuration
file named .proctor. The configuration file in each directory can
be used to specify modules or subdirectories to be ignored by Proctor
when scanning for tests. The files or directories specified in the
ignore variable are not imported at all, so if importing some
modules would fail without resources like an X Windows server
available, you can use a .proctor file as a more effective method
of ignoring them rather than setting the ignore flag inside the
source. All of the file or directory names in the ignore list are
relative to the directory containing the configuration file. For
example, to ignore the productname.gui package, create a file in
the productname directory containing ignore = [“gui”], like
this:

# Proctor instructions file ".proctor"

# Importing the gui module requires an X server,
# which is not available for the nightly test batch job.
ignore = [ 'gui' ]

The .proctor file uses Python syntax and can contain any legal
Python code. This means you can use modules such as os and
glob to build up the list of files to be ignored, following any
rules you want to establish. Here is a more sophisticated example
which only disables the GUI tests if it cannot find the X server they
require:

import os

ignore = []
if os.environ.get('DISPLAY') is None:
    ignore.append('gui')

Organizing Tests Beyond Classes and Modules

The usual way to organize related test functions is by placing them
together in the same class, and then by placing related classes
together in the same module. Such a neat organizational scheme is not
always possible, however, and related tests might be in different
modules or even in different directories. Sometimes, test modules grow
too large and need to be broken up so they are easier to maintain. In
other cases, when a feature is implemented, different aspects of the
code may be spread among files in multiple source directories,
reflecting the different layers of the application. Proctor can use
test categories to dynamically construct a test suite of related tests
without requiring the test authors to know about all of the tests in
advance or to update a test suite manually.

Proctor uses simple string identifiers as test categories, much like
the tags commonly found in a Web 2.0 application. It is easy to add
categories to your existing tests by setting the class attribute
PROCTOR_TEST_CATEGORIES to a sequence of strings; no special base
class is needed. Then tell proctorbatch to limit the test suite to
tests in specific category using the –category option.

Using proctorbatch

Listing 4 shows some new test classes with categories that are useful
as examples to demonstrate how the command line options to
proctorbatch work. The first class, FeatureOneTests, is
categorized as being related to “feature1”. The tests in the second
class, FeatureOneAndTwoTests, are categorized as being related to
both “feature1” and “feature2”, representing a set of integration
level tests verifying the interface between the two features. The
UncategorizedTests class is not included in any category. Now that
the test classes are defined, I will show how to use proctorbatch
to work with them in a variety of ways.

Listing 4

#!/usr/bin/env python
# Categorized tests.

import unittest

class FeatureOneTests(unittest.TestCase):
    "Unit tests for feature1"

    PROCTOR_TEST_CATEGORIES = ( 'feature1',)

    def test(self):
        return

class FeatureOneAndTwoTests(unittest.TestCase):
    "Integration tests for feature1 and feature2"

    PROCTOR_TEST_CATEGORIES = ( 'feature1', 'feature2', )

    def test1(self):
        return

    def test2(self):
        return

class UncategorizedTests(unittest.TestCase):
    "Not in any category"

    def test(self):
        return

if __name__ == '__main__':
    unittest.main()

Proctor provides several command line options that are useful for
examining a test suite without actually running the tests.

Proctor provides several command line options that are useful for
examining a test suite without actually running the tests. To print a
list of the categories for all tests in the module, use the
–list-categories option:

$ proctorbatch -q --list-categories Listing4.py
All
Unspecified
feature1
feature2

The output is an alphabetical listing of all of the test category
names for all of the tests found in the input files. Proctor creates
two categories automatically every time it is run. The category named
“All” contains every test discovered. The “Unspecified” category
includes any test that does not have a specific category, making it
easy to find uncategorized tests when a test set starts to become
unwieldy or more complex. When a test class does not have any
categories defined, its tests are run when no –category option is
specified on the command line to proctorbatch, or when the “All”
category is used (the default).

To examine the test set to see which tests are present, use the
–list option instead:

$ proctorbatch -q --list Listing4.py
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2
test: Listing4.FeatureOneTests.test
test: Listing4.UncategorizedTests.test

And to see the tests only for a specific category, use the
–category and –list options together:

$ proctorbatch -q --list --category feature2 Listing4.py
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2

To see the list of uncategorized tests, use the category
“Unspecified”:

$ proctorbatch -q --list --category Unspecified Listing4.py
test: Listing4.UncategorizedTests.test

After verifying that a category includes the right tests, to run the
tests in the category, use the –category option without the
–list option:

$ proctorbatch --category feature2 Listing4.py
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing4.FeatureOneAndTwoTests) ... ok
test2 (test: Listing4.FeatureOneAndTwoTests) ... ok

---------------------------------------------------
Ran 2 tests in 0.002s

OK

Identifying Test Categories

While test category names can hold any meaning you want to give them,
over time I have found that using broad categories is more desirable
than using narrowly defined categories. When a category is too
narrowly focused, the tests are more likely to be in the same module
or directory anyway. In that case, there is not as much purpose to be
served by defining the category, since it is easy enough to just run
the tests in that file or directory.

When using a broad category, it is more likely that the tests involved
will span multiple directories. At that point, having a single
category to encompass them becomes a useful way to consolidate the
tests. Suppose, for example, there is an application that
authenticates a user before allowing an action. It has a User
class to manage users and verify their credentials. It also has a
command line interface that depends on the User class to perform
authentication. There are unit tests for methods of the User
class, and integration tests to ensure that authentication works
properly in the command line program. Since the command line program
is unlikely to be in the same section of the source tree as the
low-level module containing the User class, it would be beneficial
to define a test category for “authentication” tests so all of the
related tests can be run together.

These sorts of broad categories are also useful when a feature
involves many aspects of a system at the same level. For example, when
the user edits data through a web application, the user
authentication, session management, cookie handling, and database
aspects might all be involved at different points. A “login” category
could be applied to unit tests from each aspect, so the tests can be
run individually or as a group. Adding categories makes it immensely
easier to run the right tests to identify regression errors when
changes could affect multiple areas of a large application.

Monitoring Test Progress By Hand

Proctor accepts several command line options to control the format of
the output of your test run, depending on your preference or need.
The default output format uses the same style as the unittest test
runner. The verbosity level is set to 1 by default, so the full
names of all tests are printed along with the test outcome. To see
only the pass/fail status for the tests, reduce the verbosity level by
using the -q option. See Listing 5 for an example of the default
output.

Listing 5

$ proctorbatch  Listing2.py
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing2.FailingTests) ... FAIL
test2 (test: Listing2.FailingTests) ... FAIL
test1 (test: Listing2.PassingTests) ... ok
test2 (test: Listing2.PassingTests) ... ok

======================================================================
FAIL: test1 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1

======================================================================
FAIL: test2 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2

----------------------------------------------------------------------
Ran 4 tests in 0.006s

FAILED (failures=2)

$ proctorbatch  -q Listing2.py
FF..
======================================================================
FAIL: test1 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1

======================================================================
FAIL: test2 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2

----------------------------------------------------------------------
Ran 4 tests in 0.007s

FAILED (failures=2)

When using the default format, Proctor does not print any failure or
error messages until all of the tests have run. If your test suite is
very large, or the integration tests require fixtures that take a lot
of time to configure, you may not want to wait for the tests to finish
before discovering which tests have not passed. When that is the case,
you can use the –interleaved option to show the tests results
along with the name of the test as the test runs, as illustrated in
Listing 6.

Listing 6

$ proctorbatch  --interleaved --no-gc Listing2.py
Writing coverage output to .coverage
Scanning: .
  1/  6 test: Listing2.ErrorTests.test1 ...ERROR in test: Listing2.ErrorTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 27, in test1
    raise RuntimeError('test1 error')
RuntimeError: test1 error

  2/  6 test: Listing2.ErrorTests.test2 ...ERROR in test: Listing2.ErrorTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 30, in test2
    raise RuntimeError('test2 error')
RuntimeError: test2 error

  3/  6 test: Listing2.FailingTests.test1 ...FAIL in test: Listing2.FailingTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1

  4/  6 test: Listing2.FailingTests.test2 ...FAIL in test: Listing2.FailingTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2

  5/  6 test: Listing2.PassingTests.test1 ...ok
  6/  6 test: Listing2.PassingTests.test2 ...ok

Ran 6 tests in 0.013s

FAILED (failures=2, errors=2)

Automatic Test Output Processing

For especially large test runs, or if you are committed to more
complete test automation, you may not want to examine the test results
by hand at all. Proctor can also produce a simple parsable output
format suitable for automatic processing. The output format can be
processed by another program to summarize the results or even open
tickets in your defect tracking system. To have Proctor report the
test results in this format, pass the –parsable option to
proctorbatch on the command line. Listing 7 includes a sample of
the parsable output format.

Listing 7

$ proctorbatch  --parsable --no-gc Listing2.py
Writing coverage output to .coverage
Scanning: .
__PROCTOR__ Start run
__PROCTOR__ Start test
test: Listing2.ErrorTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 27, in test1
    raise RuntimeError('test1 error')
RuntimeError: test1 error

__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  1/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.ErrorTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 30, in test2
    raise RuntimeError('test2 error')
RuntimeError: test2 error

__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  2/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  3/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  4/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test1
__PROCTOR__ Start results
ok
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  5/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test2
__PROCTOR__ Start results
ok
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  6/  6
__PROCTOR__ End progress
__PROCTOR__ End run
__PROCTOR__ Start summary
Failures: 2
Errors: 2
Successes: 2
Tests: 6
Elapsed time (sec): 0.014
Status: FAILED
__PROCTOR__ End summary

Since the test results may be part of a larger log file that includes
other information such as build output and installation messages,
Proctor uses easily identifiable delimiters to separate the sections
in its output. Each delimiter appears on a line by itself, and begins
with __PROCTOR__ to make it less likely that the output of any
other program will be misinterpreted as test output.

Proctor assumes there is no need to automatically process the output
of the scanning phase, so the first delimiter (__PROCTOR__ Start
run
) is printed at the beginning of the test execution phase. The
string __PROCTOR__ Start test appears at the beginning of each
test, followed on the next line by the name of the test. Any output
produced by the test appears beginning on the line immediately
following the name. The test output is followed by a traceback, if the
test does not pass.

The text between the __PROCTOR__ Start results and __PROCTOR__
End results
delimiters always begins with one of ok, ERROR,
or FAIL, depending on the outcome of the test. If the test did not
pass, the rest of the text in the results section consists of the full
name of the test. The string __PROCTOR__ End test follows each
test result. Between the results for each test, a progress section
shows the current test number and the total number of tests being run.

Proctor comes with proctorfilter, a simple command line program to
process a log file and print the names of tests with certain status
codes. It accepts three command line options, –ok, –error,
and –fail, to control which tests are listed in the output. For
example, to find the tests which failed in the sample output, run:

$ proctorfilter --fail Listing7.txt
test: Listing2.FailingTests.test1: FAIL
test: Listing2.FailingTests.test2: FAIL

The default behavior for proctorfilter, when no command line
options are given, is to print a list of tests that either had an
error or failed.

Building Your Own Results Parser

Using proctorfilter to summarize a set of test results is only one
way to automate results processing for your tests. Another way to
handle the test results is to open a new ticket in a bug tracking
system for each test that does not pass during the nightly test
run. When the ticket is opened, it should include all of the
information available, including output from the test and the
traceback from the failure or error. Although proctorfilter does
not include all of that information, the Proctor library also includes
a result module, with classes useful for building your own test
result processing program.

Listing 8 shows a sample program that recreates the default
functionality of proctorfilter using the proctorlib.result
module. The ResultFactory class parses input text passed to
feed() and creates TestResult instances. Each time a complete
test result has been fed in, a new TestResult is constructed and
passed as an argument to the callback given to the ResultFactory
constructor. In the sample program, the callback function
show_test_result() looks at the status code for the test before
deciding whether to print out the summary.

Listing 8

#!/usr/bin/env python
# Print a list of tests which did not pass.

import fileinput
from proctorlib.result import ResultFactory, TestResult

def show_test_result(test_result):
    "Called for each test result parsed from the input data."
    if not test_result.passed():
        print test_result
    return

# Set up the parser
parser = ResultFactory(show_test_result)

# Process data from stdin or files named via sys.argv
for line in fileinput.input():
    parser.feed(line)

A TestResult instance has several attributes of interest. The
name attribute uniquely identifies the test. The name includes the
full import path for the module, as well as the class and method name
of the test. The output attribute includes all of the text
appearing between the __PROCTOR__ Start test and __PROCTOR__
Start results
delimiters, including the traceback, if any. The
result attribute includes the full text from between __PROCTOR__
Start results
and __PROCTOR__ End results, while status
contains only the status code. The status will be the same as one of
TestResult.OK, TestResult.ERROR, or TestResult.FAIL. The
passed() method returns True if the test status is
TestResult.OK and False otherwise.

Code Coverage

At the same time it is running the automated tests, Proctor uses Ned
Batchelder’s coverage module to collect information about which
statements in the source files are actually executed. The code
coverage statistics gathered by coverage can be used to identify areas
of the code that need to have more automated tests written.

By default, proctorbatch writes the code coverage statistics to
the file ./.coverage. Use the –coverage-file option to change
the filename used. To disable coverage statistics entirely, use the
–no-coverage option.

Statistics are normally collected for every line of the source being
run. Some lines should not be included in the statistics, though, if
the code includes debugging sections that are disabled while the tests
are running. In that case, use the –coverage-exclude option to
specify regular expressions to be compared against the source code.
If the source matches the pattern, the line is not included in the
statistics counts. To disable checking for lines that match the
pattern if DEBUG:, for example, add –coverage-exclude=”if
DEBUG:”
to the command line. The –coverage-exclude option can
be repeated for each pattern to be ignored.

Once the test run is complete, use coverage.py to produce a report
with information about the portions of the code that were not executed
and the percentage that was. For example, in the following listing the
return statements in the test methods of the FailingTests
class from Listing 2 are never executed. They were skipped because
both of the tests fail before reaching the end of the function.

$ coverage.py -r -m Listing2.py
Name       Stmts   Exec  Cover   Missing
----------------------------------------
Listing2      18     16    88%   18, 22

Refer to the documentation provided by coverage.py –help for more
information on how to print code coverage reports.

Garbage Collection

Proctor can also be used to help identify the source of memory
leaks. When using the interleaved or parsable output formats, Proctor
uses the gc module functions for garbage collection to report on
objects that have not been cleaned up.

Listing 9 defines a test that introduces a circular reference between
two lists, a and b, by appending each to the other. Normally,
when processing leaves a function’s scope, the local variables are
marked so they can be deleted and their memory reclaimed. In this
case, however, since both lists are still referenced from an object
that has not been deleted, the lists are not automatically cleaned up
when the test function returns. The gc standard library module
includes an interface to discover uncollected garbage objects like
these lists, and Proctor includes a garbage collection report in the
output for each test, as in Listing 10. The garbage collection
information can be used to determine which test was being run when the
memory leaked, and then to narrow down the source of the leak.

Listing 9

#!/usr/bin/env python
# Test code with circular reference to illustrate garbage collection

import unittest

class CircularReferenceTest(unittest.TestCase):

    def test(self):
        a = []
        b = []
        b.append(a)
        a.append(b)
        return

Listing 10

$  proctorbatch --interleaved Listing9.py
Writing coverage output to .coverage
Scanning: .
  0/  1 test: Listing9.CircularReferenceTest.test ...ok
GC: Collecting...
GC: Garbage objects:
<type 'list'>
  [[[...]]]
<type 'list'>
  [[[...]]]

Ran 1 tests in 0.180s

OK

Conclusion

Automated testing is perhaps one of the biggest productivity
enhancements to come out of the Agile development movement. Even if
you are not doing Test Driven Development, using automated testing to
identify regression errors can provide great peace of mind. The basic
tools provided in the Python standard library do support automated
testing, but they tend to be targeted at library or module developers
rather than large scale projects. I hope this introduction to Proctor
has suggested a few new ideas for expanding your own use of automated
tests, and for managing those tests as your project size and scope
grows.

I would like to offer a special thanks to Ned Batchelder for his help
with integrating coverage.py and Proctor, and Mrs. PyMOTW for her help
editing this article.