Automated Testing with unittest and Proctor

Automated testing is an important part of Agile development methodologies, and the practice is seeing increasing adoption even in environments where other Agile tools are not used. This article discusses testing techniques for you to use with the open source tool Proctor. By using Proctor, you will not only manage your automated test suite more effectively, but you will also obtain better results in the process.

What is Proctor?

Proctor is a tool for running automated tests in Python source code. It scans input source files looking for classes based on the TestCase class from the unittest module in the Python standard library. You can use arbitrary organization schemes for tests defined in separate source modules or directories by applying user defined categories to test classes. Proctor constructs test suites dynamically at run time based on your categories, making it easy to run a subset of the tests even if they are not in the same location on the filesystem. Proctor has been specifically designed to operate on a large number of tests (more than 3500 at one site). Although it depends on the unittest module, Proctor is also ideally suited for use with integration or higher level tests, because it is easy to configure it to run unattended.

Installation

Proctor uses the standard distutils module tools for installation support. If you have previously installed easy_install, using it is the simplest way to install packages such as Proctor that are listed in the Python Package Index.

$ sudo easy_install Proctor

Running easy_install will download and install the most recent version by default. If you do not have easy_install, download the latest version of the Proctor source code from the home page (see the references list for this article), then install it as you would any other Python package:

$ tar zxvf Proctor-1.2.tar.gz
$ cd Proctor-1.2
$ sudo python setup.py install

Once Proctor is installed, you will find a command line program, proctorbatch, in your shell execution path. Listing 1 shows the command syntax for proctorbatch. I will examine the command line options in more detail throughout the rest of this article using a few simple tests.

Listing 1

proctorbatch

    Proctor is a tool for running unit tests.  It enhances the
    existing unittest module to provide the ability to find all tests
    in a set of code, categorize them, and run some or all of them.
    Test output may be generated in a variety of formats to support
    parsing by another tool or simple, nicely formatted, reports for
    human review.

SYNTAX:

    proctorbatch [] [ ...]

        --category=categoryName
        --coverage-exclude=pattern
        --coverage-file=filename
        --debug
        --interleaved
        --list
        --list-categories
        --no-coverage
        --no-gc
        --no-run
        --parsable
        -q
        -v


OPTIONS:

    -h             Displays abbreviated help message.

    --help         Displays complete usage information.

    --category=categoryName
                   Run only the tests in the specified category.

                   Warning: If there are no tests in a category,
                   an error will not be produced.  The test suite
                   will appear to be empty.


    --coverage-exclude=pattern
                   Add a line exclude pattern
                   (can be a regular expression).


    --coverage-file=filename
                   Write coverage statistics to the specified file.


    --debug        Turn on debug mode to see tracebacks.


    --interleaved  Interleave error and failure messages
                   with the test list.


    --list         List tests.


    --list-categories
                   List test categories.


    --no-coverage  Disable coverage analysis.


    --no-gc        Disable garbage collection and leak reporting.


    --no-run       Do not run the tests


    --parsable     Format output to make it easier to parse.


    -q             Turn on quiet mode.


    -v             Increment the verbose level.
                   Higher levels are more verbose.
                   The default is 1.

Sample Tests and Standard unittest Features

The simplest sample set of test cases needs to include at least three tests: one to pass, one to fail, and one to raise an exception indicating an error. For this example, I have separated the tests into three classes and provided two test methods on each class. Listing 2 shows the code to define the tests, including the standard unittest boilerplate code for running them directly.

Listing 2

#!/usr/bin/env python
# Sample tests for exercising Proctor.

import unittest

class PassingTests(unittest.TestCase):

    def test1(self):
        return

    def test2(self):
        return

class FailingTests(unittest.TestCase):

    def test1(self):
        self.fail('Always fails 1')
        return

    def test2(self):
        self.fail('Always fails 2')
        return

class ErrorTests(unittest.TestCase):

    def test1(self):
        raise RuntimeError('test1 error')

    def test2(self):
        raise RuntimeError('test2 error')

if __name__ == '__main__': # pragma: no cover
    unittest.main()

When Listing2.py is run, it invokes the unittest module’s main() function. As main() runs, the standard test loader is used to find tests in the current module, and all of the discovered tests are executed one after the other. It is also possible to name individual tests or test classes to be run using arguments on the command line. For example, Listing2.py PassingTests runs both of the tests in the PassingTests class. This standard behavior is provided by the unittest module and is useful if you know where the tests you want to run are located in your code base.

It is also possible to organize tests from different classes into “suites”. You can create the suites using any criteria you like – themes, feature areas, level of abstraction, specific bugs, etc. For example, this code sets up a suite containing two test cases:

import unittest
from Listing2  import *

suite1 = unittest.TestSuite([PassingTests('test1'), FailingTests('test1')])
unittest.main(defaultTest='suite1')

When run, the above code would execute the tests PassingTests.test1 and FailingTests.test1, since those are explicitly included in the suite. The trouble with creating test suites in this manner is that you have to maintain them by hand. Any time new tests are created or obsolete tests are removed, the suite definitions must be updated as well. This may not be a lot of work for small projects, but as project size and test coverage increases, the extra work can become unmanageable very quickly.

Proctor was developed to make working with tests across classes, modules, and directories easier by eliminating the manual effort involved in building suites of related tests. Over the course of several years, the set of automated tests we have written for my company’s product has grown to contain over 3500 individual tests.

Our code is organized around functional areas, with user interface and back-end code separated into different modules and packages. In order to run the automated tests for all aspects of a specific feature, a developer may need to run tests in several modules from different directories in their sandbox. By building on the standard library features of unittest, Proctor makes it easy to manage all of the tests no matter where they are located in the source tree.

Running Tests with Proctor

The first improvement Proctor makes over the standard unittest test loader is that Proctor can scan multiple files to find all of the tests, then run them in a single batch. Each Python module specified on the command line is imported, one at a time. After a module is loaded, it is scanned for classes derived from unittest.TestCase, just as with the standard test loader. All of the tests are added to a test suite, and when the scanner is finished loading the test modules, all of the tests in the suite are run.

For example, to scan all Python files in the installed version of proctorlib for tests you would run:

$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib/*.py

Proctor also accepts directory names as arguments, so the command can be written:

$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib

Proctor will search recursively down through any directories given to find all of the tests in any modules in subdirectories. The file or directory names are converted to importable package names, so that directory/file.py is imported as directory.file. If your code is organized under a single Python package, and you wish to run all of the tests in the package, you only need to specify the root directory for that package.

Expanding Automated Testing Beyond Unit Tests

Tests are often categorized by developers based on the scope of the functionality being tested as either “unit” or “integration” tests. A unit test is usually a very low level test that requires few or no external resources and verifies the functionality of an isolated feature (such as a single method of a single class). An integration test, by contrast, depends on the interaction of several classes or instances, and is designed to ensure that the API between the objects works as expected. For example, an integration test might verify that an ORM library stores data in the expected tables, or that temporary files are managed correctly when some filesystem actions are performed.

At the company where I work, we use the unittest framework found in the Python standard library for all of our automated unit and integration tests. It is convenient for us to use a single framework, because it means the developers only have to manage one set of test tools. Another benefit is the nightly batch job that runs the integration tests also includes all of the unit tests at the same time. By running the unit and integration tests automatically every night, we can identify regression errors we might not have otherwise detected until later in the testing cycle. The integration tests use database and filesystem resources created as fixtures by the test setUp and tearDown hooks. Our developers can run unit-level tests directly with unittest.main(), or test entire source packages with Proctor. The code for integration tests may be mingled in the same modules with the unit tests or in separate modules, depending on how the developer responsible for the area of code in question has it organized.

Some of the tests we have written need to use hardware that may not always be available in the test environment. We write automated tests for all of the “device driver” modules that are used to integrate infrastructure devices such as network switches, load balancers, and storage arrays with our product. These tests typically require an actual device for the test to run successfully, since the tests reconfigure it and then verify the results match what is expected. This situation poses a problem, since the test equipment is not always present in every test environment. Sometimes we only have one compatible device. At other times, a device was on loan and so may have been returned to the original vendor after the driver was finished. In both of these cases, the test setup code will not be able to find the equipment required for testing. Under these circumstances, the tests will produce an error every time they run, and it is useful to be able to skip over them and thus avoid false alarms and wasted test time.

Proctor solves this problem with a flag that causes it to ignore the tests in a module. To tell Proctor to ignore a specific module, add the module level variable __proctor_ignore_module__ in the source. Listing 3 shows an example with this flag set. Proctor still imports the module, but when it sees the flag set to True, it does not scan the contents of the module for tests. When the resource needed by the tests becomes available in our lab and it is time to test a device driver, we simply run the test file directly instead of using Proctor.

Listing 3

#!/usr/bin/env python
# The tests in this module are ignored by Proctor

import unittest

# Tell Proctor to ignore the tests in this module.
__proctor_ignore_module__ = True

class IgnoredTest(unittest.TestCase):

    def testShouldNotBeRun(self):
        self.fail('This test will not be run by Proctor')
        return

if __name__ == '__main__':
    # If this file is run directly, the tests are still executed.
    unittest.main()

Some of our other tests use resources available when the tests are run on a developer workstation, but not when they are run as part of a nightly batch job. For example, some portions of the graphical user interface for our product have automated tests, but since it is an X Windows application, those tests cannot be run without an X Windows server, which is not present on the automated test server. Since all of the GUI code is in one directory, it is easier to instruct Proctor to ignore all of the modules in that directory instead of setting the ignore flag separately for each file.

Proctor supports ignoring entire directories through a configuration file named .proctor. The configuration file in each directory can be used to specify modules or subdirectories to be ignored by Proctor when scanning for tests. The files or directories specified in the ignore variable are not imported at all, so if importing some modules would fail without resources like an X Windows server available, you can use a .proctor file as a more effective method of ignoring them rather than setting the ignore flag inside the source. All of the file or directory names in the ignore list are relative to the directory containing the configuration file. For example, to ignore the productname.gui package, create a file in the productname directory containing ignore = ['gui'], like this:

# Proctor instructions file ".proctor"
# Importing the gui module requires an X server,
# which is not available for the nightly test batch job.
ignore = ['gui']

The .proctor file uses Python syntax and can contain any legal Python code. This means you can use modules such as os and glob to build up the list of files to be ignored, following any rules you want to establish. Here is a more sophisticated example which only disables the GUI tests if it cannot find the X server they require:

import os

ignore = []
if os.environ.get('DISPLAY') is None:
    ignore.append('gui')

Organizing Tests Beyond Classes and Modules

The usual way to organize related test functions is by placing them together in the same class, and then by placing related classes together in the same module. Such a neat organizational scheme is not always possible, however, and related tests might be in different modules or even in different directories. Sometimes, test modules grow too large and need to be broken up so they are easier to maintain. In other cases, when a feature is implemented, different aspects of the code may be spread among files in multiple source directories, reflecting the different layers of the application. Proctor can use test categories to dynamically construct a test suite of related tests without requiring the test authors to know about all of the tests in advance or to update a test suite manually.

Proctor uses simple string identifiers as test categories, much like the tags commonly found in a Web 2.0 application. It is easy to add categories to your existing tests by setting the class attribute PROCTOR_TEST_CATEGORIES to a sequence of strings; no special base class is needed. Then tell proctorbatch to limit the test suite to tests in specific category using the --category option.

Using proctorbatch

Listing 4 shows some new test classes with categories that are useful as examples to demonstrate how the command line options to proctorbatch work. The first class, FeatureOneTests, is categorized as being related to “feature1”. The tests in the second class, FeatureOneAndTwoTests, are categorized as being related to both “feature1” and “feature2”, representing a set of integration level tests verifying the interface between the two features. The UncategorizedTests class is not included in any category. Now that the test classes are defined, I will show how to use proctorbatch to work with them in a variety of ways.

Listing 4

#!/usr/bin/env python
# Categorized tests.

import unittest

class FeatureOneTests(unittest.TestCase):
    "Unit tests for feature1"

    PROCTOR_TEST_CATEGORIES = ( 'feature1',)

    def test(self):
        return

class FeatureOneAndTwoTests(unittest.TestCase):
    "Integration tests for feature1 and feature2"

    PROCTOR_TEST_CATEGORIES = ( 'feature1', 'feature2', )

    def test1(self):
        return

    def test2(self):
        return

class UncategorizedTests(unittest.TestCase):
    "Not in any category"

    def test(self):
        return

if __name__ == '__main__':
    unittest.main()

Proctor provides several command line options that are useful for examining a test suite without actually running the tests.

Proctor provides several command line options that are useful for examining a test suite without actually running the tests. To print a list of the categories for all tests in the module, use the --list-categories option:

$ proctorbatch -q --list-categories Listing4.py
All
Unspecified
feature1
feature2

The output is an alphabetical listing of all of the test category names for all of the tests found in the input files. Proctor creates two categories automatically every time it is run. The category named “All” contains every test discovered. The “Unspecified” category includes any test that does not have a specific category, making it easy to find uncategorized tests when a test set starts to become unwieldy or more complex. When a test class does not have any categories defined, its tests are run when no --category option is specified on the command line to proctorbatch, or when the “All” category is used (the default).

To examine the test set to see which tests are present, use the --list option instead:

$ proctorbatch -q --list Listing4.py
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2
test: Listing4.FeatureOneTests.test
test: Listing4.UncategorizedTests.test

And to see the tests only for a specific category, use the --category and --list options together:

$ proctorbatch -q --list --category feature2 Listing4.py
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2

To see the list of uncategorized tests, use the category “Unspecified”:

$ proctorbatch -q --list --category Unspecified Listing4.py
test: Listing4.UncategorizedTests.test

After verifying that a category includes the right tests, to run the tests in the category, use the --category option without the --list option:

$ proctorbatch --category feature2 Listing4.py
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing4.FeatureOneAndTwoTests) ... ok
test2 (test: Listing4.FeatureOneAndTwoTests) ... ok

---------------------------------------------------
Ran 2 tests in 0.002s

OK

Identifying Test Categories

While test category names can hold any meaning you want to give them, over time I have found that using broad categories is more desirable than using narrowly defined categories. When a category is too narrowly focused, the tests are more likely to be in the same module or directory anyway. In that case, there is not as much purpose to be served by defining the category, since it is easy enough to just run the tests in that file or directory.

When using a broad category, it is more likely that the tests involved will span multiple directories. At that point, having a single category to encompass them becomes a useful way to consolidate the tests. Suppose, for example, there is an application that authenticates a user before allowing an action. It has a User class to manage users and verify their credentials. It also has a command line interface that depends on the User class to perform authentication. There are unit tests for methods of the User class, and integration tests to ensure that authentication works properly in the command line program. Since the command line program is unlikely to be in the same section of the source tree as the low-level module containing the User class, it would be beneficial to define a test category for “authentication” tests so all of the related tests can be run together.

These sorts of broad categories are also useful when a feature involves many aspects of a system at the same level. For example, when the user edits data through a web application, the user authentication, session management, cookie handling, and database aspects might all be involved at different points. A “login” category could be applied to unit tests from each aspect, so the tests can be run individually or as a group. Adding categories makes it immensely easier to run the right tests to identify regression errors when changes could affect multiple areas of a large application.

Monitoring Test Progress By Hand

Proctor accepts several command line options to control the format of the output of your test run, depending on your preference or need. The default output format uses the same style as the unittest test runner. The verbosity level is set to 1 by default, so the full names of all tests are printed along with the test outcome. To see only the pass/fail status for the tests, reduce the verbosity level by using the -q option. See Listing 5 for an example of the default output.

Listing 5

$ proctorbatch  Listing2.py
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing2.FailingTests) ... FAIL
test2 (test: Listing2.FailingTests) ... FAIL
test1 (test: Listing2.PassingTests) ... ok
test2 (test: Listing2.PassingTests) ... ok

======================================================================
FAIL: test1 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1

======================================================================
FAIL: test2 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2

----------------------------------------------------------------------
Ran 4 tests in 0.006s

FAILED (failures=2)

$ proctorbatch  -q Listing2.py
FF..
======================================================================
FAIL: test1 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1

======================================================================
FAIL: test2 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2

----------------------------------------------------------------------
Ran 4 tests in 0.007s

FAILED (failures=2)

When using the default format, Proctor does not print any failure or error messages until all of the tests have run. If your test suite is very large, or the integration tests require fixtures that take a lot of time to configure, you may not want to wait for the tests to finish before discovering which tests have not passed. When that is the case, you can use the --interleaved option to show the tests results along with the name of the test as the test runs, as illustrated in Listing 6.

Listing 6

$ proctorbatch  --interleaved --no-gc Listing2.py
Writing coverage output to .coverage
Scanning: .
  1/  6 test: Listing2.ErrorTests.test1 ...ERROR in test: Listing2.ErrorTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 27, in test1
    raise RuntimeError('test1 error')
RuntimeError: test1 error

  2/  6 test: Listing2.ErrorTests.test2 ...ERROR in test: Listing2.ErrorTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 30, in test2
    raise RuntimeError('test2 error')
RuntimeError: test2 error

  3/  6 test: Listing2.FailingTests.test1 ...FAIL in test: Listing2.FailingTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1

  4/  6 test: Listing2.FailingTests.test2 ...FAIL in test: Listing2.FailingTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2

  5/  6 test: Listing2.PassingTests.test1 ...ok
  6/  6 test: Listing2.PassingTests.test2 ...ok

Ran 6 tests in 0.013s

FAILED (failures=2, errors=2)

Automatic Test Output Processing

For especially large test runs, or if you are committed to more complete test automation, you may not want to examine the test results by hand at all. Proctor can also produce a simple parsable output format suitable for automatic processing. The output format can be processed by another program to summarize the results or even open tickets in your defect tracking system. To have Proctor report the test results in this format, pass the --parsable option to proctorbatch on the command line. Listing 7 includes a sample of the parsable output format.

Listing 7

$ proctorbatch  --parsable --no-gc Listing2.py
Writing coverage output to .coverage
Scanning: .
__PROCTOR__ Start run
__PROCTOR__ Start test
test: Listing2.ErrorTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 27, in test1
    raise RuntimeError('test1 error')
RuntimeError: test1 error

__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  1/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.ErrorTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 30, in test2
    raise RuntimeError('test2 error')
RuntimeError: test2 error

__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  2/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test1
Traceback (most recent call last):
  File "Listing2.py", line 17, in test1
    self.fail('Always fails 1')
AssertionError: Always fails 1
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  3/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test2
Traceback (most recent call last):
  File "Listing2.py", line 21, in test2
    self.fail('Always fails 2')
AssertionError: Always fails 2
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  4/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test1
__PROCTOR__ Start results
ok
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  5/  6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test2
__PROCTOR__ Start results
ok
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
  6/  6
__PROCTOR__ End progress
__PROCTOR__ End run
__PROCTOR__ Start summary
Failures: 2
Errors: 2
Successes: 2
Tests: 6
Elapsed time (sec): 0.014
Status: FAILED
__PROCTOR__ End summary

Since the test results may be part of a larger log file that includes other information such as build output and installation messages, Proctor uses easily identifiable delimiters to separate the sections in its output. Each delimiter appears on a line by itself, and begins with __PROCTOR__ to make it less likely that the output of any other program will be misinterpreted as test output.

Proctor assumes there is no need to automatically process the output of the scanning phase, so the first delimiter (__PROCTOR__ Start run) is printed at the beginning of the test execution phase. The string __PROCTOR__ Start test appears at the beginning of each test, followed on the next line by the name of the test. Any output produced by the test appears beginning on the line immediately following the name. The test output is followed by a traceback, if the test does not pass.

The text between the __PROCTOR__ Start results and __PROCTOR__ End results delimiters always begins with one of ok, ERROR, or FAIL, depending on the outcome of the test. If the test did not pass, the rest of the text in the results section consists of the full name of the test. The string __PROCTOR__ End test follows each test result. Between the results for each test, a progress section shows the current test number and the total number of tests being run.

Proctor comes with proctorfilter, a simple command line program to process a log file and print the names of tests with certain status codes. It accepts three command line options, --ok, --error, and --fail, to control which tests are listed in the output. For example, to find the tests which failed in the sample output, run:

$ proctorfilter --fail Listing7.txt
test: Listing2.FailingTests.test1: FAIL
test: Listing2.FailingTests.test2: FAIL

The default behavior for proctorfilter, when no command line options are given, is to print a list of tests that either had an error or failed.

Building Your Own Results Parser

Using proctorfilter to summarize a set of test results is only one way to automate results processing for your tests. Another way to handle the test results is to open a new ticket in a bug tracking system for each test that does not pass during the nightly test run. When the ticket is opened, it should include all of the information available, including output from the test and the traceback from the failure or error. Although proctorfilter does not include all of that information, the Proctor library also includes a result module, with classes useful for building your own test result processing program.

Listing 8 shows a sample program that recreates the default functionality of proctorfilter using the proctorlib.result module. The ResultFactory class parses input text passed to feed() and creates TestResult instances. Each time a complete test result has been fed in, a new TestResult is constructed and passed as an argument to the callback given to the ResultFactory constructor. In the sample program, the callback function show_test_result() looks at the status code for the test before deciding whether to print out the summary.

Listing 8

#!/usr/bin/env python
# Print a list of tests which did not pass.

import fileinput
from proctorlib.result import ResultFactory, TestResult

def show_test_result(test_result):
    "Called for each test result parsed from the input data."
    if not test_result.passed():
        print test_result
    return

# Set up the parser
parser = ResultFactory(show_test_result)

# Process data from stdin or files named via sys.argv
for line in fileinput.input():
    parser.feed(line)

A TestResult instance has several attributes of interest. The name attribute uniquely identifies the test. The name includes the full import path for the module, as well as the class and method name of the test. The output attribute includes all of the text appearing between the __PROCTOR__ Start test and __PROCTOR__ Start results delimiters, including the traceback, if any. The result attribute includes the full text from between __PROCTOR__ Start results and __PROCTOR__ End results, while status contains only the status code. The status will be the same as one of TestResult.OK, TestResult.ERROR, or TestResult.FAIL. The passed() method returns True if the test status is TestResult.OK and False otherwise.

Code Coverage

At the same time it is running the automated tests, Proctor uses Ned Batchelder’s coverage module to collect information about which statements in the source files are actually executed. The code coverage statistics gathered by coverage can be used to identify areas of the code that need to have more automated tests written.

By default, proctorbatch writes the code coverage statistics to the file ./.coverage. Use the --coverage-file option to change the filename used. To disable coverage statistics entirely, use the --no-coverage option.

Statistics are normally collected for every line of the source being run. Some lines should not be included in the statistics, though, if the code includes debugging sections that are disabled while the tests are running. In that case, use the --coverage-exclude option to specify regular expressions to be compared against the source code. If the source matches the pattern, the line is not included in the statistics counts. To disable checking for lines that match the pattern if DEBUG:, for example, add --coverage-exclude="if DEBUG:" to the command line. The --coverage-exclude option can be repeated for each pattern to be ignored.

Once the test run is complete, use coverage.py to produce a report with information about the portions of the code that were not executed and the percentage that was. For example, in the following listing the return statements in the test methods of the FailingTests class from Listing 2 are never executed. They were skipped because both of the tests fail before reaching the end of the function.

$ coverage.py -r -m Listing2.py
Name       Stmts   Exec  Cover   Missing
----------------------------------------
Listing2      18     16    88%   18, 22

Refer to the documentation provided by coverage.py --help for more information on how to print code coverage reports.

Garbage Collection

Proctor can also be used to help identify the source of memory leaks. When using the interleaved or parsable output formats, Proctor uses the gc module functions for garbage collection to report on objects that have not been cleaned up.

Listing 9 defines a test that introduces a circular reference between two lists, a and b, by appending each to the other. Normally, when processing leaves a function’s scope, the local variables are marked so they can be deleted and their memory reclaimed. In this case, however, since both lists are still referenced from an object that has not been deleted, the lists are not automatically cleaned up when the test function returns. The gc standard library module includes an interface to discover uncollected garbage objects like these lists, and Proctor includes a garbage collection report in the output for each test, as in Listing 10. The garbage collection information can be used to determine which test was being run when the memory leaked, and then to narrow down the source of the leak.

Listing 9

#!/usr/bin/env python
# Test code with circular reference to illustrate garbage collection

import unittest

class CircularReferenceTest(unittest.TestCase):

    def test(self):
        a = []
        b = []
        b.append(a)
        a.append(b)
        return

Listing 10

$  proctorbatch --interleaved Listing9.py
Writing coverage output to .coverage
Scanning: .
  0/  1 test: Listing9.CircularReferenceTest.test ...ok
GC: Collecting...
GC: Garbage objects:
<type 'list'>
  [[[...]]]
<type 'list'>
  [[[...]]]

Ran 1 tests in 0.180s

OK

Conclusion

Automated testing is perhaps one of the biggest productivity enhancements to come out of the Agile development movement. Even if you are not doing Test Driven Development, using automated testing to identify regression errors can provide great peace of mind. The basic tools provided in the Python standard library do support automated testing, but they tend to be targeted at library or module developers rather than large scale projects. I hope this introduction to Proctor has suggested a few new ideas for expanding your own use of automated tests, and for managing those tests as your project size and scope grows.

I would like to offer a special thanks to Ned Batchelder for his help with integrating coverage.py and Proctor, and Mrs. PyMOTW for her help editing this article.