Automated Testing with unittest and Proctor
Automated testing is an important part of Agile development methodologies, and the practice is seeing increasing adoption even in environments where other Agile tools are not used. This article discusses testing techniques for you to use with the open source tool Proctor. By using Proctor, you will not only manage your automated test suite more effectively, but you will also obtain better results in the process.
What is Proctor?
Proctor is a tool for running automated tests in Python source
code. It scans input source files looking for classes based on the
TestCase
class from the unittest module in the
Python standard library. You can use arbitrary organization schemes
for tests defined in separate source modules or directories by
applying user defined categories to test classes. Proctor constructs
test suites dynamically at run time based on your categories, making
it easy to run a subset of the tests even if they are not in the same
location on the filesystem. Proctor has been specifically designed to
operate on a large number of tests (more than 3500 at one
site). Although it depends on the unittest module, Proctor is also
ideally suited for use with integration or higher level tests, because
it is easy to configure it to run unattended.
Installation
Proctor uses the standard distutils module tools for
installation support. If you have previously installed easy_install
,
using it is the simplest way to install packages such as Proctor that
are listed in the Python Package Index.
$ sudo easy_install Proctor
Running easy_install
will download and install the most recent
version by default. If you do not have easy_install
, download the
latest version of the Proctor source code from the home page (see the
references list for this article), then install it as you would any
other Python package:
$ tar zxvf Proctor-1.2.tar.gz
$ cd Proctor-1.2
$ sudo python setup.py install
Once Proctor is installed, you will find a command line program,
proctorbatch
, in your shell execution path. Listing 1 shows the
command syntax for proctorbatch
. I will examine the command line
options in more detail throughout the rest of this article using a few
simple tests.
Listing 1
proctorbatch Proctor is a tool for running unit tests. It enhances the existing unittest module to provide the ability to find all tests in a set of code, categorize them, and run some or all of them. Test output may be generated in a variety of formats to support parsing by another tool or simple, nicely formatted, reports for human review. SYNTAX: proctorbatch [] [ ...] --category=categoryName --coverage-exclude=pattern --coverage-file=filename --debug --interleaved --list --list-categories --no-coverage --no-gc --no-run --parsable -q -v OPTIONS: -h Displays abbreviated help message. --help Displays complete usage information. --category=categoryName Run only the tests in the specified category. Warning: If there are no tests in a category, an error will not be produced. The test suite will appear to be empty. --coverage-exclude=pattern Add a line exclude pattern (can be a regular expression). --coverage-file=filename Write coverage statistics to the specified file. --debug Turn on debug mode to see tracebacks. --interleaved Interleave error and failure messages with the test list. --list List tests. --list-categories List test categories. --no-coverage Disable coverage analysis. --no-gc Disable garbage collection and leak reporting. --no-run Do not run the tests --parsable Format output to make it easier to parse. -q Turn on quiet mode. -v Increment the verbose level. Higher levels are more verbose. The default is 1.
Sample Tests and Standard unittest Features
The simplest sample set of test cases needs to include at least three
tests: one to pass, one to fail, and one to raise an exception
indicating an error. For this example, I have separated the tests into
three classes and provided two test methods on each class. Listing 2
shows the code to define the tests, including the standard unittest
boilerplate code for running them directly.
Listing 2
#!/usr/bin/env python
# Sample tests for exercising Proctor.
import unittest
class PassingTests(unittest.TestCase):
def test1(self):
return
def test2(self):
return
class FailingTests(unittest.TestCase):
def test1(self):
self.fail('Always fails 1')
return
def test2(self):
self.fail('Always fails 2')
return
class ErrorTests(unittest.TestCase):
def test1(self):
raise RuntimeError('test1 error')
def test2(self):
raise RuntimeError('test2 error')
if __name__ == '__main__': # pragma: no cover
unittest.main()
When Listing2.py
is run, it invokes the unittest module’s main()
function. As main()
runs, the standard test loader is used to find
tests in the current module, and all of the discovered tests are
executed one after the other. It is also possible to name individual
tests or test classes to be run using arguments on the command
line. For example, Listing2.py PassingTests
runs both of the tests
in the PassingTests
class. This standard behavior is provided by the
unittest module and is useful if you know where the tests you want to
run are located in your code base.
It is also possible to organize tests from different classes into “suites”. You can create the suites using any criteria you like – themes, feature areas, level of abstraction, specific bugs, etc. For example, this code sets up a suite containing two test cases:
import unittest
from Listing2 import *
suite1 = unittest.TestSuite([PassingTests('test1'), FailingTests('test1')])
unittest.main(defaultTest='suite1')
When run, the above code would execute the tests PassingTests.test1
and FailingTests.test1
, since those are explicitly included in the
suite. The trouble with creating test suites in this manner is that
you have to maintain them by hand. Any time new tests are created or
obsolete tests are removed, the suite definitions must be updated as
well. This may not be a lot of work for small projects, but as project
size and test coverage increases, the extra work can become
unmanageable very quickly.
Proctor was developed to make working with tests across classes, modules, and directories easier by eliminating the manual effort involved in building suites of related tests. Over the course of several years, the set of automated tests we have written for my company’s product has grown to contain over 3500 individual tests.
Our code is organized around functional areas, with user interface and back-end code separated into different modules and packages. In order to run the automated tests for all aspects of a specific feature, a developer may need to run tests in several modules from different directories in their sandbox. By building on the standard library features of unittest, Proctor makes it easy to manage all of the tests no matter where they are located in the source tree.
Running Tests with Proctor
The first improvement Proctor makes over the standard unittest test
loader is that Proctor can scan multiple files to find all of the
tests, then run them in a single batch. Each Python module specified
on the command line is imported, one at a time. After a module is
loaded, it is scanned for classes derived from unittest.TestCase
,
just as with the standard test loader. All of the tests are added to a
test suite, and when the scanner is finished loading the test modules,
all of the tests in the suite are run.
For example, to scan all Python files in the installed version of
proctorlib
for tests you would run:
$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib/*.py
Proctor also accepts directory names as arguments, so the command can be written:
$ cd /usr/lib/python2.5/site-packages
$ proctorbatch proctorlib
Proctor will search recursively down through any directories given to
find all of the tests in any modules in subdirectories. The file or
directory names are converted to importable package names, so that
directory/file.py
is imported as directory.file
. If your code is
organized under a single Python package, and you wish to run all of
the tests in the package, you only need to specify the root directory
for that package.
Expanding Automated Testing Beyond Unit Tests
Tests are often categorized by developers based on the scope of the functionality being tested as either “unit” or “integration” tests. A unit test is usually a very low level test that requires few or no external resources and verifies the functionality of an isolated feature (such as a single method of a single class). An integration test, by contrast, depends on the interaction of several classes or instances, and is designed to ensure that the API between the objects works as expected. For example, an integration test might verify that an ORM library stores data in the expected tables, or that temporary files are managed correctly when some filesystem actions are performed.
At the company where I work, we use the unittest framework found in
the Python standard library for all of our automated unit and
integration tests. It is convenient for us to use a single framework,
because it means the developers only have to manage one set of test
tools. Another benefit is the nightly batch job that runs the
integration tests also includes all of the unit tests at the same
time. By running the unit and integration tests automatically every
night, we can identify regression errors we might not have otherwise
detected until later in the testing cycle. The integration tests use
database and filesystem resources created as fixtures by the test
setUp
and tearDown
hooks. Our developers can run unit-level tests
directly with unittest.main()
, or test entire source packages with
Proctor. The code for integration tests may be mingled in the same
modules with the unit tests or in separate modules, depending on how
the developer responsible for the area of code in question has it
organized.
Some of the tests we have written need to use hardware that may not always be available in the test environment. We write automated tests for all of the “device driver” modules that are used to integrate infrastructure devices such as network switches, load balancers, and storage arrays with our product. These tests typically require an actual device for the test to run successfully, since the tests reconfigure it and then verify the results match what is expected. This situation poses a problem, since the test equipment is not always present in every test environment. Sometimes we only have one compatible device. At other times, a device was on loan and so may have been returned to the original vendor after the driver was finished. In both of these cases, the test setup code will not be able to find the equipment required for testing. Under these circumstances, the tests will produce an error every time they run, and it is useful to be able to skip over them and thus avoid false alarms and wasted test time.
Proctor solves this problem with a flag that causes it to ignore the
tests in a module. To tell Proctor to ignore a specific module, add
the module level variable __proctor_ignore_module__
in the
source. Listing 3 shows an example with this flag set. Proctor still
imports the module, but when it sees the flag set to True
, it does
not scan the contents of the module for tests. When the resource
needed by the tests becomes available in our lab and it is time to
test a device driver, we simply run the test file directly instead of
using Proctor.
Listing 3
#!/usr/bin/env python
# The tests in this module are ignored by Proctor
import unittest
# Tell Proctor to ignore the tests in this module.
__proctor_ignore_module__ = True
class IgnoredTest(unittest.TestCase):
def testShouldNotBeRun(self):
self.fail('This test will not be run by Proctor')
return
if __name__ == '__main__':
# If this file is run directly, the tests are still executed.
unittest.main()
Some of our other tests use resources available when the tests are run on a developer workstation, but not when they are run as part of a nightly batch job. For example, some portions of the graphical user interface for our product have automated tests, but since it is an X Windows application, those tests cannot be run without an X Windows server, which is not present on the automated test server. Since all of the GUI code is in one directory, it is easier to instruct Proctor to ignore all of the modules in that directory instead of setting the ignore flag separately for each file.
Proctor supports ignoring entire directories through a configuration
file named .proctor
. The configuration file in each directory can be
used to specify modules or subdirectories to be ignored by Proctor
when scanning for tests. The files or directories specified in the
ignore
variable are not imported at all, so if importing some
modules would fail without resources like an X Windows server
available, you can use a .proctor
file as a more effective method of
ignoring them rather than setting the ignore flag inside the
source. All of the file or directory names in the ignore list are
relative to the directory containing the configuration file. For
example, to ignore the productname.gui
package, create a file in the
productname
directory containing ignore = ['gui']
, like
this:
# Proctor instructions file ".proctor"
# Importing the gui module requires an X server,
# which is not available for the nightly test batch job.
ignore = ['gui']
The .proctor
file uses Python syntax and can contain any legal
Python code. This means you can use modules such as
os and glob to build up the list of
files to be ignored, following any rules you want to establish. Here
is a more sophisticated example which only disables the GUI tests if
it cannot find the X server they require:
import os
ignore = []
if os.environ.get('DISPLAY') is None:
ignore.append('gui')
Organizing Tests Beyond Classes and Modules
The usual way to organize related test functions is by placing them together in the same class, and then by placing related classes together in the same module. Such a neat organizational scheme is not always possible, however, and related tests might be in different modules or even in different directories. Sometimes, test modules grow too large and need to be broken up so they are easier to maintain. In other cases, when a feature is implemented, different aspects of the code may be spread among files in multiple source directories, reflecting the different layers of the application. Proctor can use test categories to dynamically construct a test suite of related tests without requiring the test authors to know about all of the tests in advance or to update a test suite manually.
Proctor uses simple string identifiers as test categories, much like
the tags commonly found in a Web 2.0 application. It is easy to add
categories to your existing tests by setting the class attribute
PROCTOR_TEST_CATEGORIES
to a sequence of strings; no special base
class is needed. Then tell proctorbatch
to limit the test suite to
tests in specific category using the --category
option.
Using proctorbatch
Listing 4 shows some new test classes with categories that are useful
as examples to demonstrate how the command line options to
proctorbatch
work. The first class, FeatureOneTests
, is
categorized as being related to “feature1”. The tests in the second
class, FeatureOneAndTwoTests
, are categorized as being related to
both “feature1” and “feature2”, representing a set of integration
level tests verifying the interface between the two features. The
UncategorizedTests
class is not included in any category. Now that
the test classes are defined, I will show how to use proctorbatch
to
work with them in a variety of ways.
Listing 4
#!/usr/bin/env python
# Categorized tests.
import unittest
class FeatureOneTests(unittest.TestCase):
"Unit tests for feature1"
PROCTOR_TEST_CATEGORIES = ( 'feature1',)
def test(self):
return
class FeatureOneAndTwoTests(unittest.TestCase):
"Integration tests for feature1 and feature2"
PROCTOR_TEST_CATEGORIES = ( 'feature1', 'feature2', )
def test1(self):
return
def test2(self):
return
class UncategorizedTests(unittest.TestCase):
"Not in any category"
def test(self):
return
if __name__ == '__main__':
unittest.main()
Proctor provides several command line options that are useful for examining a test suite without actually running the tests.
Proctor provides several command line options that are useful for
examining a test suite without actually running the tests. To print a
list of the categories for all tests in the module, use the
--list-categories
option:
$ proctorbatch -q --list-categories Listing4.py
All
Unspecified
feature1
feature2
The output is an alphabetical listing of all of the test category
names for all of the tests found in the input files. Proctor creates
two categories automatically every time it is run. The category named
“All” contains every test discovered. The “Unspecified” category
includes any test that does not have a specific category, making it
easy to find uncategorized tests when a test set starts to become
unwieldy or more complex. When a test class does not have any
categories defined, its tests are run when no --category
option
is specified on the command line to proctorbatch
, or when the “All”
category is used (the default).
To examine the test set to see which tests are present, use the
--list
option instead:
$ proctorbatch -q --list Listing4.py
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2
test: Listing4.FeatureOneTests.test
test: Listing4.UncategorizedTests.test
And to see the tests only for a specific category, use the
--category
and --list
options together:
$ proctorbatch -q --list --category feature2 Listing4.py
test: Listing4.FeatureOneAndTwoTests.test1
test: Listing4.FeatureOneAndTwoTests.test2
To see the list of uncategorized tests, use the category “Unspecified”:
$ proctorbatch -q --list --category Unspecified Listing4.py
test: Listing4.UncategorizedTests.test
After verifying that a category includes the right tests, to run the
tests in the category, use the --category
option without the
--list
option:
$ proctorbatch --category feature2 Listing4.py
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing4.FeatureOneAndTwoTests) ... ok
test2 (test: Listing4.FeatureOneAndTwoTests) ... ok
---------------------------------------------------
Ran 2 tests in 0.002s
OK
Identifying Test Categories
While test category names can hold any meaning you want to give them, over time I have found that using broad categories is more desirable than using narrowly defined categories. When a category is too narrowly focused, the tests are more likely to be in the same module or directory anyway. In that case, there is not as much purpose to be served by defining the category, since it is easy enough to just run the tests in that file or directory.
When using a broad category, it is more likely that the tests involved
will span multiple directories. At that point, having a single
category to encompass them becomes a useful way to consolidate the
tests. Suppose, for example, there is an application that
authenticates a user before allowing an action. It has a User
class
to manage users and verify their credentials. It also has a command
line interface that depends on the User class to perform
authentication. There are unit tests for methods of the User
class,
and integration tests to ensure that authentication works properly in
the command line program. Since the command line program is unlikely
to be in the same section of the source tree as the low-level module
containing the User
class, it would be beneficial to define a test
category for “authentication” tests so all of the related tests can be
run together.
These sorts of broad categories are also useful when a feature involves many aspects of a system at the same level. For example, when the user edits data through a web application, the user authentication, session management, cookie handling, and database aspects might all be involved at different points. A “login” category could be applied to unit tests from each aspect, so the tests can be run individually or as a group. Adding categories makes it immensely easier to run the right tests to identify regression errors when changes could affect multiple areas of a large application.
Monitoring Test Progress By Hand
Proctor accepts several command line options to control the format of
the output of your test run, depending on your preference or need.
The default output format uses the same style as the unittest test
runner. The verbosity level is set to 1
by default, so the full
names of all tests are printed along with the test outcome. To see
only the pass/fail status for the tests, reduce the verbosity level by
using the -q
option. See Listing 5 for an example of the default
output.
Listing 5
$ proctorbatch Listing2.py
Writing coverage output to .coverage
Scanning: .
test1 (test: Listing2.FailingTests) ... FAIL
test2 (test: Listing2.FailingTests) ... FAIL
test1 (test: Listing2.PassingTests) ... ok
test2 (test: Listing2.PassingTests) ... ok
======================================================================
FAIL: test1 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 17, in test1
self.fail('Always fails 1')
AssertionError: Always fails 1
======================================================================
FAIL: test2 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 21, in test2
self.fail('Always fails 2')
AssertionError: Always fails 2
----------------------------------------------------------------------
Ran 4 tests in 0.006s
FAILED (failures=2)
$ proctorbatch -q Listing2.py
FF..
======================================================================
FAIL: test1 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 17, in test1
self.fail('Always fails 1')
AssertionError: Always fails 1
======================================================================
FAIL: test2 (test: Listing2.FailingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/dhellmann/Documents/PythonMagazine/Articles/Proctor/trunk/Listing2.py", line 21, in test2
self.fail('Always fails 2')
AssertionError: Always fails 2
----------------------------------------------------------------------
Ran 4 tests in 0.007s
FAILED (failures=2)
When using the default format, Proctor does not print any failure or
error messages until all of the tests have run. If your test suite is
very large, or the integration tests require fixtures that take a lot
of time to configure, you may not want to wait for the tests to finish
before discovering which tests have not passed. When that is the case,
you can use the --interleaved
option to show the tests results along
with the name of the test as the test runs, as illustrated in Listing
6.
Listing 6
$ proctorbatch --interleaved --no-gc Listing2.py
Writing coverage output to .coverage
Scanning: .
1/ 6 test: Listing2.ErrorTests.test1 ...ERROR in test: Listing2.ErrorTests.test1
Traceback (most recent call last):
File "Listing2.py", line 27, in test1
raise RuntimeError('test1 error')
RuntimeError: test1 error
2/ 6 test: Listing2.ErrorTests.test2 ...ERROR in test: Listing2.ErrorTests.test2
Traceback (most recent call last):
File "Listing2.py", line 30, in test2
raise RuntimeError('test2 error')
RuntimeError: test2 error
3/ 6 test: Listing2.FailingTests.test1 ...FAIL in test: Listing2.FailingTests.test1
Traceback (most recent call last):
File "Listing2.py", line 17, in test1
self.fail('Always fails 1')
AssertionError: Always fails 1
4/ 6 test: Listing2.FailingTests.test2 ...FAIL in test: Listing2.FailingTests.test2
Traceback (most recent call last):
File "Listing2.py", line 21, in test2
self.fail('Always fails 2')
AssertionError: Always fails 2
5/ 6 test: Listing2.PassingTests.test1 ...ok
6/ 6 test: Listing2.PassingTests.test2 ...ok
Ran 6 tests in 0.013s
FAILED (failures=2, errors=2)
Automatic Test Output Processing
For especially large test runs, or if you are committed to more
complete test automation, you may not want to examine the test results
by hand at all. Proctor can also produce a simple parsable output
format suitable for automatic processing. The output format can be
processed by another program to summarize the results or even open
tickets in your defect tracking system. To have Proctor report the
test results in this format, pass the --parsable
option to
proctorbatch
on the command line. Listing 7 includes a sample of the
parsable output format.
Listing 7
$ proctorbatch --parsable --no-gc Listing2.py
Writing coverage output to .coverage
Scanning: .
__PROCTOR__ Start run
__PROCTOR__ Start test
test: Listing2.ErrorTests.test1
Traceback (most recent call last):
File "Listing2.py", line 27, in test1
raise RuntimeError('test1 error')
RuntimeError: test1 error
__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
1/ 6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.ErrorTests.test2
Traceback (most recent call last):
File "Listing2.py", line 30, in test2
raise RuntimeError('test2 error')
RuntimeError: test2 error
__PROCTOR__ Start results
ERROR in test: Listing2.ErrorTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
2/ 6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test1
Traceback (most recent call last):
File "Listing2.py", line 17, in test1
self.fail('Always fails 1')
AssertionError: Always fails 1
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test1
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
3/ 6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.FailingTests.test2
Traceback (most recent call last):
File "Listing2.py", line 21, in test2
self.fail('Always fails 2')
AssertionError: Always fails 2
__PROCTOR__ Start results
FAIL in test: Listing2.FailingTests.test2
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
4/ 6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test1
__PROCTOR__ Start results
ok
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
5/ 6
__PROCTOR__ End progress
__PROCTOR__ Start test
test: Listing2.PassingTests.test2
__PROCTOR__ Start results
ok
__PROCTOR__ End results
__PROCTOR__ End test
__PROCTOR__ Start progress
6/ 6
__PROCTOR__ End progress
__PROCTOR__ End run
__PROCTOR__ Start summary
Failures: 2
Errors: 2
Successes: 2
Tests: 6
Elapsed time (sec): 0.014
Status: FAILED
__PROCTOR__ End summary
Since the test results may be part of a larger log file that includes
other information such as build output and installation messages,
Proctor uses easily identifiable delimiters to separate the sections
in its output. Each delimiter appears on a line by itself, and begins
with __PROCTOR__
to make it less likely that the output of any other
program will be misinterpreted as test output.
Proctor assumes there is no need to automatically process the output
of the scanning phase, so the first delimiter (__PROCTOR__ Start run
) is printed at the beginning of the test execution phase. The
string __PROCTOR__ Start test
appears at the beginning of each test,
followed on the next line by the name of the test. Any output produced
by the test appears beginning on the line immediately following the
name. The test output is followed by a traceback, if the test does not
pass.
The text between the __PROCTOR__ Start results
and __PROCTOR__ End results
delimiters always begins with one of ok
, ERROR
, or
FAIL
, depending on the outcome of the test. If the test did not
pass, the rest of the text in the results section consists of the full
name of the test. The string __PROCTOR__ End test
follows each test
result. Between the results for each test, a progress section shows
the current test number and the total number of tests being run.
Proctor comes with proctorfilter
, a simple command line program to
process a log file and print the names of tests with certain status
codes. It accepts three command line options, --ok
, --error
, and
--fail
, to control which tests are listed in the output. For
example, to find the tests which failed in the sample output, run:
$ proctorfilter --fail Listing7.txt
test: Listing2.FailingTests.test1: FAIL
test: Listing2.FailingTests.test2: FAIL
The default behavior for proctorfilter
, when no command line options
are given, is to print a list of tests that either had an error or
failed.
Building Your Own Results Parser
Using proctorfilter
to summarize a set of test results is only one
way to automate results processing for your tests. Another way to
handle the test results is to open a new ticket in a bug tracking
system for each test that does not pass during the nightly test
run. When the ticket is opened, it should include all of the
information available, including output from the test and the
traceback from the failure or error. Although proctorfilter
does not
include all of that information, the Proctor library also includes a
result
module, with classes useful for building your own test result
processing program.
Listing 8 shows a sample program that recreates the default
functionality of proctorfilter
using the
proctorlib.result module. The ResultFactory
class
parses input text passed to feed()
and creates TestResult
instances. Each time a complete test result has been fed in, a new
TestResult
is constructed and passed as an argument to the callback
given to the ResultFactory
constructor. In the sample program, the
callback function show_test_result()
looks at the status code for
the test before deciding whether to print out the summary.
Listing 8
#!/usr/bin/env python
# Print a list of tests which did not pass.
import fileinput
from proctorlib.result import ResultFactory, TestResult
def show_test_result(test_result):
"Called for each test result parsed from the input data."
if not test_result.passed():
print test_result
return
# Set up the parser
parser = ResultFactory(show_test_result)
# Process data from stdin or files named via sys.argv
for line in fileinput.input():
parser.feed(line)
A TestResult
instance has several attributes of interest. The name
attribute uniquely identifies the test. The name includes the full
import path for the module, as well as the class and method name of
the test. The output
attribute includes all of the text appearing
between the __PROCTOR__ Start test
and __PROCTOR__ Start results
delimiters,
including the traceback, if any. The result
attribute includes the
full text from between __PROCTOR__ Start results
and
__PROCTOR__ End results
, while status
contains only the status
code. The status will be the same as one of TestResult.OK
,
TestResult.ERROR
, or TestResult.FAIL
. The passed()
method
returns True
if the test status is TestResult.OK
and False
otherwise.
Code Coverage
At the same time it is running the automated tests, Proctor uses Ned Batchelder’s coverage module to collect information about which statements in the source files are actually executed. The code coverage statistics gathered by coverage can be used to identify areas of the code that need to have more automated tests written.
By default, proctorbatch
writes the code coverage statistics to the
file ./.coverage
. Use the --coverage-file
option to change the
filename used. To disable coverage statistics entirely, use the
--no-coverage
option.
Statistics are normally collected for every line of the source being
run. Some lines should not be included in the statistics, though, if
the code includes debugging sections that are disabled while the tests
are running. In that case, use the --coverage-exclude
option to
specify regular expressions to be compared against the source code.
If the source matches the pattern, the line is not included in the
statistics counts. To disable checking for lines that match the
pattern if DEBUG:
, for example, add --coverage-exclude="if DEBUG:"
to the command line. The --coverage-exclude
option can be repeated
for each pattern to be ignored.
Once the test run is complete, use coverage.py
to produce a report
with information about the portions of the code that were not executed
and the percentage that was. For example, in the following listing the
return
statements in the test methods of the FailingTests
class
from Listing 2 are never executed. They were skipped because both of
the tests fail before reaching the end of the function.
$ coverage.py -r -m Listing2.py
Name Stmts Exec Cover Missing
----------------------------------------
Listing2 18 16 88% 18, 22
Refer to the documentation provided by coverage.py --help
for more
information on how to print code coverage reports.
Garbage Collection
Proctor can also be used to help identify the source of memory leaks. When using the interleaved or parsable output formats, Proctor uses the gc module functions for garbage collection to report on objects that have not been cleaned up.
Listing 9 defines a test that introduces a circular reference between
two lists, a
and b
, by appending each to the other. Normally, when
processing leaves a function’s scope, the local variables are marked
so they can be deleted and their memory reclaimed. In this case,
however, since both lists are still referenced from an object that has
not been deleted, the lists are not automatically cleaned up when the
test function returns. The gc standard library module includes an
interface to discover uncollected garbage objects like these lists,
and Proctor includes a garbage collection report in the output for
each test, as in Listing 10. The garbage collection information can be
used to determine which test was being run when the memory leaked, and
then to narrow down the source of the leak.
Listing 9
#!/usr/bin/env python
# Test code with circular reference to illustrate garbage collection
import unittest
class CircularReferenceTest(unittest.TestCase):
def test(self):
a = []
b = []
b.append(a)
a.append(b)
return
Listing 10
$ proctorbatch --interleaved Listing9.py
Writing coverage output to .coverage
Scanning: .
0/ 1 test: Listing9.CircularReferenceTest.test ...ok
GC: Collecting...
GC: Garbage objects:
<type 'list'>
[[[...]]]
<type 'list'>
[[[...]]]
Ran 1 tests in 0.180s
OK
Conclusion
Automated testing is perhaps one of the biggest productivity enhancements to come out of the Agile development movement. Even if you are not doing Test Driven Development, using automated testing to identify regression errors can provide great peace of mind. The basic tools provided in the Python standard library do support automated testing, but they tend to be targeted at library or module developers rather than large scale projects. I hope this introduction to Proctor has suggested a few new ideas for expanding your own use of automated tests, and for managing those tests as your project size and scope grows.
I would like to offer a special thanks to Ned Batchelder for his help with integrating coverage.py and Proctor, and Mrs. PyMOTW for her help editing this article.