Converting from Make to Paver

As I briefly mentioned in an earlier post, I recently moved the
PyMOTW build from make to Kevin Dangoor’s Python-based build tool
Paver. I had been wanting to try Paver out for a while, especially
since seeing Kevin’s presentation at PyWorks in November. As a long time
Unix/Linux user, I didn’t have any particular problems with make, but it
looked intriguing. PyMOTW is one of the few projects I have with a
significant build process (beyond simply creating the source tarball),
so it seemed like a good candidate for experimentation.

Concepts

The basic concepts with Paver and make are the same. Make “targets”
correspond roughly to Paver “tasks”. Paver places less emphasis on file
modification time-stamps, though, so tasks are all essentially “PHONEY”
targets. As with make, Paver keeps track of which dependencies are
executed so they are not repeated while building any one target.

Tasks are implemented as simple Python functions. Paver starts by
loading pavement.py, and tasks can be defined inline there or you can
import code from elsewhere if needed. According to Kevin, once the main
engine settles down enough to reach a 1.0 release, he doesn’t anticipate
a lot of active development on the core. Recipes for extending Paver can
be added easily through external modules which would be distributed
separately.

Building a Source Distribution

The most important target from the old PyMOTW Makefile was “package”.
It ran sphinx-build to create the HTML version of the documentation then
produced a versioned source distribution with distutils. The whole thing
was bundled up and dropped on my computer desktop, ready to be uploaded
to my web site.

package: clean html
    rm -f setup.py
    $(MAKE) setup.py
    rm -f MANIFEST MANIFEST.in
    $(MAKE) MANIFEST.in
    python setup.py sdist --force-manifest
    mv dist/*.gz ~/Desktop/

Paver sits on top of distutils, so one of the pre-defined tasks it has
built-in is “sdist” (similar to python setup.py sdist, for producing
source distributions of Python apps or libraries). In my case, I
extended the task definition to perform some pre-requisite tasks and
move the tarball out of the build directory onto the desktop of my
computer, to make it easier to upload to my web site.

Let’s look at the task definition:

@task@needs(['generate_setup', 'minilib', 'html_clean', 'setuptools.command.sdist'])
def sdist():
    """Create a source distribution.
    """
    # Copy the output file to the desktop
    dist_files = path('dist').glob('*.tar.gz')
    dest_dir = path(options.sdist.outdir).expanduser()
    for f in dist_files:
        f.move(dest_dir)return

The @task decorator registers the function as a task, tying it in to
the list of options available from the command line. The docstring for
the function is included in the help output (paver help or
paver –help-commands).

$ paver help
---> help
Paver 0.8.1

Usage: paver [global options] [option.name=value] task [task options] [task...]

Run 'paver help [section]' to see the following sections of info:

options    global command line options
setup      available distutils/setuptools tasks
tasks      all tasks that have been imported by your pavement

'paver help taskname' will display details for a task.

Tasks defined in your pavement:
  blog            - Generate the blog post version of the HTML for the current module
  html            - Build HTML documentation using Sphinx
  html_clean      - Remove sphinx output directories before building the HTML
  installwebsite  - Rebuild and copy website files to the remote server
  pdf             - Generate the PDF book
  sdist           - Create a source distribution
  webhtml         - Generate HTML files for website
  website         - Create local copy of website files
  webtemplatebase - Import the latest version of the web page template from the source

To run the task, pass the name as argument to paver on the command
line:

$ paver sdist

Prerequisites

The @needs decorator specifies the prerequisites for a task, listed in
order and identified by name. Paver prerequisites correspond to make
dependencies, and are all run before the task, as you would expect.

In the Makefile, before building a source distribution I always ran
the clean and html targets, too. That meant I had a fresh copy of
the HTML version of PyMOTW, generated by sphinx. The next step was to
build setup.py using a simple input template file processed by sed (so I
didn’t have to remember to edit the version and download URL every
time).

Paver provides a task to generate a setup.py (generate_setup),
so I no longer need to mess around with templates on my own. The
minilib task writes a ZIP archive with enough of Paver to support
installation through the usual python setup.py install or
easy_install PyMOTW incantations.

Notice that setuptools.command.sdist is the fully qualified name to
the task being redefined locally. That means in this case the standard
work for sdist() (producing the source distribution) is run prior to
invoking my override function.

I’ve defined an html_clean() task in pavement.py to take the
place of the old make targets clean and html:

@taskdef html_clean():
    """Remove sphinx output directories before building the HTML.
    """
    remake_directories(options.sphinx.doctrees, options.html.outdir)
    call_task('html')
    return

remake_directories() is a simple Python function I’ve written to
remove the directories passed to it and then recreate them, empty. It is
the equivalent of rm -r followed by mkdir. It’s not strictly
necessary, but I’m paranoid about old versions of renamed files ending
up in my generated output, so I always start with an empty directory.

Working with Files

Paver’s standard library includes Jason Orendorff’s path library
for working with directories and files. Using the library, paths are
objects with methods (instead of just strings). Methods include
finding the directory name for a file, getting the contents of a
directory, removing a file, etc. – the sorts of things you want to do
with files and directories. One especially nice feature is the /
operator, which works like os.path.join(). It is simple to
construct a path using components in separate variables, joining them
with /.

The sdist() function is responsible for copying the packaged source
distribution to my desktop. It starts by using path’s globbing support
to build a list of the .tar.gz files created by
setuptools.command.sdist(). (There should only be one file, but predicting
the name is more difficult than just using globbing.) The destination
directory is configured through the options Bundle (a dictionary-like
object with attribute look-up for the keys). Since the value of the
option might include ~, I expand it before using it as the
destination directory for the file move operation.

Options

Make is usually configured via the shell environment and variables
within the Makefile itself. Paver uses a collection of Bundle objects
organized into a hierarchical namespace. The options can be set to
static literal values, computed at runtime using normal Python
expressions, or overridden from the command line.

Each task can define its own section of the namespace with options.
Some underlying recipes (especially the distutils and sphinx
integration) depend on a specific structure, documented with the
relevant task documentation. For example, these are the settings I use
when running sphinx:

sphinx = Bunch(
        sourcedir='PyMOTW',
        docroot = '.',
        builder = 'html',
        doctrees='sphinx/doctrees',
        confdir = 'sphinx',
    ),

Tasks access the options using dot-notation starting with the root of
the namespace. For example, options.sphinx.doctrees.

Running Shell Commands

Even with the power of Python as a programming language, sometimes it
is necessary to shell-out to run an external program. Paver makes that
very easy. sh() wraps Python’s standard library module subprocess to
make it easier to work with for the sorts of use cases commonly found in
a build system. Simply pass a string containing the shell command you
want run, and optionally include the capture argument to have it
return the output text (useful for commands like svnversion).
sh() takes care of running the command, or in dry-run mode printing
the command it would have run.

For example, the last step of building the PyMOTW PDF requires running
a target included in the Makefile generated by Sphinx.

latex_dir = path(options.pdf.builddir) / 'latex'
sh('cd %s; make' % latex_dir)

Sphinx and Cog Integration

Paver also includes built-in support for Sphinx. The standard
integration with Sphinx supports producing HTML output. You can
configure many of the Sphinx options you would normally put in a conf.py
file directly through Paver’s pavement.py. I had to override the way
Sphinx is run by default, because I want to produce 3 different versions
of HTML output (using different templates) and the PDF, but simpler
projects won’t have to do much more than set up the location of the
input files.

In addition to Sphinx, Paver integrates Ned Batchelder’s Cog, a
templating/macro definition tool that lets you generate part of your
documentation on the fly from arbitrary Python code. I’ve done some work
to have cog run the PyMOTW examples and insert the output into the .rst
file before passing it to Sphinx to be converted to HTML or PDF. The
process is complicated enough to warrant its own post, though, so that
will have to wait for another day.

Conclusions

Paver is a useful alternative to make, especially for Python-based
packages. The default integration with distutils makes it very easy to
get started. Build environments requiring a lot of external shell calls
may find Makefile’s easier to deal with. In my case, I was able to fold
a couple of small Python scripts into the pavement.py file, so I
eliminated a few separate tools.

It’s hard to say whether a pavement file is “simpler” than a Makefile.
Task definitions do not tend to be shorter than make targets, but the
verbosity is an artifact of Python (function definitions and decorators,
etc.) rather than anything inherent in the way Paver is designed.

A typical Paver configuration file is likely to be more portable than
a Makefile, so that may be something to take into account. With file
operations easily accessible in a portable library, it should be easy to
set up your pavement.py to work on any OS.

For the complete pavement.py file used by PyMOTW, grab the latest
release from the web site.