Converting from Make to Paver

As I briefly mentioned in an earlier post, I recently moved the PyMOTW build from make to Kevin Dangoor’s Python-based build tool Paver. I had been wanting to try Paver out for a while, especially since seeing Kevin’s presentation at PyWorks in November. As a long time Unix/Linux user, I didn’t have any particular problems with make, but it looked intriguing. PyMOTW is one of the few projects I have with a significant build process (beyond simply creating the source tarball), so it seemed like a good candidate for experimentation.

Concepts

The basic concepts with Paver and make are the same. Make “targets” correspond roughly to Paver “tasks”. Paver places less emphasis on file modification time-stamps, though, so tasks are all essentially “PHONEY” targets. As with make, Paver keeps track of which dependencies are executed so they are not repeated while building any one target.

Tasks are implemented as simple Python functions. Paver starts by loading pavement.py, and tasks can be defined inline there or you can import code from elsewhere if needed. According to Kevin, once the main engine settles down enough to reach a 1.0 release, he doesn’t anticipate a lot of active development on the core. Recipes for extending Paver can be added easily through external modules which would be distributed separately.

Building a Source Distribution

The most important target from the old PyMOTW Makefile was “package”. It ran sphinx-build to create the HTML version of the documentation then produced a versioned source distribution with distutils. The whole thing was bundled up and dropped on my computer desktop, ready to be uploaded to my web site.

package: clean html
    rm -f setup.py
    $(MAKE) setup.py
    rm -f MANIFEST MANIFEST.in
    $(MAKE) MANIFEST.in
    python setup.py sdist --force-manifest
    mv dist/*.gz ~/Desktop/

Paver sits on top of distutils, so one of the pre-defined tasks it has built-in is “sdist” (similar to python setup.py sdist, for producing source distributions of Python apps or libraries). In my case, I extended the task definition to perform some pre-requisite tasks and move the tarball out of the build directory onto the desktop of my computer, to make it easier to upload to my web site.

Let’s look at the task definition:

@task@needs(['generate_setup', 'minilib', 'html_clean', 'setuptools.command.sdist'])
def sdist():
    """Create a source distribution.
    """
    # Copy the output file to the desktop
    dist_files = path('dist').glob('*.tar.gz')
    dest_dir = path(options.sdist.outdir).expanduser()
    for f in dist_files:
        f.move(dest_dir)return

The @task decorator registers the function as a task, tying it in to the list of options available from the command line. The docstring for the function is included in the help output (paver help or paver --help-commands).

$ paver help
---> help
Paver 0.8.1

Usage: paver [global options] [option.name=value] task [task options] [task...]

Run 'paver help [section]' to see the following sections of info:

options    global command line options
setup      available distutils/setuptools tasks
tasks      all tasks that have been imported by your pavement

'paver help taskname' will display details for a task.

Tasks defined in your pavement:
  blog            - Generate the blog post version of the HTML for the current module
  html            - Build HTML documentation using Sphinx
  html_clean      - Remove sphinx output directories before building the HTML
  installwebsite  - Rebuild and copy website files to the remote server
  pdf             - Generate the PDF book
  sdist           - Create a source distribution
  webhtml         - Generate HTML files for website
  website         - Create local copy of website files
  webtemplatebase - Import the latest version of the web page template from the source

To run the task, pass the name as argument to paver on the command line:

$ paver sdist

Prerequisites

The @needs decorator specifies the prerequisites for a task, listed in order and identified by name. Paver prerequisites correspond to make dependencies, and are all run before the task, as you would expect.

In the Makefile, before building a source distribution I always ran the clean and html targets, too. That meant I had a fresh copy of the HTML version of PyMOTW, generated by sphinx. The next step was to build setup.py using a simple input template file processed by sed (so I didn’t have to remember to edit the version and download URL every time).

Paver provides a task to generate a setup.py (generate_setup), so I no longer need to mess around with templates on my own. The minilib task writes a ZIP archive with enough of Paver to support installation through the usual python setup.py install or easy_install PyMOTW incantations.

Notice that setuptools.command.sdist is the fully qualified name to the task being redefined locally. That means in this case the standard work for sdist()` (producing the source distribution) is run prior to invoking my override function.

I’ve defined an html_clean() task in pavement.py to take the place of the old make targets clean and html:

@task
def html_clean():
    """Remove sphinx output directories before building the HTML.
    """
    remake_directories(options.sphinx.doctrees, options.html.outdir)
    call_task('html')
    return

remake_directories() is a simple Python function I’ve written to remove the directories passed to it and then recreate them, empty. It is the equivalent of rm -r followed by mkdir. It’s not strictly necessary, but I’m paranoid about old versions of renamed files ending up in my generated output, so I always start with an empty directory.

Working with Files

Paver’s standard library includes Jason Orendorff’s path library for working with directories and files. Using the library, paths are objects with methods (instead of just strings). Methods include finding the directory name for a file, getting the contents of a directory, removing a file, etc. – the sorts of things you want to do with files and directories. One especially nice feature is the / operator, which works like os.path.join(). It is simple to construct a path using components in separate variables, joining them with /`.

The sdist() function is responsible for copying the packaged source distribution to my desktop. It starts by using path’s globbing support to build a list of the .tar.gz files created by setuptools.command.sdist(). (There should only be one file, but predicting the name is more difficult than just using globbing.) The destination directory is configured through the options Bundle (a dictionary-like object with attribute look-up for the keys). Since the value of the option might include ~, I expand it before using it as the destination directory for the file move operation.

Options

Make is usually configured via the shell environment and variables within the Makefile itself. Paver uses a collection of Bundle objects organized into a hierarchical namespace. The options can be set to static literal values, computed at runtime using normal Python expressions, or overridden from the command line.

Each task can define its own section of the namespace with options. Some underlying recipes (especially the distutils and sphinx integration) depend on a specific structure, documented with the relevant task documentation. For example, these are the settings I use when running sphinx:

sphinx = Bunch(
        sourcedir='PyMOTW',
        docroot = '.',
        builder = 'html',
        doctrees='sphinx/doctrees',
        confdir = 'sphinx',
    ),

Tasks access the options using dot-notation starting with the root of the namespace. For example, options.sphinx.doctrees.

Running Shell Commands

Even with the power of Python as a programming language, sometimes it is necessary to shell-out to run an external program. Paver makes that very easy. sh() wraps Python’s standard library module subprocess to make it easier to work with for the sorts of use cases commonly found in a build system. Simply pass a string containing the shell command you want run, and optionally include the capture argument to have it return the output text (useful for commands like svnversion). sh() takes care of running the command, or in dry-run mode printing the command it would have run.

For example, the last step of building the PyMOTW PDF requires running a target included in the Makefile generated by Sphinx.

latex_dir = path(options.pdf.builddir) / 'latex'
sh('cd %s; make' % latex_dir)

Sphinx and Cog Integration

Paver also includes built-in support for Sphinx. The standard integration with Sphinx supports producing HTML output. You can configure many of the Sphinx options you would normally put in a conf.py file directly through Paver’s pavement.py. I had to override the way Sphinx is run by default, because I want to produce 3 different versions of HTML output (using different templates) and the PDF, but simpler projects won’t have to do much more than set up the location of the input files.

In addition to Sphinx, Paver integrates Ned Batchelder’s Cog, a templating/macro definition tool that lets you generate part of your documentation on the fly from arbitrary Python code. I’ve done some work to have cog run the PyMOTW examples and insert the output into the .rst file before passing it to Sphinx to be converted to HTML or PDF. The process is complicated enough to warrant its own post, though, so that will have to wait for another day.

Conclusions

Paver is a useful alternative to make, especially for Python-based packages. The default integration with distutils makes it very easy to get started. Build environments requiring a lot of external shell calls may find Makefile’s easier to deal with. In my case, I was able to fold a couple of small Python scripts into the pavement.py file, so I eliminated a few separate tools.

It’s hard to say whether a pavement file is “simpler” than a Makefile. Task definitions do not tend to be shorter than make targets, but the verbosity is an artifact of Python (function definitions and decorators, etc.) rather than anything inherent in the way Paver is designed.

A typical Paver configuration file is likely to be more portable than a Makefile, so that may be something to take into account. With file operations easily accessible in a portable library, it should be easy to set up your pavement.py to work on any OS.

For the complete pavement.py file used by PyMOTW, grab the latest release from the web site.