Defining Custom Roles in Sphinx

Defining Custom Roles in Sphinx

Creating custom processing instructions for Sphinx is easy and will
make documenting your project less trouble.

Apparently 42 is a magic number.

While working on issue 42 for virtualenvwrapper, I needed to create
a link from the history file to the newly resolved issue. I finally
decided that pasting the links in manually was getting old, and I
should do something to make it easier. Sphinx and docutils have
built-in markup for linking to RFCs and the Python developers use a
custom role for linking to their bug tracker issues. I decided to
create an extension so I could link to the issue trackers for my
BitBucket projects just as easily.

Extension Options

Sphinx is built on docutils, a set of tools for parsing and working
with reStructuredText markup. The rst parser in docutils is designed
to be extended in two main ways:

  1. Directives let you work with large blocks of text and intercept

    the parsing as well as formatting steps.

  2. Roles are intended for inline markup, within a paragraph.

Directives are used for handling things like inline code, including
source from external locations, or other large-scale processing.
Since each directive defines its own paragraphs, they operate at the
wrong scale for handling in-line markup. I needed to define a new
role.

Defining a Role Processor

The docutils parser works by converting the input text to an internal
tree representation made up of different types of nodes. The tree is
traversed by a writer to create output in the desired format. To add
a directive or role, you need to provide the hooks to be called to
handle the markup when it is encountered in the input file. A role
processor is defined with a function that takes arguments describing
the marked-up text and returns the nodes to be included in the parse
tree.

Roles all have a common syntax, based on the interpreted text
feature of reStructuredText. For example, the rfc role for
linking to an RFC document looks like:

:rfc:`1822`

and produces links like **RFC 1822**, complete with the upper
case RFC.

In my case, I wanted to define new roles for linking to tickets in the
issue tracker for a project (bbissue) and Mercurial changesets
(bbchangeset). The first step was to define the role processing
function.

def bbissue_role(name, rawtext, text, lineno, inliner, options={}, content=[]):
    """Link to a BitBucket issue.

    Returns 2 part tuple containing list of nodes to insert into the
    document and a list of system messages.  Both are allowed to be
    empty.

    :param name: The role name used in the document.
    :param rawtext: The entire markup snippet, with role.
    :param text: The text marked with the role.
    :param lineno: The line number where rawtext appears in the input.
    :param inliner: The inliner instance that called us.
    :param options: Directive options for customization.
    :param content: The directive content for customization.
    """
    try:
        issue_num = int(text)
        if issue_num <= 0:
            raise ValueError
    except ValueError:
        msg = inliner.reporter.error(
            'BitBucket issue number must be a number greater than or equal to 1; '
            '"%s" is invalid.' % text, line=lineno)
        prb = inliner.problematic(rawtext, rawtext, msg)
        return [prb], [msg]
    app = inliner.document.settings.env.app
    node = make_link_node(rawtext, app, 'issue', str(issue_num), options)
    return [node], []

The parser invokes the role processor when it sees interpreted text
using the role in the input. It passes both the raw, unparsed, text as
well as the contents of the interpreted text (the parts between the “`”). It also passes an “inliner”, the part of the parser that
saw the markup and invoked the processor. The inliner gives us a
handle back to docutils and Sphinx so we can access the runtime
environment to get configuration settings or save data for use later.

The return value from the processor is a tuple containing two lists.
The first list contains any new nodes to be added to the parse tree,
and the second list contains error or warning messages to show the
user. Processors are defined to return errors instead of raising
exceptions because the error messages can be inserted into the output
instead of halting all processing.

The bbissue role processor validates the input text by converting
it to an integer issue id. If that isn’t possible, it builds an error
message and returns a problematic node to be added to the output
file. It also returns the message text so the message is printed on
the console. If validation passes, a new node is constructed with
make_link_node(), and only that success node is included in the
return value.

To create the inline node with the hyperlink to a ticket,
make_link_node() looks in Sphinx’s configuration for a
bitbucket_project_url string. Then it builds a reference node
using the URL and other values derived from the values given by the
parser.

def make_link_node(rawtext, app, type, slug, options):
    """Create a link to a BitBucket resource.

    :param rawtext: Text being replaced with link node.
    :param app: Sphinx application context
    :param type: Link type (issue, changeset, etc.)
    :param slug: ID of the thing to link to
    :param options: Options dictionary passed to role func.
    """
    #
    try:
        base = app.config.bitbucket_project_url
        if not base:
            raise AttributeError
    except AttributeError, err:
        raise ValueError('bitbucket_project_url configuration value is not set (%s)' % str(err))
    #
    slash = '/' if base[-1] != '/' else ''
    ref = base + slash + type + '/' + slug + '/'
    set_classes(options)
    node = nodes.reference(rawtext, type + ' ' + utils.unescape(slug), refuri=ref,
                           **options)
    return node

Registering the Role Processor

With the role processor function defined, the next step is to tell
Sphinx to load the extension and to register the new role. Instead of
using setuptools entry points for defining plugins, Sphinx asks you to
list them explicitly in the configuration file. This makes it easy to
install several extensions to be used by several projects, and only
enable the ones you want for any given documentation set.

Extensions are listed in the conf.py configuration file for your
Sphinx project, in the extensions variable. I added my module to
the sphinxcontrib project namespace package, so the module has the
name sphinxcontrib.bitbucket.

# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = ['sphinx.ext.ifconfig',
              'sphinx.ext.autodoc',
              'sphinxcontrib.bitbucket',
              ]

Sphinx uses the name given to import the module or package containing
the extension, and then call a function named setup() to
initialize the extension. During the initialization phase you can
register new role and directives, as well as configuration values.

def setup(app):
    """Install the plugin.

    :param app: Sphinx application context.
    """
    app.add_role('bbissue', bbissue_role)
    app.add_role('bbchangeset', bbchangeset_role)
    app.add_config_value('bitbucket_project_url', None, 'env')
    return

For this extension I did not want to make any assumptions about the
BitBucket user or project name, so a bitbucket_project_url value
must be added to conf.py.

bitbucket_project_url = 'http://bitbucket.org/dhellmann/virtualenvwrapper/'

Accessing Sphinx Configuration from Your Role

Sphinx handles configuration a little differently from docutils, so I
had to dig for while to find an explanation of how to access the
configuration value from within the role processor. The inliner
argument includes a reference to the current document being processed,
including the docutils settings. Those settings contain an
environment context object, which can be modified by the processors
(to track things like items to include in a table of contents or
index, for example). Sphinx adds its separate application context to
the environment, and the application context includes the
configuration settings. If your role function’s argument is
inliner, then the full path to access a config value called
my_setting is:

inliner.document.settings.env.app.config.my_setting

Results

The new bbissue role looks the same as the rfc role, with the
ticket id as the body of the interpreted text.

For example:

:bbissue:`42`

becomes: issue 42

See also

sphinxcontrib.bitbucket home
Home page for sphinxcontrib.bitbucket, with links to the issue
tracker and announcements of new releases.
sphinxcontrib.bitbucket source
The complete source code for the Sphinx extension described above,
including both bbissue and bbchangeset roles.
Tutorial: Writing a simple extension
Part of the Sphinx documentation set, this tutorial explains how
to create a basic directive processor.
Creating reStructuredText Interpreted Text Roles
David Goodger’s original documentation for creating new roles for
distutils.
Docutils Hacker’s Guide
An introduction to Docutils’ internals by Lea Wiemann.