beagle 0.3.0

beagle is a command line tool for querying a hound code search service such as http://codesearch.openstack.org

What’s new in 0.3.0?

  • Add repo-pattern usages examples in the doc (contributed by Hervé Beraud)
  • Add an option to filter repositories in search results
  • Refresh python’s versions and their usages (contributed by Hervé Beraud)

sphinxcontrib-spelling 5.1.2

sphinxcontrib-spelling is a spelling checker for Sphinx-based documentation. It uses PyEnchant to produce a report showing misspelled words.

What’s new in 5.1.2?

  • Mark as unsafe for parallel builds (contributed by Jared Dillard)
  • Add -W arg to sphinx-build in docs so warnings cause error (contributed by Elsa Gonsiorowski, PhD)

sphinxcontrib-spelling 5.1.0

sphinxcontrib-spelling is a spelling checker for Sphinx-based documentation. It uses PyEnchant to produce a report showing misspelled words.

What’s new in 5.1.0?

  • Add an option to show the line containing a misspelling for context (contributed by Huon Wilson)

beagle 0.2.2

beagle is a command line tool for querying a hound code search service such as http://codesearch.openstack.org

What’s new in 0.2.2?

  • fix the reference to undefined function in link formatter
  • Fix issues (contributed by Hervé Beraud)
  • Refactor pipelines (contributed by Hervé Beraud)
  • [doc] refresh oslo examples (contributed by Hervé Beraud)

imapautofiler 1.8.1

imapautofiler applies user-defined rules to automatically organize messages on an IMAP server.

What’s new in 1.8.1?

  • Fix comparison with TZ aware datetime in TimeLimit rule (contributed by Nicolas Le Manchet)

sphinxcontrib.datatemplates 0.5.0

sphinxcontrib.datatemplates is an extension for Sphinx to render parts of reStructuredText pages from data files in formats like JSON, YAML, XML, and CSV.

What’s new in 0.5.0?

  • Add domain for Python Modules (contributed by Jan Brohl)
  • Use default template manager when the builder does not have one (contributed by Toni Ruza)
  • Support parallel builds (contributed by Toni Ruza)
  • Add option to load multiple documents from yaml (contributed by Jan Brohl)
  • Restore Python3.6 compat (contributed by Jan Brohl)
  • Add support for DBM formats (contributed by Jan Brohl)
  • Improve documentation

New PyMOTW site logo

Last week Nils-Hero Lindemann contacted me with some icons to spruce up the PyMOTW site. I love the look and the thought that went into the design.

PyMOTW icon

“The four squares symbolize the modules and the white square is the current module, the ‘module of the week’.”

Thanks, Nils-Hero! I am touched that you took the time not only to create the image files, but also to work out (and tell me) exactly what I need to do to add them to the HTML properly.

Dependencies between Python Standard Library modules

Glyph’s post about a “kernel python” from the 13th based on Amber’s presentation at PyCon made me start thinking about how minimal standard library could really be. Christian had previously started by nibbling around the edges, considering which modules are not frequently used, and could be removed. I started thinking about a more extreme change, of leaving in only enough code to successfully download and install other packages. The ensurepip module seemed like a necessary component for that, so I looked at its dependencies, with an eye to cutting everything else.

Methodology

I experimented with a few different ways to find the dependencies between modules. First, I looked at the contents of sys.modules after importing something. That showed that there were a lot of dependencies, but not which modules depended directly on which others.

The output of python -v -c 'import ensurepip' was similarly general, showing the imports as they happened but not in a way that made establishing the relationships between modules obvious.

Next, I looked at a few packages that already exist for doing this sort of analysis. modulegraph seems to assume that you are starting with packages released to PyPI, since it looks for setup.py and other setuptools-related packaging data. pydeps looked promising, but I couldn’t make it produce any output for the standard library. snakefood uses the abstract syntax tree, but does not work under python 3.

Eventually I wrote a little code that uses the abstract syntax tree of each module to find import and from import statements. With that information, I was able to build a digraph with the module relationships.

Results

I found one special case, where the heapq module imports doctest as part of its __main__ block, but does not use it at runtime. I decided that it would be safe to remove that use in heapq, or to at least protect it with a useful error message if the import fails, so I filtered out the use doctest, which also dropped a significant number of its dependencies.

I also treated __main__ as a special case, and ignored it completely.

As you might imagine, the standard library modules rely heavily on one another, so a full dependency graph is quite messy.

The full dependency graph for ensurepip.

To make the output more readable, I added a command line switch to produce a “simplified” graph, where a module is only referenced the first time it is imported by any other module. By traversing the graph breadth-first, I was able to build a readable tree that emphasizes the dependencies of modules “higher up the stack” (closer to ensurepip).

This produced a list of 72 dependencies for ensurepip:

_bootlocale, _compression, _dummy_thread, _threading_local,
_weakrefset, abc, argparse, ast, bisect, bz2, codecs,
collections, collections.abc, contextlib, copy, dis,
dummy_threading, encodings, enum, fnmatch, functools,
genericpath, gettext, gzip, hashlib, heapq, importlib,
importlib._bootstrap, importlib._bootstrap_external,
importlib.machinery, importlib.util, inspect, io, keyword,
linecache, locale, logging, lzma, ntpath, opcode, operator, os,
pickle, pkgutil, posixpath, pprint, py_compile, random, re,
reprlib, selectors, shutil, signal, sre_compile, sre_constants,
sre_parse, stat, string, struct, subprocess, tarfile, tempfile,
textwrap, threading, token, tokenize, traceback, tracemalloc,
types, warnings, weakref, zipfile

Some are C modules, and others are only used in some cases and may not need to be part of a “kernel” distribution.

Next Steps

This is really just a start, to satisfy my own curiosity about the idea of a minimal standard library. Anyone truly setting out to reduce the size of the standard library would need to take this analysis further, to ensure that all of the imports found with the scanner are relevant (e.g., doctest was not) and to understand whether some of the code being imported could be moved from its current home to break the dependency on a larger module if only a const or small function is being used. And of course, we may not want to go to extreme of removing everything other than the modules used by ensurepip. There may be modules needed by the interactive interpreter, or other use cases, that should similarly be included in any “kernel” distribution.