Command line programs are classes, too!
Most OOP discussions focus on GUI or domain-specific development areas, completely ignoring the workhorse of computing: command line programs. This article examines CommandLineApp, a base class for creating command line programs as objects, with option and argument validation, help text generation, and more.
Although many of the hot new development topics are centered on web technologies like AJAX, regular command line programs are still an important part of most systems. Many system administration tasks still depend on command line programs, for example. Often, a problem is simple enough that there is no reason to build a graphical or web user interface when a straightforward command line interface will do the job. Command line programs are less glamorous than programs with fancy graphics, but they are still the workhorses of modern computing.
The Python standard library includes two modules for working with
command line options. The getopt module presents an API that has
been in use for decades on some platforms and is commonly available in
many programming languages, from C to bash. The optparse module is
more modern than getopt, and offers features such as type
validation, callbacks, and automatic help generation. Both modules
elect to use a procedural-style interface, though, and as a result
neither has direct support for treating your command line application
as a first class object. There is no facility for sharing common
options between related programs using getopt. And, while it is
possible to reuse optparse.OptionParser
instances in different
programs, it is not as natural as inheritance.
CommandLineApp is a base class for command line programs. It handles the repetitive aspects of interacting with the user on the command line such as parsing options and arguments, generating help messages, error handling, and printing status messages. To create your application, just make a subclass of CommandLineApp and concentrate on your own code. All of the information about switches, arguments, and help text necessary for your program to run is derived through introspection. Common options and behavior can be shared by applications through inheritance.
To create your application, just make a subclass of CommandLineApp and concentrate on your own code.
csvcat Requirements
Recently, I needed to combine data from a few different sources, including a database and a spreadsheet, to summarize the results. I wanted to import the merged data into a spreadsheet where I could perform the analysis. All of the sources were able to save data to comma-separated-value (CSV) files; the challenge was merging the files together. Using the csv module in the Python standard library, and CommandLineApp, I wrote a small program to read multiple CSV files and concatenate them into a single output file. The program,
csvcat, is a good illustration of how to create applications with CommandLineApp.
The requirements for csvcat were fairly simple. It needed to read one or more CSV files and combine them, without repeating the column headers that appeared in each input source. In some cases, the input data included columns I did not want, so I needed to be able to select the columns to include in the output. No sort feature was needed, since I was going to import it into a spreadsheet when I was done and I could sort the data after importing it. To make the program more generally useful, I also included the ability to select the output format using a csv module feature called “dialects”.
Analyzing the Help
Listing 1 shows the help output for the final version of csvcat,
produced by running csvcat --help
. Listing 2 shows the source for
the program. All of the information in the help output is derived from
the csvcat class through introspection. The help text follows a
fairly standard layout. It begins with a description of the
application, followed by increasingly more detailed descriptions of
the syntax, arguments, and options. Application-specific help such as
examples and argument ranges appears at the end.
Listing 1
Concatenate comma separated value files.
SYNTAX:
csvcat [<options>] filename [filename...]
-c col[,col...], --columns=col[,col...]
-d name, --dialect=name
--debug
-h
--help
--quiet
--skip-headers
-v
--verbose=level
ARGUMENTS:
The names of comma separated value files, such as might be
exported from a spreadsheet or database program.
OPTIONS:
-c col[,col...], --columns=col[,col...]
Limit the output to the specified columns. Columns are
identified by number, starting with 0.
-d name, --dialect=name
Specify the output dialect name. Defaults to "excel".
--debug
Set debug mode to see tracebacks.
-h
Displays abbreviated help message.
--help
Displays verbose help message.
--quiet
Turn on quiet mode.
--skip-headers
Treat the first line of each file as a header, and only
include one copy in the output.
-v
Increment the verbose level. Higher levels are more verbose.
The default is 1.
--verbose=level
Set the verbose level.
EXAMPLES:
To concatenate 2 files, including all columns and headers:
$ csvcat file1.csv file2.csv
To concatenate 2 files, skipping the headers in the second file:
$ csvcat --skip-headers file1.csv file2.csv
To concatenate 2 files, including only the first and third columns:
$ csvcat --col 0,2 file1.csv file2.csv
OUTPUT DIALECTS:
excel-tab
excel
Listing 2
#!/usr/bin/env python
"""Concatenate csv files.
"""
import csv
import sys
import CommandLineApp
class csvcat(CommandLineApp.CommandLineApp):
"""Concatenate comma separated value files.
"""
EXAMPLES_DESCRIPTION = '''
To concatenate 2 files, including all columns and headers:
$ csvcat file1.csv file2.csv
To concatenate 2 files, skipping the headers in the second file:
$ csvcat --skip-headers file1.csv file2.csv
To concatenate 2 files, including only the first and third columns:
$ csvcat --col 0,2 file1.csv file2.csv
'''
def showVerboseHelp(self):
CommandLineApp.CommandLineApp.showVerboseHelp(self)
print
print 'OUTPUT DIALECTS:'
print
for name in csv.list_dialects():
print 't%s' % name
print
return
skip_headers = False
def optionHandler_skip_headers(self):
"""Treat the first line of each file as a header,
and only include one copy in the output.
"""
self.skip_headers = True
return
dialect = "excel"
def optionHandler_dialect(self, name):
"""Specify the output dialect name.
Defaults to "excel".
"""
self.dialect = name
return
optionHandler_d = optionHandler_dialect
columns = []
def optionHandler_columns(self, *col):
"""Limit the output to the specified columns.
Columns are identified by number, starting with 0.
"""
self.columns.extend([int(c) for c in col])
return
optionHandler_c = optionHandler_columns
def getPrintableColumns(self, row):
"""Return only the part of the row which should be printed.
"""
if not self.columns:
return row
# Extract the column values, in the order specified.
response = ()
for c in self.columns:
response += (row[c],)
return response
def getWriter(self):
return csv.writer(sys.stdout, dialect=self.dialect)
def main(self, *filename):
"""
The names of comma separated value files, such as might be
exported from a spreadsheet or database program.
"""
headers_written = False
writer = self.getWriter()
# process the files in order
for name in filename:
f = open(name, 'rt')
try:
reader = csv.reader(f)
if self.skip_headers:
if not headers_written:
# This row must include the headers for the output
headers = reader.next()
writer.writerow(self.getPrintableColumns(headers))
headers_written = True
else:
# We have seen headers before, and are skipping,
# so do not write the first row of this file.
ignore = reader.next()
# Process the rest of the file
for row in reader:
writer.writerow(self.getPrintableColumns(row))
finally:
f.close()
return
if __name__ == '__main__':
csvcat().run()
The program description is taken from the docstring of the csvcat class. Before it is printed, the text is split into paragraphs and reformatted using textwrap, to ensure that it is no wider than 80 columns of text.
The program description is followed by a syntax summary for the
program. The options listed in the syntax section correspond to
methods with names that begin with optionHandler_
. For example,
optionHandler_skip_headers()
indicates that csvcat should accept
a --skip-headers
option on the command line.
The names of any non-optional arguments to the program appear in the
syntax summary. In this case, csvcat needs the names of the files
containing the input data. At least one file name is necessary, and
multiple names can be given, as indicated by the fact that the
filename
argument to main()
(line 78) uses the variable argument
notation: *filename
. A longer description of the arguments, taken
from the docstring of the main()
method (lines 79-82), follows the
syntax summary. As with the general program summary, the description
of the arguments is reformatted with textwrap to fit the screen.
Options and Their Arguments
Following the argument description is a detailed explanation of all of the options to the program. CommandLineApp examines each option handler method to build the option description, including the name of the option, alternative names for the same option, and the name and description of any arguments the option accepts. There are three variations of option handlers, based on the arguments used by the option.
The simplest kind of option does not take an argument at all, and is
used as a “switch” to turn a feature on or off. The method
optionHandler_skip_headers
(lines 38-43) is an example of such a
switch. The method takes no argument, so CommandLineApp recognizes
that the option being defined does not take an argument either. To
create the option name, the prefix is stripped from the method name,
and the underscore is converted to a dash (--
);
optionHandler_skip_headers
becomes --skip-headers
.
Other options accept a single argument. For example, the --dialect
option requires the name of the CSV output dialect. The method
optionHandler_dialect
(lines 46-51) takes one argument, called
name
. The suggested syntax for the option, as seen in Listing 1, is
--dialect=name
. The name of the method’s argument is used as the
name of the argument to the option in the help text.
The -d
option has the same meaning as --dialect
, because
optionHandler_d
is an alias for optionHandler_dialect
(line
52). CommandLineApp recognizes aliases, and combines the forms in
the documentation so the alternative forms -d name
and
--dialect=name
are described together.
It is often useful for an option to take multiple arguments, as with
--columns
. The user could repeat the option on the command line, but
it is more compact to allow them to list multiple values in one
argument list. When CommandLineApp sees an option handler method
that takes a variable argument list, it treats the corresponding
option as accepting a list of arguments. When the option appears on
the command line, the string argument is split on any commas and the
resulting list of strings is passed to the option handler method.
For example, optionHandler_columns
(lines 55-60) takes a variable
length argument named col
. The option --columns
can be followed by
several column numbers, separated by commas. The option handler is
called with the list of values pre-parsed. In the syntax description,
the argument is shown repeating: --columns=col[,col…]
.
For all cases, the docstring from the option handler method serves as the help text for the option. The text of the docstring is reformatted using textwrap so both the code and help output are easy to read without extra effort on the part of the developer.
Application-specific Detailed Help
The general syntax and option description information is produced in the same way for all CommandLineApp programs. There are times when an application needs to include additional information in the help output, though, and there are two ways to add such information.
The first way is by providing examples of how to use the program on
the command line. Although it is optional, including examples of how
to apply different combinations of arguments to your program to
achieve various results enhances the usefulness of the help as a
reference manual. When the EXAMPLES_DESCRIPTION
class attribute is
set, it is used as the source for the examples. Unlike the other
documentation strings, the EXAMPLES_DESCRIPTION
is printed directly
without being reformatted. This preserves the indentation and other
formatting of the examples, so the user sees an accurate
representation of the program’s inputs and outputs.
Occasionally, a program may need to include information in its help
output which cannot be statically defined in a docstring or derived by
CommandLineApp. At the very end of its help, csvcat includes a
list of available CSV dialects which can be used with the --dialect
option. Since the list of dialects must be constructed at runtime
based on what dialects have been registered with the csv module,
csvcat overrides showVerboseHelp()
to print the list itself
(lines 27-35).
Using csvcat
The inputs to csvcat are any number of CSV files, and the output is CSV data printed to standard output. To test csvcat during development, I created two small files with test data. Each file contains three columns of data: a number, a string, and a date.
$ cat testdata1.csv
"Title 1","Title 2","Title 3"
1,"a",08/18/07
2,"b",08/19/07
3,"c",08/20/07
The second file does not include quotes around any of the string fields. I chose to include this variation because csvcat does not quote its output, so using unquoted test data simulates re-processing the output of csvcat.
$ cat testdata2.csv
Title 1,Title 2,Title 3
40,D,08/21/07
50,E,08/22/07
60,F,08/23/07
The simplest use of csvcat is to print the contents of an input file to standard output. Notice that the output does not include quotes around the string fields.
$ csvcat testdata1.csv
Title 1,Title 2,Title 3
1,a,08/18/07
2,b,08/19/07
3,c,08/20/07
It is also possible to select which columns should be included in the
output using the --columns
option. Columns are identified by their
number, beginning with ``. Column numbers can be listed in any order,
so it is possible to reorder the columns of the input data, if needed.
$ csvcat --columns 2,0 testdata1.csv
Title 3,Title 1
08/18/07,1
08/19/07,2
08/20/07,3
Switching to tab-separated columns instead of comma-separated is
easily accomplished by using the --dialect
option. There are only
two dialects available by default, but the the csv module API
supports registering additional dialects.
$ csvcat --dialect excel-tab testdata1.csv
Title 1 Title 2 Title 3
1 a 08/18/07
2 b 08/19/07
3 c 08/20/07
For my project, there were input files with several columns, but only two of them needed to be included in the output. Each file had a single row of column headers. I only wanted one set of headers in the output, so the headers from subsequent files needed to be skipped. And the output had to be in a format I could import into a spreadsheet, for which the default “excel” dialect worked fine. The data was merged with a command like this:
$ csvcat --skip-headers --columns 2,0 testdata1.csv testdata2.csv
Title 3,Title 1
08/18/07,1
08/19/07,2
08/20/07,3
08/21/07,40
08/22/07,50
08/23/07,60
Running a CommandLineApp Program
Most of the work for csvcat is being done in the main()
method. To invoke the application, however, the caller does not invoke
main()
directly. The program should be started by calling run()
,
so the options are validated and exceptions from main()
are
handled. The run()
method is one of several methods that are not
intended to be overridden by derived classes, since they implement the
core features of a command line program. The source for
CommandLineApp appears in Listing 3.
Listing 3
#!/usr/bin/env python
# CommandLineApp.py
"""Base class for building command line applications.
"""
import getopt
import inspect
import os
try:
from cStringIO import StringIO
except:
from StringIO import StringIO
import sys
import textwrap
class CommandLineApp(object):
"""Base class for building command line applications.
Define a docstring for the class to explain what the program does.
Include descriptions of the command arguments in the docstring for
main().
When the EXAMPLES_DESCRIPTION class attribute is not empty, it
will be printed last in the help message when the user asks for
help.
"""
EXAMPLES_DESCRIPTION = ''
# If true, always ends run() with sys.exit()
force_exit = True
# The name of this application
_app_name = os.path.basename(sys.argv[0])
_app_version = None
def __init__(self, commandLineOptions=sys.argv[1:]):
"Initialize CommandLineApp."
self.command_line_options = commandLineOptions
self.supported_options = self.scanForOptions()
return
def main(self, *args):
"""Main body of your application.
This is the main portion of the app, and is run after all of
the arguments are processed. Override this method to implment
the primary processing section of your application.
"""
pass
def handleInterrupt(self):
"""Called when the program is interrupted via Control-C
or SIGINT. Returns exit code.
"""
sys.stderr.write('Canceled by user.n')
return 1
def handleMainException(self, err):
"""Invoked when there is an error in the main() method.
"""
if self.debugging:
import traceback
traceback.print_exc()
else:
self.errorMessage(str(err))
return 1
## HELP
def showHelp(self, errorMessage=None):
"Display help message when error occurs."
print
if self._app_version:
print '%s version %s' % (self._app_name, self._app_version)
else:
print self._app_name
print
# If they made a syntax mistake, just
# show them how to use the program. Otherwise,
# show the full help message.
if errorMessage:
print ''
print 'ERROR: ', errorMessage
print ''
print ''
print '%sn' % self._app_name
print ''
txt = self.getSimpleSyntaxHelpString()
print txt
print 'For more details, use --help.'
print
return
def showVerboseHelp(self):
"Display the full help text for the command."
txt = self.getVerboseSyntaxHelpString()
print txt
return
## STATUS MESSAGES
def statusMessage(self, msg='', verbose_level=1, error=False, newline=True):
"""Print a status message to output.
Arguments
msg='' -- The status message string to be printed.
verbose_level=1 -- The verbose level to use. The message
will only be printed if the current verbose
level is >= this number.
error=False -- If true, the message is considered an error and
printed as such.
newline=True -- If true, print a newline after the message.
"""
if self.verbose_level >= verbose_level:
if error:
output = sys.stderr
else:
output = sys.stdout
output.write(str(msg))
if newline:
output.write('n')
# some log mechanisms don't have a flush method
if hasattr(output, 'flush'):
output.flush()
return
def errorMessage(self, msg=''):
'Print a message as an error.'
self.statusMessage('ERROR: %sn' % msg, verbose_level=0, error=True)
return
## DEFAULT OPTIONS
debugging = False
def optionHandler_debug(self):
"Set debug mode to see tracebacks."
self.debugging = True
return
_run_main = True
def optionHandler_h(self):
"Displays abbreviated help message."
self.showHelp()
self._run_main = False
return
def optionHandler_help(self):
"Displays verbose help message."
self.showVerboseHelp()
self._run_main = False
return
def optionHandler_quiet(self):
'Turn on quiet mode.'
self.verbose_level = 0
return
verbose_level = 1
def optionHandler_v(self):
"""Increment the verbose level.
Higher levels are more verbose.
The default is 1.
"""
self.verbose_level = self.verbose_level + 1
self.statusMessage('New verbose level is %d' % self.verbose_level,
3)
return
def optionHandler_verbose(self, level=1):
"""Set the verbose level.
"""
self.verbose_level = int(level)
self.statusMessage('New verbose level is %d' % self.verbose_level,
3)
return
## INTERNALS (Subclasses should not need to override these methods)
def run(self):
"""Entry point.
Process options and execute callback functions as needed.
This method should not need to be overridden, if the main()
method is defined.
"""
# Process the options supported and given
options = {}
for info in self.supported_options:
options[ info.switch ] = info
parsed_options, remaining_args = self.callGetopt(self.command_line_options,
self.supported_options)
exit_code = 0
try:
for switch, option_value in parsed_options:
opt_def = options[switch]
opt_def.invoke(self, option_value)
# Perform the primary action for this application,
# unless one of the options has disabled it.
if self._run_main:
main_args = tuple(remaining_args)
# We could just call main() and catch a TypeError,
# but that would not let us differentiate between
# application errors and a case where the user
# has not passed us enough arguments. So, we check
# the argument count ourself.
num_args_ok = False
argspec = inspect.getargspec(self.main)
expected_arg_count = len(argspec[0]) - 1
if argspec[1] is not None:
num_args_ok = True
if len(argspec[0]) > 1:
num_args_ok = (len(main_args) >= expected_arg_count)
elif len(main_args) == expected_arg_count:
num_args_ok = True
if num_args_ok:
exit_code = self.main(*main_args)
else:
self.showHelp('Incorrect arguments.')
exit_code = 1
except KeyboardInterrupt:
exit_code = self.handleInterrupt()
except SystemExit, msg:
exit_code = msg.args[0]
except Exception, err:
exit_code = self.handleMainException(err)
if self.debugging:
raise
if self.force_exit:
sys.exit(exit_code)
return exit_code
def scanForOptions(self):
"Scan through the inheritence hierarchy to find option handlers."
options = []
methods = inspect.getmembers(self.__class__, inspect.ismethod)
for method_name, method in methods:
if method_name.startswith(OptionDef.OPTION_HANDLER_PREFIX):
options.append(OptionDef(method_name, method))
return options
def callGetopt(self, commandLineOptions, supportedOptions):
"Parse the command line options."
short_options = []
long_options = []
for o in supportedOptions:
if len(o.option_name) == 1:
short_options.append(o.option_name)
if o.arg_name:
short_options.append(':')
elif o.arg_name:
long_options.append('%s=' % o.switch_base)
else:
long_options.append(o.switch_base)
short_option_string = ''.join(short_options)
try:
parsed_options, remaining_args = getopt.getopt(
commandLineOptions,
short_option_string,
long_options)
except getopt.error, message:
self.showHelp(message)
if self.force_exit:
sys.exit(1)
raise
return (parsed_options, remaining_args)
def _groupOptionAliases(self):
"""Return a sequence of tuples containing
(option_names, option_defs)
"""
# Figure out which options are aliases
option_aliases = {}
for option in self.supported_options:
method = getattr(self, option.method_name)
existing_aliases = option_aliases.setdefault(method, [])
existing_aliases.append(option)
# Sort the groups in order
grouped_options = []
for options in option_aliases.values():
names = [ o.option_name for o in options ]
grouped_options.append( (names, options) )
grouped_options.sort()
return grouped_options
def _getOptionIdentifierText(self, options):
"""Return the option identifier text.
For example:
-h
-v, --verbose
-f bar, --foo bar
"""
option_texts = []
for option in options:
option_texts.append(option.getSwitchText())
return ', '.join(option_texts)
def getArgumentsSyntaxString(self):
"""Look at the arguments to main to see what the program accepts,
and build a syntax string explaining how to pass those arguments.
"""
syntax_parts = []
argspec = inspect.getargspec(self.main)
args = argspec[0]
if len(args) > 1:
for arg in args[1:]:
syntax_parts.append(arg)
if argspec[1]:
syntax_parts.append(argspec[1])
syntax_parts.append('[' + argspec[1] + '...]')
syntax = ' '.join(syntax_parts)
return syntax
def getSimpleSyntaxHelpString(self):
"""Return syntax statement.
Return a simplified form of help including only the
syntax of the command.
"""
buffer = StringIO()
# Show the name of the command and basic syntax.
buffer.write('%s [<options>] %snn' %
(self._app_name, self.getArgumentsSyntaxString())
)
grouped_options = self._groupOptionAliases()
# Assemble the text for the options
for names, options in grouped_options:
buffer.write(' %sn' % self._getOptionIdentifierText(options))
return buffer.getvalue()
def _formatHelpText(self, text, prefix):
if not text:
return ''
buffer = StringIO()
text = textwrap.dedent(text)
for para in text.split('nn'):
formatted_para = textwrap.fill(para,
initial_indent=prefix,
subsequent_indent=prefix,
)
buffer.write(formatted_para)
buffer.write('nn')
return buffer.getvalue()
def getVerboseSyntaxHelpString(self):
"""Return the full description of the options and arguments.
Show a full description of the options and arguments to the
command in something like UNIX man page format. This includes
- a description of each option and argument, taken from the
__doc__ string for the optionHandler method for
the option
- a description of what additional arguments will be processed,
taken from the arguments to main()
"""
buffer = StringIO()
class_help_text = self._formatHelpText(inspect.getdoc(self.__class__),
'')
buffer.write(class_help_text)
buffer.write('nSYNTAX:nn ')
buffer.write(self.getSimpleSyntaxHelpString())
main_help_text = self._formatHelpText(inspect.getdoc(self.main), ' ')
if main_help_text:
buffer.write('nnARGUMENTS:nn')
buffer.write(main_help_text)
buffer.write('nOPTIONS:nn')
grouped_options = self._groupOptionAliases()
# Describe all options, grouping aliases together
for names, options in grouped_options:
buffer.write(' %sn' % self._getOptionIdentifierText(options))
help = self._formatHelpText(options[0].help, ' ')
buffer.write(help)
if self.EXAMPLES_DESCRIPTION:
buffer.write('EXAMPLES:nn')
buffer.write(self.EXAMPLES_DESCRIPTION)
return buffer.getvalue()
class OptionDef(object):
"""Definition for a command line option.
Attributes:
method_name - The name of the option handler method.
option_name - The name of the option.
switch - Switch to be used on the command line.
arg_name - The name of the argument to the option handler.
is_variable - Is the argument expected to be a sequence?
default - The default value of the option handler argument.
help - Help text for the option.
is_long - Is the option a long value (--) or short (-)?
"""
# Option handler method names start with this value
OPTION_HANDLER_PREFIX = 'optionHandler_'
# For *args arguments to option handlers, how to split the argument values
SPLIT_PARAM_CHAR = ','
def __init__(self, methodName, method):
self.method_name = methodName
self.option_name = methodName[len(self.OPTION_HANDLER_PREFIX):]
self.is_long = len(self.option_name) > 1
self.switch_base = self.option_name.replace('_', '-')
if len(self.switch_base) == 1:
self.switch = '-' + self.switch_base
else:
self.switch = '--' + self.switch_base
argspec = inspect.getargspec(method)
self.is_variable = False
args = argspec[0]
if len(args) > 1:
self.arg_name = args[-1]
elif argspec[1]:
self.arg_name = argspec[1]
self.is_variable = True
else:
self.arg_name = None
if argspec[3]:
self.default = argspec[3][0]
else:
self.default = None
self.help = inspect.getdoc(method)
return
def getSwitchText(self):
"""Return the description of the option switch.
For example: --switch=arg or -s arg or --switch=arg[,arg]
"""
parts = [ self.switch ]
if self.arg_name:
if self.is_long:
parts.append('=')
else:
parts.append(' ')
parts.append(self.arg_name)
if self.is_variable:
parts.append('[%s%s...]' % (self.SPLIT_PARAM_CHAR, self.arg_name))
return ''.join(parts)
def invoke(self, app, arg):
"""Invoke the option handler.
"""
method = getattr(app, self.method_name)
if self.arg_name:
if self.is_variable:
opt_args = arg.split(self.SPLIT_PARAM_CHAR)
method(*opt_args)
else:
method(arg)
else:
method()
return
if __name__ == '__main__':
CommandLineApp().run()
The available and supported options are examined when the instance is
initialized (lines 40-44). By default, the contents of sys.argv
are
used as the options and arguments passed in from the command line to
the program. It is easy to pass a different list of options when
writing automated tests for your program, by passing a list of strings
to __init__()
as commandLineOptions
. The options supported by the
program are determined by scanning the class for option handler
methods. No options are actually evaluated until run()
is called.
When the program is run, the first thing it does is use getopt to
validate the options it has been given (line 201). In callGetopt()
,
the arguments needed by getopt are constructed based on the option
handlers discovered for the class (lines 262-288). Options are
processed in the order they are passed on the command line (lines
205-207), and the option handler method for each option encountered is
called. When an option handler requires an argument that is not
provided on the command line, getopt detects the error. When an
argument is provided, the option handler is responsible for
determining whether the value is the correct type or otherwise
valid. When the argument is not valid, the option handler can raise an
exception with an error message to be printed for the user.
After all of the options are handled, the remaining arguments to the
program are checked to be sure there are enough to satisfy the
requirements, based on the argspec of the main()
function. The
number of arguments is checked explicitly to avoid having to handle a
TypeError
if the user does not pass the right number of arguments on
the command line. If CommandLineApp depended on catching a
TypeError
when it passed too few arguments to main()
, it could not
tell the difference between a coding error and a user error. If a
mistake inside main()
caused a TypeError
to occur, it might look
like the user had passed an incorrect number of arguments to the
program.
Error Handling
When an exception is raised during option processing or inside
main()
, the exception is caught by one of the except
clauses on
lines 236-245 and given to an error handling method. Subclasses can
change the error handling behavior by overriding these methods.
KeyboardInterrupt
exceptions are handled by calling
handleInterrupt()
. The default behavior is to print a message that
the program has been interrupted and cause the program to exit with an
error code. A subclass could override the method to clean up an
in-progress task, background thread, or other operation which
otherwise might not be automatically stopped when the
KeyboardInterrupt
is received.
When a lower level library tries to exit the program, SystemExit
may
be raised. CommandLineApp traps the SystemExit
exception and
exits normally, using the exit status taken from the exception. If the
force_exit
attribute of the application is false, run()
returns
instead of exiting (lines 247-249). Trapping attempts to exit makes it
easier to integrate CommandLineApp programs with unittest
or
other testing frameworks. The test can instantiate the application,
set force_exit
to a false value, then run it. If any errors occur, a
status code is returned but the test process does not exit.
Trapping attempts to exit makes it easier to integrate CommandLineApp programs with unittest or other testing frameworks.
All other types of exceptions are handled by calling
handleMainException()
and passing the exception as an argument. The
default implementation of handleMainException()
(lines 62-70) prints
a simple error message based on the exception, unless debugging mode
is turned on. Debugging mode prints the entire traceback for the
exception.
$ csvcat file_does_not_exist.csv
ERROR: [Errno 2] No such file or directory:
'file_does_not_exist.csv'
Option Definitions
The standard library module inspect provides functions for performing introspection operations on classes and objects at runtime. The API supports basic querying and type checking so it is possible, for example, to get a list of the methods of a class, including all inherited methods.
CommandLineApp.scanForOptions()
uses inspect to scan an
application class for option handler methods (lines 251-260). All of
the methods of the class are retrieved with inspect.getmembers()
,
and those whose name starts with optionHandler_
are added to the
list of supported options. Since most command line options use dashes
instead of underscores, but method names cannot contain dashes, the
underscores in the option handler method names are converted to dashes
when creating the option name.
The __init__()
method of the OptionDef class (lines 440-469)
does all of the work of determining the command line switch name and
what type of arguments the switch takes. The option handler method is
examined with inspect.getargspec()
, and the result is used to
initialize the OptionDef.
An “argspec” for a function is a tuple made up of four values: a list
of the names of all regular arguments to the function, including
self
if the function is a method; the name of the argument to
receive the variable argument values, if any; the name of the argument
to receive the keyword arguments, if any; and a list of the default
values for the arguments, in they order they appear in the list of
option names.
The argspecs for the option handlers in csvcat illustrate the
variations of interest to OptionDef. First,
optionHandler_skip_headers
:
>>> import Listing2
>>> import inspect
>>> print inspect.getargspec(
... Listing2.csvcat.optionHandler_skip_headers)
(['self'], None, None, None)
Since the only positional argument to the method is self
, and there
is no variable argument name given, the option handler is treated as a
simple command line switch without any arguments.
The optionHandler_dialect
, on the other hand, does include an
additional argument:
>>> print inspect.getargspec(
... Listing2.csvcat.optionHandler_dialect)
(['self', 'name'], None, None, None)
The name
argument is listed in the argspec as a single regular
argument. The result, when a program is run, is that while the options
are being processed by CommandLineApp and OptionDef, the value
for name
is passed directly to the option handler method (line 497).
The optionHandler_columns
method illustrates variable argument handling:
>>> print inspect.getargspec(
... Listing2.csvcat.optionHandler_columns)
(['self'], 'col', None, None)
The col
argument from optionHandler_columns
is named in the
argspec as the variable argument identifier. Since
optionHandler_columns
accepts variable arguments, the OptionDef
splits the argument value into a list of strings, and the list is
passed to the option handler method (lines 494-495) using the variable
argument syntax.
The other variable argument configuration, using unidentified keyword arguments, does not make sense for an option handler. The user of the command line program has no standard way to specify named arguments to options, so they are not supported by OptionDef.
Status Messages
In addition to command line option and argument parsing, and error
handling, CommandLineApp provides a “status message” interface for
giving varying levels of feedback to the user. Status messages are
printed by calling self.statusMessage()
(line 108). Each message
must indicate the verbose level setting at which the message should be
printed. If the current verbose level is at or higher than the desired
level, the message is printed. Otherwise, it is ignored. The -v
,
--verbose
, and --quiet
flags let the user control the
verbose_level
setting for the application, and are defined in the
CommandLineApp so that all subclasses inherit them.
Listing 4
#!/usr/bin/env python
# Illustrate verbose level controls.
import CommandLineApp
class verbose_app(CommandLineApp.CommandLineApp):
"Demonstrate verbose level controls."
def main(self):
for i in range(1, 10):
self.statusMessage('Level %d' % i, i)
return
if __name__ == '__main__':
verbose_app().run()
Listing 4 contains another sample application which uses
statusMessage()
to illustrate how the verbose level setting is
applied. The default verbose level is 1, so when the program is run
without any additional arguments only a single message is printed:
$ python Listing4.py
Level 1
$
The --quiet
option silences all status messages by setting the
verbose level to ``:
$ python Listing4.py --quiet
$
Using the -v
option increases the verbose setting, one level at a
time. The option can be repeated on the command line:
$ python Listing4.py -v
Level 1
Level 2
$ python Listing4.py -vv
New verbose level is 3
Level 1
Level 2
Level 3
$
And the --verbose
option sets the verbose level directly to the desired value:
$ python Listing4.py --verbose 4
New verbose level is 4
Level 1
Level 2
Level 3
Level 4
$
Error messages can be printed to the standard error stream using the
errorMessage()
method (lines 138-141). The message is prefixed with
the word “ERROR”, and error messages are always printed, no matter
what verbose level is set. Most programs will not need to use
errorMessage()
directly, because raising an exception is sufficient
to have an error message displayed for the user.
CommandLineApp and Inheritance
When creating a suite of related programs, it is usually desirable for
all of the programs to use the same options and, in many cases, share
other common behavior. For example, when working with a database the
connection and transaction must be managed reliably. Rather than
re-implementing the same database handling code in each program, by
using CommandLineApp, you can create an intermediate base class
for your programs and share a single implementation. Listing 5
includes a skeleton base class called SQLiteAppBase for working
with an sqlite3
database in this way.
Listing 5
#!/usr/bin/env
# Base class for sqlite programs.
import sqlite3
import CommandLineApp
class SQLiteAppBase(CommandLineApp.CommandLineApp):
"""Base class for accessing sqlite databases.
"""
dbname = 'sqlite.db'
def optionHandler_db(self, name):
"""Specify the database filename.
Defaults to 'sqlite.db'.
"""
self.dbname = name
return
def main(self):
# Subclasses can override this to control the arguments
# used by the program.
self.db_connection = sqlite3.connect(self.dbname)
try:
self.cursor = self.db_connection.cursor()
exit_code = self.takeAction()
except:
# throw away changes
self.db_connection.rollback()
raise
else:
# save changes
self.db_connection.commit()
return exit_code
def takeAction(self):
"""Override this in the actual application.
Return the exit code for the application
if no exception is raised.
"""
raise NotImplementedError('Not implemented!')
if __name__ == '__main__':
SQLiteAppBase().run()
SQLiteAppBase defines a single option handler for the --db
option to let the user choose the database file (line 12). The default
database is a file in the current directory called “sqlite.db”. The
main()
method establishes a connection to the database (line 22),
opens a cursor for working with the connection (line 24), then calls
takeAction()
to do the work (line 25). When takeAction()
raises an
exception, all database changes it may have made are discarded and the
transaction is rolled back (line 28). When there is no error, the
transaction is committed and the changes are saved (line 32).
Listing 6
#!/usr/bin/env python
# Initialize the database
import time
from Listing5 import SQLiteAppBase
class initdb(SQLiteAppBase):
"""Initialize a database.
"""
def takeAction(self):
self.statusMessage('Initializing database %s' % self.dbname)
# Create the table
self.cursor.execute("CREATE TABLE log (date text, message text)")
# Log the actions taken
self.cursor.execute(
"INSERT INTO log (date, message) VALUES (?, ?)",
(time.ctime(), 'Created database'))
self.cursor.execute(
"INSERT INTO log (date, message) VALUES (?, ?)",
(time.ctime(), 'Created log table'))
return
if __name__ == '__main__':
initdb().run()
A subclass of SQLiteAppBase can override takeAction()
to do some
actual work using the database connection and cursor created in
main()
. Listing 6 contains one such program, called initdb
. In
initdb
, the takeAction()
method creates a “log” table (line 14)
using the database cursor established in the base class. It then
inserts two rows into the new table, using the same cursor. There is
no need for initdb
to commit the transaction, since the base class
will do that after takeAction()
returns without raising an
exception.
$ python Listing6.py
Initializing database sqlite.db
Listing 7
#!/usr/bin/env python
# Initialize the database
from Listing5 import SQLiteAppBase
class showlog(SQLiteAppBase):
"""Show the contents of the log.
"""
substring = None
def optionHandler_message(self, substring):
"""Look for messages with the substring.
"""
self.substring = substring
return
def takeAction(self):
if self.substring:
pattern = '%' + self.substring + '%'
c = self.cursor.execute(
"SELECT * FROM log WHERE message LIKE ?;",
(pattern,))
else:
c = self.cursor.execute("SELECT * FROM log;")
for row in c:
print '%-30s %s' % row
return 0
if __name__ == '__main__':
showlog().run()
The showlog
program in Listing 7 also uses SQLiteAppBase. It
reads records from the log table and prints them out to the
screen. When no options are given, it uses the cursor opened by the
base class to find all of the records in the “log” table (line 24),
and print them:
$ python Listing7.py
Sat Aug 25 19:09:41 2007 Created database
Sat Aug 25 19:09:41 2007 Created log table
The --message
option to showlog
can be used to filter the output
to include only records whose message column matches the pattern
given. When a message substring is specified, the select statement is
altered to include only messages containing the substring (lines
19-20). In this example, only log messages with the word “table” in
the message are printed:
$ python Listing7.py --message table
Sat Aug 25 19:09:41 2007 Created log table
The updatelog
program in Listing 8 inserts new records into the
database. Each time updatelog
is called, the message passed on the
command line is saved as an instance attribute by main()
(line 15)
so it can be used later when a new row is inserted into the log
table (line 20) by takeAction()
.
Listing 8
#!/usr/bin/env python
# Initialize the database
import time
from Listing5 import SQLiteAppBase
class updatelog(SQLiteAppBase):
"""Add to the contents of the log.
"""
def main(self, message):
"""Provide the new message to add to the log.
"""
# Save the message for use in takeAction()
self.message = message
return SQLiteAppBase.main(self)
def takeAction(self):
self.cursor.execute(
"INSERT INTO log (date, message) VALUES (?, ?)",
(time.ctime(), self.message))
return
if __name__ == '__main__':
updatelog().run()
$ python Listing8.py "another new message"
$ python Listing7.py
Sat Aug 25 19:09:41 2007 Created database
Sat Aug 25 19:09:41 2007 Created log table
Sat Aug 25 19:10:29 2007 another new message
As with initdb
, because the base class commits changes to the
database after takeAction()
returns, updatelog
does not need to
manage the database connection in any way. Since all of the example
programs use the database connection and cursor created by their base
class, they could be updated to use a Postgresql or MySQL database by
modifying the base class, without having to make those changes to each
program separately.
Future Work
I have been using CommandLineApp in my own work for several years now, and continue to find ways to enhance it. The two primary features I would still like to add are the ability to print the help for a command in formats other than plain text, and automatic type conversion for arguments.
It is difficult to prepare attractive printed documentation from plain text help output like what is produced by the current version of CommandLineApp. Parsing the text output directly is not necessarily straightforward, since the embedded help may contain characters or patterns that would confuse a simple parser. A better solution is to use the option data gathered by introspection to generate output in a format such as DocBook, which could then be converted to PDF or HTML using other tool sets specifically designed for that purpose. There is a prototype of a program to create DocBook output from an application class, but it is not robust enough to be released – yet.
CommandLineApp is based on the older option parsing module,
getopt, rather than the new optparse. This means it does not
support some of the newer features available in optparse, such as
type conversion for arguments. Type conversion could be added to
CommandLineApp by inferring the types from default values for
arguments. The OptionDef already discovers default values, but
they are not used. The OptionDef.invoke()
method needs to be updated
to look at the default for an option before calling the option
handler. If the default is a type object, it can be used to convert
the incoming argument. If the default is a regular object, the type of
the object can be determined using type()
. Then, once the type is
known, the argument can be converted.
Conclusion
I hope this article encourages you to think about your command line programs in a different light, and to treat them as first class objects. Using inheritance to share code is so common in other areas of development that it is hardly given a second thought in most cases. As has been shown with the SQLiteAppBase programs, the same technique can be just as powerful when applied to building command line programs, saving development time and testing effort as a result. CommandLineApp has been used as the foundation for dozens of types of programs, and could be just what you need the next time you have to write a new command line program.