PyMOTW: Queue

The Queue module provides a FIFO implementation suitable for
multi-threaded programming. It can be used to pass messages or other
data between producer and consumer threads safely. Locking is handled
for the caller, so it is simple to have as many threads as you want
working with the same Queue instance. A Queue’s size (number of
elements) may be restricted to throttle memory usage or processing.

Read more at pymotw.com: Queue

PyMOTW: ConfigParser

The ConfigParser module is very useful for creating user-editable
configuration files for your applications. The configuration files are
broken up into sections, and each section can contain name-value pairs
for configuration data. Value interpolation using Python formatting
strings is also supported, to build values which depend on one another
(this is especially handy for paths or URLs).

Read more at pymotw.com: ConfigParser

Testing pygments

This is a test post to see experiment with the code hightlighting
output from pygments.org (as recommended by a couple of commenters on
my previous post). Pygments produces HTML with CSS-based styling, so I
have added a bunch of new styles to my blogger template. And I am
including as a sample the same Python code posted earlier with the
alternative syntax highlighting tool.

def main(self, *m3ufilenames):
    self.startRSS()
    self.generateChannelInfo()
    for line in fileinput.input(m3ufilenames):
        mp3filename = line.strip()
        if not mp3filename or mp3filename.startswith('#'):
            continue
        self.generateItem(mp3filename)
    self.endRSS()
    return 0

So, let me know what you think of the 2 methods, and which looks
better.

PyMOTW: Call for input

Tomorrow’s post will cover the ConfigParser module. Beyond that, I
have a few more weeks planned out, and am looking for suggestions for
which modules to cover next.

If you were stranded on a desert island, which standard library module
would you want, and why?

Converting Python source to HTML

For my PyMOTW series, I have found that I want to convert a lot of
python source code to HTML. In a perfect world it would be easy for me
to produce pretty XML/HTML and use CSS, but it is not obvious how to use
CSS from Blogger. Instead, I am using a CLI app based on this ASPN
recipe
which produces HTML snippets that I can paste directly into a
new blog post. The output HTML is more verbose than I wanted, but I like
the fact that it has no external dependencies.

If you have any alternatives, I would appreciate hearing about them.

PyMOTW: fileinput

To start this series, let’s take a look at the fileinput module,
a very useful module for creating command line programs for processing
text files in a filter-ish manner. For example, the m3utorss app I
recently wrote for my friend Patrick to convert some of his demo
recordings into a podcastable format.

Read more at pymotw.com: fileinput

PyMOTW: Python Module of the Week

I am starting a new series of posts today that I am calling “Python
Module of the Week” (PyMOTW)
. I have two goals for this:

  1. to work my way though the standard library and learn something about
    each module
  2. to get into the habit of posting to the blog more regularly

I will cheat a little, and start out with some modules that I already
know something about. Hopefully that will give me enough of a head-start
that I can keep up a fairly regular flow.

image0Subscribe to PyMOTW in your feed reader

Distributing django applications

I had a report that version 1.2 of my codehosting package did not
include all of the required files. It turns out I messed up the setup.py
file and left out the templates, images, and CSS files. Oops.

In the process of trying to fix the setup file, I discovered that
distutils does not include package data in sdist. Not a big deal,
since I just created a MANIFEST.in file to get around it.

My next challenge (for this project) is how to write the templates in
a way that would let anyone actually reuse them. For example, the
project details page shows info about the most current release and a
complete release history. It uses a 2 column layout for that, but the
way I have it implemented the layout is defined in the base template for
my site. I want to move that layout from the site base template down
into the application base template, but I do not want to repeat myself
if I can avoid it. Maybe I need to get over that and just repeat the
layout instructions. Or refactor the site base template somehow.
Obviously that needs more thought. I did find some useful advice in
DosAndDontsForApplicationWriters, but have not implemented all of
those suggestions.

In the mean time, release 1.4 of codehosting is more flexible than the
previous releases and is probably closer to something useful for people
other than me.

[Updated 28 Sept 2007 to correct typo in title]

How NOT to Backup a Blogger Blog

Over at the Google Operating System blog, they offer a way to
“backup” your blog
. It is mostly a manual hack to load the entire blog
into one page in a web browser, then save the resulting HTML, though a
similar technique is offered for saving the contents of your XML feed.

There are a few problems with this technique:

  1. It depends on knowing how many posts are in the blog, up front.
  2. The steps and tools given are manual.
  3. Comments are handled separately.

A backup needs to be automated. If I have to remember to do something
by hand, it isn’t going to be done on a regular basis. I want to add to
my blog without worrying about how many posts there are and tweaking
some backup procedure that depends on knowing all about the content of
the blog up front. I want comments saved automatically along with each
post, not in one big lump. And if I need to import the data into a
database, I want the backup format to support parsing the data easily.

What to do?

Enter BlogBackup, the unimaginatively named, fully automatic,
backup software for your blog. Just point the command line tool at your
blog feed and a directory where the backup output should go. It will
automatically perform a full backup, including:

  1. Every blog post is saved to a separate file in an easily parsable
    format, including all of the meta-data provided by the feed
    (categories, tags, publish dates, author, etc.).
  2. Comments are saved in separate directories, organized around the post
    with which they are associated. Comments also include all of their
    meta-data.
  3. The content of blog posts and comments are copied to a separate text
    file for easy indexing by desktop search tools such as Spotlight.

Since the tool is a command line program, it is easy to automate with
cron or a similar scheduling tool. Since it is fully automatic and reads
the feed itself, you do not need to reconfigure it as your blog grows.
And the data is stored in a format which makes it easy to parse to load
into another database of some sort.

So, go forth and automate.

Better blogger backups

I have enhanced the blog backup script I wrote a while back to
automatically find and include comments feeds, so comments are now
archived along with the original feed data. The means for recognizing
“comments” feeds may make the script work only with blogger.com, though,
since it depends on having “comments” in the URL. This does what I need
now, though.