Using raw SQL in django

For a django project I’m working on, I need to run a simple query with COUNT and GROUP BY statements to collect some summary data. I tried working with the model API, and eventually fell back to using raw SQL instead.

Here’s the query I wanted to run:

SELECT
    (year/10)*10 as decade,
    count(*) as case_count
FROM
    docket_case
GROUP BY
    decade

One problem I encountered was there was no way (that I could find) to do a GROUP BY. I found one reference to a group_by() method, but I don’t think the patch has been submitted or approved.

I spent some time trying to work out the right way to do it using the model API, before realizing I was banging my head on the wrong brick wall. I was using values() and [STRIKEOUT:select()] extra(select='…') together, which should have clued me in right away. Well, that and the fact that the data I was retrieving wasn’t one of my model classes.

Using a tip from page 325 of the new django book (see my earlier post) for a review), I decided to just go ahead and run the query directly and work with the result set. The results need to go into a template, so I wanted to have dictionaries instead of the tuples I get from cursor.fetchall(). This is what I came up with:

from itertools import *
from django.db import connection

def query_to_dicts(query_string, *query_args):
    """Run a simple query and produce a generator
    that returns the results as a bunch of dictionaries
    with keys for the column values selected.
    """
    cursor = connection.cursor()
    cursor.execute(query_string, query_args)
    col_names = [desc[] for desc in cursor.description]
    while True:
        row = cursor.fetchone()
        if row is None:
            break
        row_dict = dict(izip(col_names, row))
        yield row_dict
    return

You call it like:

results = query_to_dicts("""
    SELECT
        (year/10)*10 as decade,
        count(*) as case_count
    FROM
        docket_case
    GROUP BY
        decade
    """)

or, with arguments (this query counts the cases in each year of a particular decade):

results = query_to_dicts("""
    SELECT
        year,
        count(*) as case_count
    FROM
        docket_case
    WHERE
        year/10 = %s
    GROUP BY
        year
    """, decade/10)

I don’t expect a large result set, but I figured this was an excuse to experiment with generators and iterators. I’m pretty happy with the results, but surprised I had to write something like this myself. I didn’t see a cursor method to fetch the rows as dictionaries instead of tuples, and the search I ran didn’t return anything that looked useful.

Am I missing something?

[Updated 31 Dec: One correction: The method is extra(), the argument is select. I was adding a count field using extra(), and then asking for the value with values(). There does not seem to be a way to structure the query I wanted using the models API methods.]