PyMOTW: sqlite3 – Embedded Relational Database

The sqlite3 module provides a DB-API 2.0 compliant interface to
the SQLite relational database. SQLite is an in-process database,
designed to be embedded in applications, instead of using a separate
database server program such as MySQL, PostgreSQL, or Oracle. SQLite
is fast, rigorously tested, and flexible, making it suitable for
prototyping and production deployment for some applications.

Read more at sqlite3

Needed: SQL/Database design book recommendation

I need a book to teach someone about basic database design. They don’t
need relational algebra or calculus, and they don’t have to be an expert
about highly optimized storage, indexing, or anything like that. They
just need some basic normalization, column type selection, and query
help for what should be a pretty simple database.

They took a college class on RDBMSes, but the class and accompanying
book were both terrible. The book from my class is great, but is more
complex than what they want and need. I’m aware of the Dummies and Idiot
series books, but I would prefer to avoid those, if possible.

I’d rather not give them something tied to a specific tool (since they
haven’t selected the tool they are going to use), but as long as the
tool is not Access it’s OK if the book is vendor-specific. We’ll
probably end up using Postgresql or SQLite for the actual database, but
won’t be doing anything that should require special features provided by
either of those databases.

Does anyone have any recommendations?

ORM envy

Jonathan LaCour presented an overview of SQLAlchemy at last night’s
PyATL meeting, and now I have ORM envy. It’s too bad I can’t afford the
effort that would be involved in replacing the in-house ORM we use at
work, but I’ll definitely consider using it for my own projects.

Object-Relational Mappers

My friend Steve and I have spent some time discussing
object-relational mapping recently, partially initiated by his
comments on the ORM features in django

For some reason I’ve never quite understood, there seems to be an
inherent fear of SQL in the web development community, and over the
years there have been many efforts to hide the SQL completely (or in
the case of Zope, encourage the use of a custom object database
instead of a relational database). Personally I’m wary of any form
of object relational mapping which works automatically. What I do
want is a nice abstraction layer (sometimes called the data access
object pattern), so that the code working with objects doesn’t know
that the objects are actually stored in a relational database.

I tend to agree. I’m confused by the intense need to create a new way
to express a relational database schema in Python, Ruby, or any other
language. The DDL is a perfectly legitimate way to express the schema
of the database. Why not just use it?

We use an ORM like that at work. The whole thing was written several
years ago before the SQLObject and SQLAlchemy ORMs were available, of
course, or we would be using one of them. The database connection layer
scans the database table definitions when it connects for the first
time. The base class for persistent objects uses that information to
discover the names and types of attributes for classes. We do it all at
runtime now, though we have discussed caching the information somehow
for production servers (maybe using a pickle or even writing out a
python module during the build by parsing the DDL itself). Scanning the
tables doesn’t take as long as you would think, though, so it hasn’t
become a hot-spot for performance tuning. Yet.

Steve suggested a slightly different design. Use DDL to define the
schema, then convert the schema to base classes (one per table) with a
code generator. Then subclass from the auto-generated tables to add
“business logic”. I’m not sure how well that would work, but it sounds
like an interesting idea. If the generated code is going to support
querying for and returning related objects, how does it know to use the
subclass to create instances instead of the generated class?

I do like the automatic handling of queries for related objects, and
the system used by django is particularly elegant. Two features I
especially like are:

  1. The QuerySet isn’t resolved and executed until you start indexing
    into it.
  2. Modifying a QuerySet by applying a filter actually creates a new

This means passing QuerySet instances around is inexpensive, and callers
do not have to worry about call-by-reference objects being modified
unexpectedly. I need to study SQLAlchemy again, to see how it handles
query result sets.