sqlstring -- a library to build a SELECT statement

fumanchu fumanchu at amor.org
Thu Oct 20 09:10:49 CEST 2005

grunar at gmail.com wrote:
> These objects (such as sqlstring.Select), represent
> complex SQL Statements, but as Python objects. The benefit is that you
> can, at run-time, "build" the statement pythonically, without
> getting bogged down in String Manipulation. The theory is that once in
> use, things that were complex (string magic) become simpler, and allow
> the program to worry about higher-level issues.
> ...
> Some of this stuff has been around for a while (using "magic" objects
> to build where clauses, etc.).  But I'm trying to take it all the
> way--to a legit Select statement.
> While still in the early stages, it does work with a great many sql
> statements, as seen in the test suite.  Currently supported are CASE
> statements, Nested conditional clauses, nested queries and most join
> types. At this point, I'm interested in getting feedback from the
> community on several fronts:
>    1. The Operator Overload model. I've chosen to overload Python's
> operators to give a short-hand syntax to most of the things you'd
> want to do with a select statement. The rest are accessable via
> methods. Currently ** is the "where" operator, // is the "in"
> operator, % the "like" operator and ^ aliases columns. Other
> overloads are as you'd expect- + / - * == all result in Expression
> Objects that dish out the right SQL string. The question is, is the
> "leap" in syntax to confusing? Is there a cleaner way to do this?
> (Functions for example)

The big operator question will be: how will "and" and "or" be
implemented? This is always a sticking point because of Python's
short-circuiting behaviors regarding them (the resultant bytecode will
include a JUMP).

An alternative is to stuff the representation into a string, which can
then be parsed however one likes.

For Dejavu (http://projects.amor.org/dejavu), I didn't do either
one--instead I used lambdas to express the where clause, so that:

    f = logic.Expression(lambda x: ('Rick' in x.Name) or
                         (x.Birthdate == datetime.date(1970, 1, 1)))
    units = sandbox.recall(Person, f)

might produce, in the bowels of the ORM:

    "SELECT * FROM [Person] WHERE [Person].[Name] Like '%Rick%' or
[Person].[Birthdate] = #1/1/1970#"

Note that the tablename is provided in a separate step. The translation
is based on the codewalk.py and logic.py modules, which are in the
public domain if you want to use any part of them. See

>    2. How to best add further sql function support? Adding magic
> callable objects to columns came to mind, but this has it's own set
> of issues. I'm leaning towards a magic object in the sqlstring
> module. For example:
>       sqlstring.F.substring(0, 4, person.first_name)
>       would result in: substring(0, 4, person.first_name). the F object
> could be put in the local scope for short-hand.

This is a hard problem, since your sqlstring module doesn't control the
result sets, and so can't provide fallback mechanisms if a given
database does not support a given function (or operator, or minute
detail of how a function or operator works; for example, LIKE is
case-insensitive in MS SQL Server but case-sensitive in PostgreSQL). If
you're going to use subclasses to handle "database-specific overwrites"
(below), then you'll probably want to stick such functions in that base
class (and override them in subclasses), as well.

>    3. I'm undecided on how best to handle database specific
> overwrites. I want this to be as easy as possible. I'm thinking about
> subclassing Expressions with a naming scheme on the Sub-Class (such as
> CaseExpression_oracle). Then the __init__ factory could dish out the
> right version of the object based on the requestor. This brings up
> lots of questions, such as how to support multiple types of databases
> at the same time.

See the Adapter and SQLDecompiler classes in
http://projects.amor.org/dejavu/svn/trunk/storage/db.py (and the
store*.py modules) for some examples of using subclassing to produce
database-specific syntax. There, it's one Adapter class per supported
DB-type; you might consider keeping the Expression objects themselves
free from SQL, and transform the Expressions to SQL in a separate
class, which you could then subclass.

Just a couple of thoughts from someone who's done the
string-manipulation dance once before. ;) I must admit I've always
punted when it came time to produce complex joins or CASE
statements--Dejavu simply doesn't provide that level of expressivity,
preferring instead to hide it behind the object layer.

Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

More information about the Python-list mailing list