[DB-SIG] Improved support for prepared SQL statements

Daniele Varrazzo daniele.varrazzo at gmail.com
Tue Dec 9 13:24:01 CET 2014


On Tue, Dec 9, 2014 at 9:40 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 07.12.2014 22:06, SF Markus Elfring wrote:
>> Hello,
>>
>> An interface for parameterised SQL statements (working with
>> placeholders) is provided by the execute() method from the Cursor class
>> at the moment.
>> https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.execute
>>
>> I assume that the "SQL Statement Object" from the SQLite C interface is
>> reused there already.
>> http://sqlite.org/c3ref/stmt.html
>>
>> I imagine that it will be more efficient occasionally to offer also a
>> base class like "prepared_statement" so that the parameter specification
>> does not need to be parsed for every passed command.
>> I suggest to improve corresponding preparation and compilation
>> possibilities.
>> https://bugs.python.org/issue22956
>
> Please see our previous discussion on this subject:
>
> https://mail.python.org/pipermail/db-sig/2014-March/006090.html
>
> This was the first draft result of that discussion:
>
> https://mail.python.org/pipermail/db-sig/2014-March/006096.html
>
> It's not yet been added to the DB-API as standard extension. Perhaps
> we ought to continue that discussion.

Sorry for not keeping up with the discussion I'd started myself. I
have a few observations about your draft:

    The prepared operation string is retained by the cursor to allow
    executing the operation without having to prepare the operation again.

should the prepared statement be accessible by an attribute? Many the
drivers would make so, so wouldn't be better define what should it be?
``.prepared``? (a string or None if none is prepared). It could be one
of the "optional" ones in case other drivers wouldn't be able to
retrieve it back.

    In order to benefit from this caching, the .execute*() methods must
    be run with the same operation string that was passed to the .prepare()
    method.

Not much of a problem with this interface. However sometimes people
sticks a long string literal as execute method, e.g.::

    cur.execute("""LONG SQL""", args)
    for record in cur: ...

with this interface the sql should be stored in a variable and passed
to both prepare and execute:

    SQL = """LONG SQL"""
    cur.prepare(SQL)
    cur.execute(SQL, args)

in case of prepared statements, wouldn't it be handy to be able to
pass None as the first execute parameter, which would result in the
execution of the prepared statement or a ProgrammingError if no
statement has been prepared?

    cur.prepare("""LONG SQL""")
    cur.execute(None, args)
    for record in cur: ...

not a performance saving of course: just for sake of ergonomic. Of
course having cur.prepared one could always refer to that with
``cur.execute(cur.prepared, args)``

    The prepared operation is only available until the next call to
    one of the .execute*() methods or another call to the .prepare()
    method.

This is a bit unclear to me: I believe you mean "until the next call
to one of the .execute() methods with a different query", right?
(otherwise it seems you can execute a prepared statement only
once...). But does it mean that the well defined behaviour of the
driver is to actively clear the prepared statement upon an .execute()
with a different argument, or just that it is not defined whether the
previous query is still prepared?

----

One issue I would like to have clarified (or clarified that it cannot
be clarified...) is: who owns the prepared statement? My data points,
which I don't want to overgeneralise, are:

- in Postgres, the prepared statements belong to the connection, hence
any of its cursors may use a statement prepared only once.
- I work on a large software system implemented in Erlang and based on
Postgres. The Erlang driver we use doesn't have a connection/cursor
distinction and manages a connection pool. We automatically prepare
almost every query we send to the backend so further calls (which in a
Python transposition would be executed by a different cursor but
likely by the same connection) benefit from preparation.

So, one of the options I would like to provide to psycopg users is to
leverage the possibility to share prepared statement across cursors:
this may or may not imply automatic prepare upon execute, it would
likely imply that preparing the same statements multiple time in the
same connection would be no-op etc.

What I am wondering is: are Postgres features too much specific to
drive the API design? E.g. are there around db/drivers pairs for which
the prepared statement would belong to the cursor? Or is the ownership
of the prepared statements to the connection so common that the API
design may benefit of it, e.g. making prepared statements shared
across cursors, allowing more than one prepared statement per
connection etc.?

----

If the prepare() extension to the API is released as proposed above by
M-A, my psycopg implementation would be something along this line:

- implement the cursor.prepare() method, a cursor.prepared attribute;
- implement the behaviour as defined by the DBAPI;
- the prepared statements would actually be registered inside the
connection, so a cursor.execute*() call would automatically use a
statement if found prepared by any of the connection's cursors (but
preparation must be still explicit);
- call to execute with unprepared queries wouldn't affect the prepared ones;
- as an option (e.g. by using a certain Connection subclass declared a
DBAPI extension) every executed statement would be automatically
prepared (there may be a cursor.execute(..., prepared=False) maybe to
blacklist specific queries).

Would this implementation be compliant enough or is it a too stretched
interpretation of the API?

Please understand that these points are not raised to drive the API
where my own project wants: as my knowledge of other DBMSs is limited
I'm trying to understand if some of the features IMO valuable I've
found in other software systems can be implemented generally by other
DBMS/driver pairs, or if I'd still be free to implement such features
for the Postgres user benefit while maintaining the DBAPI spirit and
guidelines.

Thank you very much for your insights.

-- Daniele


More information about the DB-SIG mailing list