[PYTHON DB-SIG] database API

Thomas Breuel tmb@best.com
Fri, 11 Oct 1996 21:03:35 -0700


|We (urm, the defunct eShop :-) implemented PyDB for ODBC quite snappily.
|Works great. The interface also worked quite fine for Oracle and
|Informix. Further, the interface had more than enough functionality for
|actually building a serious server application.

I'm not disputing that the current PyDB interface can be made
to work, even efficiently.  I still think that an interface that
is closer to existing practice would be preferable and help
Python to achieve greater acceptance as a database scripting
language.

|my_statement = "select * from some_table where col1 = :1 and col2 = :2"
|cursor = connection.cursor()
|cursor.execute(my_statement, params1)
|rows1 = cursor.fetchall()
|cursor.execute(my_statement, params2)
|rows2 = cursor.fetchall()
|
|Specifically, if you pass in the *same* string objet for the statement,
|then the implementations should reuse the statement that it prepared on
|the prior execute().
|
|The above example fetched all the rows at once; you can then step
|through them as needed (most cases, you simply fetch them all). If your
|results are too large to pull into memory and/or you need to be able to
|terminate, you can do:
|
|cursor1 = connection.cursor()
|cursor1.execute(statement1)
|cursor2 = connection.cursor()
|cursor2.execute(statement2)
|while 1:
|  row1 = cursor1.fetchone()
|  row2 = cursor2.fetchone()

This approach differs from established OO patterns
(collection/iterator) as well as from the commonly used interfaces in
other languages (JDBC, ODBC), which use separate data types to
represent a prepared statement and an iterator over the result.

In fact, a "prepared statement" can be holding a large amount
of state, so this is not an academic difference.  An alternative
interface would be:

    prepared_statement = connection.prepare("... sql ...")
    result_set1 = prepared_statement.query(query_parameters1)
    result_set2 = prepared_statement.query(query_parameters2)
    for a in result_set1:
	for b in result_set2:
	    do_something(a,b)

Of course, in your example, cursor1 and cursor2 could somehow
share state transparently behind the scenes.  But that would discourage
implementors from doing the right thing, and it would leave users
confused about whether a particular implementation actually is
doing the right thing.

|The cursor object is a prepared statement. A result set is a list of
|tuples.

If it is a "prepared statement", why is it called a "cursor"?  The
term "cursor" has a specific meaning in databases, distinct from
"prepared statement".

|I believe that the interface can be even simpler (and still retain its
|functionality).  Jim Fulton has mentioned on several occasions over the
|past 5 months ways to make the API simpler to work with, but has never
|written anything down.

If you want the simplest possible API, you can go a lot further.  All
you really need is a single function:

    dbquery(database_name,
	    query_template,query_parameters,
	    result_row,result_column)

All the database and query state could be hidden by the implementation
that way.  That wouldn't be significantly less efficient than the
PyDB interface or ODBC or JDBC.  But it would make it even less obvious
what state the database is and isn't keeping around behind the
user's back and would differ even more from existing interfaces.
"Simpler" isn't always "better".

If you say that you are thinking about making this even simpler, that
suggests to me that the database interface isn't set in stone yet.
I hope very much that the PyDB interface could be modeled more on ODBC
and JDBC, since that's what most users of databases are familiar with.
The extra types and abstractions in those interfaces do serve a purpose.
Furthermore, there would be a lot more prior art and experience for
users to draw on.

I think that JDBC in particular would be an excellent blueprint for
a Python database interface.  It seems to have about the right set of
abstractions for convenient and efficient database access, and it is
reasonably close to long-established practice (ODBC) while at the
same time being quite convenient.

Thomas.

=================
DB-SIG  - SIG on Tabular Databases in Python

send messages to: db-sig@python.org
administrivia to: db-sig-request@python.org
=================