[DB-SIG] SQL PEP for discussion
Sun, 17 Jun 2001 23:47:42 +1000 (EST)
I've put togther a PEP for a standard wrapper for DB compliant
drivers. I'd like to get peoples thoughts on it before I submit
it anywhere. In particular, if the SIG thinks this is the
correct approach to take before I do further work on it.
I'd also like to find out if there are similar projects that
have had work started on them, as they may make this work
Title: Standard Wrapper For SQL Databases
Version: $Revision: 1.4 $
Author: email@example.com (Stuart Bishop)
Status: pre submission Draft
Type: Standards Track
Python has had a solid Database API for SQL databases since 1996 .
This API has been intentially kept lean to make it simpler to
develop and maintain drivers, as a richer feature set could be
implemented by a higher level wrapper and maintained in
a single place rather than in every API compliant driver.
The goal of this PEP is to define and implement such a wrapper,
and merge it into the standard Python distribution.
This document has been placed in the public domain.
Use of this wrapper requires a DB API v2.0 compliant driver to
be installed in sys.path. The wrapper may support DB API v1.0
drivers (to be determined).
Module Interface (sql.py)
driver -- The name of the DB API compliant driver module.
dsn -- Datasource Name
user -- username (optional)
password - password (optional)
host -- hostname or network address (optional)
database -- database name (optional)
Returns a Connection object
Does the connect function need to accept arbitrary keyword
Exceptions (unchanged from DB API 2.0)
Exceptions thrown need to be subclasses of those
defined in the sql module, to allow calling code to
catch them without knowning which driver module
they were thrown from. This is particularly of use
for code that is connecting to multiple databases
using different drivers.
As per DB API 2.0. The rollback method will raise a
NotSupportedError exception if the driver doesn't support
Returns a ANSI SQL quoted version of the given value as a
string. For example:
>>> print con.quote(42)
>>> print con.quote("Don't do that!")
'Don''t do that!'
Note that this cannot currently be done for dates, which
generally have a RDBMS dependant syntax. This would need
to be resolved in the next version of the DB API spec, and
is the reason why this is the method of the connection object
as opposed to a function. A quote method is invaluable for
generating logs of SQL commands or for dynamically generating
A Cursor is created and its execute method is called.
Returns the newly created cursor.
for row in con.execute('select actor,year from sketches'):
[ ... ]
Insert note abour cursor creation overheads and how to
Should cursor handling be hidden? It would be quite possible
to have the connection object maintain a pool of cursors which
are used to create iterators. The iterator would return the
cursor to the pool when iteration is completed, or in its
What should the default format be for bind variables?
Return the unwrapped connection object as produced by the
driver. This may be required for accessing RDBMS specific
A dictionary describing the capabilities of the driver.
Currently defined values are:
String constant stating the supported DB API level
of the driver. Currently only the strings '1.0' and
'2.0' are allowed.
As per DB API 2.0. Irrelevant if we can trasparently
enforce good threading.
1 if the driver supports the rollback() method
0 if the driver does not support the rollback() method
1 if the driver's cursors supports the nextset() method
0 if the nextset() method is not supported.
The cursor object becomes an iterator after its execute
method has been called. Rows are retrieved using the drivers
As per the executemany method in the DB API 2.0 spec.
Is there any need for both execute and executemany in this
Returns self, so the following code is valid
for row in mycursor.execute('select actor,year from sketches'):
[ ... ]
What should the default format be for bind variables?
As per DB API 2.0. Perhaps descriptions should be
trimmed to the basics of column name & datatype?
Return the unwrapped cursor object as produced by the driver.
This may be required to access driver specific features.
Return the Row object for the next row from the currently
executing SQL statement. As per DB API 2.0 spec, except
a StopIteration exception is raised when the result set
I guess as per DB API 2.0 spec.
When a Cursor is iterated over, it returns Row objects.
Retrieve a field from the Row. If index_or_key is an integer,
the column of the field is referenced by number (with the first
column index 0). If index_or_key is a string, the column is
referenced by name.
Note that referencing columns by name may cause problems if you
are trying to write platform independant code and should be
avoided, as different vendors capitalize their column names
Type Objects and Constructors
As per DB API 2.0 spec.
It would be nice if there was a more intelligent standard Date
class in the Python core that we could leverage. It is probably
worth putting this in another PEP that we would depend on.
The module is called sql.py, to avoid any ambiguity with non-realational
or non-SQL compliant database interfaces. This also nicely limits
the scope of the project.
RDBMS abstraction classes, such as Python dictionary -> RDBMS table
mappings, object persistance, connection pools etc. are better suited
to other modules (or submodules) and are _currently_ beyond the scope of
The core of the API is identical to the Python Database Interface v2.0
(PEP-249). This API is already familiar to Python programers and
is a proven solid foundation. To this core I have added some helper
functions and iterator support (PEP-234).
Python previously defined a common relational database API that
was implemented by all drivers, and application programmers accessed
the drivers directly. This caused the following issues:
It was difficult to write code that connected to multiple
database vendor's databases. Each seperate driver used defined
its own heirarchy of exceptions that needed to be handled, and
similar problems occurred with identifying column datatypes etc.
Platform independant code could not be written simply,
due to the differing paramater styles used by bind variables.
This also caused problems with publishing example code and tutorials.
The API remained minimal, as any new features would need to
be implemented by all driver maintainers. The DB-SIG felt
most feature suggestions would be better implemented by higher
level wrappers, such as that defined by this PEP.
More than a few people didn't realise that Python _had_ a database
API, as it required people to find the DB-SIG on python.org.
The wrapper will ease writing RDBMS independant code:
The wrapper will enforce thread safety. It is unknown at this stage
if this will be done by passing calls to a single worker thread for
non thread safe drivers, or by raising an exception if an attempt
is made to use a driver in a unsafe manner.
(Can we enforce thread safety? Thread ID's get reused, so it is
possible for the following to happen: Thread ID 10 is created
Thread ID 10 creates a cursor 'c'
Thread ID stops running
A new thread created, and happens to have thread ID 10 again.
The new thread ID 10 attempts to use cursor 'c'
Fixing this might require another PEP, and probably is impossible if
we allow for threads being spawned from C code.)
The driver name is passed to the connect method as a string,
rather than importing the driver module, to allow the driver being
used by an application to be defined in a configuration file or
other resource file.
Bound parameters are specified in a common format, and translated
to the native format used by the driver.
Some basic guidelines on writing RDBMS independant code will
be provided in the documentation (this is not always obvious
to developers who only work with one vendor's database system).
Only one heirarchy of exceptions needs to be caught,
as opposed to one heirarchy per driver being used.
The design proposed in this PEP is the same as that used by Java,
where a relational database API is shipped as part of the core language,
but requires the installation of 3rd party drivers to be used.
Perl has a similar arrangement of API and driver separation. Perl
does not ship with PerlDBI or drivers as part of the core language,
but it is well documented and trivial to add using the CPAN tools.
PHP has a seperate API for each database vendor, although work is
underway (completed?) to define a common abstraction layer
similar to Java, Perl and Python. All of the drivers ship as
part of the core language.
Gadfly is a RDBMS written in Python by (insert name of author when I'm
online). If Gadfly was also included in the Python distribution, it would
be extremly useful as a learning tool. It would also allow the
creation of products that require a RDBMS back end, yet work with
any standard Python installation. Gadfly has already been shipped
as part of Digital Creation's Zope for similar reasons. Shipping
a fully functional RDBMS with Python would also be useful for the
nyer-nyer factor in language advocacy.
Should Gadfly be included in the core Python distribution as a
Hmm... this is beyond the scope of the abstract and should be a
seperate PEP (esp as it isn't dependant on this PEP). I'll leave
it here for now as a discussion point.
 PEP-248 and PEP-249
Stuart Bishop <firstname.lastname@example.org>