[DB-SIG] DB API 3.0 strawman
Eric Bianchi
eric@nekhem.com
Tue, 7 Aug 2001 12:18:42 +0300
> Stuart Bishop wrote:
> >
> > Here is an update to the DB API update I posted to this
> > list a few weeks ago. It has moved its focus away from
> > being a wrapper for 1.0 and 2.0 compliant drivers,
> > and more into a new DB API standard (although this wrapper
> > functionality is still there).
So basically, what is in the DB API 3.0 and what is in the sql.py ?
Do the developers keep the old DB API 2.0 Specifications or must we develop this API in our drivers ?
Who is gonna write sql.py ? How can we help to write this wrapper ?
> > I'm after feedback, particularly if this is the direction to
> > head and if it should proceed to an official PEP.
> >
> > PEP: TBA
> > Title: DB API 3.0
> > Version: $Revision: 1.9 $
> > Author: zen@shangri-la.dropbear.id.au (Stuart Bishop)
> > Discussions-To: db-sig@python.org
> > Status: strawman
> > Type: Standards Track
> > Requires: 234,248,249, Date PEP
> > Created: 18-May-2001
> > Post-History: Never
> >
> > Abstract
> >
> > Python has had a solid Database API for SQL databases since 1996 [1].
> > This revision introduces iterator support and attempts to fix
> > known issues with the interface. This version provides a module
> > to be integrated into the core Python distribution, which provides
> > the interface to 3rd party RDBMS drivers. This module can use
> > DB API 1.0 and 2.0 compliant modules as the drivers, although it
> > it expected that these drivers will be updated to support this
> > API more fully.
> >
> > Copyright
> >
> > This document has been placed in the public domain, except for
> > those sections attributed to other sources.
> >
> > Specification
> >
> > Use of this wrapper requires a DB API v2.0 compliant driver to
> > be installed in sys.path. The wrapper may support DB API v1.0
> > drivers (to be determined).
> >
> > Module Interface (sql.py)
> >
> > connect(driver,dsn,user,password,host,database)
> >
> > driver -- The name of the DB API compliant driver module.
> > dsn -- Datasource Name
> > user -- username (optional)
> > password - password (optional)
> > host -- hostname or network address (optional)
> > database -- database name (optional)
> >
> > Returns a Connection object. This will be a wrapper object
> > for DB API 1.0 or DB API 2.0 compliant drivers. DB API 3.0
> > compliant drivers will have a Connection instance returned
> > unmodified.
> >
> > Does the connect function need to accept arbitrary keyword
> > arguments?
Well, I thought as a wrapper, we should use a connection string like the one specified in JDBC which is "standard".
Something like c = connect(driver,url) where url =
"dbapi://name:surname@host:port/dbname"
this connect function would be part of sql.py
> > Exceptions (unchanged from DB API 2.0)
> >
> > Exceptions thrown need to be subclasses of
> > those defined in the sql module, to allow calling code
> > to catch them without knowning which driver module
> > they were thrown from. This is particularly of use
> > for code that is connecting to multiple databases
> > using different drivers.
> > Connection Object
> >
> > close()
> > commit()
> > cursor()
> >
> > As per DB API 2.0.
> >
> > rollback(savepoint=None)
> >
> > Rollback the current transaction entirely, or to the given
> > savepoint if a savepoint is passed. The savepoint object
> > must have been returned by the savepoint method during the
> > current tranaction.
> >
> > The rollback method will raise a NotSupportedError exception
> > if the driver doesn't support transactions.
>
> I don't think that the concept of a savepoint is widely supported.
True ! Althought it is a nice concept.
>
> > quote(object)
> >
> > Returns a ANSI SQL quoted version of the given value as a
> > string. For example:
> >
> > >>> print con.quote(42)
> > 42
> > >>> print con.quote("Don't do that!")
> > 'Don''t do that!'
> >
> > Note that quoted dates often have a RDBMS dependant syntax
> > (eg. "TO_DATE('01-Jan-2001 12:01','DD-MMM-YYYY H24:MI')" for
> > Oracle). I need to track down the official ANSI SQL 1992
> > or 1999 compliant syntax for specifying date/time/and datetime
> > datatype as strings (if it exists).
> >
> > The quote method is invaluable for generating logs of SQL
> > commands or for dynamically generating SQL queries.
> >
> > This method may be made a module level function as opposed
> > to a Connection method if it can be show that string quoting
> > will always be RDBMS independant.
Well, I have the same opinion than Frederico di Gregorio. For security reasons, the quote function should be part of the driver itself.
> > execute(operation,[seq_of_parameters])
> >
> > A Cursor is created and its execute method is called.
> > Returns the newly created cursor.
> >
> > for row in con.execute('select actor,year from sketches'):
> > [ ... ]
> >
> > Insert note abour cursor creation overheads and how to
> > avoid here.
> >
> > What should the default format be for bind variables?
>
> Please don't reintroduce this DB API 1.0 feature. Cursors can
> easily be created explicitly and the .execute method called on
> these objects.
I stronly agree with Lemburg.
>
> > driver_connection()
> >
> > Returns the unwrapped connection object if the driver
> > is a DB API 1.0 or DB API 2.0 compliant driver. This iis
> > required for accessing RDBMS specific features.
> > Returns self if the connection object is unwrapped.
What is the prupose of this function exactly ?
> >
> > capabilities
> >
> > A dictionary describing the capabilities of the driver.
> > Currently defined values are:
> >
> > apilevel
> >
> > String constant stating the supported DB API level
> > of the driver. Currently only the strings '1.0',
> > '2.0' and '3.0' are allowed.
> >
> > threadsafety
> >
> > As per DB API 2.0.
> >
> > rollback
> >
> > 1 if the driver supports the rollback() method
> > 0 if the driver does not support the rollback() method
> >
> > nextset
> >
> > 1 if the driver's cursors supports the nextset() method
> > 0 if the nextset() method is not supported.
> >
> > default_transaction_level
> >
> > The default transaction isolation level for this
> > driver.
Maybe we should add a formats variable which is the list of types supported by the driver.
> > set_transaction_level(level)
> >
> > Set the transaction isolation level to the given level,
> > or a more restrictive isolation level. A NotSupported
> > exception is raised if the given or more restrictive
> > isolation level cannot be set.
> >
> > Returns the transaction isolation level actually set.
>
> > autocommit(flag)
> >
> > Do we want an autocommit method? It would need to default
> > to false for backwards compatibility, and remembering
> > to turn autocommit on is as easy as explicitly calling
> > commit().
>
I don't think we need that either.
> > savepoint()
> >
> > Sets a savepoint in the current transaction, or throws a
> > NotSupportedError exception. Returns an object which may
> > be passed to the rollback method to rollback the transaction
> > to this point.
> > Cursor Object
> >
> > The cursor object becomes an iterator after its execute
> > method has been called. Rows are retrieved using the drivers
> > fetchmany(arraysize) method.
> >
> > execute(operation,sequence_of_parameters)
> >
> > As per the executemany method in the DB API 2.0 spec.
> > Is there any need for both execute and executemany in this
> > API?
>
> Yes, because sequence_of_strings (.execute()) is a subset of
> sequence_of_sequences (.executemany()); we need two APIs to tell
> the difference.
>
> > Setting arraysize to 1 will call the drivers fetch() method
> > rather than fetchmany().
> >
> > Returns self, so the following code is valid
> >
> > for row in mycursor.execute('select actor,year from sketches'):
> > [ ... ]
> >
> > What should the default format be for bind variables?
> > callproc(procedure,[parameters])
> > arraysize
> > description
> > rowcount
> > close()
> > setinputsizes(size)
> > setoutputsizes(size[,column]
> >
> > As per DB API 2.0
> >
> > driver_cursor()
> >
> > Return the unwrapped cursor object as produced by the driver.
> > This may be required to access driver specific features.
> >
> > next()
> >
> > Return the Row object for the next row from the currently
> > executing SQL statement. As per DB API 2.0 spec, except
> > a StopIteration exception is raised when the result set
> > is exhausted.
> >
> > __iter__()
> >
> > Returns self.
What is that ?
> >
> > nextset()
> >
> > I guess as per DB API 2.0 spec.
> >
> > warnings
> >
> > List of Warning exceptions generated so far by the currently
> > executing statement. This list is cleared automatically by the
> > execute and callproc methods.
> >
> > clear_warnings()
> >
> > Erase the list of warnings. Same as 'del cursor.warnings[:]'
> > Is this method required?
>
> Not needed.
OK, not needed.
>
> > raise_warnings
> >
> > If set to 0, Warning exceptions are not raised and instead
> > only appended to the warnings list.
> >
> > If set to 1, Warning excepions are raised as they occur.
> >
> > Defaults to 1
>
> > Row Object
> >
> > When a Cursor is iterated over, it returns Row objects.
> > dtuple.py (http://www.lyra.org/greg/python/) provides such an
> > implementation already.
> >
> > [index_or_key]
> >
> > Retrieve a field from the Row. If index_or_key is an integer,
> > the column of the field is referenced by number (with the first
> > column index 0). If index_or_key is a string, the column is
> > referenced by its lowercased name (lowercased to avoid problems
> > with the differing methods vendors use to capitalize their column
> > names).
> > Type Objects and Constructors
> >
> > As per DB API 2.0 spec.
> >
> > Do we need Date, Time and DateTime classes, or just DateTime?
>
>
> > Need to author the 'Date PEP'. It would be nice if there was a
> > more intelligent standard Date class in the Python core that
> > we could leverage. If we don't, people will start using the
> > sql module for the more intelligent DateTime object it would
> > need to provide even if they arn't using databases.
>
> > ConnectionPool Object
> >
> > A connection pool, to be documented.
>
This object could be included in sql.py.
>
> > Transaction Isolation Levels
> >
> > Insert description of dirty reads, non repeatable reads and
> > phantom reads, probably stolen from JDBC 3.0 documentation.
> >
> > The following symbolic constants are defined to describe the
> > four transaction isolation levels defined by SQL99. Note that
> > they are in numeric order so comparison operators can be
> > safely used to test for restrictivness. Note that these definitions
> > are taken from the JDBC API 3.0 documentation by Sun.
> >
> > TRANSACTION_NONE
> >
> > No transaction isolation is guarenteed.
> >
> > TRANSACTION_READ_UNCOMMITTED
> >
> > Allows transactions to see uncommitted changes to the data.
> > This means that dirty reads, nonrepeatable reads, and
> > phantom reads are possible.
> >
> > TRANSACTION_READ_COMMITTED
> >
> > Means that any changes made inside a transaction are not visible
> > outside the transaction until the transaction is committed. This
> > prevents dirty reads, but nonrepeatable reads and phantom
> > reads are still possible.
> >
> > TRANSACTION_REPEATABLE_READ
> >
> > Disallows dirty reads and nonrepeatable reads. Phantom read are
> > still possible.
> >
> > TRANSACTION_SERIALIZABLE
> >
> > Specifies that dirty reads, nonrepeatable reads, and phantom
> > reads are prevented.
it is really usefull to have so many transaction levels ?
>
> > Test Suite
> >
> > A common test suite will be part of the implementation, to
> > allow driver authors and driver evaluators to test and excercise
> > the systems. Two possbilities to start with are the test suites
> > in psycopg by Federico Di Gregorio and mxODBC by eGenix.com Software.
> > Example DDL for various systems will need to be provided.
It could be in sql.test. is it possible to post a test suite file on this mailing list so every driver's developers can have a look.
>
> > Rationale
> >
> > The module is called sql.py, to avoid any ambiguity with non-realational
> > or non-SQL compliant database interfaces. This also nicely limits
> > the scope of the project. Other suggestions are 'sqlutil' and 'rdbms',
> > since 'sql' may refer to the language itself.
>
> > Previous versions of the DB API have been intentially kept lean to
> > make it simpler to develop and maintain drivers, as a richer feature
> > set could be implemented by a higher level wrapper and maintained in
> > a single place rather than in every API compliant driver. As this
> > revision provides a single place to maintain code, these features
> > can now be provided without placing a burden to the driver authors.
> >
> > Existing DB API 1.0 and 2.0 drivers can be used to power the backend
> > of this module. This means that there will be a full compliment of
> > drivers available from day 1 of this modules release without placing
> > a burden on driver developers and maintainers.
Do we really need to make a huge wrapper ? it is gonna spoil each driver's specific functions. I think sql.py should have a connect function and that's all.
> >
> > The core of the API is identical to the Python Database API v2.0
> > (PEP-249). This API is already familiar to Python programers and
> > is a proven solid foundation.
> >
> > Python previously defined a common relational database API that
> > was implemented by all drivers, and application programmers accessed
> > the drivers directly. This caused the following issues:
> >
> > It was difficult to write code that connected to multiple
> > database vendor's databases. Each seperate driver used defined
> > its own heirarchy of exceptions that needed to be handled,
>
> Note that a simple way to solve this problem is to require the
> database module to expose the used exception structure as attributes
> of the connection objects.
>
> > their own datetime class and their own set of datatype constants.
> >
> > Platform independant code could not be written simply,
> > due to the differing paramater styles used by bind variables.
> > This also caused problems with publishing example code and tutorials.
> >
> > The API remained minimal, as any new features would need to
> > be implemented by all driver maintainers. The DB-SIG felt
> > most feature suggestions would be better implemented by higher
> > level wrappers, such as that defined by this PEP.
> >
> > Language Comparison
> >
> > The design proposed in this PEP is the same as that used by Java,
> > where a relational database API is shipped as part of the core language,
> > but requires the installation of 3rd party drivers to be used.
> >
> > Perl has a similar arrangement of API and driver separation. Perl
> > does not ship with PerlDBI or drivers as part of the core language,
> > but it is well documented and trivial to add using the CPAN tools.
> >
> > PHP has a seperate API for each database vendor, although work is
> > underway (completed?) to define a common abstraction layer
> > similar to Java, Perl and Python. All of the drivers ship as
> > part of the core language.
> >
> > References
> >
> > [1] PEP-248 and PEP-249
> >
> > --
> > Stuart Bishop <zen@shangri-la.dropbear.id.au>
> >
> > _______________________________________________
> > DB-SIG maillist - DB-SIG@python.org
> > http://mail.python.org/mailman/listinfo/db-sig
--
Thierry Michel
Eric Bianchi