[DB-SIG] DB API 3.0 strawman

Tue, 7 Aug 2001 12:18:42 +0300

> Stuart Bishop wrote:
> > 
> > Here is an update to the DB API update I posted to this
> > list a few weeks ago. It has moved its focus away from
> > being a wrapper for 1.0 and 2.0 compliant drivers,
> > and more into a new DB API standard (although this wrapper
> > functionality is still there).

So basically, what is in the DB API 3.0 and what is in the sql.py ?

Do the developers keep the old DB API 2.0 Specifications or must we develop this API in our drivers ?

Who is gonna write sql.py ? How can we help to write this wrapper ?

> > I'm after feedback, particularly if this is the direction to
> > head and if it should proceed to an official PEP.
> > 
> > PEP: TBA
> > Title: DB API 3.0
> > Version: $Revision: 1.9 $
> > Author: zen@shangri-la.dropbear.id.au (Stuart Bishop)
> > Discussions-To: db-sig@python.org
> > Status: strawman
> > Type: Standards Track
> > Requires: 234,248,249, Date PEP
> > Created: 18-May-2001
> > Post-History: Never
> > 
> > Abstract
> > 
> >     Python has had a solid Database API for SQL databases since 1996 [1].
> >     This revision introduces iterator support and attempts to fix
> >     known issues with the interface. This version provides a module
> >     to be integrated into the core Python distribution, which provides
> >     the interface to 3rd party RDBMS drivers. This module can use
> >     DB API 1.0 and 2.0 compliant modules as the drivers, although it
> >     it expected that these drivers will be updated to support this
> >     API more fully.
> >
> > Copyright
> > 
> >     This document has been placed in the public domain, except for
> >     those sections attributed to other sources.
> > 
> > Specification
> > 
> >     Use of this wrapper requires a DB API v2.0 compliant driver to
> >     be installed in sys.path. The wrapper may support DB API v1.0
> >     drivers (to be determined).
> > 
> >     Module Interface (sql.py)
> > 
> >         connect(driver,dsn,user,password,host,database)
> > 
> >             driver -- The name of the DB API compliant driver module.
> >             dsn -- Datasource Name
> >             user -- username (optional)
> >             password - password (optional)
> >             host -- hostname or network address (optional)
> >             database -- database name (optional)
> > 
> >             Returns a Connection object. This will be a wrapper object
> >             for DB API 1.0 or DB API 2.0 compliant drivers. DB API 3.0
> >             compliant drivers will have a Connection instance returned
> >             unmodified.
> > 
> >             Does the connect function need to accept arbitrary keyword
> >             arguments?

Well, I thought as a wrapper, we should use a connection string like the one specified in JDBC which is "standard".

Something like c = connect(driver,url) where url =

"dbapi://name:surname@host:port/dbname"

this connect function would be part of sql.py

> >         Exceptions (unchanged from DB API 2.0)
> > 
> >             Exceptions thrown need to be subclasses of
> >             those defined in the sql module, to allow calling code
> >             to catch them without knowning which driver module
> >             they were thrown from. This is particularly of use
> >             for code that is connecting to multiple databases
> >             using different drivers.
> >     Connection Object
> > 
> >         close()
> >         commit()
> >         cursor()
> > 
> >             As per DB API 2.0.
> > 
> >         rollback(savepoint=None)
> > 
> >             Rollback the current transaction entirely, or to the given
> >             savepoint if a savepoint is passed. The savepoint object
> >             must have been returned by the savepoint method during the
> >             current tranaction.
> > 
> >             The rollback method will raise a NotSupportedError exception
> >             if the driver doesn't support transactions.
> 
> I don't think that the concept of a savepoint is widely supported.

True ! Althought it is a nice concept.

>  
> >         quote(object)
> > 
> >             Returns a ANSI SQL quoted version of the given value as a
> >             string.  For example:
> > 
> >                 >>> print con.quote(42)
> >                 42
> >                 >>> print con.quote("Don't do that!")
> >                 'Don''t do that!'
> > 
> >             Note that quoted dates often have a RDBMS dependant syntax
> >             (eg. "TO_DATE('01-Jan-2001 12:01','DD-MMM-YYYY H24:MI')" for
> >             Oracle). I need to track down the official ANSI SQL 1992
> >             or 1999 compliant syntax for specifying date/time/and datetime
> >             datatype as strings (if it exists).
> > 
> >             The quote method is invaluable for generating logs of SQL
> >             commands or for dynamically generating SQL queries.
> > 
> >             This method may be made a module level function as opposed
> >             to a Connection method if it can be show that string quoting
> >             will always be RDBMS independant.

Well, I have the same opinion than Frederico di Gregorio. For security reasons, the quote function should be part of the driver itself.

> >         execute(operation,[seq_of_parameters])
> > 
> >             A Cursor is created and its execute method is called.
> >             Returns the newly created cursor.
> > 
> >             for row in con.execute('select actor,year from sketches'):
> >                 [ ... ]
> > 
> >             Insert note abour cursor creation overheads and how to
> >             avoid here.
> > 
> >             What should the default format be for bind variables?
> 
> Please don't reintroduce this DB API 1.0 feature. Cursors can
> easily be created explicitly and the .execute method called on
> these objects.

I stronly agree with Lemburg.

>  
> >         driver_connection()
> > 
> >             Returns the unwrapped connection object if the driver
> >             is a DB API 1.0 or DB API 2.0 compliant driver. This iis
> >             required for accessing RDBMS specific features.
> >             Returns self if the connection object is unwrapped.

What is the prupose of this function exactly ?

> > 
> >         capabilities
> > 
> >             A dictionary describing the capabilities of the driver.
> >             Currently defined values are:
> > 
> >                 apilevel
> > 
> >                     String constant stating the supported DB API level
> >                     of the driver. Currently only the strings '1.0',
> >                     '2.0' and '3.0' are allowed.
> > 
> >                 threadsafety
> > 
> >                     As per DB API 2.0.
> > 
> >                 rollback
> > 
> >                     1 if the driver supports the rollback() method
> >                     0 if the driver does not support the rollback() method
> > 
> >                 nextset
> > 
> >                     1 if the driver's cursors supports the nextset() method
> >                     0 if the nextset() method is not supported.
> > 
> >                 default_transaction_level
> > 
> >                     The default transaction isolation level for this
> >                     driver.

Maybe we should add a formats variable which is the list of types supported by the driver. 

> >         set_transaction_level(level)
> > 
> >             Set the transaction isolation level to the given level,
> >             or a more restrictive isolation level. A NotSupported
> >             exception is raised if the given or more restrictive
> >             isolation level cannot be set.
> > 
> >             Returns the transaction isolation level actually set.
> 
> >         autocommit(flag)
> > 
> >             Do we want an autocommit method? It would need to default
> >             to false for backwards compatibility, and remembering
> >             to turn autocommit on is as easy as explicitly calling
> >             commit().
>  

I don't think we need that either.

> >         savepoint()
> > 
> >             Sets a savepoint in the current transaction, or throws a
> >             NotSupportedError exception. Returns an object which may
> >             be passed to the rollback method to rollback the transaction
> >             to this point.

> >     Cursor Object
> > 
> >         The cursor object becomes an iterator after its execute
> >         method has been called. Rows are retrieved using the drivers
> >         fetchmany(arraysize) method.
> > 
> >         execute(operation,sequence_of_parameters)
> > 
> >             As per the executemany method in the DB API 2.0 spec.
> >             Is there any need for both execute and executemany in this
> >             API? 
> 
> Yes, because sequence_of_strings (.execute()) is a subset of 
> sequence_of_sequences (.executemany()); we need two APIs to tell 
> the difference.
> 
> >             Setting arraysize to 1 will call the drivers fetch() method
> >             rather than fetchmany().
> > 
> >             Returns self, so the following code is valid
> > 
> >             for row in mycursor.execute('select actor,year from sketches'):
> >                 [ ... ]
> > 
> >             What should the default format be for bind variables?

> >         callproc(procedure,[parameters])
> >         arraysize
> >         description
> >         rowcount
> >         close()
> >         setinputsizes(size)
> >         setoutputsizes(size[,column]
> > 
> >             As per DB API 2.0
> > 
> >         driver_cursor()
> > 
> >             Return the unwrapped cursor object as produced by the driver.
> >             This may be required to access driver specific features.
> > 
> >         next()
> > 
> >             Return the Row object for the next row from the currently
> >             executing SQL statement. As per DB API 2.0 spec, except
> >             a StopIteration exception is raised when the result set
> >             is exhausted.
> > 
> >         __iter__()
> > 
> >             Returns self.

What is that ?

> > 
> >         nextset()
> > 
> >             I guess as per DB API 2.0 spec.
> > 
> >         warnings
> > 
> >             List of Warning exceptions generated so far by the currently
> >             executing statement. This list is cleared automatically by the
> >             execute and callproc methods.
> > 
> >         clear_warnings()
> > 
> >             Erase the list of warnings. Same as 'del cursor.warnings[:]'
> >             Is this method required?
> 
> Not needed.

OK, not needed.

>  
> >         raise_warnings
> > 
> >             If set to 0, Warning exceptions are not raised and instead
> >             only appended to the warnings list.
> > 
> >             If set to 1, Warning excepions are raised as they occur.
> > 
> >             Defaults to 1
>  
> >     Row Object
> > 
> >         When a Cursor is iterated over, it returns Row objects.
> >         dtuple.py (http://www.lyra.org/greg/python/) provides such an
> >         implementation already.
> > 
> >         [index_or_key]
> > 
> >             Retrieve a field from the Row. If index_or_key is an integer,
> >             the column of the field is referenced by number (with the first
> >             column index 0). If index_or_key is a string, the column is
> >             referenced by its lowercased name (lowercased to avoid problems
> >             with the differing methods vendors use to capitalize their column
> >             names).

> >     Type Objects and Constructors
> > 
> >         As per DB API 2.0 spec.
> > 
> >         Do we need Date, Time and DateTime classes, or just DateTime?
> 
>  
> >         Need to author the 'Date PEP'. It would be nice if there was a
> >         more intelligent standard Date class in the Python core that
> >         we could leverage. If we don't, people will start using the
> >         sql module for the more intelligent DateTime object it would
> >         need to provide even if they arn't using databases.
> 
> >     ConnectionPool Object
> > 
> >         A connection pool, to be documented.
> 

This object could be included in sql.py.

>  
> >     Transaction Isolation Levels
> > 
> >         Insert description of dirty reads, non repeatable reads and
> >         phantom reads, probably stolen from JDBC 3.0 documentation.
> > 
> >         The following symbolic constants are defined to describe the
> >         four transaction isolation levels defined by SQL99. Note that
> >         they are in numeric order so comparison operators can be
> >         safely used to test for restrictivness. Note that these definitions
> >         are taken from the JDBC API 3.0 documentation by Sun.
> > 
> >         TRANSACTION_NONE
> > 
> >             No transaction isolation is guarenteed.
> > 
> >         TRANSACTION_READ_UNCOMMITTED
> > 
> >             Allows transactions to see uncommitted changes to the data.
> >             This means that dirty reads, nonrepeatable reads, and
> >             phantom reads are possible.
> > 
> >         TRANSACTION_READ_COMMITTED
> > 
> >             Means that any changes made inside a transaction are not visible
> >             outside the transaction until the transaction is committed. This
> >             prevents dirty reads, but nonrepeatable reads and phantom
> >             reads are still possible.
> > 
> >         TRANSACTION_REPEATABLE_READ
> > 
> >             Disallows dirty reads and nonrepeatable reads. Phantom read are
> >             still possible.
> > 
> >         TRANSACTION_SERIALIZABLE
> > 
> >             Specifies that dirty reads, nonrepeatable reads, and phantom
> >             reads are prevented.

it is really usefull to have so many transaction levels ?

> 
> >     Test Suite
> > 
> >         A common test suite will be part of the implementation, to
> >         allow driver authors and driver evaluators to test and excercise
> >         the systems. Two possbilities to start with are the test suites
> >         in psycopg by Federico Di Gregorio and mxODBC by eGenix.com Software.
> >         Example DDL for various systems will need to be provided.

It could be in sql.test. is it possible to post a test suite file on this mailing list so every driver's developers can have a look.

>  
> > Rationale
> > 
> >     The module is called sql.py, to avoid any ambiguity with non-realational
> >     or non-SQL compliant database interfaces. This also nicely limits
> >     the scope of the project. Other suggestions are 'sqlutil' and 'rdbms',
> >     since 'sql' may refer to the language itself.
> 
> >     Previous versions of the DB API have been intentially kept lean to
> >     make it simpler to develop and maintain drivers, as a richer feature
> >     set could be implemented by a higher level wrapper and maintained in
> >     a single place rather than in every API compliant driver. As this
> >     revision provides a single place to maintain code, these features
> >     can now be provided without placing a burden to the driver authors.
> > 
> >     Existing DB API 1.0 and 2.0 drivers can be used to power the backend
> >     of this module. This means that there will be a full compliment of
> >     drivers available from day 1 of this modules release without placing
> >     a burden on driver developers and maintainers.

Do we really need to make a huge wrapper ? it is gonna spoil each driver's specific functions. I think sql.py should have a connect function and that's all.

> > 
> >     The core of the API is identical to the Python Database API v2.0
> >     (PEP-249). This API is already familiar to Python programers and
> >     is a proven solid foundation.
> > 
> >     Python previously defined a common relational database API that
> >     was implemented by all drivers, and application programmers accessed
> >     the drivers directly. This caused the following issues:
> > 
> >         It was difficult to write code that connected to multiple
> >         database vendor's databases. Each seperate driver used defined
> >         its own heirarchy of exceptions that needed to be handled,
> 
> Note that a simple way to solve this problem is to require the
> database module to expose the used exception structure as attributes
> of the connection objects.
> 
> >         their own datetime class and their own set of datatype constants.
> > 
> >         Platform independant code could not be written simply,
> >         due to the differing paramater styles used by bind variables.
> >         This also caused problems with publishing example code and tutorials.
> > 
> >         The API remained minimal, as any new features would need to
> >         be implemented by all driver maintainers. The DB-SIG felt
> >         most feature suggestions would be better implemented by higher
> >         level wrappers, such as that defined by this PEP.
> >
> > Language Comparison
> > 
> >     The design proposed in this PEP is the same as that used by Java,
> >     where a relational database API is shipped as part of the core language,
> >     but requires the installation of 3rd party drivers to be used.
> > 
> >     Perl has a similar arrangement of API and driver separation. Perl
> >     does not ship with PerlDBI or drivers as part of the core language,
> >     but it is well documented and trivial to add using the CPAN tools.
> > 
> >     PHP has a seperate API for each database vendor, although work is
> >     underway (completed?) to define a common abstraction layer
> >     similar to Java, Perl and Python. All of the drivers ship as
> >     part of the core language.
> > 
> > References
> > 
> >     [1] PEP-248 and PEP-249
> > 
> > --
> > Stuart Bishop <zen@shangri-la.dropbear.id.au>
> > 
> > _______________________________________________
> > DB-SIG maillist  -  DB-SIG@python.org
> > http://mail.python.org/mailman/listinfo/db-sig

-- 
Thierry Michel
Eric Bianchi