From harri.pasanen@trema.com Wed Jun 3 16:36:57 1998 From: harri.pasanen@trema.com (Harri PASANEN) Date: Wed, 03 Jun 1998 17:36:57 +0200 Subject: [DB-SIG] Sybase module References: <3562EAC5.D1E7A1F3@trema.com> <199805201558.KAA10458@stone.lib.uchicago.edu> Message-ID: <35756D99.F6ABAFF5@trema.com> Tod Olson wrote: > > >>>>> "HP" == Harri PASANEN writes: > > HP> Our sybase server runs on a SPARC-Solaris box, and running the > HP> identical python client on NT and another SPARC-Solaris box gives > HP> the result that NT-client seems roughly 3 times slower than doing > HP> the same from another SPARC Solaris box. The NT cpu is > HP> practically idle during the process, so it must be something with > HP> sybase client libs / or network. > > Some time ago, there was a problem with NT making a TCP/IP connection > to Solaris. (I wish I could remember the versions.) MS was doing > something non-standard in their TCP/IP stack, and Sun hadn't > specifically tested MS compatibility, or somesuch. > > If you're running an old or unpatched Solaris, you might check for > relevant patches, or you might check that NT has the most recent > service pack installed. Or this could be a red herring. > I'm not sure if I already sent this info, but I modified the ctsybasemodule connection init to accept packetsize as well. Get this: by setting the packetsize to 2048 instead of the default 512, my queries sped about about 500% (Or were 5 times faster in other words...) Harri From harri.pasanen@trema.com Wed Jun 3 17:14:34 1998 From: harri.pasanen@trema.com (Harri PASANEN) Date: Wed, 03 Jun 1998 18:14:34 +0200 Subject: [DB-SIG] Sybase module References: Message-ID: <3575766A.E46BA013@trema.com> Peter Godman wrote: > > On Fri, 29 May 1998, Harri PASANEN wrote: > > > Hi, > > > > > > I'm having slight problems with Peter Godman's sybase module > > on Windows NT, with Sybase server being on Sparc-Solaris 2.6. > > > > The same Python program works every time on Solaris, but > > on NT it occasionally hangs, always in the same place on > > cursor.execute(). Looks like there is something > > locking the cursor, as sp_lock always then gives: > > > > fid spid locktype table_id page > > dbname class > > context > > ------ ------ ---------------------------- ----------- ----------- > > --------------- > > 0 20 Sh_intent 16003088 0 > > fk4_0 Cursor Id 1310912 > > 0 20 Sh_page 16003088 6944 > > fk4_0 Cursor Id 1310912 > > > > Hmmm. It's been a long time now, but doesn't the page 0 Sh_intent mean > that it's trying to escalate a lock to a table lock? If this were the > case, this would mean that Sybase is not realizing that the escalation is > unimpeded. This is quite strange. > > If you only use one cursor, have you tried the program using connection > methods rather than cursor methods, i.e. using the execute method of the > connection rather than creating a cursor? I'd be interested to see > whether it happens there. Can you provide a sequence of DBAPI calls, or > some python code? > > If all else fails, we'll have to turn on state debugging, I guess. > > Cheers > Pete > This one resolved itself by switching the SYBASE environment variable to point to sql11 directory, when it used to point to sql10 directory. The path still includes sql10/dll, where the Sybase dll's likely come from. All this is somewhat odd, and browsing Sybase documentation I could not really find out what the SYBASE environment variable is used for besides looking for configuration files. All our other software uses DB-library, this may be something related to CT-library and versions. All this is very bizarre, but luckily I got around the problem. Thanks for help offer. Harri From jim.fulton@Digicool.com Wed Jun 3 23:36:51 1998 From: jim.fulton@Digicool.com (Jim Fulton) Date: Wed, 03 Jun 1998 18:36:51 -0400 Subject: [DB-SIG] DB-API 1.1 References: <3563FACF.7663D00E@lemburg.com> <356B3269.EDBD1D8@lemburg.com> Message-ID: <3575D003.588F@digicool.com> M.-A. Lemburg wrote: > > M.-A. Lemburg wrote: > > > > I'd like to start discussing updating/extending the DB-API 1.0 to > > a new version. IMHO, this needs to be done to > > a. clearify a few minor tidbits > > b. enable a more informative exception mechanism > > > > You can have a look at an alpha version of 1.1 at: > > http://starship.skyport.net/~lemburg/DatabaseAPI-1.1.html > > Looks like everybody is too busy ... oh well. I'll wait another > week or so then and repost the RFC. I applaud you for taking this on.... Here are some comments. I don't know what has changed since 1.0, so some things I gripe about may be inherited from 1.0. I didn't comment enough on the 1.0 spec, so I'll try to make up for it now. > > > Module Interface > > The database interface modules should typically be named with > something terminated by db. Why? > Existing examples are: oracledb, > informixdb, and pg95db. These modules should export several names: > > modulename(connection_string_or_tuple) Why use the module name? Why not some descriptive name like: 'Connect'? > Constructor for creating a connection to the database. Returns a > Connection Object. In case a connection tuple is used, it should > follow this convention: (data_source_name, user, password). Why allow a string or a tuple? Doesn't this add non-portability? > error > > Exception raised for errors in the the database module's internal > processing. Other errors may also be raised. Database error > should in general be made accessible through the exceptions > defined for the dbi abstraction module described below. Maybe this should be InternalError. Is this a DBI defined error, or is it local to the module? Does it subclass from the DBI.Error defined below? > Connection Objects > > Connections Objects should respond to the following methods: > > close() > > Close the connection now (rather than whenever __del__ is > called). The connection will be unusable from this point forward; > an exception will be raised if any operation is attempted with > the connection. > > commit() > > Commit any pending transaction to the database. Note that if the > database supports an auto-commit feature, this must be initially > off. An interface method may be provided to turn it back on. > > rollback() > > Roll the database back to the start of any pending > transaction. Note that closing a connection without committing > the changes first will cause an implicit rollback to be > performed. Why not have a begin() method? > cursor() > > Return a new Cursor Object. An exception may be thrown if the > database does not support a cursor concept. > > callproc([params]) > > Note: this method is not well-defined yet. Call a stored > database procedure with the given (optional) parameters. Returns > the result of the stored procedure. How are IN OUT and OUT parameters handled? How common are stored procedures across database products anyway? > all Cursor Object attributes and methods > > For databases that do not have cursors and for simple > applications that do not require the complexity of a cursor, a > Connection Object should respond to each of the attributes and > methods of the Cursor Object. Databases that have cursor can > implement this by using an implicit, internal cursor. > > Cursor Objects > > These objects represent a database cursor, which is used to > manage the context of a fetch operation. > > Cursor Objects should respond to the following methods and attributes: > > arraysize > > This read/write attribute specifies the number of rows to fetch > at a time with fetchmany(). This value is also used when > inserting multiple rows at a time (passing a tuple/list of > tuples/lists as the params value to execute()). This attribute > will default to a single row. > > Note that the arraysize is optional and is merely provided for > higher performance database interactions. Implementations should > observe it with respect to the fetchmany() method, but are free > to interact with the database a single row at a time. Doesn't fetchmany accept a count? Why is this attribute needed? > description > > This read-only attribute is a tuple of 7-tuples. Each 7-tuple > contains information describing each result column: (name, > type_code, display_size, internal_size, precision, scale, > null_ok). This attribute will be None for operations that do not > return rows or if the cursor has not had an operation invoked via > the execute() method yet. > > The type_code is equal to one of the dbi type objects specified > in the section below. > > Note: this is a bit in flux. Generally, the first two items of > the 7-tuple will always be present; the others may be database > specific. This is bad. I suppose we are stuck with this for backwards compatibility. If I were designing this interface I would have description be a collection object that acted as both a sequence of column definitions and a mapping from column name to column definitions. I would have the column definitions be objects that have methods for standard attributes like name, and type (and maybe nullability, scale and precision) as well as optional attributes for things like display size and internal size. I suppose that with some trickery, this could be handled in a mostly backward compatible way, by making column-definitions sequence objects too. > close() > > Close the cursor now (rather than whenever __del__ is > called). The cursor will be unusable from this point forward; an > exception will be raised if any operation is attempted with the > cursor. > > execute(operation [,params]) > > Execute (prepare) a database operation (query or > command). Parameters may be provided (as a sequence > (e.g. tuple/list)) and will be bound to variables in the > operation. Variables are specified in a database-specific > notation (some DBs use ?,?,? to indicate parameters, others > :1,:2,:3) that is based on the index in the parameter tuple > (position-based rather than name-based). The format of parameter references should be standardized. Maybe with something more Pythonic, like: %s, %d, %f This might also allow type information to be captured to aid in binding variables. > The parameters may also be specified as a sequence of sequences > (e.g. a list of tuples) to insert multiple rows in a single > operation. Does this run the insert multiple times, or does it bind some sorts of arrays to input parameters? Is this useful enough to include in this standard? It feels like alot of extra burden for DBI interface developers. > A reference to the operation will be retained by the cursor. If > the same operation object is passed in again, then the cursor can > optimize its behavior. This is most effective for algorithms > where the same operation is used, but different parameters are > bound to it (many times). This sounds a bit too magical to me. Does it apply when no arguments are presented? I'd rather see an explicit prepare method, preferably on a connection object that returns a callable object, as in: f=aConnection.prepare( "select * from mydata where id=%d and name=%s") ... x=f(1, 'foo') ... y=f(2, 'bar') > For maximum efficiency when reusing an operation, it is best to > use the setinputsizes() method to specify the parameter types and > sizes ahead of time. It is legal for a parameter to not match the > predefined information; the implementation should compensate, > possibly with a loss of efficiency. I think that this could better be handled using more pythonic place holders. I don't like having to specify sizes for strings, since I may want to use a type like 'long var binary' that effectively doesn't have an upper limit. > Using SQL terminology, these are the possible result values from > the execute() method: > > If the statement is DDL (e.g. CREATE TABLE), then 1 is > returned. This seems a bit arbitrary to me. > If the statement is DML (e.g. UPDATE or INSERT), then the > number of rows affected is returned (0 or a positive > integer). > > If the statement is DQL (e.g. SELECT), None is returned, > indicating that the statement is not really complete until > you use one of the fetch methods. > > fetchone() > > Fetch the next row of a query result, returning a single tuple, > or None when no more data is available. > > fetchmany([size]) > > Fetch the next set of rows of a query result, returning as a list > of tuples. An empty list is returned when no more rows are > available. The number of rows to fetch is specified by the > parameter. If it is None, then the cursor's arraysize determines > the number of rows to be fetched. > > Note there are performance considerations involved with the size > parameter. For optimal performance, it is usually best to use the > arraysize attribute. If the size parameter is used, then it is > best for it to retain the same value from one fetchmany() call to > the next. > > fetchall() > > Fetch all (remaining) rows of a query result, returning them as a > list of tuples. Note that the cursor's arraysize attribute can > affect the performance of this operation. For the record, I've never liked this approach. When I've done this sort of thing before (For Ingres and Info, sorry, I can't share the code, it was done while at USGS), I had selects return "result" objects. Result objects encapsulated cursors and have them sequence behavior. As in: rows=aConnection.execute('select * from blech') # (Note, no explicit cursor objects) for row in rows: ... do something with the rows Note that the "rows" object in this example is not a tuple or list, but an object that lazily gets rows from the result as needed. Also note that the individual rows are not tuples, but objects that act as a sequence of values *and* as a mapping from column name to value. This lets you do something like: rows=aConnection.execute('select name, id from blech') for row in rows: print "%(name), %(id)" % row In my Ingres and Info interfaces, I also had the rows have attributes (e.g. aRow.name), but then it's hard for rows to have generic methods, like 'keys' and 'items'. I also provided access to meta data for rows, something like: row.__schema__ > nextset() > > If the database supports returning multiple result sets, this > method will make the cursor skip to the next available set. If > there are no more sets, the method returns None. Otherwise, it > returns 1 and subsequent calls to the fetch methods will return > rows from the next result set. Database interface modules that > don't support this feature should always return None. This feels a bit cumbersome to me. What happens if you need to iterate over multiple results simulataneously. I'd rather see an object for each result set and return a tuple of result sets if there are more than one. > setinputsizes(sizes) > > Note: this method is not well-defined yet. This can be used > before a call to execute() to predefine memory areas for the > operation's parameters. sizes is specified as a tuple -- one item > for each input parameter. The item should be a Type object that > corresponds to the input that will be used, or it should be an > integer specifying the maximum length of a string parameter. If > the item is None, then no predefined memory area will be reserved > for that column (this is useful to avoid predefined areas for > large inputs). > > This method would be used before the execute() method is invoked. > > Note that this method is optional and is merely provided for > higher performance database interaction. Implementations are free > to do nothing and users are free to not use it. See above. > setoutputsize(size [,col]) > > Note: this method is not well-defined yet. Set a column buffer > size for fetches of large columns (e.g. LONG). The column is > specified as an index into the result tuple. Using a column of > None will set the default size for all large columns in the > cursor. > > This method would be used before the execute() method is invoked. > > Note that this method is optional and is merely provided for > higher performance database interaction. Implementations are free > to do nothing and users are free to not use it. In the case of LONG columns, how is someone suppose to know the maximum size ahead of time? Does anyone really want this? > DBI Helper Objects and Exceptions > > Many databases need to have the input in a particular format for > binding to an operation's input parameters. For example, if an input > is destined for a DATE column, then it must be bound to the database > in a particular string format. Similar problems exist for "Row ID" > columns or large binary items (e.g. blobs or RAW columns). This > presents problems for Python since the parameters to the execute() > method are untyped. They don't have to be. See above. > When the database module sees a Python string > object, it doesn't know if it should be bound as a simple CHAR column, > as a raw binary item, or as a DATE. > > To overcome this problem, the dbi interface module was created. This > module, which every database module must provide, specifies some basic > database interface types for working with databases. There are two > classes: dbiDate and dbiRaw. These are simple container classes that > wrap up a value. When passed to the database modules, the module can > then detect that the input parameter is intended as a DATE or a > RAW. I suggest doing away with these through use of parameters like %r for raw and %t for date time, or whatever. > For symmetry, the database modules will return DATE and RAW > columns as instances of these classes. I'd rather see strings come back for RAW and "Date" objects come back for dates. I'd prefer to see the Date type be pluggable. > A Cursor Object's description attribute returns information about each > of the result columns of a query. The type_code is defined to be equal > to one of five types exported by this module: STRING, RAW, NUMBER, > DATE, or ROWID. There needs to be a distinction between ints and floats. > Note: The values returned in the description tuple must not > necessarily be the same as the defined types, i.e. while coltype == > STRING will always work, coltype is STRING may fail. Why? > The module exports the following functions and names: > > dbiDate(value) > > This function constructs a dbiDate instance that holds a date > value. The value should be specified as an integer number of > seconds since the "epoch" (e.g. time.time()). > > dbiRaw(value) > > This function constructs a dbiRaw instance that holds a raw > (binary) value. The value should be specified as a Python string. > > STRING > > This object is used to describe columns in a database that are > string-based (e.g. CHAR). > > RAW > > This object is used to describe (large) binary columns in a > database (e.g. LONG RAW, blobs). > > NUMBER > > This object is used to describe numeric columns in a database. > > DATE > > This object is used to describe date columns in a database. > > ROWID > > This object is used to describe the "Row ID" column in a > database. > > The module also exports these exceptions that the DB module should > raise: > > Warning > > Exception raised for important warnings like data truncations > while inserting, etc. > > Error > > Exception that is the base class of all other error > exceptions. You can use this to catch all errors with one single > 'except' statement. Warnings are not considered errors and thus > should not use this class as base. > > DataError > > Exception raised for errors that are due to problems with the > processed data like division by zero, numeric out of range, etc. > > OperationalError > > Exception raised when the an unexpected disconnect occurs, the > data source name is not found, etc. > > IntegrityError > > Exception raised when the relational integrity of the database is > affected, e.g. a foreign key check fails. > > InternalError > > Exception raised when the database encounters an internal error, > e.g. the cursor is not valid anymore, the transaction is out of > sync, etc. > > ProgrammingError > > Exception raised for programming erros, e.g. table not found or > already exists, etc. > > Note: The values of these exceptions are not defined. They should give > the user a fairly good idea of what went wrong though. If dbi exports a C API, it should do so through a Python CObject. This should avoid weird linking problems (that the oracledb and ctsybase modules have. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From mal@lemburg.com Thu Jun 4 00:36:37 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jun 1998 01:36:37 +0200 Subject: [DB-SIG] DB-API 1.1 References: <3563FACF.7663D00E@lemburg.com> <356B3269.EDBD1D8@lemburg.com> <3575D003.588F@digicool.com> Message-ID: <3575DE05.710874E1@lemburg.com> Jim Fulton wrote: > > M.-A. Lemburg wrote: > > > > M.-A. Lemburg wrote: > > > > > > I'd like to start discussing updating/extending the DB-API 1.0 to > > > a new version. IMHO, this needs to be done to > > > a. clearify a few minor tidbits > > > b. enable a more informative exception mechanism > > > > > > You can have a look at an alpha version of 1.1 at: > > > http://starship.skyport.net/~lemburg/DatabaseAPI-1.1.html > > > > Looks like everybody is too busy ... oh well. I'll wait another > > week or so then and repost the RFC. > > I applaud you for taking this on.... > > Here are some comments. I don't know what has changed since Not much. > 1.0, so some things I gripe about may be inherited from 1.0. That's ok ;-) > > I didn't comment enough on the 1.0 spec, so I'll try to make up > for it now. > > > > > > > Module Interface > > > > The database interface modules should typically be named with > > something terminated by db. > > Why? Good question :-) See my previous post for an alternative: 1. Instead of defining the connection constructor to be named , I think Connect(...) is a better choice (helps porting applications from one DB to another). > > > Existing examples are: oracledb, > > informixdb, and pg95db. These modules should export several names: > > > > modulename(connection_string_or_tuple) > > Why use the module name? Why not some descriptive name > like: 'Connect'? Right. > > > Constructor for creating a connection to the database. Returns a > > Connection Object. In case a connection tuple is used, it should > > follow this convention: (data_source_name, user, password). > > Why allow a string or a tuple? Doesn't this add non-portability? Some databases want to be passed three (or more) different parameters for the connect. Putting them all into a connect string would add an extra parse step. > > > error > > > > Exception raised for errors in the the database module's internal > > processing. Other errors may also be raised. Database error > > should in general be made accessible through the exceptions > > defined for the dbi abstraction module described below. > > Maybe this should be InternalError. Is this a DBI defined error, > or is it local to the module? Does it subclass from the DBI.Error > defined below? Hmm, InternalError refers to the database not the interface. Maybe we should add an InterfaceError that subclasses from Error. > > > Connection Objects > > > > Connections Objects should respond to the following methods: > > > > close() > > > > Close the connection now (rather than whenever __del__ is > > called). The connection will be unusable from this point forward; > > an exception will be raised if any operation is attempted with > > the connection. > > > > commit() > > > > Commit any pending transaction to the database. Note that if the > > database supports an auto-commit feature, this must be initially > > off. An interface method may be provided to turn it back on. > > > > rollback() > > > > Roll the database back to the start of any pending > > transaction. Note that closing a connection without committing > > the changes first will cause an implicit rollback to be > > performed. > > Why not have a begin() method? What for ? > > > cursor() > > > > Return a new Cursor Object. An exception may be thrown if the > > database does not support a cursor concept. > > > > callproc([params]) > > > > Note: this method is not well-defined yet. Call a stored > > database procedure with the given (optional) parameters. Returns > > the result of the stored procedure. > > How are IN OUT and OUT parameters handled? > How common are stored procedures across database products > anyway? I don't think that there is a lot of portability regarding stored procedures. For one, the storing process itself is *very* DB dependant (with each database having its own syntax for defining procedures) and it is not really clear how to handle the parameters (esp. the IN OUT ones you mention). > > all Cursor Object attributes and methods > > > > For databases that do not have cursors and for simple > > applications that do not require the complexity of a cursor, a > > Connection Object should respond to each of the attributes and > > methods of the Cursor Object. Databases that have cursor can > > implement this by using an implicit, internal cursor. > > > > Cursor Objects > > > > These objects represent a database cursor, which is used to > > manage the context of a fetch operation. > > > > Cursor Objects should respond to the following methods and attributes: > > > > arraysize > > > > This read/write attribute specifies the number of rows to fetch > > at a time with fetchmany(). This value is also used when > > inserting multiple rows at a time (passing a tuple/list of > > tuples/lists as the params value to execute()). This attribute > > will default to a single row. > > > > Note that the arraysize is optional and is merely provided for > > higher performance database interactions. Implementations should > > observe it with respect to the fetchmany() method, but are free > > to interact with the database a single row at a time. > > Doesn't fetchmany accept a count? Why is this attribute > needed? I guess this allows the interface to prefetch arraysize many rows in one go. Don't have any experience with it though. > > > description > > > > This read-only attribute is a tuple of 7-tuples. Each 7-tuple > > contains information describing each result column: (name, > > type_code, display_size, internal_size, precision, scale, > > null_ok). This attribute will be None for operations that do not > > return rows or if the cursor has not had an operation invoked via > > the execute() method yet. > > > > The type_code is equal to one of the dbi type objects specified > > in the section below. > > > > Note: this is a bit in flux. Generally, the first two items of > > the 7-tuple will always be present; the others may be database > > specific. > > This is bad. I suppose we are stuck with this for backwards > compatibility. I guess so, too :( > > If I were designing this interface I would have description > be a collection object that acted as both a sequence of > column definitions and a mapping from column name to column > definitions. I would have the column definitions be objects > that have methods for standard attributes like name, and type > (and maybe nullability, scale and precision) > as well as optional attributes for things like display size and > internal size. Especially those last two don't make too much sense now-a-days (probably did back when all output had to formatted for 80- column display/printers). > I suppose that with some trickery, this could be handled in a mostly > backward compatible way, by making column-definitions sequence objects > too. We could change the specification to: ... is a read-only sequence of sequences, each describing one output column ... I don't think that any code really depends on real tuples for description - as long as a,b,c = cursor.description[:3] works everything should be fine. > > close() > > > > Close the cursor now (rather than whenever __del__ is > > called). The cursor will be unusable from this point forward; an > > exception will be raised if any operation is attempted with the > > cursor. > > > > execute(operation [,params]) > > > > Execute (prepare) a database operation (query or > > command). Parameters may be provided (as a sequence > > (e.g. tuple/list)) and will be bound to variables in the > > operation. Variables are specified in a database-specific > > notation (some DBs use ?,?,? to indicate parameters, others > > :1,:2,:3) that is based on the index in the parameter tuple > > (position-based rather than name-based). > > The format of parameter references should be standardized. > Maybe with something more Pythonic, like: > > %s, %d, %f > > This might also allow type information to be captured to > aid in binding variables. That won't work, since the arguments passed in the input sequence are handled by the database not the interface. E.g. for ODBC the database may define the type it wants to receive for each parameter. As example: the ODBC driver for MySQL always wants to be passed strings regardless of the "real" type. > > The parameters may also be specified as a sequence of sequences > > (e.g. a list of tuples) to insert multiple rows in a single > > operation. > > Does this run the insert multiple times, or does it bind > some sorts of arrays to input parameters? Is this > useful enough to include in this standard? It feels like > alot of extra burden for DBI interface developers. The interface can implement this as a loop inserting one row at a time or an array of rows which it passes to the database in one go. The latter is usually faster for database interfaces that rely heavily on network communication. > > A reference to the operation will be retained by the cursor. If > > the same operation object is passed in again, then the cursor can > > optimize its behavior. This is most effective for algorithms > > where the same operation is used, but different parameters are > > bound to it (many times). > > This sounds a bit too magical to me. Does it apply when no arguments > are presented? For ODBC this saves doing a prepare for every execute (well, sometimes at least). It does also apply when no arguments are present. > I'd rather see an explicit prepare method, preferably > on a connection object that returns a callable object, as in: > > f=aConnection.prepare( > "select * from mydata where id=%d and name=%s") > ... > x=f(1, 'foo') > ... > y=f(2, 'bar') > Don't know if all DBs support preparing SQL statements. You can already do this with cursors and a little magic (using the "keep a reference" feature). > > > For maximum efficiency when reusing an operation, it is best to > > use the setinputsizes() method to specify the parameter types and > > sizes ahead of time. It is legal for a parameter to not match the > > predefined information; the implementation should compensate, > > possibly with a loss of efficiency. > > I think that this could better be handled using more pythonic > place holders. I don't like having to specify sizes for strings, > since I may want to use a type like 'long var binary' that effectively > doesn't have an upper limit. Well, I think this was meant to have the interface pre-allocate memory for the data transfer. Don't think it's needed anymore, though, since most memory allocators only reserve memory rather than actually make it available (this is done on request indicated by page faults). > > > Using SQL terminology, these are the possible result values from > > the execute() method: > > > > If the statement is DDL (e.g. CREATE TABLE), then 1 is > > returned. > > This seems a bit arbitrary to me. Sure is and it's hard to implement too. Some interfaces will have to parse the SQL statement just to be compliant to these result values... > > If the statement is DML (e.g. UPDATE or INSERT), then the > > number of rows affected is returned (0 or a positive > > integer). > > > > If the statement is DQL (e.g. SELECT), None is returned, > > indicating that the statement is not really complete until > > you use one of the fetch methods. > > > > fetchone() > > > > Fetch the next row of a query result, returning a single tuple, > > or None when no more data is available. > > > > fetchmany([size]) > > > > Fetch the next set of rows of a query result, returning as a list > > of tuples. An empty list is returned when no more rows are > > available. The number of rows to fetch is specified by the > > parameter. If it is None, then the cursor's arraysize determines > > the number of rows to be fetched. > > > > Note there are performance considerations involved with the size > > parameter. For optimal performance, it is usually best to use the > > arraysize attribute. If the size parameter is used, then it is > > best for it to retain the same value from one fetchmany() call to > > the next. > > > > fetchall() > > > > Fetch all (remaining) rows of a query result, returning them as a > > list of tuples. Note that the cursor's arraysize attribute can > > affect the performance of this operation. > > For the record, I've never liked this approach. When I've done this > sort of thing before (For Ingres and Info, sorry, I can't share the > code, it was done while at USGS), I had selects return "result" > objects. Result objects encapsulated cursors and have them sequence > behavior. As in: > > rows=aConnection.execute('select * from blech') > # (Note, no explicit cursor objects) > for row in rows: > ... do something with the rows > > Note that the "rows" object in this example is not a tuple > or list, but an object that lazily gets rows from the > result as needed. > > Also note that the individual rows are not tuples, but > objects that act as a sequence of values *and* as a mapping > from column name to value. This lets you do something like: > > rows=aConnection.execute('select name, id from blech') > for row in rows: > print "%(name), %(id)" % row > > In my Ingres and Info interfaces, I also had the > rows have attributes (e.g. aRow.name), but then it's > hard for rows to have generic methods, like 'keys' and > 'items'. I also provided access to meta data for rows, > something like: > > row.__schema__ We could change that to ... return the result set as sequence of sequences ... as stated above, I think the only compatibility issue is having a,b,c = sequence work (and all that's needed is a working __getitem__ method). > > > nextset() > > > > If the database supports returning multiple result sets, this > > method will make the cursor skip to the next available set. If > > there are no more sets, the method returns None. Otherwise, it > > returns 1 and subsequent calls to the fetch methods will return > > rows from the next result set. Database interface modules that > > don't support this feature should always return None. > > This feels a bit cumbersome to me. What happens if you need > to iterate over multiple results simulataneously. I'd rather > see an object for each result set and return a tuple of result > sets if there are more than one. Sorry, can't really comment on that one, since I have no experience with it. Only one thing: introducing too many different new types will put a real strain on interface implementors. We should avoid that. > > > setinputsizes(sizes) > > > > Note: this method is not well-defined yet. This can be used > > before a call to execute() to predefine memory areas for the > > operation's parameters. sizes is specified as a tuple -- one item > > for each input parameter. The item should be a Type object that > > corresponds to the input that will be used, or it should be an > > integer specifying the maximum length of a string parameter. If > > the item is None, then no predefined memory area will be reserved > > for that column (this is useful to avoid predefined areas for > > large inputs). > > > > This method would be used before the execute() method is invoked. > > > > Note that this method is optional and is merely provided for > > higher performance database interaction. Implementations are free > > to do nothing and users are free to not use it. > > See above. > > > setoutputsize(size [,col]) > > > > Note: this method is not well-defined yet. Set a column buffer > > size for fetches of large columns (e.g. LONG). The column is > > specified as an index into the result tuple. Using a column of > > None will set the default size for all large columns in the > > cursor. > > > > This method would be used before the execute() method is invoked. > > > > Note that this method is optional and is merely provided for > > higher performance database interaction. Implementations are free > > to do nothing and users are free to not use it. > > In the case of LONG columns, how is someone suppose to know the maximum > size ahead of time? Does anyone really want this? Not me... (see above) > > > DBI Helper Objects and Exceptions > > > > Many databases need to have the input in a particular format for > > binding to an operation's input parameters. For example, if an input > > is destined for a DATE column, then it must be bound to the database > > in a particular string format. Similar problems exist for "Row ID" > > columns or large binary items (e.g. blobs or RAW columns). This > > presents problems for Python since the parameters to the execute() > > method are untyped. > > They don't have to be. See above. > > > When the database module sees a Python string > > object, it doesn't know if it should be bound as a simple CHAR column, > > as a raw binary item, or as a DATE. > > > > To overcome this problem, the dbi interface module was created. This > > module, which every database module must provide, specifies some basic > > database interface types for working with databases. There are two > > classes: dbiDate and dbiRaw. These are simple container classes that > > wrap up a value. When passed to the database modules, the module can > > then detect that the input parameter is intended as a DATE or a > > RAW. > > I suggest doing away with these through use of parameters like > %r for raw and %t for date time, or whatever. Not sure how you'd implement this without parsing the SQL string. > > For symmetry, the database modules will return DATE and RAW > > columns as instances of these classes. > > I'd rather see strings come back for RAW and "Date" objects > come back for dates. I'd prefer to see the Date type be pluggable. Of course, I'd like to see everybody you DateTime[Delta] types ;-) After all, they were created for exactly this reason... but agreeing on a common interface should be enough. > > A Cursor Object's description attribute returns information about each > > of the result columns of a query. The type_code is defined to be equal > > to one of five types exported by this module: STRING, RAW, NUMBER, > > DATE, or ROWID. > > There needs to be a distinction between ints and floats. IMHO, we should simply use the standard Python types the way they are meant: numbers with precision become floats, ones without precision become integers. Very long integers (BIGINT) are output as longs. > > > Note: The values returned in the description tuple must not > > necessarily be the same as the defined types, i.e. while coltype == > > STRING will always work, coltype is STRING may fail. > > Why? Simple: to allow the interface implementor to use some extra magic. E.g. the mxODBC module will give you a much more fine grained coltype description than the one defined by the dbi module, so multiple coltype values will have to compare equal to say NUMBER (floats, integers, decimals, ...). > > > The module exports the following functions and names: > > > > ... > > If dbi exports a C API, it should do so through a Python > CObject. This should avoid weird linking problems (that > the oracledb and ctsybase modules have. Good idea. -- Marc-Andre Lemburg ---------------------------------------------------------------------- | Python Pages: http://starship.skyport.net/~lemburg/ | ------------------------------------------------------- From ted_horst@swissbank.com Thu Jun 4 00:37:20 1998 From: ted_horst@swissbank.com (Ted Horst) Date: Wed, 3 Jun 98 18:37:20 -0500 Subject: [DB-SIG] DB-API 1.1 In-Reply-To: <3575D003.588F@digicool.com> References: <3563FACF.7663D00E@lemburg.com> <356B3269.EDBD1D8@lemburg.com> <3575D003.588F@digicool.com> Message-ID: <9806032337.AA02347@ch1d162nwk> I'll add my $/50 for what its worth. On Wed, 03 Jun 1998, Jim Fulton wrote: > M.-A. Lemburg wrote: > > > > M.-A. Lemburg wrote: > > > > > > I'd like to start discussing updating/extending the DB-API 1.0 to > > > a new version. IMHO, this needs to be done to > > > a. clearify a few minor tidbits > > > b. enable a more informative exception mechanism > > > > > > You can have a look at an alpha version of 1.1 at: > > > > > > Looks like everybody is too busy ... oh well. I'll wait another > > week or so then and repost the RFC. > > I applaud you for taking this on.... > > Here are some comments. I don't know what has changed since > 1.0, so some things I gripe about may be inherited from 1.0. > > I didn't comment enough on the 1.0 spec, so I'll try to make up > for it now. > > > > > > > Module Interface > > > > The database interface modules should typically be named with > > something terminated by db. > > Why? > > > Existing examples are: oracledb, > > informixdb, and pg95db. These modules should export several names: > > > > modulename(connection_string_or_tuple) > > Why use the module name? Why not some descriptive name > like: 'Connect'? I'll agree with Jim here. I don't care so much what the name is, but it should be the same across modules. > > > Constructor for creating a connection to the database. Returns a > > Connection Object. In case a connection tuple is used, it should > > follow this convention: (data_source_name, user, password). > > Why allow a string or a tuple? Doesn't this add non-portability? Actually I would like to use a dictionary here. That way you can include lots of optional parameters, and if the implementation can't use them, it won't ask. I'm not crazy about trying to parse a string to get the parameters that I need, and if you use a tuple, everybody is going to have to agree on the position of all possible parameters. > > > error > > > > Exception raised for errors in the the database module's internal > > processing. Other errors may also be raised. Database error > > should in general be made accessible through the exceptions > > defined for the dbi abstraction module described below. > > Maybe this should be InternalError. Is this a DBI defined error, > or is it local to the module? Does it subclass from the DBI.Error > defined below? > > > Connection Objects > > > > Connections Objects should respond to the following methods: > > > > close() > > > > Close the connection now (rather than whenever __del__ is > > called). The connection will be unusable from this point forward; > > an exception will be raised if any operation is attempted with > > the connection. Why can't you reopen the connection ? > > > > commit() > > > > Commit any pending transaction to the database. Note that if the > > database supports an auto-commit feature, this must be initially > > off. An interface method may be provided to turn it back on. > > > > rollback() > > > > Roll the database back to the start of any pending > > transaction. Note that closing a connection without committing > > the changes first will cause an implicit rollback to be > > performed. > > Why not have a begin() method? > > > cursor() > > > > Return a new Cursor Object. An exception may be thrown if the > > database does not support a cursor concept. > > > > callproc([params]) > > > > Note: this method is not well-defined yet. Call a stored > > database procedure with the given (optional) parameters. Returns > > the result of the stored procedure. > > How are IN OUT and OUT parameters handled? > How common are stored procedures across database products > anyway? Stored procedures are important, but it is my impression that you can usually get to the through execute. The only thing that might not get is the return value of the procedure (as opposed to the results returned), and this could just be returned. > > > all Cursor Object attributes and methods > > > > For databases that do not have cursors and for simple > > applications that do not require the complexity of a cursor, a > > Connection Object should respond to each of the attributes and > > methods of the Cursor Object. Databases that have cursor can > > implement this by using an implicit, internal cursor. > > > > Cursor Objects > > > > These objects represent a database cursor, which is used to > > manage the context of a fetch operation. > > > > Cursor Objects should respond to the following methods and attributes: > > > > arraysize > > > > This read/write attribute specifies the number of rows to fetch > > at a time with fetchmany(). This value is also used when > > inserting multiple rows at a time (passing a tuple/list of > > tuples/lists as the params value to execute()). This attribute > > will default to a single row. > > > > Note that the arraysize is optional and is merely provided for > > higher performance database interactions. Implementations should > > observe it with respect to the fetchmany() method, but are free > > to interact with the database a single row at a time. > > Doesn't fetchmany accept a count? Why is this attribute > needed? > > > description > > > > This read-only attribute is a tuple of 7-tuples. Each 7-tuple > > contains information describing each result column: (name, > > type_code, display_size, internal_size, precision, scale, > > null_ok). This attribute will be None for operations that do not > > return rows or if the cursor has not had an operation invoked via > > the execute() method yet. > > > > The type_code is equal to one of the dbi type objects specified > > in the section below. > > > > Note: this is a bit in flux. Generally, the first two items of > > the 7-tuple will always be present; the others may be database > > specific. > > This is bad. I suppose we are stuck with this for backwards > compatibility. Again, I would much prefer that this be a dictionary with maybe some required entries, but as much extra info as the implementer cares to include. > > If I were designing this interface I would have description > be a collection object that acted as both a sequence of > column definitions and a mapping from column name to column > definitions. I would have the column definitions be objects > that have methods for standard attributes like name, and type > (and maybe nullability, scale and precision) > as well as optional attributes for things like display size and > internal size. > > I suppose that with some trickery, this could be handled in a mostly > backward compatible way, by making column-definitions sequence objects > too. > > > close() > > > > Close the cursor now (rather than whenever __del__ is > > called). The cursor will be unusable from this point forward; an > > exception will be raised if any operation is attempted with the > > cursor. > > > > execute(operation [,params]) > > > > Execute (prepare) a database operation (query or > > command). Parameters may be provided (as a sequence > > (e.g. tuple/list)) and will be bound to variables in the > > operation. Variables are specified in a database-specific > > notation (some DBs use ?,?,? to indicate parameters, others > > :1,:2,:3) that is based on the index in the parameter tuple > > (position-based rather than name-based). > > The format of parameter references should be standardized. > Maybe with something more Pythonic, like: > > %s, %d, %f I agree that we should use python style formats and let the individual modules do the proper translation. > > This might also allow type information to be captured to > aid in binding variables. > > > The parameters may also be specified as a sequence of sequences > > (e.g. a list of tuples) to insert multiple rows in a single > > operation. > > Does this run the insert multiple times, or does it bind > some sorts of arrays to input parameters? Is this > useful enough to include in this standard? It feels like > alot of extra burden for DBI interface developers. > > > A reference to the operation will be retained by the cursor. If > > the same operation object is passed in again, then the cursor can > > optimize its behavior. This is most effective for algorithms > > where the same operation is used, but different parameters are > > bound to it (many times). > > This sounds a bit too magical to me. Does it apply when no arguments > are presented? I'd rather see an explicit prepare method, preferably > on a connection object that returns a callable object, as in: > > > f=aConnection.prepare( > "select * from mydata where id=%d and name=%s") > ... > x=f(1, 'foo') > ... > y=f(2, 'bar') Nice idea! > > > > For maximum efficiency when reusing an operation, it is best to > > use the setinputsizes() method to specify the parameter types and > > sizes ahead of time. It is legal for a parameter to not match the > > predefined information; the implementation should compensate, > > possibly with a loss of efficiency. > > I think that this could better be handled using more pythonic > place holders. I don't like having to specify sizes for strings, > since I may want to use a type like 'long var binary' that effectively > doesn't have an upper limit. > > > Using SQL terminology, these are the possible result values from > > the execute() method: > > > > If the statement is DDL (e.g. CREATE TABLE), then 1 is > > returned. > > This seems a bit arbitrary to me. > > > If the statement is DML (e.g. UPDATE or INSERT), then the > > number of rows affected is returned (0 or a positive > > integer). It might be nice to have this available in some other way, so that you can get it for select queries as well. I don't know how widely supported this is. > > > > If the statement is DQL (e.g. SELECT), None is returned, > > indicating that the statement is not really complete until > > you use one of the fetch methods. I would like to be able to get ther return value of a stored procedure here. > > > > fetchone() > > > > Fetch the next row of a query result, returning a single tuple, > > or None when no more data is available. > > > > fetchmany([size]) > > > > Fetch the next set of rows of a query result, returning as a list > > of tuples. An empty list is returned when no more rows are > > available. The number of rows to fetch is specified by the > > parameter. If it is None, then the cursor's arraysize determines > > the number of rows to be fetched. > > > > Note there are performance considerations involved with the size > > parameter. For optimal performance, it is usually best to use the > > arraysize attribute. If the size parameter is used, then it is > > best for it to retain the same value from one fetchmany() call to > > the next. > > > > fetchall() > > > > Fetch all (remaining) rows of a query result, returning them as a > > list of tuples. Note that the cursor's arraysize attribute can > > affect the performance of this operation. > > For the record, I've never liked this approach. When I've done this > sort of thing before (For Ingres and Info, sorry, I can't share the > code, it was done while at USGS), I had selects return "result" > objects. Result objects encapsulated cursors and have them sequence > behavior. As in: > > rows=aConnection.execute('select * from blech') > # (Note, no explicit cursor objects) > for row in rows: > ... do something with the rows > > Note that the "rows" object in this example is not a tuple > or list, but an object that lazily gets rows from the > result as needed. > > Also note that the individual rows are not tuples, but > objects that act as a sequence of values *and* as a mapping > from column name to value. This lets you do something like: > > rows=aConnection.execute('select name, id from blech') > for row in rows: > print "%(name), %(id)" % row > > In my Ingres and Info interfaces, I also had the > rows have attributes (e.g. aRow.name), but then it's > hard for rows to have generic methods, like 'keys' and > 'items'. I also provided access to meta data for rows, > something like: > > row.__schema__ > Result objects are great, but I think that they should be built on top of the standard layer. Different uses might require (or at least suggest) very different kinds of result objects. > > nextset() > > > > If the database supports returning multiple result sets, this > > method will make the cursor skip to the next available set. If > > there are no more sets, the method returns None. Otherwise, it > > returns 1 and subsequent calls to the fetch methods will return > > rows from the next result set. Database interface modules that > > don't support this feature should always return None. > > This feels a bit cumbersome to me. What happens if you need > to iterate over multiple results simulataneously. I'd rather > see an object for each result set and return a tuple of result > sets if there are more than one. Yes. > > > setinputsizes(sizes) > > > > Note: this method is not well-defined yet. This can be used > > before a call to execute() to predefine memory areas for the > > operation's parameters. sizes is specified as a tuple -- one item > > for each input parameter. The item should be a Type object that > > corresponds to the input that will be used, or it should be an > > integer specifying the maximum length of a string parameter. If > > the item is None, then no predefined memory area will be reserved > > for that column (this is useful to avoid predefined areas for > > large inputs). > > > > This method would be used before the execute() method is invoked. > > > > Note that this method is optional and is merely provided for > > higher performance database interaction. Implementations are free > > to do nothing and users are free to not use it. > > See above. > > > setoutputsize(size [,col]) > > > > Note: this method is not well-defined yet. Set a column buffer > > size for fetches of large columns (e.g. LONG). The column is > > specified as an index into the result tuple. Using a column of > > None will set the default size for all large columns in the > > cursor. > > > > This method would be used before the execute() method is invoked. > > > > Note that this method is optional and is merely provided for > > higher performance database interaction. Implementations are free > > to do nothing and users are free to not use it. > > In the case of LONG columns, how is someone suppose to know the maximum > size ahead of time? Does anyone really want this? > > > DBI Helper Objects and Exceptions > > > > Many databases need to have the input in a particular format for > > binding to an operation's input parameters. For example, if an input > > is destined for a DATE column, then it must be bound to the database > > in a particular string format. Similar problems exist for "Row ID" > > columns or large binary items (e.g. blobs or RAW columns). This > > presents problems for Python since the parameters to the execute() > > method are untyped. > > They don't have to be. See above. > > > When the database module sees a Python string > > object, it doesn't know if it should be bound as a simple CHAR column, > > as a raw binary item, or as a DATE. > > > > To overcome this problem, the dbi interface module was created. This > > module, which every database module must provide, specifies some basic > > database interface types for working with databases. There are two > > classes: dbiDate and dbiRaw. These are simple container classes that > > wrap up a value. When passed to the database modules, the module can > > then detect that the input parameter is intended as a DATE or a > > RAW. > > I suggest doing away with these through use of parameters like > %r for raw and %t for date time, or whatever. > > > For symmetry, the database modules will return DATE and RAW > > columns as instances of these classes. > > I'd rather see strings come back for RAW and "Date" objects > come back for dates. I'd prefer to see the Date type be pluggable. I'll agree to strings for RAW values, dates I am still not sure. Pluggable date types sounds good, but what will the interface be ? I think I would at least like the option just getting back a string, but then do we need to agree on a format ? > > > A Cursor Object's description attribute returns information about each > > of the result columns of a query. The type_code is defined to be equal > > to one of five types exported by this module: STRING, RAW, NUMBER, > > DATE, or ROWID. > > There needs to be a distinction between ints and floats. > > > Note: The values returned in the description tuple must not > > necessarily be the same as the defined types, i.e. while coltype == > > STRING will always work, coltype is STRING may fail. > > Why? > > > The module exports the following functions and names: > > > > dbiDate(value) > > > > This function constructs a dbiDate instance that holds a date > > value. The value should be specified as an integer number of > > seconds since the "epoch" (e.g. time.time()). > > > > dbiRaw(value) > > > > This function constructs a dbiRaw instance that holds a raw > > (binary) value. The value should be specified as a Python string. > > > > STRING > > > > This object is used to describe columns in a database that are > > string-based (e.g. CHAR). > > > > RAW > > > > This object is used to describe (large) binary columns in a > > database (e.g. LONG RAW, blobs). > > > > NUMBER > > > > This object is used to describe numeric columns in a database. > > > > DATE > > > > This object is used to describe date columns in a database. > > > > ROWID > > > > This object is used to describe the "Row ID" column in a > > database. > > > > The module also exports these exceptions that the DB module should > > raise: > > > > Warning > > > > Exception raised for important warnings like data truncations > > while inserting, etc. > > > > Error > > > > Exception that is the base class of all other error > > exceptions. You can use this to catch all errors with one single > > 'except' statement. Warnings are not considered errors and thus > > should not use this class as base. > > > > DataError > > > > Exception raised for errors that are due to problems with the > > processed data like division by zero, numeric out of range, etc. > > > > OperationalError > > > > Exception raised when the an unexpected disconnect occurs, the > > data source name is not found, etc. > > > > IntegrityError > > > > Exception raised when the relational integrity of the database is > > affected, e.g. a foreign key check fails. > > > > InternalError > > > > Exception raised when the database encounters an internal error, > > e.g. the cursor is not valid anymore, the transaction is out of > > sync, etc. > > > > ProgrammingError > > > > Exception raised for programming erros, e.g. table not found or > > already exists, etc. > > > > Note: The values of these exceptions are not defined. They should give > > the user a fairly good idea of what went wrong though. > > If dbi exports a C API, it should do so through a Python > CObject. This should avoid weird linking problems (that > the oracledb and ctsybase modules have. > > Jim > Ted Horst (not speaking for any Swiss banks) From billtut@microsoft.com Thu Jun 4 01:42:17 1998 From: billtut@microsoft.com (Bill Tutt) Date: Wed, 3 Jun 1998 17:42:17 -0700 Subject: [DB-SIG] DB-API 1.1 Message-ID: <4D0A23B3F74DD111ACCD00805F31D81005B04C36@red-msg-50.dns.microsoft.com> > -----Original Message----- > From: Jim Fulton [mailto:jim.fulton@Digicool.com] > M.-A. Lemburg wrote: > > > > M.-A. Lemburg wrote: > > > > > Module Interface > > > > The database interface modules should typically be named with > > something terminated by db. > > Why? > Aggreed, this really isn't necessary. > > Existing examples are: oracledb, > > informixdb, and pg95db. These modules should export several names: > > > > modulename(connection_string_or_tuple) > > Why use the module name? Why not some descriptive name > like: 'Connect'? > Historical reasons, i.e. no good reason. > > Constructor for creating a connection to the database. > Returns a > > Connection Object. In case a connection tuple is used, > it should > > follow this convention: (data_source_name, user, password). > > Why allow a string or a tuple? Doesn't this add non-portability? Actually I orignaly thought of doing something like an ODBC connection string, but after reading Ted and M-A's responses on this subject I'm liking Ted's idea more and more. Lets use a dictionary here. An ODBC connection string can just be formed by the C equivalent of: s = "" for k in keys(dict): s = s + dict[k] + ";" > > error > > > > Exception raised for errors in the the database > module's internal > > processing. Other errors may also be raised. Database error > > should in general be made accessible through the exceptions > > defined for the dbi abstraction module described below. > > Maybe this should be InternalError. Is this a DBI defined error, > or is it local to the module? Does it subclass from the DBI.Error > defined below? > It's local to the module. It would be useful to subclass from DBI.Error, although generally any exceptions raised throuh "odbc.error" should be setup/logic errors as opposed to underlying database problems which should raise DBI exceptions. M-A suggested a good alternative name to use. > > Connection Objects > > > > Connections Objects should respond to the following methods: > > > > close() > > > > Close the connection now (rather than whenever __del__ is > > called). The connection will be unusable from this > point forward; > > an exception will be raised if any operation is attempted with > > the connection. > > > > commit() > > > > Commit any pending transaction to the database. Note > that if the > > database supports an auto-commit feature, this must be > initially > > off. An interface method may be provided to turn it back on. > > > > rollback() > > > > Roll the database back to the start of any pending > > transaction. Note that closing a connection without committing > > the changes first will cause an implicit rollback to be > > performed. > > Why not have a begin() method? > You don't need a begin() in the database C module. Having a begin() would require more logic that really isn't necessary for Python -> C DB API module. Having a wrapper class in Python to do this is trivial of course. (This may suggest that the database module is Python, and that the C code is a supplemental module) > > cursor() > > > > Return a new Cursor Object. An exception may be thrown if the > > database does not support a cursor concept. > > > > callproc([params]) > > > > Note: this method is not well-defined yet. Call a stored > > database procedure with the given (optional) > parameters. Returns > > the result of the stored procedure. > > How are IN OUT and OUT parameters handled? > How common are stored procedures across database products > anyway? > Stored procedures are very common, unfortunately the syntax, or the results from them aren't. On SQL Server for example stored proecures can have the following parameters/results: 1) IN params 2) OUT params 3) return value 4) returned result set Unfortunately I know nothing about stored procedures in other databases besides SQL Server. > > all Cursor Object attributes and methods > > > > For databases that do not have cursors and for simple > > applications that do not require the complexity of a cursor, a > > Connection Object should respond to each of the attributes and > > methods of the Cursor Object. Databases that have cursor can > > implement this by using an implicit, internal cursor. > > > > Cursor Objects > > > > These objects represent a database cursor, which is used to > > manage the context of a fetch operation. > > > > Cursor Objects should respond to the following methods and > attributes: > > > > arraysize > > > > This read/write attribute specifies the number of rows to fetch > > at a time with fetchmany(). This value is also used when > > inserting multiple rows at a time (passing a tuple/list of > > tuples/lists as the params value to execute()). This attribute > > will default to a single row. > > > > Note that the arraysize is optional and is merely provided for > > higher performance database interactions. > Implementations should > > observe it with respect to the fetchmany() method, but are free > > to interact with the database a single row at a time. > > Doesn't fetchmany accept a count? Why is this attribute > needed? This is a 1.0 API anachronism. It isn't needed at all. Punt it. > > description > > > > This read-only attribute is a tuple of 7-tuples. Each 7-tuple > > contains information describing each result column: (name, > > type_code, display_size, internal_size, precision, scale, > > null_ok). This attribute will be None for operations > that do not > > return rows or if the cursor has not had an operation > invoked via > > the execute() method yet. > > > > The type_code is equal to one of the dbi type objects specified > > in the section below. > > > > Note: this is a bit in flux. Generally, the first two items of > > the 7-tuple will always be present; the others may be database > > specific. > > This is bad. I suppose we are stuck with this for backwards > compatibility. > > If I were designing this interface I would have description > be a collection object that acted as both a sequence of > column definitions and a mapping from column name to column > definitions. I would have the column definitions be objects > that have methods for standard attributes like name, and type > (and maybe nullability, scale and precision) > as well as optional attributes for things like display size and > internal size. > > I suppose that with some trickery, this could be handled in a mostly > backward compatible way, by making column-definitions sequence objects > too. > Having the simple interface for the C code is easier for the C extension writer. Wrapping this tuple into your proposed solution seems trivial. > > close() > > > > Close the cursor now (rather than whenever __del__ is > > called). The cursor will be unusable from this point > forward; an > > exception will be raised if any operation is attempted with the > > cursor. > > > > execute(operation [,params]) > > > > Execute (prepare) a database operation (query or > > command). Parameters may be provided (as a sequence > > (e.g. tuple/list)) and will be bound to variables in the > > operation. Variables are specified in a database-specific > > notation (some DBs use ?,?,? to indicate parameters, others > > :1,:2,:3) that is based on the index in the parameter tuple > > (position-based rather than name-based). > > The format of parameter references should be standardized. > Maybe with something more Pythonic, like: > > %s, %d, %f > > This might also allow type information to be captured to > aid in binding variables. > Oracle uses the :1, :2, :3 variation, while ODBC uses the ? variation. Personally I've always favored the Oracle way of indicating binding parameters. Typing is discussed below. > > The parameters may also be specified as a sequence of sequences > > (e.g. a list of tuples) to insert multiple rows in a single > > operation. > > Does this run the insert multiple times, or does it bind > some sorts of arrays to input parameters? Is this > useful enough to include in this standard? It feels like > alot of extra burden for DBI interface developers. It essentially allows the C code to sit in a tight loop executing a prepared SQL statement with different data very quickly as opposed to having to go back to a Python loop to hit the next call to execute(). This is VERY VERY VERY useful and necessary when dealing with moving large amounts of data in and out of databases. E.g.: It's easier on SQL Server to import 33 million rows of data, massaging the data as needed in Python, sorting the data, and then bulk loading it into SQL Server without indexes, and then create clustered indexes on the already sorted data, than to run the update/deletes directly in SQL on those 33 million rows of data, including the related tables off of the main data that have 1 to many relationships. > > > A reference to the operation will be retained by the cursor. If > > the same operation object is passed in again, then the > cursor can > > optimize its behavior. This is most effective for algorithms > > where the same operation is used, but different parameters are > > bound to it (many times). > > This sounds a bit too magical to me. Does it apply when no arguments > are presented? I'd rather see an explicit prepare method, preferably > on a connection object that returns a callable object, as in: > > > f=aConnection.prepare( > "select * from mydata where id=%d and name=%s") > ... > x=f(1, 'foo') > ... > y=f(2, 'bar') > > I think he's just promoting a simple cache of prepared statements. This is useful, but rather obnoxious request for the C database module writer. If we organized a generic DB API C extension that people could steal from this would be a good idea. Having a specific prepare() method might make sense though. > > For maximum efficiency when reusing an operation, it is best to > > use the setinputsizes() method to specify the > parameter types and > > sizes ahead of time. It is legal for a parameter to > not match the > > predefined information; the implementation should compensate, > > possibly with a loss of efficiency. > > I think that this could better be handled using more pythonic > place holders. I don't like having to specify sizes for strings, > since I may want to use a type like 'long var binary' that effectively > doesn't have an upper limit. setoutputsizes() is AFAIK completly puntable, except for being useful in certain circumstances for specifying the expected size of BLOBs. setinputsizes() should be punted. > > > Using SQL terminology, these are the possible result > values from > > the execute() method: > > > > If the statement is DDL (e.g. CREATE TABLE), then 1 is > > returned. > > This seems a bit arbitrary to me. > Punt it, people know when they're doing a DDL, it'll either suceed or fail and throw an exception, just return None. > > If the statement is DML (e.g. UPDATE or INSERT), then the > > number of rows affected is returned (0 or a positive > > integer). > > > > If the statement is DQL (e.g. SELECT), None is returned, > > indicating that the statement is not really complete until > > you use one of the fetch methods. > > > > fetchone() > > > > Fetch the next row of a query result, returning a single tuple, > > or None when no more data is available. > > > > fetchmany([size]) > > > > Fetch the next set of rows of a query result, > returning as a list > > of tuples. An empty list is returned when no more rows are > > available. The number of rows to fetch is specified by the > > parameter. If it is None, then the cursor's arraysize > determines > > the number of rows to be fetched. > > > > Note there are performance considerations involved > with the size > > parameter. For optimal performance, it is usually best > to use the > > arraysize attribute. If the size parameter is used, then it is > > best for it to retain the same value from one > fetchmany() call to > > the next. > > > > fetchall() > > > > Fetch all (remaining) rows of a query result, > returning them as a > > list of tuples. Note that the cursor's arraysize attribute can > > affect the performance of this operation. > > For the record, I've never liked this approach. When I've done this > sort of thing before (For Ingres and Info, sorry, I can't share the > code, it was done while at USGS), I had selects return "result" > objects. Result objects encapsulated cursors and have them sequence > behavior. As in: > > rows=aConnection.execute('select * from blech') > # (Note, no explicit cursor objects) > for row in rows: > ... do something with the rows > > Note that the "rows" object in this example is not a tuple > or list, but an object that lazily gets rows from the > result as needed. > The distinction between the 3 is useful for determing how often & when your code actually hits the database. Wrapping fetchone() with a lazy "rows" object is trivial. As is writing a threaded wrapping for fetchall() on the "rows" object. > Also note that the individual rows are not tuples, but > objects that act as a sequence of values *and* as a mapping > from column name to value. This lets you do something like: > > rows=aConnection.execute('select name, id from blech') > for row in rows: > print "%(name), %(id)" % row > > In my Ingres and Info interfaces, I also had the > rows have attributes (e.g. aRow.name), but then it's > hard for rows to have generic methods, like 'keys' and > 'items'. I also provided access to meta data for rows, > something like: > > row.__schema__ > This part is fairly good, I'm generally in favor for a row to be a C'ified version of Greg Stein's dtuple.py module. (Attached) > > nextset() > > > > If the database supports returning multiple result sets, this > > method will make the cursor skip to the next available set. If > > there are no more sets, the method returns None. Otherwise, it > > returns 1 and subsequent calls to the fetch methods will return > > rows from the next result set. Database interface modules that > > don't support this feature should always return None. > > This feels a bit cumbersome to me. What happens if you need > to iterate over multiple results simulataneously. I'd rather > see an object for each result set and return a tuple of result > sets if there are more than one. > If you need to do so, wrap it in Python. Databases aren't usually nice enough to allow you to itertate over multiple result sets simultaneously. > > setoutputsize(size [,col]) > > > > Note: this method is not well-defined yet. Set a column buffer > > size for fetches of large columns (e.g. LONG). The column is > > specified as an index into the result tuple. Using a column of > > None will set the default size for all large columns in the > > cursor. > > > > This method would be used before the execute() method > is invoked. > > > > Note that this method is optional and is merely provided for > > higher performance database interaction. > Implementations are free > > to do nothing and users are free to not use it. > > In the case of LONG columns, how is someone suppose to know > the maximum > size ahead of time? Does anyone really want this? > The usefulness of this is really limited to the effecency of the C modules memory allocation policy, and can probably be safely punted. > > DBI Helper Objects and Exceptions > > > > Many databases need to have the input in a particular format for > > binding to an operation's input parameters. For example, if an input > > is destined for a DATE column, then it must be bound to the database > > in a particular string format. Similar problems exist for "Row ID" > > columns or large binary items (e.g. blobs or RAW columns). This > > presents problems for Python since the parameters to the execute() > > method are untyped. > > They don't have to be. See above. > No they don't, but its easier and saner to do it this way. > > When the database module sees a Python string > > object, it doesn't know if it should be bound as a simple > CHAR column, > > as a raw binary item, or as a DATE. > > Database modules should never think a Python string is a date. Never, ever, ever.... bad, bad, bad. If you want a date, pass in a dbiDate object. > > To overcome this problem, the dbi interface module was created. This > > module, which every database module must provide, specifies > some basic > > database interface types for working with databases. There are two > > classes: dbiDate and dbiRaw. These are simple container classes that > > wrap up a value. When passed to the database modules, the module can > > then detect that the input parameter is intended as a DATE or a > > RAW. > [ % typing idea comments by Jim] > > For symmetry, the database modules will return DATE and RAW > > columns as instances of these classes. > > I'd rather see strings come back for RAW and "Date" objects > come back for dates. I'd prefer to see the Date type be pluggable. > I'm in agreement with Jim here that RAW's should be returned as strings. Even though this is likely to be fairly obnoxious wrt memory allocation hits, depending on the allocation policy of BLOBs. > > A Cursor Object's description attribute returns information > about each > > of the result columns of a query. The type_code is defined > to be equal > > to one of five types exported by this module: STRING, RAW, NUMBER, > > DATE, or ROWID. > > There needs to be a distinction between ints and floats. Well, the information is contained within the precision and scale values, but yes it would be nice to distinguish between ints, floats, and decimals (SQL column types along the lines of: Latitude DECIMAL(9,6).) > > Note: The values returned in the description tuple must not > > necessarily be the same as the defined types, i.e. while coltype == > > STRING will always work, coltype is STRING may fail. > > Why? > M-A mentioned something about this, although I disagree with him. If M-A wants the real column types from mxODBC I suggest he adds an interface to ODBC DB metadata APIs. i.e. SQLCatalog(), etc.. > > The module exports the following functions and names: > > > > dbiDate(value) > > > > This function constructs a dbiDate instance that holds a date > > value. The value should be specified as an integer number of > > seconds since the "epoch" (e.g. time.time()). > > This clearly can't be a 1970 based epoch. I hearby suggest using M-A's DateTime class for use as the implementation of dbiDate. Bill begin 600 dtuple.py M(PT*(R!D8BYP>2`M+2!G96YE7!I M8V%L;'D@:6YS=&%N8V5S(&]F($1A=&%B87-E5'5P;&4@;W(@;VYE(&]F(&ET M2DN#0H-"DYO=&4Z M('1H92!T97)M(&1A=&%B87-E('1U<&QE(&ES(')A=&AE2!P87-S:6YG(&$@(F1ER!]#0H@("`@9F]R(&D@:6X@2!S M;VUE('!R;V-E2!R971U2!A(&1A=&%B87-E('%U M97)Y+B!!(%1U<&QE1&5S8W)I<'1O&EN9RP@2!B92!U&ES=',@82!P;W1E;G1I86P@86UB M:6=U:71Y(&)E='=E96X@871T96UP=&EN9R!T;R!A8W0@87,@82!L:7-T#0IO M"22!I;G1E2X-"@T*3F]T92!T:&%T M(&$@9&%T86)A&-E M<'1I;VYS('=I;&P@8F4@7!E*&1E7!E M*'-E;&8N7V1A=&%?*2`]/2!T>7!E*&]T:&5R*3H-"B`@("`@(')E='5R;B!C M;7`H7!E*'-E;&8I(#T]('1Y<&4H;W1H97(I.B`@ M(R,C(&9I>"!T:&ES.B!N965D('1O('9E2!E<75A;"!C;&%S&5D("AT=7!L92]L:7-T*2!A;F0@;6%P<&EN9RUS='EL92!A8V-E2P@=F%L=64I M.@T*("`@("=3:6UU;&%T92!I;F1E>&5D("AT=7!L92]L:7-T*2!A;F0@;6%P M<&EN9RUS='EL92!A8V-E7!E17)R;W(L(")C86XG="!A5\H2!I;B!S96QF+E]D97-C7RYN86UE M%\HR!]#0H@("`@9F]R M(&YA;64L(&ED>"!I;B!S96QF+E]D97-C7RYN86UE;6%P+FET96US*"DZ#0H@ M("`@("!V86QU95MN86UE72`]('-E;&8N7V1A=&%?6VED>%T-"B`@("!R971U M M.-A. Lemburg wrote: >Jim Fulton wrote: >>M.-A. Lemburg wrote: >>>M.-A. Lemburg wrote: >>> callproc([params]) >>> >>> Note: this method is not well-defined yet. Call a stored >>> database procedure with the given (optional) parameters. Returns >>> the result of the stored procedure. >> >> How are IN OUT and OUT parameters handled? >> How common are stored procedures across database products >> anyway? >I don't think that there is a lot of portability regarding stored >procedures. For one, the storing process itself is *very* DB >dependant (with each database having its own syntax for defining >procedures) and it is not really clear how to handle the parameters >(esp. the IN OUT ones you mention). you don't have to "handle" the parameters. you only have to design an interface for doing so. how they are handled is left to the compliant module implementor. and just because you can't, in general, port stored procedures between databases (sybase and sql anywhere being the only example i know), that doesn't mean that the syntax for calling stored procedures needs to change across compliant modules as well. if someone has to port the stored procedures, why do they also have to port the (python) calls to them? surely each compliant module can produce the appropriate db-specific sql syntax. just how different are stored procedures in different databases? i haven't seen many databases but i can't imagine being surprised by any of them. they didn't cancel the odbc project because of this :) >>> nextset() >>> >>> If the database supports returning multiple result sets, this >>> method will make the cursor skip to the next available set. If >>> there are no more sets, the method returns None. Otherwise, it >>> returns 1 and subsequent calls to the fetch methods will return >>> rows from the next result set. Database interface modules that >>> don't support this feature should always return None. >> >> This feels a bit cumbersome to me. What happens if you need >> to iterate over multiple results simulataneously. I'd rather >> see an object for each result set and return a tuple of result >> sets if there are more than one. >Sorry, can't really comment on that one, since I have no experience >with it. Only one thing: introducing too many different new >types will put a real strain on interface implementors. We should >avoid that. what "new types" are you referring to here? i don't understand. the difference as i see it is getting result sets one at a time, or in a tuple. surely a tuple isn't going to cause too much strain :) i agree that using nextset() is too cumbersome. i hope that both approaches are supported. >>> A Cursor Object's description attribute returns information about each >>> of the result columns of a query. The type_code is defined to be equal >>> to one of five types exported by this module: STRING, RAW, NUMBER, >>> DATE, or ROWID. >> >> There needs to be a distinction between ints and floats. >IMHO, we should simply use the standard Python types the way >they are meant: numbers with precision become floats, ones without >precision become integers. Very long integers (BIGINT) are output >as longs. be careful. the ctsybase module retrieves "money" values as float, not long. since "money" is an 8 byte integer, 4 digits of precision (typically) are lost this way. i'm sure people have been killed for less :) raf From mdt@mdt.in-berlin.de Thu Jun 4 08:49:24 1998 From: mdt@mdt.in-berlin.de (Michael Dietrich) Date: Thu, 4 Jun 1998 09:49:24 +0200 Subject: [DB-SIG] DB-API 1.1 In-Reply-To: <3575D003.588F@digicool.com>; from Jim Fulton on Wed, Jun 03, 1998 at 06:36:51PM -0400 References: <3563FACF.7663D00E@lemburg.com> <356B3269.EDBD1D8@lemburg.com> <3575D003.588F@digicool.com> Message-ID: <19980604094924.B329@mdt.in-berlin.de> > > commit() > > > > Commit any pending transaction to the database. Note that if the > > database supports an auto-commit feature, this must be initially > > off. An interface method may be provided to turn it back on. > > > > rollback() > > > > Roll the database back to the start of any pending > > transaction. Note that closing a connection without committing > > the changes first will cause an implicit rollback to be > > performed. > > Why not have a begin() method? i like another aproach: i would like to see a transaktionobject, where the transaction is the livetime of the object. this object has the methode commit() to really mark all transactions as really. if you get an exception or forget to call commit() the default is to rollback every db-change. for any change in the database you need a transaction-object, so the programmer is forced to use it. an example: t = db.Tramsaction() db.Execute(t, 'INSERT ....') db.Execute(t, 'DELETE ....') db.Execute(t, 'DELETE ....') t.Commit() t = None i used this abroach succesfully with c++. most programmers in this project began to like that. it's easy and you can't forget anything except the commit. but you immideatly see that. another thing is the nested transaction, which is forbidden with most databases. our transactionclass in that project only commited the outermost transaction, while all inner transactions where ignored, that means you could neither commit nor rollback them. so only the whole transaction could be commited and that's what a transaction is for. -- see header From mal@lemburg.com Thu Jun 4 10:23:40 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jun 1998 11:23:40 +0200 Subject: [DB-SIG] DB-API 1.1 References: <3563FACF.7663D00E@lemburg.com> <356B3269.EDBD1D8@lemburg.com> <3575D003.588F@digicool.com> <19980604094924.B329@mdt.in-berlin.de> Message-ID: <3576679C.42C087E@lemburg.com> Ok, I'll try to summarize what's been proposed so far [and my comments on them ]: Ted Horst would like to see dictionaries used for connection parameters and column descriptions. This is a good idea, but not backward compatible. Then again, some DB interfaces use tuples, others string, some both... so we might as well make the change for the connection arguments. It's different for the column description: unpacking dictionaries doesn't work. I suggested defining description to return a sequence of sequences. This lets you implement hybrid models as well, e.g. sequence of sequence/mappings. Would be very useful indeed to have a type that behaves as both: sequence and mapping. Maybe a project for some rainy sunday... Jim wanted to know how what kind of exception to raise for interface related errors. I suggested InterfaceError with Error as base class. Seems to be excepted unless I hear anything different. Stored Procedures. As expected this causes troubles. AFAIK, it is possible to call stored procedures via the standard execute method. Input parameters get passed in via the parameter tuple, output goes into a result set. Since databases tend to use different calling syntaxes the callproc() could take care of mapping the procedure to the database specific syntax and then pass the parameters on to the execute method to have the procedure call itself executed. Results would then be available via the result set. IN OUT parameters are not possible using this scheme, but then: which Python type would you use for them anyway... Things to be considered obsolete: arraysize setinputsize() setoutputsize() Completely agree on punting these... fetchmany() would then have a mandatory argument instead of an optional one. begin() method. Jim mentioned this without too many details attached to it. Michael responded with a transaction model. Michaels model is easily implemented in Python on top of database cursors. As are some other techniques of getting the OO look&feel into database handling. This would well define a new project for the DB-SIG. Return values for DDL, DML and DQL statements. The only really useful return value defined is the number of rows affected for DMLs. mxODBC provides two new attributes for cursors: rowcount - number of rows affected or in the result set colcount - number of columns in the result set We drop all the special return values and add these two (or maybe just rowcount) attributes instead. Multiple results sets. Same problem as with stored procedures. There doesn't seem to be much of a standard and most DBs probably don't even support these. Mapping of database types to Python types. Since different databases support different types (including some really exotic ones like PostgreSQL), I think the general idea should be: preserve as much accuracy as you can e.g. if a money databse type doesn't fit Python's integers, use longs. The only type debatable, IMHO, is what to use for long varchar columns (LONGS or BLOBS). The most common usage for these is probably storing raw data, just like you would in a disk file. Two Python types come into play here: arrays (mutable) and strings (immutable). Strings provide a much better access interface and are widely supported, so I'd vote for strings. Having coltype compare '==' instead of 'is'. Bill doesn't like it. I don't think it's much of a change and it allows me to return the raw database data in coltype. The types defined in dbi are only a very small subset of the possible database types (again, see PostgreSQL for an example of many wild types ;). This small change makes it possible to add magic in Python to have multiple values compare equal to one, which would otherwise not be possible. I don't like the idea of having to access the coltype through a seperate (non-portable) API. dbiDate. I suggested using DateTime types, Bill agrees, Jim probably has his own set of types ;-) which he'd like to use. I'm biased, of course, but the DateTime type are readily available and easy to use on the user as on the interface programmer side. They provide a rich set of API functions which they export via a CObject (meaning: no linking problems). mxODBC uses them already. Though it also allows you to choose two other way of passing date/time values: as strings and as tuples. In general having more of the API return dedicated types with a nice OO interface would be a really nice thing, but I fear that most interface programmers are not willing to hack up two or three other types in addition to connections and cursors. -- Marc-Andre Lemburg ---------------------------------------------------------------------- | Python Pages: http://starship.skyport.net/~lemburg/ | ------------------------------------------------------- From Paul Boddie Thu Jun 4 11:34:11 1998 From: Paul Boddie (Paul Boddie) Date: Thu, 4 Jun 1998 12:34:11 +0200 (MET DST) Subject: [DB-SIG] DB-API 1.1 Message-ID: <199806041034.MAA23806@aisws7.cern.ch> Sorry to go straight for this one, but it is something that I have dealt with tentatively... > Stored Procedures. As expected this causes troubles. > > AFAIK, it is possible to call stored procedures via the > standard execute method. Input parameters get passed in > via the parameter tuple, output goes into a result set. I don't think it is with oracledb, or at least not universally. In any case, I think that a specific method to make it more usable is worthwhile, otherwise you have to do things like: begin proc(...); end; That is how you might do the work in, say, sqlplus for Oracle, but it isn't very nice. I always mess the syntax up somehow. ;-) > Since databases tend to use different calling syntaxes > the callproc() could take care of mapping the procedure > to the database specific syntax and then pass the parameters > on to the execute method to have the procedure call itself > executed. Results would then be available via the > result set. IN OUT parameters are not possible using this > scheme, but then: which Python type would you use for them > anyway... The way I implemented this [1] used a method on the cursor object, for reasons that the Oracle Call Interface function requires a cursor with which the work is done. The DB-API doesn't define what happens with the parameters, though, and although I considered a dictionary, having little experience of either OCI or Python internals I settled for passing in a list of all IN or IN/OUT parameters in the order that they are expected in the procedure call, and receiving a tuple of all OUT or IN/OUT parameters in the order they are listed in the procedure definition. That seems a bit strange, and obviously the programmer has to check the parameters in use carefully, but it isn't completely illogical. [1] http://assuwww.cern.ch/~pboddie/Personal/Interests/Python/Patch_for_oracledb.html Paul Boddie Paul.Boddie@cern.ch | http://assuwww.cern.ch/~pboddie | Any views expressed above are personal and not necessarily | shared by my employer or my associates. From mal@lemburg.com Thu Jun 4 13:31:08 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 04 Jun 1998 14:31:08 +0200 Subject: [DB-SIG] DB-API 1.1 References: <199806041034.MAA23806@aisws7.cern.ch> Message-ID: <3576938C.40B8DF63@lemburg.com> Paul Boddie wrote: > > > Stored Procedures. As expected this causes troubles. > > > > AFAIK, it is possible to call stored procedures via the > > standard execute method. Input parameters get passed in > > via the parameter tuple, output goes into a result set. > > I don't think it is with oracledb, or at least not universally. In > any case, I think that a specific method to make it more usable is > worthwhile, otherwise you have to do things like: > > begin proc(...); end; > > That is how you might do the work in, say, sqlplus for Oracle, but it > isn't very nice. I always mess the syntax up somehow. ;-) That's why propose to have the callproc() method do this for you. In Python, e.g.: def callproc(self,procname,parameters=[]): # DB specific syntax here (Solid syntax in this case): placeholders = ('?,' * (len(parameters)-1)) [:-1] s = '{?=call %s(%s)}' % (procname,placeholders) # Now back to the normal execute method; this will # possibly manipulate the parameter list in place. return self.execute(s,parameters) As opposed to my earlier comment I'd like to suggest a different approach, though: Have the execute-method manipulate the parameter list in place. Input columns will be used as such, output columns may contain e.g. None as placeholder and in/output columns get their value replaced by the procedures output. Some DBs seem to also make a result available sometimes... this could then be accessed via .fetchxxx(). This goes along the lines of what Paul's patch does, but without the hazzles of always having to figure out the right order for input and output parameters. -- Marc-Andre Lemburg ---------------------------------------------------------------------- | Python Pages: http://starship.skyport.net/~lemburg/ | ------------------------------------------------------- From Paul Boddie Thu Jun 4 14:09:04 1998 From: Paul Boddie (Paul Boddie) Date: Thu, 4 Jun 1998 15:09:04 +0200 (MET DST) Subject: [DB-SIG] DB-API 1.1 Message-ID: <199806041309.PAA23882@aisws7.cern.ch> > Paul Boddie wrote: > > > > > Stored Procedures. As expected this causes troubles. > > > > > > AFAIK, it is possible to call stored procedures via the > > > standard execute method. Input parameters get passed in > > > via the parameter tuple, output goes into a result set. > > > > I don't think it is with oracledb, or at least not universally. In > > any case, I think that a specific method to make it more usable is > > worthwhile, otherwise you have to do things like: > > > > begin proc(...); end; > > > > That is how you might do the work in, say, sqlplus for Oracle, but it > > isn't very nice. I always mess the syntax up somehow. ;-) > > That's why propose to have the callproc() method do this for you. > In Python, e.g.: > > def callproc(self,procname,parameters=[]): > # DB specific syntax here (Solid syntax in this case): > placeholders = ('?,' * (len(parameters)-1)) [:-1] > s = '{?=call %s(%s)}' % (procname,placeholders) > # Now back to the normal execute method; this will > # possibly manipulate the parameter list in place. > return self.execute(s,parameters) One problem I should have mentioned: for oracledb at least I think that there would need to be a bit of behind-the-scenes activity to ensure that the execute method "knows" what sort of object it is referring to, and that the fetch method "remembers" this information and uses the appropriate calls. I think that in OCI, the means of retrieval may be different. Actually, I did get a contribution which enabled execute methods to work with some kinds of stored procedures, and the person in question pointed out that the mechanisms dealing with procedures are a bit different. As I noted, I avoided combining the two and just found out how to call procedures from scratch, reusing as much of the binding code as possible. > As opposed to my earlier comment I'd like to suggest a different > approach, though: > > Have the execute-method manipulate the parameter list in place. > Input columns will be used as such, output columns may contain > e.g. None as placeholder and in/output columns get their value > replaced by the procedures output. > Some DBs seem to also make a result available sometimes... this > could then be accessed via .fetchxxx(). It's possible, but wouldn't that mean that one would have to set the list up first, so that one could access the list's members after the call? Obviously the only price to pay here is notational convenience, and if wrapped up in a callproc method, then it wouldn't affect anyone at all. > This goes along the lines of what Paul's patch does, but without > the hazzles of always having to figure out the right order for > input and output parameters. Yes, that's not all that nice, although the description of the procedure is enough to know the ordering and one just needs to make sure one hasn't missed an IN/OUT parameter or something when one feeds the data in as parameters, for example. A dictionary mapping names to values might be nice, but then we would need to obtain the names of parameters in order to do the matching, and I am not sure that this is possible using OCI. It would certainly add some complexity, anyway. I have given a point of view from the oracledb side (and my very basic OCI experience). Perhaps we could try and assess what the differences are between database systems. Paul Boddie Paul.Boddie@cern.ch | http://assuwww.cern.ch/~pboddie | Any views expressed above are personal and not necessarily | shared by my employer or my associates. From mlorton@slip.net Thu Jun 4 21:47:33 1998 From: mlorton@slip.net (Michael Lorton) Date: Thu, 4 Jun 1998 13:47:33 -0700 (PDT) Subject: [DB-SIG] DB-API 1.1 In-Reply-To: <4D0A23B3F74DD111ACCD00805F31D81005B04C36@red-msg-50.dns.microsoft.com> Message-ID: On Wed, 3 Jun 1998, Bill Tutt wrote: > > From: Jim Fulton [mailto:jim.fulton@Digicool.com] > > M.-A. Lemburg wrote: > > > > > > M.-A. Lemburg wrote: > > > > > > > Module Interface > > > > > > The database interface modules should typically be named with > > > something terminated by db. > > > > Why? > > > > Agreed, this really isn't necessary. I used "db" to indicate that it was DBI compliant, rather than a proprietary interface. > > > Existing examples are: oracledb, > > > informixdb, and pg95db. These modules should export several names: > > > modulename(connection_string_or_tuple) > > > > Why use the module name? Why not some descriptive name > > like: 'Connect'? > > > > Historical reasons, i.e. no good reason. Well, "good" is subjective. The original reason, which hasn't changed, is that the constructor must be manipulated by code that understand what RDBMSs are available, not be generic code. The rest of the API is intended to be used by generic code. M. From Tod Olson Thu Jun 4 22:14:38 1998 From: Tod Olson (Tod Olson) Date: Thu, 04 Jun 1998 16:14:38 -0500 Subject: [DB-SIG] DB-API 1.1 In-Reply-To: Your message of "Thu, 04 Jun 1998 11:23:40 +0200" References: <3576679C.42C087E@lemburg.com> Message-ID: <199806042114.QAA03194@stone.lib.uchicago.edu> >>>>> "M" == M -A Lemburg writes: M> Things to be considered obsolete: M> arraysize M> setinputsize() M> setoutputsize() M> Completely agree on punting these... fetchmany() would then M> have a mandatory argument instead of an optional one. I actually found arraysize to be useful when I implemented array binding in a module I was working on. I could allocate the space for the array binding when the user set the variable and avoid doing a malloc and free on every call to fetch*. That is, it was a nice hint for handling memory in a place where Python couldn't help me. M> Multiple results sets. M> Same problem as with stored procedures. There doesn't M> seem to be much of a standard and most DBs probably don't M> even support these. Sybase certainly does. It would be good to leave in nextset() for those who can use it. A result set object might be nice. If the database (eg. Sybase) will only let you get the result sets in a specific order (ie. no simultaneous iteration over mult. sets) it doesn't seem to be that big of an improvement. Does anyone use a database that can return multiple result sets to an SQL batch where the DB lets the programmer access the sets simultaneously? Tod A. Olson "How do you know I'm mad?" said Alice. ta-olson@uchicago.edu "If you weren't mad, you wouldn't have The University of Chicago Library come here," said the Cat. From mal@lemburg.com Fri Jun 5 09:39:56 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 05 Jun 1998 10:39:56 +0200 Subject: [DB-SIG] DB-API 1.1 References: <199806041309.PAA23882@aisws7.cern.ch> Message-ID: <3577AEDC.7CD01E41@lemburg.com> Paul Boddie wrote: > > > Paul Boddie wrote: > > > > > > > Stored Procedures. As expected this causes troubles. > > > > > > > > AFAIK, it is possible to call stored procedures via the > > > > standard execute method. Input parameters get passed in > > > > via the parameter tuple, output goes into a result set. > > > > > > I don't think it is with oracledb, or at least not universally. In > > > any case, I think that a specific method to make it more usable is > > > worthwhile, otherwise you have to do things like: > > > > > > begin proc(...); end; > > > > > > That is how you might do the work in, say, sqlplus for Oracle, but it > > > isn't very nice. I always mess the syntax up somehow. ;-) > > > > That's why propose to have the callproc() method do this for you. > > In Python, e.g.: > > > > def callproc(self,procname,parameters=[]): > > # DB specific syntax here (Solid syntax in this case): > > placeholders = ('?,' * (len(parameters)-1)) [:-1] > > s = '{?=call %s(%s)}' % (procname,placeholders) > > # Now back to the normal execute method; this will > > # possibly manipulate the parameter list in place. > > return self.execute(s,parameters) > > One problem I should have mentioned: for oracledb at least I think > that there would need to be a bit of behind-the-scenes activity to > ensure that the execute method "knows" what sort of object it is > referring to, and that the fetch method "remembers" this information > and uses the appropriate calls. I think that in OCI, the means of > retrieval may be different. The above code is just an example of how it could work, the intended behaviour being: * call the procedure/function with the parameters list * let the procedure manipulate the parameters, i.e. replacing OUT and IN/OUT entries with output values * make any result set or multiple result set available through subsequent .fetchxxx()/.nextset() calls. Is this manageable ? It certainly is for ODBC and Solid. Don't have any experience with Oracle though. > Actually, I did get a contribution which enabled execute methods to > work with some kinds of stored procedures, and the person in question > pointed out that the mechanisms dealing with procedures are a bit > different. As I noted, I avoided combining the two and just found out > how to call procedures from scratch, reusing as much of the binding > code as possible. > > > As opposed to my earlier comment I'd like to suggest a different > > approach, though: > > > > Have the execute-method manipulate the parameter list in place. > > Input columns will be used as such, output columns may contain > > e.g. None as placeholder and in/output columns get their value > > replaced by the procedures output. > > Some DBs seem to also make a result available sometimes... this > > could then be accessed via .fetchxxx(). > > It's possible, but wouldn't that mean that one would have to set the > list up first, so that one could access the list's members after the > call? Right. You pass in the parameter list, let the procedure do whatever it needs to do with it and then have a look at the changed list to extract the IN/OUT and OUT variables. It's a very simple but effective way of doing it, IMHO. > Obviously the only price to pay here is notational convenience, > and if wrapped up in a callproc method, then it wouldn't affect anyone > at all. > > > This goes along the lines of what Paul's patch does, but without > > the hazzles of always having to figure out the right order for > > input and output parameters. > > Yes, that's not all that nice, although the description of the > procedure is enough to know the ordering and one just needs to make > sure one hasn't missed an IN/OUT parameter or something when one feeds > the data in as parameters, for example. A dictionary mapping names to > values might be nice, but then we would need to obtain the names of > parameters in order to do the matching, and I am not sure that this is > possible using OCI. It would certainly add some complexity, anyway. IMHO, dictionaries wouldn't make life easier, but that may be a personal opinion: I am very much addicted to tuple and list unpacking -- one of the coolest things Python has to offer (among the hundreds of other nice features). > I have given a point of view from the oracledb side (and my very basic > OCI experience). Perhaps we could try and assess what the differences > are between database systems. ODBC 2.0: Calling procedures is done through the normal execute method. You bind parameters prior to calling the execute API either as IN, IN/OUT or OUT and the API overwrites IN/OUT and OUT parameters with the procedures output. Additionally, you can retrieve information about procedure columns (types, names, data flow direction, etc.). Result sets are extracted with the normal APIs that you also use for all other queries, e.g. SELECT. Solid uses ODBC 2.0 as API, so the same should work here. There's one quirk though: at least for Solid procedures can have return values. This is notated as '{?=call MyProc(?,?,?)}' rather than '{call MyProc(?,?,?)}'. Maybe we should have a second cursor method called 'callfunc()' to clearify the difference, because otherwise the first value in the parameter list would always be an OUT parameter to be used for the procedures return value (yet, in my normal understanding, procedures are not supposed to return anything -- only functions do). Question: should we allow multiple procedure calls to be done with one .callproc() invocation ? Passing a list of lists could invoke this way of handling it (just like passing a list of lists to cursor.execute()). -- Marc-Andre Lemburg ---------------------------------------------------------------------- | Python Pages: http://starship.skyport.net/~lemburg/ | ------------------------------------------------------- From mal@lemburg.com Fri Jun 5 09:42:56 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 05 Jun 1998 10:42:56 +0200 Subject: [DB-SIG] DB-API 1.1 References: <3576679C.42C087E@lemburg.com> <199806042114.QAA03194@stone.lib.uchicago.edu> Message-ID: <3577AF90.25C2965C@lemburg.com> > > > > Module Interface > > > > > > > > The database interface modules should typically be named with > > > > something terminated by db. > > > > > > Why? > > > > > > > Agreed, this really isn't necessary. > > I used "db" to indicate that it was DBI compliant, rather than a > proprietary interface. We could have a nice icon for that... DB-API 1.x compatible or something... ;-) > > > > Existing examples are: oracledb, > > > > informixdb, and pg95db. These modules should export several names: > > > > modulename(connection_string_or_tuple) > > > > > > Why use the module name? Why not some descriptive name > > > like: 'Connect'? > > > > > > > Historical reasons, i.e. no good reason. > > Well, "good" is subjective. The original reason, which hasn't changed, is > that the constructor must be manipulated by code that understand what > RDBMSs are available, not be generic code. The rest of the API is > intended to be used by generic code. Having it named 'Connect' makes porting applications much easier: all you have to do is change the import (in the ideal case). -- Marc-Andre Lemburg ---------------------------------------------------------------------- | Python Pages: http://starship.skyport.net/~lemburg/ | ------------------------------------------------------- From mal@lemburg.com Fri Jun 5 09:53:32 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 05 Jun 1998 10:53:32 +0200 Subject: [DB-SIG] DB-API 1.1 References: <3576679C.42C087E@lemburg.com> <199806042114.QAA03194@stone.lib.uchicago.edu> Message-ID: <3577B20C.6C776D89@lemburg.com> Tod Olson wrote: > > >>>>> "M" == M -A Lemburg writes: > > M> Things to be considered obsolete: > M> arraysize > M> setinputsize() > M> setoutputsize() > > M> Completely agree on punting these... fetchmany() would then > M> have a mandatory argument instead of an optional one. > > I actually found arraysize to be useful when I implemented array > binding in a module I was working on. I could allocate the space for > the array binding when the user set the variable and avoid doing a > malloc and free on every call to fetch*. That is, it was a nice hint > for handling memory in a place where Python couldn't help me. But arraysize is only intended as default argument for fetchmany() -- what if the user decides s/he only wants arraysize/2 or arraysize*2 rows by passing an argument to fetchmany() ? BTW: You can still optimize malloc/free's by reusing the arrays for every fetchmany()-call. Calling realloc normally isn't that costly. > M> Multiple results sets. > > M> Same problem as with stored procedures. There doesn't > M> seem to be much of a standard and most DBs probably don't > M> even support these. > > Sybase certainly does. It would be good to leave in nextset() for > those who can use it. Agreed. It can always return None for databases that don't provide the functionality. > A result set object might be nice. If the database (eg. Sybase) will > only let you get the result sets in a specific order (ie. no > simultaneous iteration over mult. sets) it doesn't seem to be that big > of an improvement. Does anyone use a database that can return > multiple result sets to an SQL batch where the DB lets the programmer > access the sets simultaneously? ODBC allows multiple result sets too. SQLMoreResults is the 1-1 interface for .nextset(). It only allows to skip to the next set in the list -- you don't have random access to all of them (can't even go back to a previous one). -- Marc-Andre Lemburg ---------------------------------------------------------------------- | Python Pages: http://starship.skyport.net/~lemburg/ | ------------------------------------------------------- From harri.pasanen@trema.com Fri Jun 5 10:57:07 1998 From: harri.pasanen@trema.com (Harri PASANEN) Date: Fri, 05 Jun 1998 11:57:07 +0200 Subject: [DB-SIG] DB-API 1.1 References: <3576679C.42C087E@lemburg.com> <199806042114.QAA03194@stone.lib.uchicago.edu> <3577AF90.25C2965C@lemburg.com> Message-ID: <3577C0F3.691339@trema.com> M.-A. Lemburg wrote: > > > > > > > > > Why use the module name? Why not some descriptive name > > > > like: 'Connect'? > > > > > > > > > > Historical reasons, i.e. no good reason. > > > > Well, "good" is subjective. The original reason, which hasn't changed, is > > that the constructor must be manipulated by code that understand what > > RDBMSs are available, not be generic code. The rest of the API is > > intended to be used by generic code. > > Having it named 'Connect' makes porting applications much > easier: all you have to do is change the import (in the ideal > case). > I'd just like to add my vote to suggestions for making the Connect parameter a dictionary. This way additional paremeters could be easily supported, such as 'ApplicationName', 'PacketSize', etc. Regards, Harri From harri.pasanen@trema.com Fri Jun 5 11:15:55 1998 From: harri.pasanen@trema.com (Harri PASANEN) Date: Fri, 05 Jun 1998 12:15:55 +0200 Subject: [DB-SIG] DB-API 1.1 References: <3563FACF.7663D00E@lemburg.com> <356B3269.EDBD1D8@lemburg.com> <3575D003.588F@digicool.com> <3575DE05.710874E1@lemburg.com> Message-ID: <3577C55B.3B4EB954@trema.com> M.-A. Lemburg wrote: > > Jim Fulton wrote: > > > > M.-A. Lemburg wrote: > > > > > > M.-A. Lemburg wrote: > > > description > > > > > > This read-only attribute is a tuple of 7-tuples. Each 7-tuple > > > contains information describing each result column: (name, > > > type_code, display_size, internal_size, precision, scale, > > > null_ok). This attribute will be None for operations that do not > > > return rows or if the cursor has not had an operation invoked via > > > the execute() method yet. > > > > > > The type_code is equal to one of the dbi type objects specified > > > in the section below. > > > > > > Note: this is a bit in flux. Generally, the first two items of > > > the 7-tuple will always be present; the others may be database > > > specific. > > > > This is bad. I suppose we are stuck with this for backwards > > compatibility. > > I guess so, too :( > Just for the record, I don't care if description is changed to something more descriptive in not backwards compatible fashion. I would however like to get the exact metadata out of the DB, ctsybasemodule actually lets me do it by abusing description, as the old spec is so loose here. I would like to see a mapping from Python DB-API types to database native types, and the guarantee that dbi types accross databases are equal, so that no data is lost if Python is used to bridge two databases. Also conversion from database type to Python and vise-versa should never be lossy. I guess the problematic types are dates, and different decimal (money) types. I'm not sure if dbi-types should be a union of all supported database types, with clear separation of the set of types that represent the intersection of different db type sets. Truly portable programs would the just rely on this intersection, but db-specific munging would also be possible. Hmm... someone can maybe find a better wording for this. > > > > If I were designing this interface I would have description > > be a collection object that acted as both a sequence of > > column definitions and a mapping from column name to column > > definitions. I would have the column definitions be objects > > that have methods for standard attributes like name, and type > > (and maybe nullability, scale and precision) > > as well as optional attributes for things like display size and > > internal size. > > Especially those last two don't make too much sense now-a-days > (probably did back when all output had to formatted for 80- > column display/printers). > Actually the internal size is useful when estimating how much storage space is needed. Just another 2 centimes, Harri From c96pty@cs.umu.se Wed Jun 10 11:46:17 1998 From: c96pty@cs.umu.se (Peter Toneby) Date: Wed, 10 Jun 1998 12:46:17 +0200 (MET DST) Subject: [DB-SIG] mysqlmodule and *shared* Message-ID: As I didn't find anything about this in the readme or in the archives (didn't go to far back) I mention it here. To compile the mySQLmodule-0.1.4 using shared libs one have to add -lgcc in Modules/Setup for the mySQL module otherwise when importing mysqldb you will get some unresolved symbol errors (__divdi3 and __moddi3). /Peter ========================================================================= |o |o >> /\/ \o __o o o \o / /|/ \|o o\ o__ _o \o | | /| |\ | | | \ \ | | | << << << << | ========================================================================= === Greetings from Peter Toneby c96pty@cs.umu.se === ========================================================================= From cindy@hotxxxmail.com Sat Jun 13 20:01:16 1998 From: cindy@hotxxxmail.com (cindy@hotxxxmail.com) Date: Sat, 13 Jun 1998 15:01:16 -0400 (EDT) Subject: [DB-SIG] Hey Girly Girl Message-ID: Hello Cuz Hey girly girl hows it going? I love CA and I love my new job. Remember no telling mom and Dk about it they would kill me if they found out I was doing nude modeling over the net. Hey I showed them your picture and they said that they would hire you. Anyway come look at my web page and tell me what you think. The site is http://www.livesexplus.com go to models info you will love my personal page. Tell me what you think of my name? Anyway gotta go and I will write you later give me a call if you wanna come out. Love Cinthy From gstein@exchange.microsoft.com Mon Jun 15 04:07:49 1998 From: gstein@exchange.microsoft.com (Greg Stein (Exchange)) Date: Sun, 14 Jun 1998 20:07:49 -0700 Subject: [DB-SIG] DB-API 1.1 Message-ID: <69D8143E230DD111B1D40000F848584004F3FAE8@ED> > From: M.-A. Lemburg [mailto:mal@lemburg.com] > Sent: Friday, June 05, 1998 1:54 AM > > Tod Olson wrote: > > > > >>>>> "M" == M -A Lemburg writes: > > > > M> Things to be considered obsolete: > > M> arraysize > > M> setinputsize() > > M> setoutputsize() > > > > M> Completely agree on punting these... fetchmany() would then > > M> have a mandatory argument instead of an optional one. > > > > I actually found arraysize to be useful when I implemented array > > binding in a module I was working on. I could allocate the > space for > > the array binding when the user set the variable and avoid doing a > > malloc and free on every call to fetch*. That is, it was a > nice hint > > for handling memory in a place where Python couldn't help me. This is exactly the reason for the introduction of these attributes / methods. > But arraysize is only intended as default argument for fetchmany() > -- what if the user decides s/he only wants arraysize/2 or arraysize*2 > rows by passing an argument to fetchmany() ? > > BTW: You can still optimize malloc/free's by reusing the arrays > for every fetchmany()-call. Calling realloc normally isn't that > costly. Nope. No can do. You want the size *before* calling execute(). For the most efficient array operation, you want to bind your input and output arrays, then prepare the statement (the prepare is done inside execute()). Finally, you start fetching results. This high-performance capability is well within the realm of capability of a Python module. The DB-API was designed with this high-perf possibility in mind for future growth. The fact that people haven't yet built it does not negate that the API is an appropriate design for DBAPI implementors to grow into. The fetchmany() has an argument only for the purpose of batching up a number of results to return to a Python program. It usually cannot interact with any array-binding possibilities. One of the pieces of input that I received from the Dec 95 conference was that people wanted the fetchmany/fetchall methods (fetchone was insufficient). The point was "usually you want them all, anyhow." Practically speaking, though, fetchall() can get you in trouble if the particular query happened to return all 200 megs of your database. Therefore, the fetchmany() was introduced to effectively say "give me all of them, but limit to 100 just in case." -g From gstein@exchange.microsoft.com Mon Jun 15 04:26:38 1998 From: gstein@exchange.microsoft.com (Greg Stein (Exchange)) Date: Sun, 14 Jun 1998 20:26:38 -0700 Subject: [DB-SIG] DB-API 1.1 Message-ID: <69D8143E230DD111B1D40000F848584004F3FAE9@ED> Some historical rationale: > From: M.-A. Lemburg [mailto:mal@lemburg.com] > Sent: Thursday, June 04, 1998 2:24 AM > > Ok, I'll try to summarize what's been proposed so far [and my > comments on them ]: > > Ted Horst would like to see dictionaries used for connection > parameters and column descriptions. > ... > Jim wanted to know how what kind of exception to raise for interface > related errors. I suggested InterfaceError with Error as base class. > ... > Stored Procedures. As expected this causes troubles. > > AFAIK, it is possible to call stored procedures via the > standard execute method. Input parameters get passed in > via the parameter tuple, output goes into a result set. > > Since databases tend to use different calling syntaxes > the callproc() could take care of mapping the procedure > to the database specific syntax and then pass the parameters > on to the execute method to have the procedure call itself > executed. Results would then be available via the > result set. IN OUT parameters are not possible using this > scheme, but then: which Python type would you use for them > anyway... The differences between calling stored procs and the "normal" use of execute is why the two were separated. If there is a way to weasel execute() into doing it for you... great. callproc() was always a hand-wave. > Things to be considered obsolete: > arraysize > setinputsize() > setoutputsize() > > Completely agree on punting these... fetchmany() would then > have a mandatory argument instead of an optional one. See my other note. These should be considered optional, but should not be punted. Python can be used for high-performance DB mgmt operations (I proved this back in 96, against a guy who took two days to implement in C++ something that I did in 15 minutes with comparable execution speed). > begin() method. Jim mentioned this without too many details attached > to it. Michael responded with a transaction model. > > Michaels model is easily implemented in Python on top > of database cursors. As are some other techniques of getting > the OO look&feel into database handling. This would > well define a new project for the DB-SIG. Many of these higher level operations were omitted from the design. The intent was to allow module implementors to expose "enough" so that Python modules could do the rest. [ the dtuple module is a good example of punting useful functionality up a level ] > Return values for DDL, DML and DQL statements. > > The only really useful return value defined is the > number of rows affected for DMLs. mxODBC provides two > new attributes for cursors: > rowcount - number of rows affected or in the result set > colcount - number of columns in the result set > We drop all the special return values and add these two > (or maybe just rowcount) attributes instead. Having the execute() function return an integer allows the execute() to be an atomic operation (for DMLs). Placing return values into attributes breaks this. I don't recall the origin of the 1 for a DDL statement. You'd have to ask Lorton for that one... > Multiple results sets. > > Same problem as with stored procedures. There doesn't > seem to be much of a standard and most DBs probably don't > even support these. I think these snuck in with 1.1; I don't recall the nextset() method at all. > Mapping of database types to Python types. > > Since different databases support different types (including > some really exotic ones like PostgreSQL), I think the > general idea should be: preserve as much accuracy as you > can e.g. if a money databse type doesn't fit Python's integers, > use longs. The only type debatable, IMHO, is what to use > for long varchar columns (LONGS or BLOBS). The most common > usage for these is probably storing raw data, just like you > would in a disk file. Two Python types come into play here: > arrays (mutable) and strings (immutable). Strings provide > a much better access interface and are widely supported, so > I'd vote for strings. Raw columns were returned as dbiRaw instances so that they could be fed right back into an INSERT statement. In the dbiRaw.value, you would find that native string object, however. The thing that we did want to avoid, however, is an exhaustive mapping within the API definition (currencies and the like). These would probably follow the dbi model, but each database handles these speciality types a bit differently. It becomes very cumbersome on the module implementor if you try to unify these types within the API. We felt it was a bit better to punt unification up to the Python level if it was required. > Having coltype compare '==' instead of 'is'. Bill doesn't like it. > ... > dbiDate. I suggested using DateTime types, Bill agrees, Jim probably > has his own set of types ;-) which he'd like to use. > > I'm biased, of course, but the DateTime type are readily > available and easy to use on the user as on the interface > programmer side. They provide a rich set of API functions > which they export via a CObject (meaning: no linking problems). > mxODBC uses them already. Though it also allows you to > choose two other way of passing date/time values: as strings > and as tuples. This wasn't around when the DBAPI first came up. 'nuf said :-) > In general having more of the API return dedicated types with a > nice OO interface would be a really nice thing, but I fear that > most interface programmers are not willing to hack up two or > three other types in addition to connections and cursors. Major, big-time point. We had three modules that we wanted to build ourselves, not to mention modules that others wanted to build. Keeping the module slim and "close to the metal" was easier on us, and more in tune with Python's viewpoint on exposing APIs thru modules (e.g. expose the natural interface, then place Python layers on that as necessary). -g From gstein@exchange.microsoft.com Mon Jun 15 04:42:06 1998 From: gstein@exchange.microsoft.com (Greg Stein (Exchange)) Date: Sun, 14 Jun 1998 20:42:06 -0700 Subject: [DB-SIG] DB-API 1.1 Message-ID: <69D8143E230DD111B1D40000F848584004F3FAEA@ED> #2: the intent of the dbi module was that only one version existed on the planet. e.g. the db-sig itself owns it and the module implementors reference it, but don't redefine it. -g p.s. some later emails discussed dictionaries being passed to Connect(). I'd like to point out that this becomes difficult for the user to specify in a one-liner. Keyword args is pretty nice, but a bitch and a half to deal with from the C side of things. Allowing a database-defined set of args or a string is easiest on the module implementor. This variance between Connect() calls is also the reason that Michael Lorton specified that the import and connection to a database is specific, while the rest of the API can be generic; therefore, the use of the module name as the connection function. [ it is assumed as a fact that each database has different parameter sets and definitions for a connection. ] > -----Original Message----- > From: M.-A. Lemburg [mailto:mal@lemburg.com] > Sent: Thursday, May 21, 1998 2:59 AM > To: DB-SIG @ Python.org > Subject: [DB-SIG] DB-API 1.1 > > > I'd like to start discussing updating/extending the DB-API 1.0 to > a new version. IMHO, this needs to be done to > a. clearify a few minor tidbits > b. enable a more informative exception mechanism > > You can have a look at an alpha version of 1.1 at: > http://starship.skyport.net/~lemburg/DatabaseAPI-1.1.html > > Major additions are the new dbi-exception classes (which use the > 1.5 class exception mechanism, thus allowing to catch all DB-related > exceptions with the Error-base class, or just a specific subclass > of it), .nextset() and a few statements about the return values > of the .fetch*() methods in case there's nothing left to be fetched. > > Some other suggestions: > > 1. Instead of defining the connection constructor to be named > , I think Connect(...) is a better choice > (helps porting > applications from one DB to another. > > 2. We should come up with a reasonable package structure for database > access, e.g. to give the discussion a starting point: > > [Database] > [] > [dbi] > [] > [dbi] > > You'd then write: > > from Database.Oracle import * > db = Connect('...') > c = db.cursor() > c.execute('...',dbi.dbiDate(...)) > > When porting to another database, only 'Oracle' would have to changed > to the other DBs name (in the ideal case ;-). > > 3. cursor.description should be well defined, always return 7-tuples, > but allow None to be passed as synonym for 'data not available'. > > 4. Fix some standard for date/time values. I won't comment, > since I am a little biased on this one ;-) > > Awaiting your comments. > > -- > Marc-Andre Lemburg > ---------------------------------------------------------------------- > | Python Pages: http://starship.skyport.net/~lemburg/ | > ------------------------------------------------------- > > > > _______________________________________________ > DB-SIG maillist - DB-SIG@python.org > http://www.python.org/mailman/listinfo/db-sig > From mal@lemburg.com Mon Jun 15 10:03:13 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 15 Jun 1998 11:03:13 +0200 Subject: [DB-SIG] DB-API 1.1 References: <69D8143E230DD111B1D40000F848584004F3FAE8@ED> Message-ID: <3584E351.5EDC8D29@lemburg.com> Greg Stein (Exchange) wrote: > > > From: M.-A. Lemburg [mailto:mal@lemburg.com] > > Sent: Friday, June 05, 1998 1:54 AM > > > > Tod Olson wrote: > > > > > > >>>>> "M" == M -A Lemburg writes: > > > > > > M> Things to be considered obsolete: > > > M> arraysize > > > M> setinputsize() > > > M> setoutputsize() > > > > > > M> Completely agree on punting these... fetchmany() would then > > > M> have a mandatory argument instead of an optional one. > > > > > > I actually found arraysize to be useful when I implemented array > > > binding in a module I was working on. I could allocate the > > space for > > > the array binding when the user set the variable and avoid doing a > > > malloc and free on every call to fetch*. That is, it was a > > nice hint > > > for handling memory in a place where Python couldn't help me. > > This is exactly the reason for the introduction of these attributes / > methods. > > > But arraysize is only intended as default argument for fetchmany() > > -- what if the user decides s/he only wants arraysize/2 or arraysize*2 > > rows by passing an argument to fetchmany() ? > > > > BTW: You can still optimize malloc/free's by reusing the arrays > > for every fetchmany()-call. Calling realloc normally isn't that > > costly. > > Nope. No can do. You want the size *before* calling execute(). For the most > efficient array operation, you want to bind your input and output arrays, > then prepare the statement (the prepare is done inside execute()). Finally, > you start fetching results. I'm not sure where that get's you any added performance: you are aiming at 'execute one big SQL statement and make that as fast as possible' (right ?), so the scenario could be handled like this: · the execute method looks at the length of the argument list and builds an array from that *before* binding and executing the statement itself · next, fetchmany(size) is called; this allocates size packets of output columns and then loads the data from the database; the next call to fetchmany(size) will reuse the already allocated output array; freeing of the array is done whenever a new execute is done or the cursor is closed Of course, leaving the 3 API in the spec won't hurt anyone, so we might as well keep them (we would have to provide dummies for them anyway -- just to be backward compatible). > > Return values for DDL, DML and DQL statements. > > > > The only really useful return value defined is the > > number of rows affected for DMLs. mxODBC provides two > > new attributes for cursors: > > rowcount - number of rows affected or in the result set > > colcount - number of columns in the result set > > We drop all the special return values and add these two > > (or maybe just rowcount) attributes instead. > > Having the execute() function return an integer allows the execute() to be > an atomic operation (for DMLs). Placing return values into attributes breaks > this. Hmm, I wanted to make the interface design a little easier for the programmer. The problem is figuring out what type of statement the execute method was passed. For mxODBC, I came up with this hack: /* Figure out the return value and return */ if (dbcs->colcount == 0) { if (dbcs->rowcount == 0) /* must be a DDL like CREATE */ return PyInt_FromLong(1L); else /* must be a DML like UPDATE */ return PyInt_FromLong(dbcs->rowcount); } else /* must be a DQL like SELECT */ Py_ReturnNone(); How about this: we leave the return codes as they are, but instead of defining them via DML, DDL, DQL we use the above scheme (it should be equivalent, but produces less headaches on the programmers side). Plus, we add rowcount as attribute. Do all supported databases provide an API for rowcount after an execute ? > Multiple results sets. > > > > Same problem as with stored procedures. There doesn't > > seem to be much of a standard and most DBs probably don't > > even support these. > > I think these snuck in with 1.1; I don't recall the nextset() method at all. They did ;-) > > Mapping of database types to Python types. > > > > Since different databases support different types (including > > some really exotic ones like PostgreSQL), I think the > > general idea should be: preserve as much accuracy as you > > can e.g. if a money databse type doesn't fit Python's integers, > > use longs. The only type debatable, IMHO, is what to use > > for long varchar columns (LONGS or BLOBS). The most common > > usage for these is probably storing raw data, just like you > > would in a disk file. Two Python types come into play here: > > arrays (mutable) and strings (immutable). Strings provide > > a much better access interface and are widely supported, so > > I'd vote for strings. > > Raw columns were returned as dbiRaw instances so that they could be fed > right back into an INSERT statement. In the dbiRaw.value, you would find > that native string object, however. That will have to be specified though. > The thing that we did want to avoid, however, is an exhaustive mapping > within the API definition (currencies and the like). These would probably > follow the dbi model, but each database handles these speciality types a bit > differently. It becomes very cumbersome on the module implementor if you try > to unify these types within the API. We felt it was a bit better to punt > unification up to the Python level if it was required. > > > In general having more of the API return dedicated types with a > > nice OO interface would be a really nice thing, but I fear that > > most interface programmers are not willing to hack up two or > > three other types in addition to connections and cursors. > > Major, big-time point. We had three modules that we wanted to build > ourselves, not to mention modules that others wanted to build. Keeping the > module slim and "close to the metal" was easier on us, and more in tune with > Python's viewpoint on exposing APIs thru modules (e.g. expose the natural > interface, then place Python layers on that as necessary). Right. The OO layer can easily be built on top of cursors and from what I've heard there are some of these layers just waiting to be announced ;-) > > 2. We should come up with a reasonable package structure for database > > access, e.g. to give the discussion a starting point: > > > > [Database] > > [] > > [dbi] > > [] > > [dbi] > > > > You'd then write: > > > > from Database.Oracle import * > > db = Connect('...') > > c = db.cursor() > > c.execute('...',dbi.dbiDate(...)) > > > > When porting to another database, only 'Oracle' would have to changed > > to the other DBs name (in the ideal case ;-). > > The intent of the dbi module was that only one version existed on the > planet. e.g. the db-sig itself owns it and the module implementors reference > it, but don't redefine it. I think reality got it the other way around ;) which isn't that bad, since the interface programmer has more freedom that way. There could still be a generic dbi module though... the above scheme doesn't necessarily imply that dbi has to be included by the database package (it could simply be an imported generic submodule). > p.s. some later emails discussed dictionaries being passed to Connect(). I'd > like to point out that this becomes difficult for the user to specify in a > one-liner. Keyword args is pretty nice, but a bitch and a half to deal with > from the C side of things. Allowing a database-defined set of args or a > string is easiest on the module implementor. This variance between Connect() > calls is also the reason that Michael Lorton specified that the import and > connection to a database is specific, while the rest of the API can be > generic; therefore, the use of the module name as the connection function. > [ it is assumed as a fact that each database has different parameter sets > and definitions for a connection. ] Great idea. It maintains backward compatibility with multi-argument Connect() implementations while still providing a dictionary like interface. BTW: Handling keywords isn't really necessary for the C module. Simply wrap the C module into a Python module and define the Connect() function there. It can then preprocess the arguments and pass the results to a low-level connect function defined in the C module. [I use this technique a lot and with good results, e.g. you can prototype interfaces in Python and then move them down one level when things have settled.] I'll have an edited version of the 1.1 spec up online in a few days and then ring in the second round... Thanks for all your comments. -- Marc-Andre Lemburg Y2000: 564 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From Paul Boddie Fri Jun 19 12:46:57 1998 From: Paul Boddie (Paul Boddie) Date: Fri, 19 Jun 1998 13:46:57 +0200 (MET DST) Subject: [DB-SIG] oracledb commit behaviour Message-ID: <199806191146.NAA28283@aisws7.cern.ch> Hello, I think that I have uncovered some strange behaviour with oracledb-0.1.3 and exception handling. I have a few scripts where the following type of thing is done: cursor = connection.cursor() for i in something: try: cursor.execute(...) except SomeException: ... cursor.close() print "The end." connection.commit() connection.close() The problem is that if an unhandled exception occurs, a commit still seems to get done! For example, I can pass around the loop once, and one execute gets done, but then on the second pass an exception (other than SomeException) occurs, but although the message "The end." never gets printed, the commit still seems to take place. This is most alarming. Has anyone seen anything like this before? I'm using oracledb-0.1.3 (with patches only to add some other, unused features in this case), Oracle 7.3.3 and Python 1.5.1. I have also tried the same thing with an unpatched version of oracledb-0.1.3, and an unpatched version compiled using older Oracle 7.1 libraries (albeit for Python 1.4). The only thing that I can see is that ologof is being called when the connection is destructed. If this is the case, then it may be that a commit is being done implicitly (see 2-27, Oracle Call Interface Release 7.3 Programmer's Guide). Any ideas, suggestions, analogous situations? Paul Boddie Paul.Boddie@cern.ch | http://assuwww.cern.ch/~pboddie | Any views expressed above are personal and not necessarily | shared by my employer or my associates. From mal@lemburg.com Fri Jun 19 11:55:27 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jun 1998 12:55:27 +0200 Subject: [DB-SIG] DB API 1.1a3 Message-ID: <358A439F.562FB631@lemburg.com> I've uploaded a new edited version of the API 1.1 specification. It includes most of the things and modifications we have discussed in the past weeks (I probably forgot some): http://starship.skyport.net/~lemburg/DatabaseAPI-1.1.html The 1.0 version can be found at: http://www.python.org/sigs/db-sig/DatabaseAPI.html Comments ? -- Marc-Andre Lemburg Y2000: 560 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From jeremy@cnri.reston.va.us Fri Jun 19 16:12:47 1998 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 19 Jun 1998 11:12:47 -0400 (EDT) Subject: [DB-SIG] authoritative list of Python database modules Message-ID: <13706.31270.860152.989255@bitdiddle.cnri.reston.va.us> I have spent the last 30 minutes trying to find an authoritative list of database modules. I've become convinced that there isn't one, or at least that there are enough impostors that it wouldn't be possible to recognize the authoritative list. I volunteer to assemble an authoritative list, if people will help me gather the information. I expect an authoritative list to include all the known database modules, including versions of Python and the database that are known to work. The current state of affairs is downright confusing! I'm aware of three pages that list database software: http://www.python.org/download/Contributed.html#Database http://www.python.org/ftp/python/contrib/Database/ http://www.python.org/sigs/db-sig/status.html Unfortunately, each page seems to list a different subset of modules and many of the versions pointed to seem to be out of date. I tried a few searches. A search for "database module" turns up the db-sig status page, which is somewhat helpful. Other searches, including "database", "oracle database", and "postgres module" produce less helpful results. (Although searching for "oracle database" does produce the ora_mod readme file as the first hit -- and the ora_mod readme file points to the Digital Creations oracle module.) Here's the list I've come up with, along with a pointer to what looks like the most recent version and its date. Please send me any additions or corrections. If you know what versions of Python or the database these modules work with, let me know; in particular, do they work with 1.5? Oracle ftp://ftp.digicool.com/pub/releases/unsupported/oracle June 24, 1997 Informix http://www.python.org/ftp/python/contrib/Database/informixdb.tar.gz Aug. 27, 1997 Sybase http://www.python.org/ftp/python/contrib/Database/sybasemodule.tar.gz Feb. 28, 1997 ODBC (???) http://www.python.org/windows/win32/odbc.html (Is it possible to use ODBC without PythonWin?) (What about mxODBC?) Solid ftp://ftp.gams.at/pub/Solid/Python/SolidPython-0.0.7.tar.gz Nov. 17, 1997 Gadfly http://starship.skyport.net/crew/aaron_watters/kwParsing/gadfly.html MySQL http://www.python.org/ftp/python/contrib/Database/mySQLmodule-0.1.4.tar.gz Oct. 9, 1997 mSQL http://www.ollie.clive.ia.us/jeff/python/msql/ Feb. 20, 1997 Postgres (v 6.2) http://www.druid.net/pygresql/ Dec. 23, 1997 Jeremy From mal@lemburg.com Fri Jun 19 10:30:08 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 19 Jun 1998 11:30:08 +0200 Subject: [DB-SIG] ANN: mxODBC Package - Version 0.8.0 Message-ID: <358A2FA0.1F3ACF73@lemburg.com> ANNOUNCING: mxODBC Version 0.8.0 A Python extension package providing a generic interface to ODBC 2.0 API compliant database drivers or managers WHAT IT IS: mxODBC is an extension package that provides a Python Database API compliant interface to ODBC 2.0 capable database drivers and managers. In addition to the capabilities provided through the standard API it also provides a rich set of catalog methods that allow you to scan the database for tables, procedures, etc. Furthermore, it uses the mxDateTime package for date/time value interfacing eliminating most of the problems these types normally introduce (other in/output formats are available too). WHAT'S NEW ? The 0.8.0 version was redesigned to be multi-database enabled. This means that you can access and work with several databases at the same time, e.g. to copy data from one database to another, reformatting it along the way. The new package structure also makes pre-configured setups possible. Included are setups for SuSE ADABAS, MySQL and Solid Server. These should run out of the box (after some minor path adjustments). I'm still looking for setups: most wanted are Oracle, MS ODBC Manager and DB/2. WHERE CAN I GET IT ? The full documentation, copyright information and instructions for downloading and installing can be found at: http://starship.skyport.net/~lemburg/mxODBC.html The mxDateTime package needed for mxODBC can be found at: http://starship.skyport.net/~lemburg/mxDateTime.html WHAT DOES IT COST ? If you want to redistribute the package for commercial use, you'll have to get my permission first. Any other usage is free. -- Marc-Andre Lemburg Y2000: 562 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From jeremy@cnri.reston.va.us Fri Jun 19 16:12:47 1998 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Fri, 19 Jun 1998 11:12:47 -0400 (EDT) Subject: [DB-SIG] authoritative list of Python database modules Message-ID: <13706.31270.860152.989255@bitdiddle.cnri.reston.va.us> I have spent the last 30 minutes trying to find an authoritative list of database modules. I've become convinced that there isn't one, or at least that there are enough impostors that it wouldn't be possible to recognize the authoritative list. I volunteer to assemble an authoritative list, if people will help me gather the information. I expect an authoritative list to include all the known database modules, including versions of Python and the database that are known to work. The current state of affairs is downright confusing! I'm aware of three pages that list database software: http://www.python.org/download/Contributed.html#Database http://www.python.org/ftp/python/contrib/Database/ http://www.python.org/sigs/db-sig/status.html Unfortunately, each page seems to list a different subset of modules and many of the versions pointed to seem to be out of date. I tried a few searches. A search for "database module" turns up the db-sig status page, which is somewhat helpful. Other searches, including "database", "oracle database", and "postgres module" produce less helpful results. (Although searching for "oracle database" does produce the ora_mod readme file as the first hit -- and the ora_mod readme file points to the Digital Creations oracle module.) Here's the list I've come up with, along with a pointer to what looks like the most recent version and its date. Please send me any additions or corrections. If you know what versions of Python or the database these modules work with, let me know; in particular, do they work with 1.5? Oracle ftp://ftp.digicool.com/pub/releases/unsupported/oracle June 24, 1997 Informix http://www.python.org/ftp/python/contrib/Database/informixdb.tar.gz Aug. 27, 1997 Sybase http://www.python.org/ftp/python/contrib/Database/sybasemodule.tar.gz Feb. 28, 1997 ODBC (???) http://www.python.org/windows/win32/odbc.html (Is it possible to use ODBC without PythonWin?) (What about mxODBC?) Solid ftp://ftp.gams.at/pub/Solid/Python/SolidPython-0.0.7.tar.gz Nov. 17, 1997 Gadfly http://starship.skyport.net/crew/aaron_watters/kwParsing/gadfly.html MySQL http://www.python.org/ftp/python/contrib/Database/mySQLmodule-0.1.4.tar.gz Oct. 9, 1997 mSQL http://www.ollie.clive.ia.us/jeff/python/msql/ Feb. 20, 1997 Postgres (v 6.2) http://www.druid.net/pygresql/ Dec. 23, 1997 Jeremy From mal@lemburg.com Mon Jun 22 21:15:12 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 22 Jun 1998 22:15:12 +0200 Subject: [DB-SIG] authoritative list of Python database modules References: <13706.31270.860152.989255@bitdiddle.cnri.reston.va.us> Message-ID: <358EBB50.6A931B37@lemburg.com> Jeremy Hylton wrote: > > I have spent the last 30 minutes trying to find an authoritative list > of database modules. I've become convinced that there isn't one, or > at least that there are enough impostors that it wouldn't be possible > to recognize the authoritative list. > > I volunteer to assemble an authoritative list, if people will help me > gather the information. I expect an authoritative list to include all > the known database modules, including versions of Python and the > database that are known to work. Great idea. (BTW: You should also include information on DB API compliance for each module.) > ODBC (???) > http://www.python.org/windows/win32/odbc.html > (Is it possible to use ODBC without PythonWin?) AFAIK, the win32 odbc module only works with the MS ODBC Manager on WinNT/95. > (What about mxODBC?) mxODBC can be found at: http://starship.skyport.net/~lemburg/mxODBC.html It links directly against any ODBC 2.0 lib on Unix and is known work (concurrently) with: SuSE Adabas D Solid MySQL iODBC (the free Unix ODBC manager) It probably also compiles against: Oracle with Intersolv ODBC drivers MS ODBC Manager (freely downloadable) Many other databases with ODBC drivers for Unix Current version: 0.8.0 (June 1998) It is DB API 1.0 compliant and will soon be DB API 1.1 compliant too. -- Marc-Andre Lemburg Y2000: 557 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From MHammond@skippinet.com.au Tue Jun 23 01:37:35 1998 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 23 Jun 1998 10:37:35 +1000 Subject: [DB-SIG] authoritative list of Python database modules Message-ID: <009001bd9e3f$d5adb020$0a01a8c0@skippy.skippinet.com.au> >Great idea. (BTW: You should also include information on DB API >compliance for each module.) > >> ODBC (???) >> http://www.python.org/windows/win32/odbc.html >> (Is it possible to use ODBC without PythonWin?) > >AFAIK, the win32 odbc module only works with the MS ODBC Manager >on WinNT/95. I have no idea how true that is - I didnt write it :-) But the ODBC distributed with Pythonwin is not at all dependant on Pythonwin. It requires only "dbi.pyd" and "odbc.pyd" to run. Pythonwin is just a convenient vehicle for distributing it, and keeping it up to date... Mark. From panda@peace.com.my Tue Jun 23 05:01:31 1998 From: panda@peace.com.my (chas) Date: Tue, 23 Jun 1998 12:01:31 +0800 (SGT) Subject: [DB-SIG] authoritative list of Python database modules Message-ID: <3.0.32.19980623122319.0092ead0@peace.com.my> >MySQL >http://www.python.org/ftp/python/contrib/Database/mySQLmodule-0.1.4.tar.gz >Oct. 9, 1997 New version : 13 June 1998 Version 1.1 http://snail.earthlight.co.nz/projects.html Good luck with the list, chas From mal@lemburg.com Tue Jun 23 14:45:21 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 23 Jun 1998 15:45:21 +0200 Subject: [DB-SIG] DB API 1.1a3 References: <358A439F.562FB631@lemburg.com> Message-ID: <358FB171.1F88E69A@lemburg.com> {This is a repost... the original message seems to have gotten lost somewhere in the weekend's DNS mixup at python.org] I've uploaded a new edited version of the API 1.1 specification. It includes most of the things and modifications we have discussed in the past weeks (I probably forgot some): http://starship.skyport.net/~lemburg/DatabaseAPI-1.1.html The 1.0 version can be found at: http://www.python.org/sigs/db-sig/DatabaseAPI.html Comments ? -- Marc-Andre Lemburg Y2000: 556 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From jim.fulton@Digicool.com Tue Jun 23 16:45:15 1998 From: jim.fulton@Digicool.com (Jim Fulton) Date: Tue, 23 Jun 1998 11:45:15 -0400 Subject: [DB-SIG] Oracle no longer wants positional arguments! Message-ID: <358FCD8B.5728@digicool.com> I happen to be working on a new Oracle module that needs to work with Oracle 8 and I noticed something interesting. The Oracle OCI call, obindrn, that lets you use positional placeholders, as in: select * from spam where foo=:1 is now considered "obsolete"! Oracle wants you to use named variables, like: select * from spam where foo=:foo Aparently, you were already (at least in Oracle 7) not allowed to use positional placeholders in PL/SQL blocks. Oops. ;-) Maybe we should think about a less position-oriented interface, perhaps using keyword arguments or a dictionary to pass values into execute. (Note that I'd still prefer a more function-like interface, as in: f=connection.prepare(some_sql) result=f(arg1=v1, arg2=v2) ) Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From Paul Boddie Tue Jun 23 17:03:59 1998 From: Paul Boddie (Paul Boddie) Date: Tue, 23 Jun 1998 18:03:59 +0200 (MET DST) Subject: [DB-SIG] Oracle no longer wants positional arguments! Message-ID: <199806231603.SAA21485@aisws7.cern.ch> Jim Fulton wrote: > I happen to be working on a new Oracle module that needs to > work with Oracle 8 and I noticed something interesting. Right! I thought someone was working on a new module. Now I know I didn't just make it up! > The Oracle OCI call, obindrn, that lets you use positional > placeholders, as in: > > select * from spam > where foo=:1 > > is now considered "obsolete"! I noticed that named placeholders were supported in OCI, and wondered why they weren't utilised by oracledb... > Oracle wants you to use named variables, like: > > select * from spam > where foo=:foo > > Aparently, you were already (at least in Oracle 7) > not allowed to use positional placeholders in PL/SQL > blocks. > > Oops. ;-) In the PL/SQL probably not, but it is possible to call PL/SQL procedures using OCI with numbered placeholders. > Maybe we should think about a less position-oriented interface, > perhaps using keyword arguments or a dictionary > to pass values into execute. > > (Note that I'd still prefer a more function-like > interface, as in: > > f=connection.prepare(some_sql) > result=f(arg1=v1, arg2=v2) > ) Hmmm... I think that (naming and notation issues introduced in the example above aside) dictionaries would provide a fairly convenient interface to the parameters of stored procedures, which I was working on for oracledb-0.1.3. I posted a bit of a concern about the disconnection/commit relationship in oracledb. Are you in a position to comment about that? (Given that the oracledb module is unsupported, of course.) Paul Boddie Paul.Boddie@cern.ch | http://assuwww.cern.ch/~pboddie | Any views expressed above are personal and not necessarily | shared by my employer or my associates. From klm@python.org Tue Jun 23 17:08:06 1998 From: klm@python.org (Ken Manheimer) Date: Tue, 23 Jun 1998 12:08:06 -0400 (EDT) Subject: [DB-SIG] Lost maillist messages thursday eve (6/18) to friday noon (6/19) Message-ID: <199806231608.MAA13071@glyph.cnri.reston.va.us> Some of you may have lost postings posted to one of the following maillists between last thursday (6/18) evening and friday (6/19) noon. Mailman-Developers (1 msg) Matrix-SIG (8 msgs) DB-SIG (3 msgs) Doc-SIG (4 msgs) Pythonmac-SIG (3 msgs) XML-SIG (1 msg) Trove-Dev (6 msgs) This happened accompanying an upgrade of our maillist software, mailman, due to an bad interaction between a new mailman feature and an anti-spam (anti-relay) mechanism applied to python.org's sendmail configuration. This problem did not show during testing because our test recipients were all local, and not subject to the anti-relay constraints. If you sent something to any of these lists during that time frame and never saw it show, you may want to resend. Archiving was not affected, so you should be able to find the messages in the maillist archives. People receiving the lists in digest format were not affected, since the delivery problem was fixed before the digest delivery time. My apologies for the disruption! Ken Manheimer klm@python.org 703 620-8990 x268 (orporation for National Research |nitiatives # If you appreciate Python, consider joining the PSA! # # . # From jim.fulton@Digicool.com Tue Jun 23 17:40:26 1998 From: jim.fulton@Digicool.com (Jim Fulton) Date: Tue, 23 Jun 1998 12:40:26 -0400 Subject: [DB-SIG] PROPOSAL: Portable Argment Format Message-ID: <358FDA7A.12D@digicool.com> I'd like to lobby for a portable argument format for the DBI interface. While this *does* require parsing SQL, this is not really all that hard and I think the benefits are well worth the effort. I volunteer to provide a utility to assist with this. Here's what I think the format needs to do: - Not interfere with SQL. That is, it must be unabiguous to find parameters in SQL. - Support optional argument names, which may be given a positional interpretation. - Capture type information, to make type explicit. Maybe this is all that's needed. I propose the following format: :(name)code where : signals a parameter and code is a type code. Valid type codes are: c, b, B, h, H, i, I, l, L, f, d, and s -- as defined in the struct module t -- Date/Time $ -- Money (???) r -- Binary data (raw/blob) gotten from a string others...??? Note that, after some consideration, I decided to use ':' rather than '%' to signal a parameter, because: - % has a meaning in string substitution that is too similar and not similar enough to sql parameters. I don't want to worry about conflicting or confused type codes. - One might want to use string formating to generate sql containing parameter references (without wanting to escape %s). - I was too lazy to try to determin if % could be confused with some SQL syntax element and figured that : was OK because it was used by Oracle. :-) I'm not wed to ':'. '?' would be OK too. The name is optional and defaults to a generated name, _x, where x is the index of the argument in the unnamed arguments. For example: select * from spam where foo=:s and bar=:d is equivalent to: select * from spam where foo=:(_0)s and bar=:(_1)d Note that positional arguments could be given explicitly and intermixed with non-positional arguments, as in: select * from spam where foo=:(_1)s and bar=:(bar)d and baz=:i is equivalent to: select * from spam where foo=:(_1)s and bar=:(bar)d and baz=:(_0)i Note that arguments may be repeated as in: select * from spam where x1 > :(minx)i and x2 > :(minx)i and w == :s and a == :(_1)i and b < :(_1)i When used with ODBC, this would be converted to: select * from spam where x1 > ? and x2 > ? and w == ? and a == ? and b < ? and two of the arguments would have to bound twice. I propose that non-positional arguments be assigned positions according to their order of appearence, with positional arguments ordered before non-positional arguments. So in the example above, the arguments and their positions would be: _0 at position 0, _1 at position 1, and minx at position 2. I propose that arguments be treated in a similar fashion to Python function arguments, allowing either positional or non-positional actual arguments. For example, the signature of the above example would be "_0, _1, minx". Someone could pass the parameters like this: sql=("select * from spam " "where x1 > :(minx)i and x2 > :(minx)i and " " w == :s and a == :(_1)i and b < :(_1)i") c.execute(sql, 'eggs', 10, 20) x.execute(sql, 'eggs', 10, minx=20) x.execute(sql, _0=eggs, minx=20, _1=10) I think that this proposal hase a number of advantages over the current scheme: - It provides greater portability. - It provides some measure of type safety. - It supports more user-friendly calling mechanism (e.g. keyword arguments) - It is more powerful than some database-specific mechanisms. For example, ODBC's mechanism forces parameters that are *used* more than once to be bound more than once. The only real downside is that module developers may have a bit more work to do. I'll volunteer to reduce the work required by providing a utility that parses an sql statement and returns a new SQL statement that uses database-specific format and that provides information needed to bind parameters. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From jim.fulton@Digicool.com Tue Jun 23 17:51:30 1998 From: jim.fulton@Digicool.com (Jim Fulton) Date: Tue, 23 Jun 1998 12:51:30 -0400 Subject: [DB-SIG] Oracle no longer wants positional arguments! References: <199806231603.SAA21485@aisws7.cern.ch> Message-ID: <358FDD12.3BC4@digicool.com> Paul Boddie wrote: > > Jim Fulton wrote: > > > I happen to be working on a new Oracle module that needs to > > work with Oracle 8 and I noticed something interesting. > > Right! I thought someone was working on a new module. Now I know I didn't just > make it up! I think that Anthony Baxter was working on a new oracle module too. I suspect that that is who you were thinking of. Alas, it looks like I'm not going to be able to release (at least not for free) the oracle module I'm working on now. (Waaaaaa!) > > The Oracle OCI call, obindrn, that lets you use positional > > placeholders, as in: > > > > select * from spam > > where foo=:1 > > > > is now considered "obsolete"! > > I noticed that named placeholders were supported in OCI, and wondered why they > weren't utilised by oracledb... > > > Oracle wants you to use named variables, like: > > > > select * from spam > > where foo=:foo > > > > Aparently, you were already (at least in Oracle 7) > > not allowed to use positional placeholders in PL/SQL > > blocks. > > > > Oops. ;-) > > In the PL/SQL probably not, but it is possible to call PL/SQL procedures using > OCI with numbered placeholders. But in Oracle 8, the ordered placeholder binding function, obindrn, is flagged as "obsolete". I take this to mean that obindrn is "deprecated" and should be avoided. > > Maybe we should think about a less position-oriented interface, > > perhaps using keyword arguments or a dictionary > > to pass values into execute. > > > > (Note that I'd still prefer a more function-like > > interface, as in: > > > > f=connection.prepare(some_sql) > > result=f(arg1=v1, arg2=v2) > > ) > > Hmmm... I think that (naming and notation issues introduced in the example above > aside) dictionaries would provide a fairly convenient interface to the > parameters of stored procedures, which I was working on for oracledb-0.1.3. > > I posted a bit of a concern about the disconnection/commit relationship in > oracledb. Are you in a position to comment about that? (Given that the oracledb > module is unsupported, of course.) Not really. I'm willing to apply a patch, if you think that the current module is broken and explain the problem to me (in private email.) Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From Anthony Baxter Wed Jun 24 00:55:33 1998 From: Anthony Baxter (Anthony Baxter) Date: Wed, 24 Jun 1998 09:55:33 +1000 Subject: [DB-SIG] Oracle no longer wants positional arguments! In-Reply-To: Your message of "Tue, 23 Jun 1998 12:51:30 -0400." <358FDD12.3BC4@digicool.com> References: <358FDD12.3BC4@digicool.com> <199806231603.SAA21485@aisws7.cern.ch> Message-ID: <199806232355.JAA05526@jambu.off.connect.com.au> >>> Jim Fulton wrote > I think that Anthony Baxter was working on a new oracle module too. > I suspect that that is who you were thinking of. Yah. I have a version which is a fairly cleaned up version of the most recent DC one, with various patches from the net. The named parameter stuff is actually what I'm midway through (well ok, I haven't had a chance to touch it for a month or more - must find bigger cache of round tuits). Anthony From mal@lemburg.com Wed Jun 24 09:57:23 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 24 Jun 1998 10:57:23 +0200 Subject: [DB-SIG] PROPOSAL: Portable Argment Format References: <358FDA7A.12D@digicool.com> Message-ID: <3590BF73.1A56FA19@lemburg.com> Jim Fulton wrote: > > I'd like to lobby for a portable argument format for the DBI > interface. While this *does* require parsing SQL, this is > not really all that hard and I think the benefits are well > worth the effort. I volunteer to provide a utility to assist > with this. Actually, it only requires scanning SQL. Parsing SQL would be overkill ;) > Here's what I think the format needs to do: > > - Not interfere with SQL. That is, it must be unabiguous > to find parameters in SQL. > > - Support optional argument names, which may be given > a positional interpretation. > > - Capture type information, to make type explicit. There's a problem with this one: the database may want to have different types than the one you explicitly state in the SQL statement, e.g. when porting from e.g. Solid to MySQL you'll find that MySQL will want all parameters to be strings, even numbers, so statements like INSERT INTO MyTable VALUE (:t, :s, :i) would fail on MySQL. But then: the interface could implement the type checks and do the conversions afterwards... > > Maybe this is all that's needed. > > I propose the following format: > > :(name)code > > where : signals a parameter and code is a type code. > > Valid type codes are: > > c, b, B, h, H, i, I, l, L, f, d, and s -- > as defined in the struct module > > t -- Date/Time We ought to define three chars for date/time values: D - date only T - daytime only S - timestamp (date + daytime) since this is what many DBs can handle. > $ -- Money (???) Hmm, what would that look like ? > r -- Binary data (raw/blob) gotten from a string > others...??? Questions: · What will databases without some of these types get ? E.g. the money type is not defined in ODBC. · Which Python types are expected for each type character ? E.g. will passing strings to ':r' be ok ? Will more than one type be allowed per type char (with automatic conversion if necessary) ? > [Examples] Note that you can provide the whole functionality by coding a Python function (or class) on top of cursor.execute, so no change to the API spec is necessary. Moreover, every existing implementation will be able to use it without modification (which is a Good Thing :). You could even have a perpare constructor that returns a (Python) instance with a function call interface. [We don't have a performance issues here, since the function call overhead is negligable w/r to the time it takes for the database to finish.] -- Marc-Andre Lemburg Y2000: 555 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From jim.fulton@Digicool.com Wed Jun 24 15:33:08 1998 From: jim.fulton@Digicool.com (Jim Fulton) Date: Wed, 24 Jun 1998 10:33:08 -0400 Subject: [DB-SIG] PROPOSAL: Portable Argment Format References: <358FDA7A.12D@digicool.com> <3590BF73.1A56FA19@lemburg.com> Message-ID: <35910E24.50B8@digicool.com> M.-A. Lemburg wrote: > > Jim Fulton wrote: > > > > I'd like to lobby for a portable argument format for the DBI > > interface. While this *does* require parsing SQL, this is > > not really all that hard and I think the benefits are well > > worth the effort. I volunteer to provide a utility to assist > > with this. > > Actually, it only requires scanning SQL. Parsing SQL would > be overkill ;) > > > Here's what I think the format needs to do: > > > > - Not interfere with SQL. That is, it must be unabiguous > > to find parameters in SQL. > > > > - Support optional argument names, which may be given > > a positional interpretation. > > > > - Capture type information, to make type explicit. > > There's a problem with this one: the database may want to > have different types than the one you explicitly state in the > SQL statement, e.g. when porting from e.g. Solid to MySQL > you'll find that MySQL will want all parameters to be strings, > even numbers, so statements like > > INSERT INTO MyTable VALUE (:t, :s, :i) > > would fail on MySQL. No, because the MySQL interface would convert values to strings as part of the conversion process. > But then: the interface could implement > the type checks and do the conversions afterwards... Right. > > > > Maybe this is all that's needed. > > > > I propose the following format: > > > > :(name)code > > > > where : signals a parameter and code is a type code. > > > > Valid type codes are: > > > > c, b, B, h, H, i, I, l, L, f, d, and s -- > > as defined in the struct module > > > > t -- Date/Time > > We ought to define three chars for date/time values: > > D - date only > T - daytime only > S - timestamp (date + daytime) Fine. > since this is what many DBs can handle. > > > $ -- Money (???) > > Hmm, what would that look like ? No idea. That's why I added question marks. OK, Punt. > > r -- Binary data (raw/blob) gotten from a string > > others...??? > > Questions: > > · What will databases without some of these types get ? E.g. the money > type is not defined in ODBC. Perform some reasonable conversion. > · Which Python types are expected for each type character ? E.g. will > passing strings to ':r' be ok ? :r is defined to come from a string above. The proposal should spell this out. > Will more than one type be allowed per > type char (with automatic conversion if necessary) ? Yes. This should be spelled out. > > [Examples] > > Note that you can provide the whole functionality by coding > a Python function (or class) on top of cursor.execute, so no change to > the API spec is necessary. Moreover, every existing implementation > will be able to use it without modification (which is a Good Thing :). How? Each implementation has to be involved in implementing this. For example, portable aguments have to be converted to platform- specific arguments, and platform specific conversions have to be applied. Utilities can be provided to make the implementation simpler, but implementation-specific code is needed. > You could even have a perpare constructor that returns a (Python) > instance with a function call interface. [We don't have a > performance issues here, since the function call overhead > is negligable w/r to the time it takes for the database to finish.] Right. With type information in the sql statement, I can do much of the binding work up front and make the actual callinr process simpler and faster. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From mal@lemburg.com Wed Jun 24 17:17:46 1998 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 24 Jun 1998 18:17:46 +0200 Subject: [DB-SIG] PROPOSAL: Portable Argment Format References: <358FDA7A.12D@digicool.com> <3590BF73.1A56FA19@lemburg.com> <35910E24.50B8@digicool.com> Message-ID: <359126AA.684F87B@lemburg.com> Jim Fulton wrote: > > M.-A. Lemburg wrote: > > Note that you can provide the whole functionality by coding > > a Python function (or class) on top of cursor.execute, so no change to > > the API spec is necessary. Moreover, every existing implementation > > will be able to use it without modification (which is a Good Thing :). > > How? Each implementation has to be involved in implementing this. > For example, portable aguments have to be converted to platform- > specific arguments, and platform specific conversions have to be > applied. Utilities can be provided to make the implementation simpler, > but implementation-specific code is needed. I was thinking of a framework which each interface implementor could then subclass. The subclasses would simply override the DB-specific methods in order to provide the right conversions. Most of the other code and the basic structure will remain reusable. -- Marc-Andre Lemburg Y2000: 555 days left --------------------------------------------------------------------- : Python Pages >>> http://starship.skyport.net/~lemburg/ : --------------------------------------------------------- From jim.fulton@Digicool.com Fri Jun 26 13:11:58 1998 From: jim.fulton@Digicool.com (Jim Fulton) Date: Fri, 26 Jun 1998 08:11:58 -0400 Subject: [DB-SIG] Another funny Oracle feature Message-ID: <3593900E.3DEB@digicool.com> Another funny (ha ha, not really) feature I've found in Oracle is that when binding parameters to SQL (not procedures, just SQL) it is possible to have *output* parameters as well as input parameters. For example, someonw can use: select foo into :fooparm from spam where bar=:barparm Now this select has two parameters. One is an input parameter and one is an output parameter. As a bonus, There doesn't seem to be any API call I can make to discover that 'fooparm' is an output parameter. But then, there isn't an API call I can make to find out what parameters there are either, so why am I complaining? :( This situation is rather hard to handle with the current DBI API. This may be reasonable, since the SQL above is, I assume, non-standard. I'm curious if anyone else has come across a case like this with other databases? BTW, the situation is not so bad for stored procedures, since there is an API for getting procedure metadata. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.