From jeff@ollie.clive.ia.us Mon Jan 6 20:29:36 1997 From: jeff@ollie.clive.ia.us (Jeffrey C. Ollie) Date: Mon, 06 Jan 1997 14:29:36 -0600 Subject: [PYTHON DB-SIG] mSQL 2.0 and Python 1.4 Message-ID: <199701062029.OAA15393@worf.netins.net> -----BEGIN PGP SIGNED MESSAGE----- I was wondering if anyone was working on getting the mSQL module to work with mSQL 2.0 (recently released in beta) and Python 1.4. I've gotten things to compile and some simple tests to run, but I'm sure that some of the other changes in the mSQL API may cause problems. I've included my patch below. It seems to work without problem with mSQL 2.0b2, Python 1.4, IRIX 5.3, and gcc 2.7.2.1, but I haven't done any extensive testing. So unless someone else pipe up shortly, I'm going to go to work making sure that there aren't any hidden problems since I need mSQL for a project that I'm working on here. =================================================================== RCS file: RCS/mSQLmodule.c,v retrieving revision 1.1 diff -c -r1.1 mSQLmodule.c *** mSQLmodule.c 1997/01/06 19:36:25 1.1 - --- mSQLmodule.c 1997/01/06 19:38:52 *************** *** 31,36 **** - --- 31,38 ---- ****************************************************** + Modified by Jeffrey C. Ollie January 1997 to work with mSQL 2.0 + Modified by David Gibson December 1995 - listdbs and listtables now return a list of strings *************** *** 327,334 **** - --- 329,341 ---- case REAL_TYPE: type="real"; break; default: type="????"; break; } + #ifdef IS_UNIQUE + if (IS_UNIQUE(tf->flags)) + strcpy(flags, "unique"); + #else if (IS_PRI_KEY(tf->flags)) strcpy(flags,"pri"); + #endif else flags[0]=0; if (IS_NOT_NULL(tf->flags)) [A copy of the headers and the PGP signature follow.] Date: Mon, 06 Jan 1997 14:29:36 -0600 From: "Jeffrey C. Ollie" Subject: mSQL 2.0 and Python 1.4 To: python-list@cwi.nl, db-sig@python.org -----BEGIN PGP SIGNATURE----- Version: 2.6.2 Comment: AnySign 1.4 - A Python tool for PGP signing e-mail and news. iQCVAwUBMtFgtpwkOQz8sbZFAQEafQP+IMiofVZcrYZ+vUDX/l6rigtuj2p+q6gZ 6MkpdXVcEQyTzgwy+/wAR6DQg5SjejLaCUULUhW0jYuKIA4ECDeY0mEpiLmTB07e qnuOD2zhHKA1UetgBx19pR9KM+7MWd6pcsdkF9w/yfbKQf852+7mk3ZWsyAWavOt MAw2AzFnSJc= =l0SU -----END PGP SIGNATURE----- -- Jeffrey C. Ollie | Should Work Now (TM) Python Hacker, Mac Lover | ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From ted_horst@il.us.swissbank.com Wed Jan 8 00:52:01 1997 From: ted_horst@il.us.swissbank.com (Ted Horst) Date: Tue, 7 Jan 97 18:52:01 -0600 Subject: [PYTHON DB-SIG] API conformant sybase module Message-ID: <9701080052.AA18808@ch1d162nwk> I am looking at converting my sybase module to the Python Database API, and I could use a little clarification. First off, I am looking at the 1.0 version of the API dated April 9, 1996 that I got from http://www.python.org/sigs/database. Is this the most recent version ? The next issue is that I don't think that sybase supports cursors. The are optional in the spec, so I guess that it is OK just to skip it, but I am not sure how big of an issue this will be. I have several questions on the specifics of the API. 1) Is there a reason that closing a connection has to render it useless ? In my current module, I can connect and disconnect at will. This might even be useful if you wanted to simulate cursors with multiple connections but keep your actual connection count down. 2) I don't understand how the variable binding is supposed to work. Are the items in the tuple just replaced by the values or is there actually some sort of binding to a name in some namespace ? 3) How, when, and where are the DBI objects used ? 4) I assume that its ok to add methods, but what would people think about having settable attributes (eg. for the database or the connection info) ? 5) What would people think about replacing the connection function with something more generic. Rather than modulename(connection_string), have connect(**kw). Then some_module.connect(user = "ted", server = "local", ...) could work on different implementations and unused parameters could just be ignored. Also, have people written generic software on top of this API ? Is there a test suite ? This should be fairly straightforward once I understand the binding. Thanks, Ted Horst PS I have not been able to connect to ftp.digicool.com, so if someone can send me the oracle module, let me know. ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From Bertil_Reinhammar@ivab.se Wed Jan 8 10:02:20 1997 From: Bertil_Reinhammar@ivab.se (Bertil Reinhammar) Date: Wed, 8 Jan 1997 11:02:20 +0100 Subject: [PYTHON DB-SIG] API conformant sybase module In-Reply-To: <9701080052.AA18808@ch1d162nwk> (message from Ted Horst on Tue, 7 Jan 97 18:52:01 -0600) Message-ID: <199701081002.LAA19304@mughi.doceye.ivab.se> !!! Hi, nice to see some activity here ;-) > 1) Is there a reason that closing a connection has to render it useless ? In > my current module, I can connect and disconnect at will. This might even be > useful if you wanted to simulate cursors with multiple connections but keep > your actual connection count down. If you say db.close(), is this not a message that you explicitly wish to make the kill permanent ? In the Informix Module ( which I expect to release real soon now ) connections are handled by allowing the user to open as many connections as required, each giving an object (of course) and transparently making the connections dormant or current depending on which object is used. > 2) I don't understand how the variable binding is supposed to work. Are the > items in the tuple just replaced by the values or is there actually some sort > of binding to a name in some namespace ? Hmm, I understand this regards variable binding in execute() method, no ? (Sorry if I restate the obvious.) If so, the tuple in execute call provides values to be used. In the case of SELECT, the values in each row is returned as a tuple from fetch() method. In case one whish to be generic w.r.t. table, use description() method, which is supposed to return column descriptions in the same order as the result comes from fetch() method. > 3) How, when, and where are the DBI objects used ? There are e.g. BLOB objects. If I have a BLOB column in my table, I get a dbiRaw object element in my return tuple (if select) and I access BLOB data via the dbiRaw method value(). > 4) I assume that its ok to add methods, but what would people think about > having settable attributes (eg. for the database or the connection info) ? What do you have in mind ? > > 5) What would people think about replacing the connection function with > something more generic. Rather than modulename(connection_string), have > connect(**kw). Then some_module.connect(user = "ted", server = "local", ...) > could work on different implementations and unused parameters could just be > ignored. I like this. Please elaborate. > Also, have people written generic software on top of this API ? Is there a > test suite ? Test suite ? Not what I know of. Generic code ? No, there is a need of a standardized error report mechanism before this is even worth trying. As of now, when you try to write something non- trivial, you end up handling database specifics in the error reports either through direct error value fetch (which is not defined as yet) or parsing error string from exception (and parsing messages meant for humans to guide my software does not appeal to me). How can we do to improve ? Suggestion: 1) Analyse all possible errors and classify them into error comment ----- ------- DBERR_LOCK When some part of query failed due to some lock, timeout, deadlock. DBERR_QUERY Malformed queries. Syntax error, missing table. DBERR_DATA Insert fail due to constraints, duplicate keys etc. DBERR_SERVER Server down, connection lost etc. DBERR_ACCESS Authorization problems. DBERR_RESOURCE Memory full, disk full, blobspace full, licence constraints etc. DBERR_BUG Internal errors to be reported to vendor. DBERR_ERROR The rest that cannot be successfully mapped on the list above. 2) Upon exception, provide a tuple with Error class as of 1). String as reported from database and "this" API. For human consumption. Dictionary with error information such as the numeric code and (in case of informix) ISAM code. Some of these entries may be standardized, some left to the particular database API. Using a dictionary to report general information allows descriptive keys as to how to interpret the values. This allows a generic interface as long as the error handling keeps to a reason- able level of sophistication. When you require your software to be even smarter, I feel genericity falls as you then probably require knowledge about the under- lying components. We have used a model similar to this for some time in our proprietary libraries. hej/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Bertil Reinhammar IV DocEye AB (Combitech) phn. +46 13 200606 Teknikringen 9 fax. +46 13 214897 S-58330 Linköping bertil_reinhammar@ivab.se Sweden ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From david@alphagene.com Wed Jan 8 13:26:32 1997 From: david@alphagene.com (David Walton) Date: Wed, 08 Jan 1997 08:26:32 -0500 Subject: [PYTHON DB-SIG] API conformant sybase module References: <199701081002.LAA19304@mughi.doceye.ivab.se> Message-ID: <32D3A088.237C@alphagene.com> Hi, I don't mean to change the subject, but while there is some activity I'd like to ask a question. Why was it decided for the API to return a list of tuples? At my previous place of employment (the Jackson Labortory on the Mouse Genome Database Project), we were using one of the early Sybase modules, hacked to our own specification (read pre-DBAPI). It always seemed natural to us to have rows from the database returned as dictionaries. As a result when I started working with the mSQL module, and more recently the Oracle module (both of which are great btw! Thanks to those who developed them!) at my current job, it was a suprise to me to find that there was no option for returning a list of dictionaries. I altered the mSQL module to add this function (and notified it's maintainer), and added a python function to my own Oracle library, that acts as an interface to the Oracle module, to convert rows to dictionaries if the caller so desires. I don't mind the way I've done it, though it seems less efficient in the case of Oracle to be converting rows to a different data structure once in Python, instead of in the Oracle module itself. I would be interested in hearing why it isn't part of the API, and peoples' thoughts on the value of this functionality. I've always found it very useful. Thanks! Dave Walton Scientific Software Engineer AlphaGene, Inc. david@alphagene.com ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From gtc@cognos.informatics.jax.org Wed Jan 8 14:31:16 1997 From: gtc@cognos.informatics.jax.org (Glenn T. Colby) Date: Wed, 8 Jan 1997 09:31:16 -0500 (EST) Subject: [PYTHON DB-SIG] API conformant sybase module In-Reply-To: <32D3A088.237C@alphagene.com> Message-ID: Dave, This reminds me, I made a few changes to the sybasemodule that you were using at The Jackson Lab so that it returns tuples instead of dictionaries, along with one tuple that contains the column names. To my surprise, there was no performance gain/loss. It seems natural to return tuples to me now, since there is so much space overhead when returning a dictionary for each row in the database. Since our database has grown so much, we started running out of swap space, etc., when returning dictionaries. On top of our sybasemodule now we have a class that processes query results -- I found no real benefit in using dictionaries over tuples in that class. Doing a dictionary key lookup is not much different than doing some kind of enumeration. For now, our sybasemodule returns dictionaries, but in our next release it will return tuples because of space concerns. Cheers. --Glenn On Wed, 8 Jan 1997, David Walton wrote: > Hi, > > I don't mean to change the subject, but while there is some > activity I'd like to ask a question. Why was it decided for the > API to return a list of tuples? At my previous place of employment > (the Jackson Labortory on the Mouse Genome Database Project), we > were using one of the early Sybase modules, hacked to our own > specification (read pre-DBAPI). It always seemed natural to us > to have rows from the database returned as dictionaries. > > As a result when I started working with the mSQL module, and more > recently the Oracle module (both of which are great btw! Thanks to > those who developed them!) at my current job, it was a suprise > to me to find that there was no option for returning a list of > dictionaries. I altered the mSQL module to add this function > (and notified it's maintainer), and added a python function to > my own Oracle library, that acts as an interface to the Oracle > module, to convert rows to dictionaries if the caller so desires. > > I don't mind the way I've done it, though it seems less efficient > in the case of Oracle to be converting rows to a different data > structure once in Python, instead of in the Oracle module itself. > > I would be interested in hearing why it isn't part of the API, and > peoples' thoughts on the value of this functionality. I've always > found it very useful. > > Thanks! > > Dave Walton > Scientific Software Engineer > AlphaGene, Inc. > david@alphagene.com > > ================= > DB-SIG - SIG on Tabular Databases in Python > > send messages to: db-sig@python.org > administrivia to: db-sig-request@python.org > ================= > --Glenn +-------------------------+-------------------------------------------+ | Glenn T. Colby | Join the Python Software Activity! | | The Jackson Laboratory | | | gtc@informatics.jax.org | See this URL: http://www.python.org/psa/ | +-------------------------+-------------------------------------------+ ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From ted_horst@il.us.swissbank.com Wed Jan 8 20:57:57 1997 From: ted_horst@il.us.swissbank.com (Ted Horst) Date: Wed, 8 Jan 97 14:57:57 -0600 Subject: [PYTHON DB-SIG] API conformant sybase module References: <199701081002.LAA19304@mughi.doceye.ivab.se> Message-ID: <9701082058.AA18962@ch1d162nwk> Bertil Reinhammar wrote: !!! Hi, nice to see some activity here ;-) > 1) Is there a reason that closing a connection has to render it useless ? In > my current module, I can connect and disconnect at will. This might even be > useful if you wanted to simulate cursors with multiple connections but keep > your actual connection count down. If you say db.close(), is this not a message that you explicitly wish to make the kill permanent ? In the Informix Module ( which I expect to release real soon now ) connections are handled by allowing the user to open as many connections as required, each giving an object (of course) and transparently making the connections dormant or current depending on which object is used. I just don't see the point of having a method that renders an object completely useless, but keeps it around. I guess I will implement the close method as stated, but keep my connect and disconnect methods as well. > 2) I don't understand how the variable binding is supposed to work. Are the > items in the tuple just replaced by the values or is there actually some sort > of binding to a name in some namespace ? Hmm, I understand this regards variable binding in execute() method, no ? yep (Sorry if I restate the obvious.) If so, the tuple in execute call provides values to be used. In the case of SELECT, the values in each row is returned as a tuple from fetch() method. In case one whish to be generic w.r.t. table, use description() method, which is supposed to return column descriptions in the same order as the result comes from fetch() method. Sorry, but I still don't get it. Looking at the spec again, I may be missing something significant. I was assuming that the 'operation' arguement to execute is a string of arbitrary SQL. If this is the case, then I don't see any role for the 'params'. I also don't see any benefit to storing the string. I must be missing something, somebody please help ! A python code example would be very useful here. > 3) How, when, and where are the DBI objects used ? There are e.g. BLOB objects. If I have a BLOB column in my table, I get a dbiRaw object element in my return tuple (if select) and I access BLOB data via the dbiRaw method value(). > 4) I assume that its ok to add methods, but what would people think about > having settable attributes (eg. for the database or the connection info) ? What do you have in mind ? Currently I can do things like: >>> syb1 = Sybase(user = 'ted', server = 'local', database = 'emp', ... password = 'haha', interface = '/usr/sybase/interfaces') >>> syb1.connect() >>> syb1.database 'emp' >>> syb1.sql('select * from employee') [] >>> syb1.database = 'acct' >>> syb1.sql('select * from accounts') [] >>> syb1.user 'ted' >>> syb1.disconnect() >>> syb1.user = 'guest' >>> syb1.database = 'cat' >>> syb1.connect() >>> syb1.sql('select * from catalog') [] Right now this is done in a python wrapper class, and I guess it should just stay there. > > 5) What would people think about replacing the connection function with > something more generic. Rather than modulename(connection_string), have > connect(**kw). Then some_module.connect(user = "ted", server = "local", ...) > could work on different implementations and unused parameters could just be > ignored. I like this. Please elaborate. See the example above. We could come up with a standard set of keywords (user, password, server, etc) that would be common to most databases, but you could put whatever you wanted in the arguement list. The implementor would just pick out the things that are needed for their implementation, possibly providing defaults, and ignore the rest. > Also, have people written generic software on top of this API ? Is there a > test suite ? Test suite ? Not what I know of. Generic code ? No, there is a need of a standardized error report mechanism before this is even worth trying. As of now, when you try to write something non- trivial, you end up handling database specifics in the error reports either through direct error value fetch (which is not defined as yet) or parsing error string from exception (and parsing messages meant for humans to guide my software does not appeal to me). How can we do to improve ? Suggestion: 1) Analyse all possible errors and classify them into error comment ----- ------- DBERR_LOCK When some part of query failed due to some lock, timeout, deadlock. DBERR_QUERY Malformed queries. Syntax error, missing table. DBERR_DATA Insert fail due to constraints, duplicate keys etc. DBERR_SERVER Server down, connection lost etc. DBERR_ACCESS Authorization problems. DBERR_RESOURCE Memory full, disk full, blobspace full, licence constraints etc. DBERR_BUG Internal errors to be reported to vendor. DBERR_ERROR The rest that cannot be successfully mapped on the list above. 2) Upon exception, provide a tuple with Error class as of 1). String as reported from database and "this" API. For human consumption. Dictionary with error information such as the numeric code and (in case of informix) ISAM code. Some of these entries may be standardized, some left to the particular database API. Using a dictionary to report general information allows descriptive keys as to how to interpret the values. This allows a generic interface as long as the error handling keeps to a reason- able level of sophistication. When you require your software to be even smarter, I feel genericity falls as you then probably require knowledge about the under- lying components. We have used a model similar to this for some time in our proprietary libraries. OK, sounds fine. Ted Horst ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From jeff@ollie.clive.ia.us Thu Jan 9 05:25:10 1997 From: jeff@ollie.clive.ia.us (Jeffrey C. Ollie) Date: Wed, 08 Jan 1997 23:25:10 -0600 Subject: [PYTHON DB-SIG] Python mSQL Module v2.1 for mSQL 2.0 (and hopefully mSQL 1.X too!) Message-ID: <199701090525.XAA08809@worf.netins.net> -----BEGIN PGP SIGNED MESSAGE----- Since it seems that I'm the one with the time and the motivation, I've put the fruits of my hacking on the Python mSQL module up on my web server. Find the new source file and a short description of what I've done at: The new module SHOULD compile with both mSQL 1.X and mSQL 2.0bX. If you compile with mSQL 1.X you won't get the "listindex" method. I've only lightly tested with mSQL 2.0b2, Python 1.4, IRIX 5.3, and gcc 2.7.2.1. Very shortly, I'll be starting to use the module in a more demanding environment on Linux 2.0.27 and possibly HP-UX 10.X, so I may find some bugs. I might also work on a new version that implements the Python DB-SIG API, once I figure out just how that API works :). I hope this is useful to others. Please send bug reports and comments to . [A copy of the headers and the PGP signature follow.] Cc: db-sig@python.org, python-list@cwi.nl, msql-list@bunyip.com Date: Wed, 08 Jan 1997 23:25:10 -0600 From: "Jeffrey C. Ollie" Subject: Python mSQL Module v2.1 for mSQL 2.0 (and hopefully mSQL 1.X too!) To: Mark Shuttleworth , Piers Lauder , Christian Tismer , Anthony Baxter -----BEGIN PGP SIGNATURE----- Version: 2.6.2 Comment: AnySign 1.4 - A Python tool for PGP signing e-mail and news. iQCVAwUBMtSBOpwkOQz8sbZFAQE9YgP+JTaof9jPy6q4T7mvaZg6l2o8K1AIjwMi Y8FPn3EnwqGrqhwaHD9sYs3guxNF/7EfRnBx3uL8yMvJzA/HHQ/0wVbnahKMNOrg zs4PMpxf9xKtrUDzKoQpNIJj0wCL4lUP8ajomYaVVkQH90fxcEbLtXqzMnLMcyWQ qy4jOfPI4NM= =lYxX -----END PGP SIGNATURE----- -- Jeffrey C. Ollie | Should Work Now (TM) Python Hacker, Mac Lover | ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From Bertil_Reinhammar@ivab.se Thu Jan 9 08:02:22 1997 From: Bertil_Reinhammar@ivab.se (Bertil Reinhammar) Date: Thu, 9 Jan 1997 09:02:22 +0100 Subject: [PYTHON DB-SIG] API conformant sybase module In-Reply-To: <9701082058.AA18962@ch1d162nwk> (message from Ted Horst on Wed, 8 Jan 97 14:57:57 -0600) Message-ID: <199701090802.JAA02629@mughi.doceye.ivab.se> !!! Ted Horst: I just don't see the point of having a method that renders an object completely useless, but keeps it around. I see your point. However, one may wish to make the connection unusable BUT cannot control the invokation of __del__ just by killing the object reference since there may be others lingering. I have encountered exactly this problem where Python kept a reference after an exception and the __del__ method did not execute when I hoped for it. There are other similar situations. Sorry, but I still don't get it. Looking at the spec again, I may be missing something significant. I was assuming that the 'operation' arguement to execute is a string of arbitrary SQL. If this is the case, then I don't see any role for the 'params'. I also don't see any benefit to storing the string. I must be missing something, somebody please help ! A python code example would be very useful here. Of course, you can always construct a complete sql string but that less efficient in case you wish to do the same operation many times with new data. In this case the implementation can keep a record of the latest PREPAREd string and reuse the previous PREPARE result until a new string is supplied, saves time. Also, I like to be able to use data without being forced to convert into string format. In Tcl this is all the same but not in Python. Further, if I need to insert a BLOB (Binary Large OBject), I *require* the parameterized style. As an example of how I typically use the execute method: Python 1.4 (Nov 4 1996) [GCC 2.7.2.1] Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> import os, sys, informixdb >>> db = informixdb.informixdb("mydatabase@myserver") >>> c = db.cursor() >>> some_image_id = 72 >>> c.execute("select * from image where image_id = ?", (some_image_id,)) >>> c.description [('image_id', 'NUMBER', 4, 4, 0, 0, 1), ('locked_by', 'STRING', 16, 16, 0, 0, 1), ('locked_at', 'DATE', 3080, 3080, 0, 0, 1), ('img_state', 'STRING', 1, 1, 0, 0, 1), ('copy_of', 'NUMBER', 4, 4, 0, 0, 1)] >>> aRow = c.fetchone() >>> aRow (72L, '', , 'P', 0) >>> Hope this illuminates somewhat. > 4) I assume that its ok to add methods, but what would people think about > having settable attributes (eg. for the database or the connection info) ? What do you have in mind ? Currently I can do things like: >>> syb1 = Sybase(user = 'ted', server = 'local', database = 'emp', ... password = 'haha', interface = '/usr/sybase/interfaces') Yes, this is nice. Can you provide a list keywords you have found useful and describe the semantics ? >>> syb1.connect() >>> syb1.database 'emp' >>> syb1.sql('select * from employee') [] >>> syb1.database = 'acct' Ehrr, should one not simply create a new database object here ? >>> syb1.sql('select * from accounts') [] >>> syb1.user 'ted' >>> syb1.disconnect() >>> syb1.user = 'guest' >>> syb1.database = 'cat' >>> syb1.connect() >>> syb1.sql('select * from catalog') [] Hmm, and here ? It seems like a matter of style but I feel that this is not really the I like to use objects. hej/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Bertil Reinhammar IV DocEye AB (Combitech) phn. +46 13 200606 Teknikringen 9 fax. +46 13 214897 S-58330 Linköping bertil_reinhammar@ivab.se Sweden ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From david@alphagene.com Thu Jan 9 13:14:22 1997 From: david@alphagene.com (David Walton) Date: Thu, 09 Jan 1997 08:14:22 -0500 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts References: Message-ID: <32D4EF2E.167E@alphagene.com> Hey folks, May I assume that due to the lack of response (with the exception of Glenn Colby's, thanks for your comments Glenn), that there is no interest out there in Python database modules giving the *option* to return a list of dictionaries as an alternative to the list of tuples that are currently returned? Glenn, comments that at the Jackson Laboratory, while they are still having their Sybase module return lists of dictionaries, they are doing a rewrite that will cause it to return a list of tuples instead (like the API). Glenn also comments that he is setting up a SQL class that abstracts away the programmers knowledge of the schema, and that he can't see any use for returning a list of dictionaries due to the cost in the over-head of the datastructure. (part of what I just said may have come from mail that Glenn and I exchanged after his initial message. Glenn, please correct me if I misinterpreted your words.) This is fine for their needs. In our case however, we have no desire to abstract away the relational schema from our developers. We also find it very useful to be able to reference the results of a query by column name, instead of column number. It makes it possible for any code that is using a 'SELECT * FROM foo' to continue to function when a column is inserted to the schema for foo. And let me say, at this point in time this happens frequently to us, because our schema is still a moving target. I realize that it is possible to take the "description" provided by the API and associate the column names with the columns in my list of tuples. I do that now. I just thought it would be a little more efficient to have the module returning the rows to me as dictionaries in the first place. I'm very interested in knowing why folks find this option undesirable. (if they actually do) Glenn's reason being "there is so much space overhead when returning a dictionary for each row in the database.". However, for us this is not a concern (at least right now), and we would find the option useful. Thanks again for any input folks can contribute. Dave -- Dave Walton Scientific Software Engineer AlphaGene, Inc. david@alphagene.com ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From Bertil_Reinhammar@ivab.se Thu Jan 9 15:14:57 1997 From: Bertil_Reinhammar@ivab.se (Bertil Reinhammar) Date: Thu, 9 Jan 1997 16:14:57 +0100 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts In-Reply-To: <32D4EF2E.167E@alphagene.com> (message from David Walton on Thu, 09 Jan 1997 08:14:22 -0500) Message-ID: <199701091514.QAA14852@mughi.doceye.ivab.se> !!! [Long and well formulated text by David Walton cut] Sorry about the apparent disinterest. I'm just a slow starter. We once did a module doing what you want and found it quite agreeable. I vote in favor of your suggestion as long as we extend the API since code is likely to break otherwise. Maybe the idea requires some refinement as to how to choose return format etc. Glenn Colby wrote that they got problems with memory when using dictionaries but I don't see that this is implied by the dictionaries per se. Even if one have very long rows, one or a few of them should not be a big problem. What am I missing ? hej/ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Bertil Reinhammar IV DocEye AB (Combitech) phn. +46 13 200606 Teknikringen 9 fax. +46 13 214897 S-58330 Linköping bertil_reinhammar@ivab.se Sweden ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From tismer@tismer.com Thu Jan 9 15:23:53 1997 From: tismer@tismer.com (Christian Tismer) Date: Thu, 09 Jan 1997 16:23:53 +0100 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts References: <32D4EF2E.167E@alphagene.com> Message-ID: <32D50D89.732@tismer.com> Hey David, > Hey folks, > > May I assume that due to the lack of response (with the exception > of Glenn Colby's, thanks for your comments Glenn), that there is > no interest out there in Python database modules giving the *option* > to return a list of dictionaries as an alternative to the list > of tuples that are currently returned? [...deleted stuff...] > I'm very interested in knowing why folks find this option undesirable. > (if they actually do) You are right that it is to some convenience if you have the result rows of a query in dict form. But, as has been said here, the overhead is considerable. Two cases: 1) Your database is small. Then you can afford to do the dict view in Python, using something like # names=("name1", "name2", "name3") # row=("one","two","three") def asDict(names, row) : res = {} for i in range(len(names)) : res[names[i]] = row[i] return res 2) Your database is large. Then you definately will hunt for speed and optimize your row handling to use few space, and setup index arrays to select columns of, say, a large array of returned rows (tuples), and the names are very seldom necessary and can be found easily. Again, you would go with the above example which is Python standard temporary code one writes over and over. Moreover, there are a lot of other return structures possible and necessary, depending of the application, and it would be impossible to provide for every imagineable need. I could imagine that another one wants the result in the form [("Name1", "Name2", "Name3"), ("Value1", "Value2", "Value3)] and the next wants it split into key and nonkey name and so on... For that reason, and the fact that the dict version looses natural order, I would suggest to leave it. - chris ---------------------------------------------------------------------- Christian Tismer - tismer@appliedbiometrics.com Our support pages: Got a real operating system? No? Try at least a real language: Python & ---------------------------------------------------------------------- ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From jim.fulton@digicool.com Thu Jan 9 15:25:00 1997 From: jim.fulton@digicool.com (Jim Fulton) Date: Thu, 09 Jan 1997 10:25:00 -0500 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts References: <32D4EF2E.167E@alphagene.com> Message-ID: <32D50DCC.2FC@digicool.com> I'm afraid that I don't have enough time to participate in this list like I would like to. I haven't been able to follow this thread very closely, but I feel that I need to spend a little badwidth here to make some points I've made before, but that have probably been forgotten. 1. At one point, I thought there was agreement that data should be returned in Database tuple objects, not Python tuples. Database tuple objects provide sequence, mapping, and getattr interfaces. For example, assume that you have a table with columns a, b, and c. If I have a database tuple, t, then I can: t[1] # Get value of b for t t.b # ditto t['b'] # ditto t[1]=42 # Set value of b for t to 42 #(for database interfacess that allow this) t.b=42 # ditto t['b']=42 # ditto Database tuples store data internally as a C array of object pointers and a pointer to a shared meta-data object that has the data necessary to do name->column mappings. So the implementation is essentially as efficient as Python tuples and yet provides name-based interfaces. I have implemented this mechanism for several database interfaces I did at USGS, but these were never released. :-( I feel very strongly that this is the "right" way to model result data. 2. I don't particularly like the current database API. I favor an API that I consider far more natural. Unfortunately, I don't have time to develop a detailed proposal. Here is a high-level sketch: Replace "cursors" with database table objects. import spamdb db=spamdb.Database(whatever) # Create a database connection table=db.execute("select * from spam") # This creates a database # table object Note that, on systems that use cursors, table objects wrap cursors and make the cursor look like a sequence of database records. We don't load all of the data into memory at once. Rather, as we sequentially access the data, the data are fetched, one row at a time. The most common way to access the data will be via a for loop: for row in table: # Do stuff with the row, like, maybe printing out the values # for column a print row.a Note that we could combine the table creation with the for: for row in db.execute("select * from spam"): ... We can also do non-select statements, as in: db.execute("create table eggs ...") If you want to get all of the data at once, you can use a slice: datacopy=table[:] Or data=db.execute("select * from spam")[:] And we can compile parameterized statements for use later as functions: f=db.prepare("select * from spam where ni=%s") This creates a compiled statement that has a single string parameter. A database-independent string format should be used, IMO. Now the compiled statement is used like any other Python callable object: table=f("foo") or for t in f("bar") Databse connections may be closed explicitly: db.close() or implicitly when the database is GCed, as with file objects. My $0.02. Sorry it isn't more. Silence is not assent. -- Jim Fulton Digital Creations jim@digicool.com 540.371.6909 ## Python is my favorite language ## ## http://www.python.org/ ## ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From pa@tekla.fi Thu Jan 9 15:42:05 1997 From: pa@tekla.fi (Harri Pasanen) Date: Thu, 9 Jan 1997 17:42:05 +0200 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts In-Reply-To: <32D4EF2E.167E@alphagene.com> References: <32D4EF2E.167E@alphagene.com> Message-ID: <9701091542.AA13462@tahma.tekla.fi> David Walton writes: > Hey folks, > > May I assume that due to the lack of response (with the exception > of Glenn Colby's, thanks for your comments Glenn), that there is > no interest out there in Python database modules giving the *option* > to return a list of dictionaries as an alternative to the list > of tuples that are currently returned? > > Glenn, comments that at the Jackson Laboratory, while they are > still having their Sybase module return lists of dictionaries, they > are doing a rewrite that will cause it to return a list of tuples > instead (like the API). Glenn also comments that he is setting up > a SQL class that abstracts away the programmers knowledge of > the schema, and that he can't see any use for returning a list > of dictionaries due to the cost in the over-head of the datastructure. > (part of what I just said may have come from mail that Glenn and > I exchanged after his initial message. Glenn, please correct me > if I misinterpreted your words.) > > This is fine for their needs. In our case however, we have no desire > to abstract away the relational schema from our developers. We also > find it very useful to be able to reference the results of a query > by column name, instead of column number. It makes it possible for > any code that is using a 'SELECT * FROM foo' to continue to function > when a column is inserted to the schema for foo. And let me say, > at this point in time this happens frequently to us, because our > schema is still a moving target. > > I realize that it is possible to take the "description" provided > by the API and associate the column names with the columns in my > list of tuples. I do that now. I just thought it would be > a little more efficient to have the module returning the rows to me > as dictionaries in the first place. > > I'm very interested in knowing why folks find this option undesirable. > (if they actually do) > > Glenn's reason being "there is so much space overhead when returning > a dictionary for each row in the database.". However, for us this > is not a concern (at least right now), and we would find the > option useful. > > Thanks again for any input folks can contribute. > I think the space overhead criteria alone is sufficiently strong to do without a dictionary for each row. The proper approach in my mind is to wrap the tuple access inside a class, which can provide very natural access to each field. Possible interfaces to data, assuming row below is an instance of above mention wrapper class. row.field1 # access field1 using __getattr__, name of field is 'field1' row._tuple[0] # access field1 directly via tuple row.GetAttribute("field1") # access via class internal dict. If I remember, Jim Fulton proposed something like this long time ago, and I've personally implemented this kind of an interface. Works very nicely. Jim's proposal was more detailed than this quick sketch. I suggest you browse past archives for this sig, if those are available. Harri Pasanen ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From ct7@kaizen.net Thu Jan 9 16:51:07 1997 From: ct7@kaizen.net (W. Craig Trader) Date: Thu, 09 Jan 1997 11:51:07 -0500 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts References: <32D4EF2E.167E@alphagene.com> <32D50DCC.2FC@digicool.com> Message-ID: <32D521FB.41C6@kaizen.net> I haven't said anything earlier for pretty much the same reasons as Jim, and I don't like posting comments that can be summed up as "I agree" or "I disagree" without adding additional information. That said, I agree with Jim on all points, including the "silence is not consent". -- ct7@kaizen.net \Excellence can be attained if you... \ W. Craig Trader \Care more than others think is wise... \ Senior Software Engineer\Risk more than others think is safe... \ Kaizen Works, Inc. \Dream more than others think is practical...\ 703.733.2853 \Expect more than others think is possible. \ ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From klm@CNRI.Reston.Va.US Thu Jan 9 16:39:40 1997 From: klm@CNRI.Reston.Va.US (Ken Manheimer) Date: Thu, 9 Jan 1997 11:39:40 -0500 (EST) Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts In-Reply-To: <9701091542.AA13462@tahma.tekla.fi> Message-ID: On Thu, 9 Jan 1997, Harri Pasanen wrote: > I think the space overhead criteria alone is sufficiently strong to do > without a dictionary for each row. > [...] > If I remember, Jim Fulton proposed something like this long time ago, > and I've personally implemented this kind of an interface. Works very > nicely. Jim's proposal was more detailed than this quick sketch. > I suggest you browse past archives for this sig, if those are > available. Jim just posted a recap of his original proposal, but if you want to examine the archive you can get it as a plain-text (mailbox-format) file from the web: (If you look quickly, you'll find this message at the bottom!-) Ken Manheimer klm@cnri.reston.va.us 703 620-8990 x268 (orporation for National Research |nitiatives # If you appreciate Python, consider joining the PSA! # # . # ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From aaron_watters@msn.com Thu Jan 9 21:52:31 1997 From: aaron_watters@msn.com (aaron watters) Date: Thu, 9 Jan 97 21:52:31 UT Subject: [PYTHON DB-SIG] API conformant sybase module Message-ID: ---------- From: owner-db-sig@python.org on behalf of Ted Horst I just don't see the point of having a method that renders an object completely useless, but keeps it around. I guess I will implement the close method as stated, but keep my connect and disconnect methods as well. == Remember that objects can be shared, and, for example, in the case of a socket connection the other end of the conversation may go away in a way that it cannot come back when a socket is closed. Thus if a socket is closed it stays around in case there are other references in the program, but if the program tries to do anything with the socket (except close it again) it will get an exception. It makes no sense for a server to try to "reopen" a socket that was initiated by a client which doesn't exist anymore, for example... This may also make sense in certain formulations/applications/generalizations of databases, but I dunno. -- Aaron Watters ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From aaron_watters@msn.com Thu Jan 9 22:07:31 1997 From: aaron_watters@msn.com (aaron watters) Date: Thu, 9 Jan 97 22:07:31 UT Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts Message-ID: ---------- From: owner-db-sig@python.org on behalf of David Walton Sent: Thursday, January 09, 1997 8:14 AM To: db-sig@python.org; Glenn T. Colby Cc: Lyndon Hicks; R. Mark Adams; Rick Stacy Subject: Re: [PYTHON DB-SIG] DB-Modules returning lists of dicts Hey folks, May I assume that due to the lack of response (with the exception of Glenn Colby's, thanks for your comments Glenn), that there is no interest out there in Python database modules giving the *option* to return a list of dictionaries as an alternative to the list of tuples that are currently returned? I don't see any reason to not do it this way. Why not implement it this way and then provide a wrapper that provides the standard behaviour (conformant to Greg's spec).? Or visa versa. In fact write an alternative (competing) spec and provide standard wrapper classes that translate to the standard behaviour. Then I believe each alternate namedb module could probably be wrapped in reverse to mirror your functionality. That said, I kinda like the spec - simple, but not too simple, and not too sql-centric, in case something better comes up. * Aaron Watters (and thanks again to greg for coming up with *something* for a long time there was just blather, at least now there's *some* spec to complain about ;).) ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From aaron_watters@msn.com Thu Jan 9 22:10:55 1997 From: aaron_watters@msn.com (aaron watters) Date: Thu, 9 Jan 97 22:10:55 UT Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts Message-ID: On Thu, 9 Jan 1997, Harri Pasanen wrote: > I think the space overhead criteria alone is sufficiently strong to do > without a dictionary for each row. Just to be fair, I don't think the space thing is all that big a deal. Remember that the names can be shared, so we're talking about a few extra PyObject pointers (4x, but hey they're teeny) per tuple. -- Aaron Watters ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From morley@aaii.oz.au Thu Jan 9 23:09:46 1997 From: morley@aaii.oz.au (David Morley) Date: Fri, 10 Jan 1997 10:09:46 +1100 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts In-Reply-To: Your message of Thu, 09 Jan 1997 17:42:05 +0200 Message-ID: <199701092310.KAA25070@aaii.oz.au> Harri Pasanen wrote: > > I think the space overhead criteria alone is sufficiently strong to do > without a dictionary for each row. > > The proper approach in my mind is to wrap the tuple access inside a > class, which can provide very natural access to each field. > > Possible interfaces to data, assuming row below is an instance of > above mention wrapper class. > > row.field1 # access field1 using __getattr__, name of field is 'field1' > row._tuple[0] # access field1 directly via tuple > row.GetAttribute("field1") # access via class internal dict. > > If I remember, Jim Fulton proposed something like this long time ago, > and I've personally implemented this kind of an interface. Works very > nicely. Jim's proposal was more detailed than this quick sketch. > I suggest you browse past archives for this sig, if those are > available. > > Harri Pasanen When accessing an mSQL database, I use a similar interface, which I also find extremely useful. One difference is that I use the dict metaphor with attribute access as a shortcut, so row['field1'] and row.field1 accesses field1 of the row, row.keys() returns the field names, and row.values() (rather than row._tuple) returns the original tuple. What this means is that although you lose the ability to use attribute access for fields named values, keys, has_key, etc., you gain the ability to treat a record just like a dictionary if you want to. Another layer of abstraction (on top of the API) that I find useful when doing simple selects on individual tables is the following: server.keys() = list of databases handled by server server.mydb = server['mydb'] = Database representing database mydb server.mydb.table1 = server.mydb['table1'] = Table object representing table1 server.mydb.keys() = list of table names in database server.mydb.table1() = select * from table1 server.mydb.table1(id=47) = select * from table1 where id=47 server.mydb.table1(category='big', user='Fred', date=INCREASING) = select * from table1 where category="big" and user="Fred" order by date and so on, which provide a very compact, yet readable (to me at least :-), interface to tabular databases for things like CGI script. I can use the same interface to access databases stored in flat files as well. If you want, the idea can even be extended (although it is a bit more obscure) to fields if you want: server.mydb.table1.keys() = list of fields in table1 server.mydb.table1.id = server.mydb.table1['id'] = Field object representing field id of table table1 server.mydb.table1.user.keys() = list of values for field user in table1 server.mydb.table1.user['Fred'] = server.mydb.table1.user.Fred = select * from table1 where user="Fred" As you can see, I am addicted to this Dictionary + Attribute-access mixed-metaphor. David ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From gstein@microsoft.com Thu Jan 9 23:47:54 1997 From: gstein@microsoft.com (Greg Stein) Date: Thu, 9 Jan 1997 15:47:54 -0800 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts Message-ID: Below is the module that I use to handle the tuple returns, presenting them per Jim's speclet. The resulting objects can be viewed as a sequence, dictionary, or attributed object. It's quite slick :-) Returning a tuple is the most appropriate form for what databases actually return. Responsibility for applying high-level semantics is passed on up to the high-level code :-) The essence to most Python C extensions is "provide a match to the underlying C library and leave additional semantics to higher levels." The DBAPI has kind of a hard problem with this since C libraries for databases varies so much (can't just export ODBC or Oracle's OCI or Informix's ESQL/C, etc). So it tries to find a small, useful commonality across known database APIs, yet leave some room for flexibility. Note that it doesn't try to establish the *same* API for all databases, but to try to expose databases' APIs in a *similar* fashion. Creating an "anydbm" equivalent for the databases implemented according to the DBAPI is left to Python code. Regarding Ted's question on binding: the SQL statement is intended to have syntax of the form: "select * from some_table where some_col = :1". Where the :1 binds to the first value of the param tuple. Note in Bertil's reply, he used a question mark. The question mark is typical of ODBC (and, in turn, Informix). However, my experience is that it is much inferior to numeric bind variables (where the number corresponds to the positional param in the tuple). If I were to do things over, I'd probably allow named bind variables, too, and accept a dictionary for the params. Note that the spec enables efficient processing within the module, by allowing it to map numeric bind variables to the underlying database's need for question marks and caching the results (the comments in the spec regarding remembering the string object reference). If a different string object comes in, then you pre-process it and cache the two values. Lastly, regarding Jim's comments on new semantics for the API. I'd recommend that somebody implement that in Python to try it out. Should be easy enough. Here is some example code for construction of the high-level objects using the dtuple.py module: rows = cursor.execute(query, params) desc = dtuple.TupleDescriptor(map(lambda x: (string.lower(x[0]),) + x[1:], cursor.description)) for i in range(len(rows)): rows[i] = dtuple.DatabaseTuple(desc, rows[i]) return rows and dtuple.py: ---- """\ This module implements various functions and classes and constants for a generic database representation, implementation, and access. """ class TupleDescriptor: """\ Instances of this class are used to describe database tuples (which are typically instances of DatabaseTuple or one of its derivative classes). These instances specify the column names, formats, lengths, and other relevant information about the items in a particular tuple. An instance is typically shared between many database tuples (such as those returned by a single query). Note: the term database tuple is rather specific; in actuality the tuple may have come from non-database sources and/or generated by a process wholly unrelated to databases. Note again: I'm open for new names for this and the DatabaseTuple class andconcept :-) """ def __init__(self, desc): """\ An instance is created by passing a "descriptor" to fully specify the information about the related database tuple. This descriptor takes the form of a tuple or list where each element is a tuple. The first element of this tuple is the name of the column. The following elements of the tuple are used to describe the column (such as length, format, significant digits, etc). """ self.desc = tuple(desc) ### validate the names? self.names = map(lambda x: x[0], desc) self.namemap = { } for i in range(len(self.names)): self.namemap[self.names[i]] = i def __len__(self): """\ A tuple descriptor responds to __len__ to simplify some processing by allowing the use of the len() builtin function. """ return len(self.names) def __repr__(self): return '%s(%s)' % (TupleDescriptor.__name__, repr(self.desc)) def __str__(self): return str(self.desc) class DatabaseTuple: """\ Instances of this class are used to represent tuples of information, typically returned by a database query. A TupleDescriptor is used as a means of describing the information for a variety of access methods. The tuple's information can be accessed via simple indexing, slices, as a mapping where the keys are the column names (as defined by the descriptor), or via attribute-based access (where the attribute names are equivalent to the column names). This object acts as a tuple, a list, a mapping, and an instance. To retrieve "pure" tuples, lists, or mappings, the asTuple(), asList(), and asMapping() methods may be used, each returning a value equal to what this object pretends to be. There exists a potential ambiguity between attempting to act as a list or mapping and the attribute-based access to the data. In particular, if the column names are 'index', 'count', 'keys', 'items', 'values', or 'has_key', then the attribute-based access will have precedence over their related methods for lists and mappings. To actually use these methods, simply apply them to the result of the asList() or asMapping() methods. Note that column names with leading underscores may interfere with the implementation of this class, and as a result may not be accessible via the attribute-access scheme. Also, column names of asTuple, asList, and asMapping will be inaccessible via the attribute-access scheme since those will always represent the methods. To access these columns, the mapping interface can be used with the column name as the mapping key. Note that a database tuple acts as a tuple with respect to sub-scripted assignment. TypeError exceptions will be raised for several situations, and AttributeError may be raised for some methods that are intended to mutate the data (list's 'sort' method) as these methods have not been implemented. """ def __init__(self, desc, data): """\ A DatabaseTuple is initialized with a TupleDescriptor and a tuple or list specifying the data elements. """ if len(desc) != len(data): raise ValueError # descriptor does not seem to describe tuple if type(desc) == type(()) or type(desc) == type([]): desc = TupleDescriptor(desc) self.__dict__['_desc_'] = desc self.__dict__['_data_'] = tuple(data) def __str__(self): return str(self._data_) def __repr__(self): return '%s(%s,%s)' % (DatabaseTuple.__name__, repr(self._desc_), repr(self._data_)) def __cmp__(self, other): if type(self._data_) == type(other): return cmp(self._data_, other) if type(self._data_) == type( {} ): return cmp(self.asMapping(), other) if type(self._data_) == type( () ): return cmp(self.asTuple(), other) if type(self) == type(other): ### fix this: need to verify equal classes return cmp(self._data_, other._data_) return cmp(self._data_, other) def __getattr__(self, name): 'Simulate attribute-access via column names' return self._getvalue_(name) def __setattr__(self, name, value): 'Simulate attribute-access via column names' ### need to redirect into a db update raise TypeError, "can't assign to this subscripted object" def __getitem__(self, key): 'Simulate indexed (tuple/list) and mapping-style access' if type(key) == type(1): return self._data_[key] return self._getvalue_(key) def __setitem__(self, key, value): 'Simulate indexed (tuple/list) and mapping-style access' if type(key) == type(1): ### need to redirect into a db update of elem #key raise TypeError, "can't assign to this subscripted object" ### need to redirect into a db update of elem named key raise TypeError, "can't assign to this subscripted object" def __len__(self): return len(self._data_) def __getslice__(self, i, j): 'Simulate list/tuple slicing access' return self._data_[i:j] def __setslice__(self, i, j, list): 'Simulate list/tuple slicing access' ### need to redirect into a db update of elems raise TypeError, "can't assign to this subscripted object" def _keys_(self): "Simulate mapping's methods" return self._desc_.names def _has_key_(self, key): "Simulate mapping's methods" return key in self._desc_.names def _items_(self): "Simulate mapping's methods" return self.asMapping().items() def _count_(self, item): "Simulate list's methods" return self.asList().count(item) def _index_(self, item): "Simulate list's methods" return self.asList().index(item) def _getvalue_(self,name): 'Internal method for named-based value retrieval' if name not in self._desc_.names: if name == 'keys': return self._keys_ if name == 'items': return self._items_ if name == 'values': return self.asList if name == 'has_key': return self._has_key_ if name == 'count': return self._count_ if name == 'index': return self._index_ raise AttributeError return self._data_[self._desc_.namemap[name]] def asMapping(self): 'Return the "tuple" as a real mapping' value = { } for name, idx in self._desc_.namemap.items(): value[name] = self._data_[idx] return value def asTuple(self): 'Return the "tuple" as a real tuple' return self._data_ def asList(self): 'Return the "list" as a real mapping' return map(None, self._data_) ---- -- Greg Stein, Microsoft Corporation execfile("disclaimer.py") ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From P.S.Craig@durham.ac.uk Fri Jan 10 09:45:37 1997 From: P.S.Craig@durham.ac.uk (P.S.Craig@durham.ac.uk) Date: Fri, 10 Jan 1997 09:45:37 +0000 (GMT) Subject: [PYTHON DB-SIG] DB-Modules returning lists Message-ID: <3522.199701100945@kacmoody> Hi, I have been following this discussion with some interest and debating whether I should step in with my pennyworth. Here it is. About 4 months ago, I needed a python binding to ingres so that I could python/Tk to provide a GUI for some student attendance records which are entered by many people and which need to be accessible to the whole department. I decided to brew my own for a number of reasons: 1) I was unable to get hold of the standard stuff which people have been talking about because of connection problems to the ftp site. 2) I do not come from a database background. In fact, I am generally much more interested the Matrix-SIG than the DB-SIG. 3) I wanted to be able to do numerical computations with the results of queries without having to write lots of type conversion code. The solution I settled on was to use two-dimensional Numerical python arrays to hold the results of select queries. I can imagine that those reading this will immediately say that not all data is numerical and of course you are right. The secret is that Numerical python can handle multi-dimensional arrays of arbitrary python objects. There are several benefits: 1) The ability to perform binary operations with results of queries can be nice. 2) Often the 2-d array from a db actually represents a higher-dimensional structure. It's easy with numerical python to change the dimensionality of the returned array. Then we can access the data in its natural representation 3) The problem of naming the columns is easily solved by wrapping the numerical python array in a class which implements __getitem__ and __setitem__ to allow named columns. It's then easy to pick out data which satisfy additional criteria to those specified in the original query without returning to the db. 4) Numerical python is fairly efficient. A couple of slight problems: 1) Wrapping the numerical python arrays in another class means that binary operations don't work transparently. In principle, this can be resolved using the UserArray stuff for numerical python, but I haven't yet had the time. 2) I have dropped the whole cursor idea. All data from a query is lifted immediately into python in one gulp. Potentially disastrous for very large query results. Is this a big hassle in practice? Not for me, at any rate. The modified numerical python stuff is implemented in a module. I sort of intend this module to end up offering an array interface similar to S (a statistics and data analysis package). I also have a class based SQL interface to ingres. The basic classes are Connection, View, Alias, Table and a bunch of operator classes for handling query building. The module works well, but I am aware that it needs a number of extra classes to make it cleaner (Field, Query, etc). I feel irritated with myself for doing everything completely from scratch. I don't know if any of what I have described is interesting to others. If it is, I'll be pleased to show you my (lousy) code or tell you more about how it works. Peter Craig #--------------------------------------------------------------------# | E-mail: P.S.Craig@durham.ac.uk Telephone: +44-91-3742376 (Work) | | Fax: +44-91-3747388 +44-91-3860448 (Home) | | | | WWW: http://fourier.dur.ac.uk:8000/stats/psc.html | | | | Snail: Peter Craig, Dept. of Math. Sciences, Univ. of Durham, | | South Road, Durham DH1 3LE, England | #--------------------------------------------------------------------# ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From david@alphagene.com Fri Jan 10 13:04:49 1997 From: david@alphagene.com (David Walton) Date: Fri, 10 Jan 1997 08:04:49 -0500 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts References: Message-ID: <32D63E71.15FB@alphagene.com> I just want to thank everyone who responded to my question about having database modules return lists of dictionaries in addition to lists of tuples. I've value all points of view. I'm very interested in Jim Fulton's ideas regarding a database tuple type, or a more robust database object. I definitely think we will pursue something like this here. The points that came through most clearly though, were Greg Stein's: (for me anyway) > The essence to most Python C extensions is "provide a match to the > underlying C library and leave additional semantics to higher levels." and > Returning a tuple is the most appropriate form for what databases > actually return. Responsibility for applying high-level semantics is > passed on up to the high-level code :-) I also would like to apologize for my query in my follow-up message which asked if silence meant disinterest. Obviously it does not. We're all busy people, and I very much appreciate you all taking the time to respond. Dave -- Dave Walton Scientific Software Engineer AlphaGene, Inc. david@alphagene.com ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From gstein@microsoft.com Fri Jan 10 22:23:48 1997 From: gstein@microsoft.com (Greg Stein) Date: Fri, 10 Jan 1997 14:23:48 -0800 Subject: [PYTHON DB-SIG] DB-Modules returning lists of dicts Message-ID: >---------- >From: David Walton[SMTP:david@alphagene.com] >Sent: Friday, January 10, 1997 5:04 AM >To: db-sig@python.org >Cc: Rick Stacy; R. Mark Adams; Lyndon Hicks; gtc@informatics.jax.org >Subject: Re: [PYTHON DB-SIG] DB-Modules returning lists of dicts > >I'm very interested in Jim Fulton's ideas regarding a database >tuple type, or a more robust database object. I definitely think >we will pursue something like this here. The dtuple.py that I posted was based on Jim's ideas after a talk we had at a Python conference a year ago. He couldn't publish his code, so wrote the new stuff to match his needs/desires (and mine :-). It works well, but won't be as performant as a C-based implementation, which is what Jim had developed. -g ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org ================= From gstein@microsoft.com Fri Jan 10 22:28:46 1997 From: gstein@microsoft.com (Greg Stein) Date: Fri, 10 Jan 1997 14:28:46 -0800 Subject: [PYTHON DB-SIG] DB-Modules returning lists Message-ID: heh... Jim Fulton and I discussed using the matrix module at one point, along with Jim Huginin. We punted, though, because of the need to fetch everything and wanting to reduce dependencies. A comment: you mentioned how you typically just want to fetch everything, regardless. The fetchall() method exists because I got that same feedback from a number of people at the conference a while back (in particular, Benny Shomer was particularly persuasive :-). Most of my usage is actually fetchmany() called with a maximum number of rows that I want. If you could post your classes, even in their current form, then I bet somebody will find some use for them. I once mentioned a couple years ago: post what you have. Never wait "to clean it up" because they will never be cleaned up to your satisfaction, so it will never get posted, so nobody will ever be able to benefit from your contribution. :-) -g >---------- >From: P.S.Craig@durham.ac.uk[SMTP:P.S.Craig@durham.ac.uk] >Sent: Friday, January 10, 1997 1:45 AM >To: db-sig@python.org >Subject: Re: [PYTHON DB-SIG] DB-Modules returning lists > >Hi, > >I have been following this discussion with some interest and debating >whether I should step in with my pennyworth. Here it is. > >About 4 months ago, I needed a python binding to ingres so that I >could python/Tk to provide a GUI for some student attendance records >which are entered by many people and which need to be accessible to >the whole department. > >I decided to brew my own for a number of reasons: >1) I was unable to get hold of the standard stuff which people have > been talking about because of connection problems to the ftp site. >2) I do not come from a database background. In fact, I am generally > much more interested the Matrix-SIG than the DB-SIG. >3) I wanted to be able to do numerical computations with the results > of queries without having to write lots of type conversion code. > >The solution I settled on was to use two-dimensional Numerical python >arrays to hold the results of select queries. I can imagine that those >reading this will immediately say that not all data is numerical and >of course you are right. The secret is that Numerical python can >handle multi-dimensional arrays of arbitrary python objects. > >There are several benefits: > >1) The ability to perform binary operations with results of queries > can be nice. >2) Often the 2-d array from a db actually represents a > higher-dimensional structure. It's easy with numerical python to > change the dimensionality of the returned array. Then we can access > the data in its natural representation >3) The problem of naming the columns is easily solved by wrapping the > numerical python array in a class which implements __getitem__ and > __setitem__ to allow named columns. It's then easy to pick out data > which satisfy additional criteria to those specified in the > original query without returning to the db. >4) Numerical python is fairly efficient. > >A couple of slight problems: > >1) Wrapping the numerical python arrays in another class means that > binary operations don't work transparently. In principle, this can > be resolved using the UserArray stuff for numerical python, but I > haven't yet had the time. >2) I have dropped the whole cursor idea. All data from a query is > lifted immediately into python in one gulp. Potentially disastrous > for very large query results. Is this a big hassle in practice? > Not for me, at any rate. > >The modified numerical python stuff is implemented in a module. I sort >of intend this module to end up offering an array interface similar to >S (a statistics and data analysis package). > >I also have a class based SQL interface to ingres. The basic classes >are Connection, View, Alias, Table and a bunch of operator classes for >handling query building. The module works well, but I am aware that it >needs a number of extra classes to make it cleaner (Field, Query, >etc). > >I feel irritated with myself for doing everything completely from >scratch. I don't know if any of what I have described is interesting >to others. If it is, I'll be pleased to show you my (lousy) code or >tell you more about how it works. > >Peter Craig > >#--------------------------------------------------------------------# >| E-mail: P.S.Craig@durham.ac.uk Telephone: +44-91-3742376 (Work) | >| Fax: +44-91-3747388 +44-91-3860448 (Home) | >| | >| WWW: http://fourier.dur.ac.uk:8000/stats/psc.html | >| | >| Snail: Peter Craig, Dept. of Math. Sciences, Univ. of Durham, | >| South Road, Durham DH1 3LE, England | >#--------------------------------------------------------------------# > >================= >DB-SIG - SIG on Tabular Databases in Python > >send messages to: db-sig@python.org >administrivia to: db-sig-request@python.org >================= > ================= DB-SIG - SIG on Tabular Databases in Python send messages to: db-sig@python.org administrivia to: db-sig-request@python.org =================