From aprotin at research.att.com Wed May 2 14:47:45 2007 From: aprotin at research.att.com (Art Protin) Date: Wed, 02 May 2007 08:47:45 -0400 Subject: [DB-SIG] Fetch_raw Message-ID: <46388871.8070402@research.att.com> Dear folks, I am considering yet another extension to my driver. I especially like the way that the methods cursor.fetch...() return lists of objects of the right type and the user of the API (the application) does not need separate methods, one for each type, to fetch individual columns. However, in the debugging of our JDBC driver I find that there is some merit in getting the data, any of the data, as strings, which just happens to be the format that our database uses to pass them. Thus, there are two aspects where you could now influence my implementation: 1) any of you could argue that it is a corruption of the API that will lead to bad results or bad practices to include a mechanism for accessing the "raw" data; OR 2) any of you could argue about the form that such an extension should take. I am torn between adding an additional optional parameter to the methods .fetchone(), .fetchmany(), and .fetchall(), and adding three new methods: .fetch_one_raw(), .fetch_many_raw() and .fetch_all_raw(). (All of the names I used for extension methods and attributes have an internal underbar so my users will be able to easily tell that they are extensions.) In the first case, the optional new parameter would indicate whether the data should be passed back normal or "raw", ie. with all the columns forced to be of type "string". In the second case that information would be coded into the choice of method. I feel that even having the three methods, fetchone(), fetchmany(), and fetchall(), is too many (and too like JDBC) and would have written the spec (PEP 249) to have just fetch() with (an) argument(s) for the count (and any other needed parameters) to adjust the behavior. However, now that the spec calls for three, should I follow that pattern and go on to providing six or buck it by adding an optional parameter to each of the three? Thank you all, Arthur Protin From mal at egenix.com Wed May 2 15:21:03 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 02 May 2007 15:21:03 +0200 Subject: [DB-SIG] Fetch_raw In-Reply-To: <46388871.8070402@research.att.com> References: <46388871.8070402@research.att.com> Message-ID: <4638903F.7090203@egenix.com> On 2007-05-02 14:47, Art Protin wrote: > Dear folks, > I am considering yet another extension to my driver. I especially > like the way that the methods cursor.fetch...() return lists of objects > of the right type and the user of the API (the application) does not > need separate methods, one for each type, to fetch individual columns. > However, in the debugging of our JDBC driver I find that there is > some merit in getting the data, any of the data, as strings, which just > happens to be the format that our database uses to pass them. > Thus, there are two aspects where you could now influence my > implementation: > > 1) any of you could argue that it is a corruption of the API that will > lead to bad results or bad practices to include a mechanism for accessing > the "raw" data; OR > 2) any of you could argue about the form that such an extension should > take. > > I am torn between adding an additional optional parameter to the > methods .fetchone(), .fetchmany(), and .fetchall(), and adding three > new methods: .fetch_one_raw(), .fetch_many_raw() and .fetch_all_raw(). > (All of the names I used for extension methods and attributes have an > internal underbar so my users will be able to easily tell that they are > extensions.) In the first case, the optional new parameter would > indicate whether the data should be passed back normal or "raw", > ie. with all the columns forced to be of type "string". In the > second case that information would be coded into the choice of method. > I feel that even having the three methods, fetchone(), fetchmany(), and > fetchall(), is too many (and too like JDBC) and would have written the > spec (PEP 249) to have just fetch() with (an) argument(s) for the count > (and any other needed parameters) to adjust the behavior. However, > now that the spec calls for three, should I follow that pattern and go on > to providing six or buck it by adding an optional parameter to each of the > three? Why don't you simply add a configuration parameter to the connection/cursor that sets the type mapping for all data types to strings ?! That way you avoid having to change your other code completely. BTW, I still have to follow up to the type registry discussion. Sorry, but I simply didn't have time to read everything yet. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 02 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten at uniqsys.com Wed May 2 15:25:55 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 02 May 2007 09:25:55 -0400 Subject: [DB-SIG] Fetch_raw In-Reply-To: <46388871.8070402@research.att.com> References: <46388871.8070402@research.att.com> Message-ID: <1178112355.3394.35.camel@dot.uniqsys.com> On Wed, 2007-05-02 at 08:47 -0400, Art Protin wrote: > [...]I find that there is > some merit in getting the data, any of the data, as strings, which just > happens to be the format that our database uses to pass them. > Thus, there are two aspects where you could now influence my > implementation: > > 1) any of you could argue that it is a corruption of the API that will > lead to bad results or bad practices to include a mechanism for accessing > the "raw" data; OR I'm not going to argue that. Allowing access to "raw" data may be beneficial for performance and meaningful to the developer. For example, my InformixDB module allows retrieving values of User-Defined Types in binary form. I think this is perfectly acceptable as long as the developers are consenting adults who know what to do with the raw data and understand that they're welding their application to a particular database and API implementation. > 2) any of you could argue about the form that such an extension should > take. And we will ;) > I am torn between adding an additional optional parameter to the > methods .fetchone(), .fetchmany(), and .fetchall(), and adding three > new methods: .fetch_one_raw(), .fetch_many_raw() and .fetch_all_raw(). Adding an optional parameter is in my opinion way better than making three new methods. Alternatively or additionally, you could add an attribute to the cursor object for controlling how fetch results are returned. -Carsten From mike_mp at zzzcomputing.com Thu May 3 02:03:05 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Wed, 2 May 2007 20:03:05 -0400 Subject: [DB-SIG] Fetch_raw In-Reply-To: <46388871.8070402@research.att.com> References: <46388871.8070402@research.att.com> Message-ID: On May 2, 2007, at 8:47 AM, Art Protin wrote: > I am considering yet another extension to my driver. I especially > like the way that the methods cursor.fetch...() return lists of > objects > of the right type and the user of the API (the application) does not > need separate methods, one for each type, to fetch individual columns. > However, in the debugging of our JDBC driver I find that there is > some merit in getting the data, any of the data, as strings, which > just > happens to be the format that our database uses to pass them. > Thus, there are two aspects where you could now influence my > implementation: > since its essentially an exposure of the implementation details of the underlying network conversation, and the use case youve found in JDBC is one of debugging, why not instead allow a hook for a debugging function that intercepts network traffic in response to normal fetchone()/fetchXXX() calls ? i cant see any reason an application would want to receive the network traffic directly under normal use (since the details of network conversations change with new releases and backends). From phd at phd.pp.ru Thu May 3 15:29:43 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 3 May 2007 17:29:43 +0400 Subject: [DB-SIG] SQLObject 0.7.6 Message-ID: <20070503132943.GC9945@phd.pp.ru> Hello! I'm pleased to announce the 0.7.6 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.7.6 News and changes: http://sqlobject.org/docs/News.html What's New ========== News since 0.7.5 ---------------- Bug Fixes --------- * Fixed a longstanding bug with .select() ignoring 'limit' parameter. * Fixed a bug with absent comma in JOINs. * Fixed sqlbuilder - .startswith(), .endswith() and .contains() assumed their parameter must be a string; now you can pass an SQLExpression: Table.q.name.contains(func.upper('a')), for example. * Fixed a longstanding bug in sqlbuilder.Select() with groupBy being a sequence. * Fixed a bug with Aliases in JOINs. * Yet another patch to properly initialize MySQL connection encoding. * Fixed a minor comparison problem in test_decimal.py. Docs ---- * Added documentation about 'validator' Col constructor option. * More documentation about orderBy. For a more complete list, please see the news: http://sqlobject.org/docs/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Thu May 3 15:44:43 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 3 May 2007 17:44:43 +0400 Subject: [DB-SIG] SQLObject 0.8.3 Message-ID: <20070503134443.GC10781@phd.pp.ru> Hello! I'm pleased to announce the 0.8.3 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.8.3 News and changes: http://sqlobject.org/News.html What's New ========== News since 0.8.2 ---------------- Bug Fixes --------- * Fixed a longstanding bug with .select() ignoring 'limit' parameter. * Fixed a bug with absent comma in JOINs. * Fixed sqlbuilder - .startswith(), .endswith() and .contains() assumed their parameter must be a string; now you can pass an SQLExpression: Table.q.name.contains(func.upper('a')), for example. * Fixed a longstanding bug in sqlbuilder.Select() with groupBy being a sequence. * Fixed a bug with Aliases in JOINs. * Yet another patch to properly initialize MySQL connection encoding. * Fixed a minor comparison problem in test_decimal.py. Docs ---- * Added documentation about 'validator' Col constructor option. * Added an answer and examples to the FAQ on how to use sqlmeta.createSQL. * More documentation about orderBy. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From sable at users.sourceforge.net Thu May 3 15:46:38 2007 From: sable at users.sourceforge.net (=?ISO-8859-1?Q?S=E9bastien_Sabl=E9?=) Date: Thu, 03 May 2007 15:46:38 +0200 Subject: [DB-SIG] Sybase module 0.38 released Message-ID: <4639E7BE.8050804@users.sourceforge.net> WHAT IS IT: The Sybase module provides a Python interface to the Sybase relational database system. It supports all of the Python Database API, version 2.0 with extensions. The module is available here: http://downloads.sourceforge.net/python-sybase/python-sybase-0.38.tar.gz The module home page is here: http://python-sybase.sourceforge.net/ CHANGES SINCE 0.38pre2: * Corrected bug in databuf_alloc: Sybase reports the wrong maxlength for numeric type - verified with Sybase 12.5 - thanks to patch provided by Phil Porter MAJOR CHANGES SINCE 0.37: * This release works with python 2.5 * It also works with sybase 15 * It works with 64bits clients * It can be configured to return native python datetime objects * The bug "This routine cannot be called because another command structure has results pending." which appears in various cases has been corrected * It includes a unitary test suite based on the dbapi2.0 compliance test suite From aprotin at research.att.com Thu May 3 21:04:29 2007 From: aprotin at research.att.com (Art Protin) Date: Thu, 03 May 2007 15:04:29 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> Message-ID: <463A323D.1020301@research.att.com> Dear folks, I just spent the morning reviewing the dialog on this thread without completely abandoning thoughts on my other thread, "Fetch_raw". On that other thread, inspite of this fragment of the conversation (Carsten Haese responding to me): >> 1) any of you could argue that it is a corruption of the API that will >> lead to bad results or bad practices to include a mechanism for accessing >> the "raw" data; OR > > > >I'm not going to argue that. Allowing access to "raw" data may be >beneficial for performance and meaningful to the developer. > I found that supportive feedback made me realize that my approach was just another special case of this more general issue, and should be discarded. I failed to find a solution yet to the need expressed in this thread but believe that I have found a dozen key issues that define the problem (although when I enumerate them below, I may combine or subdivide them slightly differently). Further, remember as you read this that this is my view and I tend to overstate my case. First, I believe that nothing returned from a call to cursor.fetchXXX() should be of a type bound to the API/interface. I recognize that this is a difficult constraint but the values need to usable by some application written in Python and may even need to written to a different DBMS. I see the role of the interface to make the data available in pure Python form. (So this argues against the proposal by Carsten Haese, I think.) This view says then that the types used by SQL and various DBMS are "foreign types" (foreign to Python). (I think tthat this generalizes to a claim that the view of the API is from Python looking out - queries go out and results come back.) There is an issue with representing the identities of foreign types in Python. It may not prove to be difficult in practice but I think we need to be aware of the issue to keep from making a big problem out of it. I use strings to represent (the names/identifiers of) both SQL and native database types. There is a further issue with types in that SQL data types do not map fully to either Python type nor to the native types of some of DBMSs. My driver+DBMS allows SQL queries to reference the native types and all the results are expressed in native type. Default conversions are done based not on the SQL types reported in cursor.description but are based on the native types reported in the metadata of the result. There seems to be a lot of confusion (that I especially see in our JDBC driver and its applications) about the difference between the tables (and columns) in the database and the results from a query. My DBMS creates in effect an anonymous (& temporary) table with the results of a query and the types of the columns in the result are not necessarily the exact same types as the column in db that went into the result. Yet I find some applications that use JDBC use the type gotten from the description of the original table and not the type given with the result. Some of the comments in this thread seem to reflect this confusion although I may be mistaken. While my interface uses strings to represent the types, those strings are formatted to express structure, specifically a major type and a minor or sub type. I doubt that my DBMS is unique with regard to having a hierarchy of type information although it may be that none yet go beyond two levels. It may even be the case that someone has purely othogonal scales that define their type space. We need to consider how to encode the taxonomy of types with enough flexibility to handle most if not all players. I suspect that all the various type taxonomies can be accomodated with merely two levels by defining an abstract main type, with a method to derive it from the full type, and the full type. A mapping scheme would operate at the abstracted main type and the conversion routines could branch out as needed. The selection of the abstracted main types would be an implemeters trade-off of grouping common functionality verses the bushiness of special handling The role of my sub types in my collection of conversion routines varies widely, from their specifying a trivial parameter, such as the maximum length of a string, to their specifying formats and reference points, as the subtypes of date do. My DBMS even has types that possible should be the same major type but aren't. Thus, in general, I find that, rather than code up thousands of similar routines, I have a handfull of conversion routines that need to know the full type representation (ie both major and sub types) in order to do the proper conversion. I suspect that this should be a common feature, that the conversion routines put into a "mapping" structure always have a "signature" (argument list) that accepts the relevent type information as well as the data to be converted. With a good enough scheme for type groups and full types (what I was calling abstracted main types and full types) the mapping for conversions to and from foreign types ought to become fairly simple. A device not unlike a dictionary would associate a type group with two routines, one to export a python value to the DBMS (for query parameters or inserts or updates) and one to import a value from the DBMS. The import routines would take a raw value and (the representation of) a full (foreign) type and convert the value into a pure Python value of the appropriate type (it should "know" what type that is). The export routine should take a Python value and (the representation of) a full foreign type and again do the right thing. The routine can be written to ignore the foreign type, relying on the method that "abstracted" the type into a type grouping to sufficiently distinguish the types and fully select the conversion, or at the other extreme could handle every type and subtype and subsubtype in the DBMS such that there is only the one pair of routines in the mapping structure and one type group. There are reasons why this mapping structure should be at the module level and inherited by the connection and cursor in turn. Disturbingly, I even see worth in separately tying it to either the statement or the results. (I do implement a query object in my driver that corresponds to the SQL statement and could thus separate them.) I even see the worth in tying it to specific columns as exemplified by the jpeg and pickle case. Given the current structure of the API, I think that the setting on the cursor should stay until reset. If you know enough about things to set it up you should know when to reset it. I do not remember who commented on the distastefulness of messing with the SQL queries that go through the interface, but I feel I need to comment on it as well. As ugly as it such tampering is, my driver has no recourse but to alter the SQL. It is the only way that I can support SQL parameters. That said, I strive to make the fewest, and smallest changes to the SQL. Looking back on this much too long posting, I am struck by the realization that 1) there seems to be very little need to extend the support for mapping SQL types to Python types (there are so few of them and they don't change much); 2) the very hard part of supporting mapping from DBMS native types to Python types in a generic manner is that DBMS native types aren't generic. ( and 3) I do not need to add public access to the raw data - people debugging things will have to poke around under the hood or complain to me about it.) I hope that this was worth what ever time you took reading it. Thank you all, Arthur Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070503/3253dc60/attachment.htm From carsten at uniqsys.com Sun May 6 05:00:17 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 05 May 2007 23:00:17 -0400 Subject: [DB-SIG] Type mapping proposal, revision 2 Message-ID: <1178420417.3119.7.camel@localhost.localdomain> Hiya everybody, I'd like to pick up the discussion on "Controlling return types" again. I have incorporated feedback I received in response to my first type mapping proposal into http://www.uniqsys.com/~carsten/typemap.html . Comments are appreciated. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From carsten at uniqsys.com Sun May 6 06:23:42 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sun, 06 May 2007 00:23:42 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <463A323D.1020301@research.att.com> References: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <463A323D.1020301@research.att.com> Message-ID: <1178425422.4680.5.camel@localhost.localdomain> On Thu, 2007-05-03 at 15:04 -0400, Art Protin wrote: > First, I believe that nothing returned from a call to > cursor.fetchXXX() > should be of a type bound to the API/interface. I recognize that this > is > a difficult constraint but the values need to usable by some > application > written in Python and may even need to written to a different DBMS. > I see the role of the interface to make the data available in pure > Python > form. (So this argues against the proposal by Carsten Haese, I think.) I agree, and I have changed my proposal significantly to remove the "welding" of output data to a specific interface. However, I don't agree with the absoluteness of the assertion that "nothing returned from a call to cursor.fetchXXX() should be of a type bound to the API/interface." For example, Smart Blobs from an Informix database wouldn't have any meaning in any other database, so I think it's acceptable that they are returned as objects from the informixdb module. However, as long as the statement is understood to be about standard SQL types, I have no problem with it. > [lengthy discussion of main types and sub types...] In the context of my type mapping proposal, for output mapping it is the API's responsibility to convey enough information in the type key so that the mapping can determine which adapter to call. For your database, this might be a tuple of main type and subtype. The application is free to use a mapping that dispatches to an adapter function solely based on the main type and have the adapter branch on the subtype, or use a mapping that uses both pieces of information for making the choice of adapter function. > [export and import functions...] Associating an import function *and* an export function with the database type information is problematic. I don't have a problem with what you call import functions. They seem to serve the same purpose as the adapter functions in the outputmap in revision 2 of my proposal. (Your naming of mapping directions is from the application's point of view, whereas mine is from the database's point of view following the semantics established by the naming for the setinputsizes and setoutputsizes methods.) On the other hand however, coupling an import function with an export function that's looked up based on the database-side type information is not a good idea, for various reasons. In general, when binding parameters to a query, the parameter need not be destined for a database column. Parameters can appear in the WHERE clause of a select statement, and it's not guaranteed that the underlying database has any way of finding out what type of datum should be bound to such a parameter. Also, even if this information were available, the process for binding an application object to, say, a character column would depend very much on what kind of application object it is. A string can be passed on verbatim, a unicode object needs to be encoded, and a Geometry object might need to be translated into OpenGIS Well-Known-Text format. If the export function were looked up based on database type, those three cases would all have to be handled by the same export function, which seems utterly ridiculous to me. It makes much more sense to map input parameters based on the type of Python object that is provided as the parameter value, and that's the inputmap that I'm proposing. That way you get two separate mappings for the symmetric purposes of: * Given a value and type from the database, map it to an application object, and * Given a value and type from the application, map it to a database object. I don't mind supplying information about the database-side type to the adapter functions as optional parameters if this information is available, but determining the export function based on this is in my opinion not feasible for the reasons I stated above. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From daloonia at gmx.de Mon May 7 09:32:14 2007 From: daloonia at gmx.de (Marion Balthasar) Date: Mon, 07 May 2007 09:32:14 +0200 Subject: [DB-SIG] pyPgSQL for Python2.5 ? Message-ID: <20070507073214.181680@gmx.net> Hello list, does anyone know if I can get a precompiled win32-binary Package of pyPgSQL for Python 2.5 ? At the homepage there are only ones for 2.4 and earlier, also a platformindependent source package (which I also tried to build, but I'm using mingw32 development environment, and it doesn't work with this one...) Has anyone an idea, if and where I can get an accurate pyPgSQL Module? I really need to work with Python 2.5! There is no way to avoid this for my topic of work.. Thanks in advance, Marion. -- "Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ... Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail From aprotin at research.att.com Mon May 7 15:53:23 2007 From: aprotin at research.att.com (Art Protin) Date: Mon, 07 May 2007 09:53:23 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1178425422.4680.5.camel@localhost.localdomain> References: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <463A323D.1020301@research.att.com> <1178425422.4680.5.camel@localhost.localdomain> Message-ID: <463F2F53.6070003@research.att.com> Dear Carsten, et al., Carsten Haese wrote: >On Thu, 2007-05-03 at 15:04 -0400, Art Protin wrote: > > >>First, I believe that nothing returned from a call to >>cursor.fetchXXX() >>should be of a type bound to the API/interface. I recognize that this >>is >>a difficult constraint but the values need to usable by some >>application >>written in Python and may even need to written to a different DBMS. >>I see the role of the interface to make the data available in pure >>Python >>form. (So this argues against the proposal by Carsten Haese, I think.) >> >> > >I agree, and I have changed my proposal significantly to remove the >"welding" of output data to a specific interface. However, I don't agree >with the absoluteness of the assertion that "nothing returned from a >call to cursor.fetchXXX() should be of a type bound to the >API/interface." For example, Smart Blobs from an Informix database >wouldn't have any meaning in any other database, so I think it's >acceptable that they are returned as objects from the informixdb module. > >However, as long as the statement is understood to be about standard SQL >types, I have no problem with it. > > As I have said (or should have said) "I overstate my case(s)". However, I still think this is a good first order approximation. Database specific information can almost never be handled in a generic way and what can be made generic should not be tied to types in the API. (This opens the question to how to design DBMS specific features in a way that maximizes transference of training for our users. Or is that what we were already working on?) > > >>[lengthy discussion of main types and sub types...] >> >> > >In the context of my type mapping proposal, for output mapping it is the >API's responsibility to convey enough information in the type key so >that the mapping can determine which adapter to call. > How does this contrast with what I said? > For your database, >this might be a tuple of main type and subtype. The application is free >to use a mapping that dispatches to an adapter function solely based on >the main type and have the adapter branch on the subtype, or use a >mapping that uses both pieces of information for making the choice of >adapter function. > > Sorry. Does "have the adapter branch on the subtype" mean that you accept the need for the type information to be passed as a second argument to the adapter function? [Looks like you say that below.] > > >>[export and import functions...] >> >> > >Associating an import function *and* an export function with the >database type information is problematic. I don't have a problem with >what you call import functions. They seem to serve the same purpose as >the adapter functions in the outputmap in revision 2 of my proposal. >(Your naming of mapping directions is from the application's point of >view, whereas mine is from the database's point of view following the >semantics established by the naming for the setinputsizes and >setoutputsizes methods.) > > Sorry. Both the .setinputsizes() and the .setoutputsizes() methods are implemented identically on my system -- they do nothing. Viewing the API from the database is wrong in only the most subtle of ways. The purpose of the API is to unite us all in specifying common functionality for Python users. Thus, when we work on the spec. we really need to view our personal DBMS as the outside. This will help to produce an API that is Pythonic and that our user base (our Python user base) will be comfortable with and thereby will be productive with. >On the other hand however, coupling an import function with an export >function that's looked up based on the database-side type information is >not a good idea, for various reasons. In general, when binding >parameters to a query, the parameter need not be destined for a database >column. Parameters can appear in the WHERE clause of a select statement, >and it's not guaranteed that the underlying database has any way of >finding out what type of datum should be bound to such a parameter. > > Good point! >Also, even if this information were available, the process for binding >an application object to, say, a character column would depend very much >on what kind of application object it is. A string can be passed on >verbatim, a unicode object needs to be encoded, and a Geometry object >might need to be translated into OpenGIS Well-Known-Text format. If the >export function were looked up based on database type, those three cases >would all have to be handled by the same export function, which seems >utterly ridiculous to me. > > I had a problem seeing the forest for the trees. Or is that the design for the code. Anyhow, the export functions can examine a Python object to determine its type, so I figured that the mapping function would need to encode the missing information. Alas, you show that can not be. And I find that the task of transforming data from the form the application likes to the form the DBMS likes must be split in to two portions, each hard coded. The application must get the data into "base Python types" and the driver must accept all "base Python types" and do such conversions as is needed for the DBMS. The mapping of Python types to DBMS types and the writing of all needed adapter functions falls on the API driver implementor alone. (I find it painful to realize that writing the best, cleanest interface means not writing any more code, at least not for this feature.) >It makes much more sense to map input parameters based on the type of >Python object that is provided as the parameter value, and that's the >inputmap that I'm proposing. That way you get two separate mappings for >the symmetric purposes of: >* Given a value and type from the database, map it to an application >object, and >* Given a value and type from the application, map it to a database >object. > >I don't mind supplying information about the database-side type to the >adapter functions as optional parameters if this information is >available, but determining the export function based on this is in my >opinion not feasible for the reasons I stated above. > >Best regards, > > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070507/0da8baf9/attachment.html From carsten at uniqsys.com Mon May 7 19:09:59 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Mon, 07 May 2007 13:09:59 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <463F2F53.6070003@research.att.com> References: <703ae56b0704200931m3a5e0048xe766eed8db8d70d3@mail.gmail.com> <435DF58A933BA74397B42CDEB8145A86224D86@ex9.hostedexchange.local> <1177119242.3305.52.camel@localhost.localdomain> <1177139387.3208.6.camel@localhost.localdomain> <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <463A323D.1020301@research.att.com> <1178425422.4680.5.camel@localhost.localdomain> <463F2F53.6070003@research.att.com> Message-ID: <1178557799.3360.59.camel@dot.uniqsys.com> On Mon, 2007-05-07 at 09:53 -0400, Art Protin wrote: > Database specific information can almost never be handled in a generic > way > and what can be made generic should not be tied to types in the API. We agree on both counts. > (This opens the question to how to design DBMS specific features in a > way > that maximizes transference of training for our users. Or is that > what we > were already working on?) Yes, I think my type mapping proposal[1] addresses this in an abstract, generic, and extensible way. An application may need to use a particular non-standard feature that some, but not all, databases have, and the application developer still wants to support all databases that have that particular feature. In my scenario, the application developer would provide (or reuse previously provided) database-specific adapter functions to map the database-specific object into a database-agnostic application-side object and back, and plug the appropriate functions into the outputmap/inputmap of the corresponding database connection. The adapters would be database specific, but the API for registering the adapters would be the same across all compliant databases. [1] For those that are new to this thread, see http://www.uniqsys.com/~carsten/typemap.html . > > > > > [lengthy discussion of main types and sub types...] > > > > > > > In the context of my type mapping proposal, for output mapping it is the > > API's responsibility to convey enough information in the type key so > > that the mapping can determine which adapter to call. > How does this contrast with what I said? It probably doesn't, I think I'm just summarizing what you said. > Sorry. Does "have the adapter branch on the subtype" mean that you > accept > the need for the type information to be passed as a second argument to > the > adapter function? [Looks like you say that below.] Yes, that is what I am saying, but the information should be optional, preferably as a keyword argument with an agreed-upon name. API implementers should be free not to provide the information, and adapter functions that are plugged into the map are free to ignore it if it's provided and should fall back gracefully if the information is not provided. > Sorry. Both the .setinputsizes() and the .setoutputsizes() methods > are implemented > identically on my system -- they do nothing. Same here, but that doesn't change the fact that the semantics of "input" and "output" are well-established and well-defined in the context of DB-API 2. > Viewing the API from the database > is wrong in only the most subtle of ways. The pre-established semantics disagree. I think it's better to stick to those semantics than to confuse matters by adding another naming convention such as import/export or fromdb/todb or fromapp/toapp. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From unixdude at gmail.com Wed May 9 14:27:04 2007 From: unixdude at gmail.com (Jim Patterson) Date: Wed, 9 May 2007 08:27:04 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: <1178557799.3360.59.camel@dot.uniqsys.com> References: <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <463A323D.1020301@research.att.com> <1178425422.4680.5.camel@localhost.localdomain> <463F2F53.6070003@research.att.com> <1178557799.3360.59.camel@dot.uniqsys.com> Message-ID: On 5/7/07, Carsten Haese wrote: > > On Mon, 2007-05-07 at 09:53 -0400, Art Protin wrote: In general I like the proposal. I think it will go a long way to make it easier to write adapter specific logic to handle data conversions. However, I wonder how much it helps the issue of writing adapter neutral code. For these conversion functions to work, the adapter needs to call the function passing some data of some sort. That would seem to be database specific. I know that Oracle for example uses a custom format for numbers in it's API, and I have never seen any other database use the same format. cx_Oracle hides this from the python universe and returns floats or strings (and maybe Decimal in the near future), but it has to be told which one to use. Strings are just as much of an issue. It is database specific how the strings are encoded in it's internal API, it could be ASCII, or UTF-8, or UCS-2 or ... How did you see that working? In my scenario, the application developer would provide (or reuse > previously provided) database-specific adapter functions to map the > database-specific object into a database-agnostic application-side > object and back, and plug the appropriate functions into the > outputmap/inputmap of the corresponding database connection. The > adapters would be database specific, but the API for registering the > adapters would be the same across all compliant databases. > > [1] For those that are new to this thread, see > http://www.uniqsys.com/~carsten/typemap.html . If we want to address the problem for the common cases like strings, dates, and numbers then we need to extend your proposal to require the database api to provide the typemappings for the common cases. If we leave those mappings optional and don't specify the name of the supplied mappings then code will still be adapter specific. The pre-established semantics disagree. I think it's better to stick to > those semantics than to confuse matters by adding another naming > convention such as import/export or fromdb/todb or fromapp/toapp. > I agree. Jim P. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070509/3e08ee3c/attachment.htm From carsten at uniqsys.com Wed May 9 15:34:56 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 09 May 2007 09:34:56 -0400 Subject: [DB-SIG] Controlling return types for DB APIs In-Reply-To: References: <1177172281.3191.79.camel@localhost.localdomain> <4CE6F84D-474F-4ED0-B83E-6E6F474F8DEA@zzzcomputing.com> <463A323D.1020301@research.att.com> <1178425422.4680.5.camel@localhost.localdomain> <463F2F53.6070003@research.att.com> <1178557799.3360.59.camel@dot.uniqsys.com> Message-ID: <1178717696.3355.56.camel@dot.uniqsys.com> On Wed, 2007-05-09 at 08:27 -0400, Jim Patterson wrote: > [excellent points about database dependence and database independence...] I agree with everything you say, and my ultimate goal, unspoken until now, is that the community will develop and agree on a set of mandatory standard output maps that the application developers can rely on to write database-independent code for the common use cases of decimal formats, character formats, and date formats. As you said, each database is different at the low level. Therefore, any adapter function will usually be implemented in a database-specific way. My proposal attempts to lay the necessary foundation for a generic framework in which to specify adapter functions. If we agree that the foundation is solid, we can then build output maps with standardized names on top of this foundation. Under the hood, each API implementation will use database-specific features, but the database-dependence will be abstracted away from the application developer by using common map names for common use cases. In my opinion, consensus should be formed in two steps. First, we'd agree that the foundation is solid and that we need mandatory maps for common use cases based on that foundation. Then, in a separate discussion, we'd negotiate what those mandatory maps should be and what they're called. This way, we'll solve the two seemingly mutually exclusive problems of allowing type mapping for common cases without restricting usability for database-specific use cases. To name a concrete example, suppose we agree that the developer should be able to choose between having decimals returned as floats, as strings, or as Decimal instances. We could mandate that the DB-API module provide output maps called, for example, DecimalAsFloat, DecimalAsString, and DecimalAsDecimal. Each of those would be a dictionary-like object that maps the database-specific type description for "decimal" to a database-specific adapter function that will accept the canonical database-specific representation of a decimal value and return the corresponding database-agnostic application object. All this database-specific stuff is neatly hidden from the application programmers. All they need to do is something like this import somedb conn = somedb.connect(...) conn.outputmap = somedb.DecimalAsFloat and they'll get all decimals returned as floats regardless of what kind of database engine they're connected to. I said the standard maps should be dictionary-like because I think we should allow the application developer to combine standard maps by adding them together, which ordinary dictionaries won't do. That way, the developer will be able to write something like conn.outputmap = somedb.DecimalAsFloat + somedb.CharAsUnicode to combine two standard maps in a rather natural way. I hope this makes sense, -- Carsten Haese http://informixdb.sourceforge.net From carl at personnelware.com Wed May 9 21:57:59 2007 From: carl at personnelware.com (Carl Karsten) Date: Wed, 09 May 2007 14:57:59 -0500 Subject: [DB-SIG] ctree Message-ID: <464227C7.1020300@personnelware.com> I need to read some data from an old system that is running on a 1999 linux box. Using strings, I find: FairCom(R) Server and c-tree Plus(R) Wondering if anyone here knew how I could read it in py. The plan is to migrate it to MySql. Carl K From carsten at uniqsys.com Wed May 9 22:15:21 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 09 May 2007 16:15:21 -0400 Subject: [DB-SIG] ctree In-Reply-To: <464227C7.1020300@personnelware.com> References: <464227C7.1020300@personnelware.com> Message-ID: <1178741721.3355.82.camel@dot.uniqsys.com> On Wed, 2007-05-09 at 14:57 -0500, Carl Karsten wrote: > I need to read some data from an old system that is running on a 1999 linux box. > Using strings, I find: > > FairCom(R) Server and c-tree Plus(R) According to http://www.faircom.com/products/ctree/CTP_APIs.shtml there doesn't seem to be a Python API, as if that's a surprise, but there are a low-level ISAM API and a C API. With any luck, one of those APIs should already be on that server. That might be enough to at least get a data dump. Good luck, -- Carsten Haese http://informixdb.sourceforge.net From carl at personnelware.com Wed May 9 22:45:53 2007 From: carl at personnelware.com (Carl Karsten) Date: Wed, 09 May 2007 15:45:53 -0500 Subject: [DB-SIG] ctree In-Reply-To: <1178741721.3355.82.camel@dot.uniqsys.com> References: <464227C7.1020300@personnelware.com> <1178741721.3355.82.camel@dot.uniqsys.com> Message-ID: <46423301.6040406@personnelware.com> Carsten Haese wrote: > On Wed, 2007-05-09 at 14:57 -0500, Carl Karsten wrote: >> I need to read some data from an old system that is running on a 1999 linux box. >> Using strings, I find: >> >> FairCom(R) Server and c-tree Plus(R) > > According to http://www.faircom.com/products/ctree/CTP_APIs.shtml there > doesn't seem to be a Python API, as if that's a surprise, but there are > a low-level ISAM API and a C API. With any luck, one of those APIs > should already be on that server. That might be enough to at least get a > data dump. Well, I did C back in 1900's, so my skills are a bit rusty. I did find this: http://oltp-platform.cvs.sourceforge.net/oltp-platform/OLTPP/services/PythonScript/PythonTranslate.h?view=markup http://oltp-platform.cvs.sourceforge.net/oltp-platform/OLTPP/scripts/TestZipCodes.py?view=markup 12 a,b,c = ZipCode.Get() 13 print "Zip code is ", a 14 print "State is ", b 15 print "City is ", c Which might be what I need, but I am having trouble grasping what to grasp. I haven't figured out how it 'connects' or 'opens' or knows where the data is. Carl K From carsten at uniqsys.com Thu May 10 01:28:30 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 09 May 2007 19:28:30 -0400 Subject: [DB-SIG] ctree In-Reply-To: <46423301.6040406@personnelware.com> References: <464227C7.1020300@personnelware.com> <1178741721.3355.82.camel@dot.uniqsys.com> <46423301.6040406@personnelware.com> Message-ID: <1178753310.3194.6.camel@localhost.localdomain> On Wed, 2007-05-09 at 15:45 -0500, Carl Karsten wrote: > Carsten Haese wrote: > > On Wed, 2007-05-09 at 14:57 -0500, Carl Karsten wrote: > >> I need to read some data from an old system that is running on a 1999 linux box. > >> Using strings, I find: > >> > >> FairCom(R) Server and c-tree Plus(R) > > > > According to http://www.faircom.com/products/ctree/CTP_APIs.shtml there > > doesn't seem to be a Python API, as if that's a surprise, but there are > > a low-level ISAM API and a C API. With any luck, one of those APIs > > should already be on that server. That might be enough to at least get a > > data dump. > > Well, I did C back in 1900's, so my skills are a bit rusty. I did find this: > If you're better in Perl, god forbid, there's http://cpan.uwinnipeg.ca/htdocs/Db-Ctree/Db/Ctree.html Alternatively, if the C API is present as a shared object (.so), you could try to use it from Python with ctypes. Then again, depending on the hairiness of the API, that may require a strong stomach and a high pain threshold. Good luck, -- Carsten Haese http://informixdb.sourceforge.net From phd at phd.pp.ru Thu May 10 16:55:31 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 10 May 2007 18:55:31 +0400 Subject: [DB-SIG] SQLObject 0.7.7 Message-ID: <20070510145531.GC18313@phd.pp.ru> Hello! I'm pleased to announce the 0.7.7 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.7.7 News and changes: http://sqlobject.org/docs/News.html What's New ========== News since 0.7.6 ---------------- Bug Fixes --------- * Fixed a bug in SQLRelatedJoin that ignored per-instance connection. * Fixed a bug in MySQL connection in case there is no charset in the DB URI. For a more complete list, please see the news: http://sqlobject.org/docs/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Thu May 10 17:03:09 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 10 May 2007 19:03:09 +0400 Subject: [DB-SIG] SQLObject 0.8.4 Message-ID: <20070510150309.GG18313@phd.pp.ru> Hello! I'm pleased to announce the 0.8.4 release of SQLObject. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.8.4 News and changes: http://sqlobject.org/News.html What's New ========== News since 0.8.3 ---------------- Bug Fixes --------- * Fixed a bug in SQLRelatedJoin that ignored per-instance connection. * Fixed a bug in MySQL connection in case there is no charset in the DB URI. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From phd at phd.pp.ru Thu May 10 17:26:14 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 10 May 2007 19:26:14 +0400 Subject: [DB-SIG] SQLObject 0.9.0 Message-ID: <20070510152614.GK18313@phd.pp.ru> Hello! I'm pleased to announce the 0.9.0 release of SQLObject, the first stable release of the 0.9 branch. What is SQLObject ================= SQLObject is an object-relational mapper. Your database tables are described as classes, and rows are instances of those classes. SQLObject is meant to be easy to use and quick to get started with. SQLObject supports a number of backends: MySQL, PostgreSQL, SQLite, and Firebird. It also has newly added support for Sybase, MSSQL and MaxDB (also known as SAPDB). Where is SQLObject ================== Site: http://sqlobject.org Development: http://sqlobject.org/devel/ Mailing list: https://lists.sourceforge.net/mailman/listinfo/sqlobject-discuss Archives: http://news.gmane.org/gmane.comp.python.sqlobject Download: http://cheeseshop.python.org/pypi/SQLObject/0.9.0 News and changes: http://sqlobject.org/News.html What's New ========== Features & Interface -------------------- * Support for Python 2.2 has been declared obsolete. * Removed actively deprecated attributes; lowered deprecation level for other attributes to be removed after 0.9. * SQLite connection got columnsFromSchema(). Now all connections fully support fromDatabase. There are two version of columnsFromSchema() for SQLite - one parses the result of "SELECT sql FROM sqlite_master" and the other uses "PRAGMA table_info"; the user can choose one over the other by using "use_table_info" parameter in DB URI; default is False as the pragma is available only in the later versions of SQLite. * Changed connection.delColumn(): the first argument is sqlmeta, not tableName (required for SQLite). * SQLite connection got delColumn(). Now all connections fully support delColumn(). As SQLite backend doesn't implement "ALTER TABLE DROP COLUMN" delColumn() is implemented by creating a new table without the column, copying all data, dropping the original table and renaming the new table. * Versioning - see http://sqlobject.org/Versioning.html * MySQLConnection got new keyword "conv" - a list of custom converters. * Use logging if it's available and is configured via DB URI. * New columns: TimestampCol to support MySQL TIMESTAMP type; SetCol to support MySQL SET type; TinyIntCol for TINYINT; SmallIntCol for SMALLINT; MediumIntCol for MEDIUMINT; BigIntCol for BIGINT. Small Features -------------- * Support for MySQL INT type attributes: UNSIGNED, ZEROFILL. * Support for DEFAULT SQL attribute via defaultSQL keyword argument. * cls.tableExists() as a shortcut for conn.tableExists(cls.sqlmeta.table). * cls.deleteMany(), cls.deleteBy(). Bug Fixes --------- * idName can be inherited from the parent sqlmeta class. * Fixed a longstanding bug with .select() ignoring 'limit' parameter. * Fixed a bug with absent comma in JOINs. * Fixed sqlbuilder - .startswith(), .endswith() and .contains() assumed their parameter must be a string; now you can pass an SQLExpression: Table.q.name.contains(func.upper('a')), for example. * Fixed a longstanding bug in sqlbuilder.Select() with groupBy being a sequence. * Fixed a bug with Aliases in JOINs. * Yet another patch to properly initialize MySQL connection encoding. * Fixed a minor comparison problem in test_decimal.py. * Fixed a bug in SQLRelatedJoin that ignored per-instance connection. Docs ---- * Added documentation about 'validator' Col constructor option. * Added an answer and examples to the FAQ on how to use sqlmeta.createSQL. * Added an example on how to configure logging. * More documentation about orderBy. For a more complete list, please see the news: http://sqlobject.org/News.html Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From info at egenix.com Thu May 10 17:33:14 2007 From: info at egenix.com (eGenix Team: M.-A. Lemburg) Date: Thu, 10 May 2007 17:33:14 +0200 Subject: [DB-SIG] ANN: eGenix mx Base Distribution 3.0.0 (mxDateTime, mxTextTools, etc.) Message-ID: <46433B3A.50802@egenix.com> ________________________________________________________________________ ANNOUNCING eGenix.com mx Base Extension Package Version 3.0.0 Open Source Python extensions providing important and useful services for Python programmers. This announcement is also available on our web-site for online reading: http://www.egenix.com/company/news/eGenix-mx-Base-Distribution-3.0-GA.html ________________________________________________________________________ ABOUT The eGenix.com mx Base Extensions for Python are a collection of professional quality software tools which enhance Python's usability in many important areas such as fast text searching, date/time processing and high speed data types. The tools have a proven record of being portable across many Unix and Windows platforms. You can write applications which use the tools on Windows and then run them on Unix platforms without change due to the consistent platform independent interfaces. All available packages have proven their stability and usefulness in many mission critical applications and various commercial settings all around the world. * About Python: Python is an object-oriented Open Source programming language which runs on all modern platforms (http://www.python.org/). By integrating ease-of-use, clarity in coding, enterprise application connectivity and rapid application design, Python establishes an ideal programming platform for todays IT challenges. * About eGenix: eGenix is a consulting and software product company focused on providing professional quality services and products to Python users and developers (http://www.egenix.com/). ________________________________________________________________________ NEWS The 3.0 release of the eGenix mx Base Distributions comes with a huge number of enhancements, bug fixes and additions. Some highlights: * All mx Extensions have been ported to Python 2.5. * mxDateTime has support for working with Python's datetime module types, so you can use and combine both if necessary. The parser was enhanced to support even more formats and make it more reliable than ever before. * mxTextTools now fully supports Unicode, so you can parse Unicode data just as fast as you can 8-bit string data. The package also includes a tag table compiler and new jump target support to simplify working with tag tables. * mxURL and mxUID were previously released as part of our mx Experimental distribution. They have now been integrated into the base distribution, providing easy-to-use data types for common tasks in web programming. * We've switched from the old distutils wininst installer to the new MSI installer for the Windows Python 2.5 build. This gives you a lot more options for automatic installs, including unattended installs. See http://www.python.org/download/releases/2.5/msi/ for details. For a more detailed description of changes, please see the respective package documentation on our web-site. As always we are providing pre-compiled versions of the package for Windows, Linux, Mac OS X, FreeBSD and Solaris as well as sources which allow you to install the package on all other supported platforms. ________________________________________________________________________ DOWNLOADS The download archives and instructions for installing the packages can be found on the eGenix mx Base Distribution page: http://www.egenix.com/products/python/mxBase/ ________________________________________________________________________ UPGRADING Please note that the 2.0 series of the eGenix mx Base Distribution does not support Python 2.5 on 64-bit platforms due to the Py_ssize_t changes in the Python C API. You are encouraged to upgrade to the new 3.0 series, if you plan to deploy on 64-bit platforms and use Python 2.5 as basis for your applications. ________________________________________________________________________ LICENSES & COSTS The eGenix mx Base package is distributed under the eGenix.com Public License which is a CNRI Python License style Open Source license. You can use the package in both commercial and non-commercial settings without fee or charge. The package comes with full source code ________________________________________________________________________ SUPPORT Commercial support for these packages is available from eGenix.com. Please see http://www.egenix.com/services/support/ for details about our support offerings. Enjoy, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From info at egenix.com Thu May 10 17:35:40 2007 From: info at egenix.com (eGenix Team: M.-A. Lemburg) Date: Thu, 10 May 2007 17:35:40 +0200 Subject: [DB-SIG] ANN: eGenix mxODBC Distribution 3.0.0 (mxODBC Database Interface) Message-ID: <46433BCC.7020508@egenix.com> ________________________________________________________________________ ANNOUNCING eGenix.com mxODBC Database Interface Version 3.0.0 Our commercially supported Python extension providing ODBC database connectivity to Python applications on Windows and Unix platforms This announcement is also available on our web-site for online reading: http://www.egenix.com/company/news/eGenix-mxODBC-Distribution-3.0-GA.html ________________________________________________________________________ ABOUT The mxODBC Database Interface allows users to easily connect Python applications to just about any database on the market today - on both Windows and Unix platforms in a highly portable and convenient way. This makes mxODBC the ideal basis for writing cross-platform database programs and utilities in Python. mxODBC is included in the eGenix.com mxODBC Distribution for Python, a commercial part of the eGenix.com mx Extension Series, a collection of professional quality software tools which enhance Python's usability in many important areas such as ODBC database connectivity, fast text processing, date/time processing and web site programming. The package has proven its stability and usefulness in many mission critical applications and various commercial settings all around the world. * About Python: Python is an object-oriented Open Source programming language which runs on all modern platforms (http://www.python.org/). By integrating ease-of-use, clarity in coding, enterprise application connectivity and rapid application design, Python establishes an ideal programming platform for todays IT challenges. * About eGenix: eGenix is a consulting and software product company focused on providing professional quality services and products to Python users and developers (http://www.egenix.com/). ________________________________________________________________________ NEWS mxODBC 3.0 has received a large number of enhancements and supports more ODBC drivers than ever. Some highlights: * mxODBC has been ported to Python 2.5. * We've worked a lot on the Unicode support and made it more robust, especially on Unix platforms where the ODBC Unicode support has stabilized over the last few years. You can now issue commands using Unicode and exchange Unicode data with the database in various configurable ways. * We've also added a methods to give you more control of the connections and cursors as well as the .callproc() method for calling stored procedures that mxODBC 2.0 was missing. * Multiple result sets via the .nextset() are also supported, so working with stored procedures should be a lot easier now. * Another highlight is the added support for Python's datetime module types and the option to use strings for date/time processing (e.g. to be able to use timezones in timestamps if that's supported by the database). * Python's decimal module is now supported as well and it's possible to configure mxODBC to return Decimal types for numeric values. * mxODBC 3.0 received full 64-bit support, so that you can run mxODBC (and all other mx Extensions) on e.g. AMD64 platforms. * We've switched from the old distutils wininst installer to the new MSI installer for the Windows Python 2.5 build. This gives you a lot more options for automatic installs, including unattended installs. See http://www.python.org/download/releases/2.5/msi/ for details. Note that in order to avoid confusion, we've decided to rename the eGenix.com mx Commercial Distribution to eGenix.com mxODBC Distribution with this release. The commercial distribution has always only contained the mxODBC package, so this was an obvious step to clarify things for our users. As always we are providing pre-compiled versions of the package for Windows, Linux, Mac OS X, FreeBSD and Solaris as well as sources which allow you to install the package on all other supported platforms. ________________________________________________________________________ DOWNLOADS The download archives and instructions for installing the package can be found at: http://www.egenix.com/products/python/mxODBC/ IMPORTANT: In order to use the eGenix mx Commercial package you will first need to install the eGenix mx Base package which can be downloaded from here: http://www.egenix.com/products/python/mxBase/ ________________________________________________________________________ UPGRADING Please note that mxODBC 2.0 does not support Python 2.5 on 64-bit platforms due to the Py_ssize_t changes in the Python C API. You are encouraged to upgrade to the new mxODBC 3.0 release, if you plan to deploy on 64-bit platforms and use Python 2.5 as basis for your applications. ________________________________________________________________________ LICENSES & COSTS This release brings you all the new features and enhancements in mxODBC that were previously only available in through our mxODBC Zope Database Adapter. Like the Zope product, mxODBC now requires that you install a license in order to use it. You can request 30-day evaluation licenses by writing to sales at egenix.com, stating your name (or the name of the company) and the number of eval licenses that you need. We will then issue you licenses and send them to you by email. Please make sure that you can receive ZIP file attachments on the email you specify in the request, since the license files are send out as ZIP attachements. _______________________________________________________________________ SUPPORT Commercial support for these packages is available from eGenix.com. Please see http://www.egenix.com/services/support/ for details about our support offerings. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten at uniqsys.com Thu May 10 17:48:22 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Thu, 10 May 2007 11:48:22 -0400 Subject: [DB-SIG] ANN: eGenix mxODBC Distribution 3.0.0 (mxODBC Database Interface) In-Reply-To: <46433BCC.7020508@egenix.com> References: <46433BCC.7020508@egenix.com> Message-ID: <1178812102.3367.70.camel@dot.uniqsys.com> On Thu, 2007-05-10 at 17:35 +0200, eGenix Team: M.-A. Lemburg wrote: > [release announcement...] Ah, that explains why you were conspicuously absent from the type mapping discussion. I hope you'll have some time to chime in now ;) Cheers, -- Carsten Haese http://informixdb.sourceforge.net From mal at egenix.com Thu May 10 17:52:07 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 10 May 2007 17:52:07 +0200 Subject: [DB-SIG] ANN: eGenix mxODBC Distribution 3.0.0 (mxODBC Database Interface) In-Reply-To: <1178812102.3367.70.camel@dot.uniqsys.com> References: <46433BCC.7020508@egenix.com> <1178812102.3367.70.camel@dot.uniqsys.com> Message-ID: <46433FA7.4070602@egenix.com> On 2007-05-10 17:48, Carsten Haese wrote: > On Thu, 2007-05-10 at 17:35 +0200, eGenix Team: M.-A. Lemburg wrote: >> [release announcement...] > > Ah, that explains why you were conspicuously absent from the type > mapping discussion. I hope you'll have some time to chime in now ;) Exactly. I'll read up on it over the weekend :-) Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 10 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From dieter at handshake.de Thu May 10 20:27:15 2007 From: dieter at handshake.de (Dieter Maurer) Date: Thu, 10 May 2007 20:27:15 +0200 Subject: [DB-SIG] ctree In-Reply-To: <46423301.6040406@personnelware.com> References: <464227C7.1020300@personnelware.com> <1178741721.3355.82.camel@dot.uniqsys.com> <46423301.6040406@personnelware.com> Message-ID: <17987.25603.256616.988335@gargle.gargle.HOWL> Carl Karsten wrote at 2007-5-9 15:45 -0500: > ... >Well, I did C back in 1900's, so my skills are a bit rusty. I did find this: "Pyrex" may allow you to create Python bindings to C API with minimal C knowledge. -- Dieter From aprotin at research.att.com Tue May 15 18:06:50 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 15 May 2007 12:06:50 -0400 Subject: [DB-SIG] Other extensions Message-ID: <4649DA9A.9060700@research.att.com> Dear folks, I have lots more questions about ways that the API could and possibly should be enriched. A. Stored Procedures: 1. The API provides a method for calling a stored procedure. Has there been any discussion about how a user/application might discover the names of such store procedures? 2. Has there been any discussion of how a user/application might create a stored procedure? * My implementation has made some attempt to address this. All of our queries are "named" and "stored" but they are either stored with the session (connection) or with the user account (as provided in connect()). Everything stored with the session vanishes when the connection closes and everything stored with user account is visible by all connections using that account. Thus I made visible objects of the class Server (via an attribute of connection objects), keep all the account info there and provided some methods on server objects to create persistent named queries and to control access to them by other accounts. I have no method to destroy a persistent query yet. B. Metadata Not all DBMSs provide SQL access to the system tables. In fact, the DBMS I work with most is one that doesn't. 1. Has there been a discussion yet about how a user/application might do discovery of the table names? 2. and the column names with a table? 3. and the types of the columns? * My implementation has done naught to address this limitation. C. Non-SQL Queries 1. Has there been any discussion of how a user/application should present queries that are in some other query language? 2. Has there been any discussion of the representation of query language names? * My implementation had to address this because our DBMS has its own preferred query language and management requires that I provide access to it (which I accept as perfectly reasonable). To avoid confusion that might arise when trying to recognize the difference between it and SQL, I simply added extension methods like Cursor.exec_alt(prog, parm, keys) where prog is just the (non-SQL) program in a string, parm is just parameters for the query (just like for .execute()) and keys is a list of keys to use when parm is a dictionary (to linearize the parameters for handing off to the DBMS). But this does not address how a third party application might discover that an alternative language is available nor how it would know how to pass such a query from a sophisticated user to this alternative method. I doubt this is a complete list, but my mind has gotten empty while writing this so I will send it as is. Thank you all, Art Protin From mal at egenix.com Tue May 15 18:38:24 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 15 May 2007 18:38:24 +0200 Subject: [DB-SIG] Other extensions In-Reply-To: <4649DA9A.9060700@research.att.com> References: <4649DA9A.9060700@research.att.com> Message-ID: <4649E200.600@egenix.com> On 2007-05-15 18:06, Art Protin wrote: > Dear folks, > I have lots more questions about ways that the API could and possibly > should be enriched. There have been some discussions about this, but since no standard API could be found, no additions to the DB-API were made. ODBC has a very complete set of catalog functions for querying meta-data of a database. It works by creating result sets that you can then fetch using the standard DB-API tools, so it's fairly straight forward to use. Internally, most ODBC drivers map these function calls to standard SQL SELECTs which work against the internal system tables or call special stored procedures in the database to create the result sets. I suppose the same could be done at a Python interface level (which could be the DB-API level or a level above the DB-API). > A. Stored Procedures: > 1. The API provides a method for calling a stored procedure. Has there > been any discussion about how a user/application might discover the > names of such store procedures? > 2. Has there been any discussion of how a user/application might create > a stored procedure? This can normally be done using standard .execute() calls. > * My implementation has made some attempt to address this. All of our > queries are "named" and "stored" but they are either stored with the > session (connection) or with the user account (as provided in > connect()). > Everything stored with the session vanishes when the connection closes > and everything stored with user account is visible by all connections > using that account. Thus I made visible objects of the class Server > (via > an attribute of connection objects), keep all the account info there and > provided some methods on server objects to create persistent named > queries and to control access to them by other accounts. I have no > method to destroy a persistent query yet. Like everything that deals with stored procedures, this is highly database specific. > B. Metadata > Not all DBMSs provide SQL access to the system tables. In fact, the > DBMS I work with most is one that doesn't. > 1. Has there been a discussion yet about how a user/application might do > discovery of the table names? > 2. and the column names with a table? > 3. and the types of the columns? > * My implementation has done naught to address this limitation. See our mxODBC interface for how this can be done via catalog methods: http://www.egenix.com/products/python/mxODBC/ > C. Non-SQL Queries > 1. Has there been any discussion of how a user/application should present > queries that are in some other query language? No. The DB-API is about relational databases and SQL as query language. The interfaces may also be suitable for other query languages, but that's out of scope for the DB-API. > 2. Has there been any discussion of the representation of query language > names? > * My implementation had to address this because our DBMS has its own > preferred query language and management requires that I provide access > to it (which I accept as perfectly reasonable). To avoid confusion > that might > arise when trying to recognize the difference between it and SQL, I > simply > added extension methods like Cursor.exec_alt(prog, parm, keys) where > prog is just the (non-SQL) program in a string, parm is just > parameters for > the query (just like for .execute()) and keys is a list of keys to > use when > parm is a dictionary (to linearize the parameters for handing off to > the DBMS). > But this does not address how a third party application might > discover that > an alternative language is available nor how it would know how to pass > such a query from a sophisticated user to this alternative method. Unless the .execute() method signature doesn't provide the necessary detail, I'd generally do this by passing an additional (keyword) parameter to .execute(). I don't think that the DB-API should require a mechanism for querying the query language as this is normally always SQL (in some dialect). > I doubt this is a complete list, but my mind has gotten empty while > writing > this so I will send it as is. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 15 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mike_mp at zzzcomputing.com Tue May 15 19:05:53 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 15 May 2007 13:05:53 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: <4649E200.600@egenix.com> References: <4649DA9A.9060700@research.att.com> <4649E200.600@egenix.com> Message-ID: On May 15, 2007, at 12:38 PM, M.-A. Lemburg wrote: > On 2007-05-15 18:06, Art Protin wrote: >> Dear folks, >> I have lots more questions about ways that the API could and >> possibly >> should be enriched. > > There have been some discussions about this, but since no standard > API could be found, no additions to the DB-API were made. > > ODBC has a very complete set of catalog functions for querying > meta-data of a database. It works by creating result sets that you > can then fetch using the standard DB-API tools, so it's fairly > straight forward to use. > > Internally, most ODBC drivers map these function calls to standard > SQL SELECTs which work against the internal system tables or call > special stored procedures in the database to create the result sets. > > I suppose the same could be done at a Python interface level (which > could be the DB-API level or a level above the DB-API). the "standard" for database metadata are the information_schema tables/views, they are part of ANSI 2003. Currently, there is support for information_schema in postgres, mysql, SQL Server 7, and possibly Oracle. At least for the PG/mysql implementations, they are not compatible with each other and in the case of MySQL does not even provide complete information as compared its built-in commands. also information_schema is implemented as views within PG and have some performance issues. Id also say that the structure of information_schema is way too complicated for typical usage and is not intuitive at all...then again an API/schema that is intentionally simplified might not provide full flexibility. >> A. Stored Procedures: >> 1. The API provides a method for calling a stored procedure. >> Has there >> been any discussion about how a user/application might >> discover the >> names of such store procedures? >> 2. Has there been any discussion of how a user/application might >> create >> a stored procedure? > > This can normally be done using standard .execute() calls. .callproc is better since it accounts for in/out parameters. > >> C. Non-SQL Queries >> 1. Has there been any discussion of how a user/application should >> present >> queries that are in some other query language? > > No. The DB-API is about relational databases and SQL as query > language. The interfaces may also be suitable for other query > languages, but that's out of scope for the DB-API. agreed From aprotin at research.att.com Tue May 15 19:58:51 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 15 May 2007 13:58:51 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: <4649E200.600@egenix.com> References: <4649DA9A.9060700@research.att.com> <4649E200.600@egenix.com> Message-ID: <4649F4DB.2040107@research.att.com> Dear Marc-Andre, et al, M.-A. Lemburg wrote: >On 2007-05-15 18:06, Art Protin wrote: > > >>Dear folks, >> I have lots more questions about ways that the API could and possibly >>should be enriched. >> >> > >There have been some discussions about this, but since no standard >API could be found, no additions to the DB-API were made. > >ODBC has a very complete set of catalog functions for querying >meta-data of a database. It works by creating result sets that you >can then fetch using the standard DB-API tools, so it's fairly >straight forward to use. > >Internally, most ODBC drivers map these function calls to standard >SQL SELECTs which work against the internal system tables or call >special stored procedures in the database to create the result sets. > >I suppose the same could be done at a Python interface level (which >could be the DB-API level or a level above the DB-API). > > > No, this can not be done in a standard way above the DB-API level. There is nothing in the DB-API specification that can be used in a DBMS independent manner, that would be assured of producing the answers. One would need to presume that all DBMSs have SQL access to the systems tables and ours for one does not. I value a simple and clean interface. I feel that nothing should be added to the API can be built from the tools that API provides. I do not see how the existing API provides enough functionality to get this data in a generic way. >>A. Stored Procedures: >>1. The API provides a method for calling a stored procedure. Has there >> been any discussion about how a user/application might discover the >> names of such store procedures? >>2. Has there been any discussion of how a user/application might create >> a stored procedure? >> >> > >This can normally be done using standard .execute() calls. > > Please explain how this is done, as I believe that there is no way provided in the API to do it in a standard or DBMS independent manner. > > >>* My implementation has made some attempt to address this. All of our >> queries are "named" and "stored" but they are either stored with the >> session (connection) or with the user account (as provided in >>connect()). >> Everything stored with the session vanishes when the connection closes >> and everything stored with user account is visible by all connections >> using that account. Thus I made visible objects of the class Server >>(via >> an attribute of connection objects), keep all the account info there and >> provided some methods on server objects to create persistent named >> queries and to control access to them by other accounts. I have no >> method to destroy a persistent query yet. >> >> > >Like everything that deals with stored procedures, this is highly >database specific. > > > Yes, but is there any commonality in what the different interfaces could offer that could be the basis for defining general approaches? >>B. Metadata >> Not all DBMSs provide SQL access to the system tables. In fact, the >> DBMS I work with most is one that doesn't. >>1. Has there been a discussion yet about how a user/application might do >> discovery of the table names? >>2. and the column names with a table? >>3. and the types of the columns? >>* My implementation has done naught to address this limitation. >> >> > >See our mxODBC interface for how this can be done via catalog >methods: > > http://www.egenix.com/products/python/mxODBC/ > > > >>C. Non-SQL Queries >>1. Has there been any discussion of how a user/application should present >> queries that are in some other query language? >> >> > >No. The DB-API is about relational databases and SQL as query >language. The interfaces may also be suitable for other query >languages, but that's out of scope for the DB-API. > > > It seems strange to me to contradict the editor of the specification about what the specification says. However, I do not find anything in the first hundred lines that mention either "relational" or "SQL", rather it talks about "Database Interfacing" and queries. I have no problem at all with placing the utmost priority in making sure that the API works with SQL queries on relational DBMSs, but I have no respect for efforts to make it only work with relational DBMSs or only work with SQL. If my DBMS is the only one that has an alternative to SQL, then it makes no sense to try to "standardize" alternative languages, and I accept that. I do not accept that the API must somehow limit itself to SQL even when there is a common need for more. When no one else comes forward with a similar need, I will assume that there is no one else who has such a need or interest and that is reason enough to drop this debate. >>2. Has there been any discussion of the representation of query language >> names? >>* My implementation had to address this because our DBMS has its own >> preferred query language and management requires that I provide access >> to it (which I accept as perfectly reasonable). To avoid confusion >>that might >> arise when trying to recognize the difference between it and SQL, I >>simply >> added extension methods like Cursor.exec_alt(prog, parm, keys) where >> prog is just the (non-SQL) program in a string, parm is just >>parameters for >> the query (just like for .execute()) and keys is a list of keys to >>use when >> parm is a dictionary (to linearize the parameters for handing off to >>the DBMS). >> But this does not address how a third party application might >>discover that >> an alternative language is available nor how it would know how to pass >> such a query from a sophisticated user to this alternative method. >> >> > >Unless the .execute() method signature doesn't provide the >necessary detail, I'd generally do this by passing an additional >(keyword) parameter to .execute(). > > > I could easily adopt such an approach. What about the next revision having some mention of such addition parameters to .execute()? Would that be something like lang='SQL' being the default that a user could override? >I don't think that the DB-API should require a mechanism for >querying the query language as this is normally always SQL (in >some dialect). > > Only if no one else needs something besides SQL. Moreover, I did not expect such a mechanism would be required. Rather, I had expected it to be the recognized optional mechanism. If an interface does not support anything but SQL, it would not need the mechanism. A user/application could look for the method or attribute and know the default (that only SQL is supported) by the absense. The issue is it is much much better that implementers use the same common extensions when they will suffice and not do yet another unique solution. We will however add what ever extensions are needed to make the features of our unique DBMS available to our users. SO it is all about balance. > > >> I doubt this is a complete list, but my mind has gotten empty while >>writing >>this so I will send it as is. >> >> > > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070515/26db211b/attachment-0001.html From mal at egenix.com Tue May 15 20:36:27 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 15 May 2007 20:36:27 +0200 Subject: [DB-SIG] Other extensions In-Reply-To: <4649F4DB.2040107@research.att.com> References: <4649DA9A.9060700@research.att.com> <4649E200.600@egenix.com> <4649F4DB.2040107@research.att.com> Message-ID: <4649FDAB.30104@egenix.com> On 2007-05-15 19:58, Art Protin wrote: > Dear Marc-Andre, et al, > M.-A. Lemburg wrote: > >> On 2007-05-15 18:06, Art Protin wrote: >> >> >>> Dear folks, >>> I have lots more questions about ways that the API could and possibly >>> should be enriched. >>> >> >> There have been some discussions about this, but since no standard >> API could be found, no additions to the DB-API were made. >> >> ODBC has a very complete set of catalog functions for querying >> meta-data of a database. It works by creating result sets that you >> can then fetch using the standard DB-API tools, so it's fairly >> straight forward to use. >> >> Internally, most ODBC drivers map these function calls to standard >> SQL SELECTs which work against the internal system tables or call >> special stored procedures in the database to create the result sets. >> >> I suppose the same could be done at a Python interface level (which >> could be the DB-API level or a level above the DB-API). >> >> >> > No, this can not be done in a standard way above the DB-API level. > There is nothing > in the DB-API specification that can be used in a DBMS independent > manner, that > would be assured of producing the answers. One would need to presume > that all > DBMSs have SQL access to the systems tables and ours for one does not. > > I value a simple and clean interface. I feel that nothing should be > added to the API > can be built from the tools that API provides. I do not see how the > existing API provides > enough functionality to get this data in a generic way. This depends on the database backend. Most of them provide system tables with the needed information in one form or another. Others don't and require using special interfaces (at C level). It's difficult to standardize. ODBC has gotten this pretty well sorted out, IMHO, but emulating it in the DB API would be quite difficult for the module authors. >>> A. Stored Procedures: >>> 1. The API provides a method for calling a stored procedure. Has >>> there >>> been any discussion about how a user/application might discover the >>> names of such store procedures? >>> 2. Has there been any discussion of how a user/application might >>> create >>> a stored procedure? >>> >> >> This can normally be done using standard .execute() calls. >> >> > Please explain how this is done, as I believe that there is no way > provided in the > API to do it in a standard or DBMS independent manner. There is no DBMS independent way of dealing with stored procedures. However, most DBMS that support stored procedures (and that I know of :-)) allow creation of these using cursor.execute(). Some require using database specific tools and sometimes the whole process of adding stored procedures lies completely outside the scope of a database interface - instead you write Java, C, Python or some other language code and plug this directly into the database engine. >>> * My implementation has made some attempt to address this. All of our >>> queries are "named" and "stored" but they are either stored with the >>> session (connection) or with the user account (as provided in >>> connect()). >>> Everything stored with the session vanishes when the connection >>> closes >>> and everything stored with user account is visible by all connections >>> using that account. Thus I made visible objects of the class >>> Server (via >>> an attribute of connection objects), keep all the account info >>> there and >>> provided some methods on server objects to create persistent named >>> queries and to control access to them by other accounts. I have no >>> method to destroy a persistent query yet. >>> >> >> Like everything that deals with stored procedures, this is highly >> database specific. >> >> >> > Yes, but is there any commonality in what the different interfaces could > offer > that could be the basis for defining general approaches? Apart from what .callproc() offers ? I don't think so. >>> B. Metadata >>> Not all DBMSs provide SQL access to the system tables. In fact, the >>> DBMS I work with most is one that doesn't. >>> 1. Has there been a discussion yet about how a user/application might do >>> discovery of the table names? >>> 2. and the column names with a table? >>> 3. and the types of the columns? >>> * My implementation has done naught to address this limitation. >>> >> >> See our mxODBC interface for how this can be done via catalog >> methods: >> >> http://www.egenix.com/products/python/mxODBC/ >> >> >> >>> C. Non-SQL Queries >>> 1. Has there been any discussion of how a user/application should >>> present >>> queries that are in some other query language? >>> >> >> No. The DB-API is about relational databases and SQL as query >> language. The interfaces may also be suitable for other query >> languages, but that's out of scope for the DB-API. >> >> >> > It seems strange to me to contradict the editor of the specification > about what the > specification says. However, I do not find anything in the first > hundred lines that > mention either "relational" or "SQL", rather it talks about "Database > Interfacing" > and queries. > I have no problem at all with placing the utmost priority in making sure > that the > API works with SQL queries on relational DBMSs, but I have no respect for > efforts to make it only work with relational DBMSs or only work with SQL. > > If my DBMS is the only one that has an alternative to SQL, then it makes no > sense to try to "standardize" alternative languages, and I accept that. > I do not > accept that the API must somehow limit itself to SQL even when there is a > common need for more. > > When no one else comes forward with a similar need, I will assume that > there is > no one else who has such a need or interest and that is reason enough to > drop > this debate. Well, the DB-SIG list is called "Python Tabular Databases SIG" and so far we've all been talking about SQL as query language which by itself already is a rather broad scope due to the many different dialects. Like I said: if the interface is also usable for other query languages that's fine, but the spec itself is being designed against SQL. We already have a problem with the parameter markers. I wouldn't want to open yet another can of worms ;-) >>> 2. Has there been any discussion of the representation of query language >>> names? >>> * My implementation had to address this because our DBMS has its own >>> preferred query language and management requires that I provide >>> access >>> to it (which I accept as perfectly reasonable). To avoid >>> confusion that might >>> arise when trying to recognize the difference between it and SQL, >>> I simply >>> added extension methods like Cursor.exec_alt(prog, parm, keys) where >>> prog is just the (non-SQL) program in a string, parm is just >>> parameters for >>> the query (just like for .execute()) and keys is a list of keys to >>> use when >>> parm is a dictionary (to linearize the parameters for handing off >>> to the DBMS). >>> But this does not address how a third party application might >>> discover that >>> an alternative language is available nor how it would know how to >>> pass >>> such a query from a sophisticated user to this alternative method. >>> >> >> Unless the .execute() method signature doesn't provide the >> necessary detail, I'd generally do this by passing an additional >> (keyword) parameter to .execute(). >> >> >> > I could easily adopt such an approach. What about the next revision > having some > mention of such addition parameters to .execute()? Would that be > something like > lang='SQL' > being the default that a user could override? All such extensions would be module specific. This is needed to recognize places in your code that need adjusting in case you want to port to a different module. >> I don't think that the DB-API should require a mechanism for >> querying the query language as this is normally always SQL (in >> some dialect). >> >> > Only if no one else needs something besides SQL. > > Moreover, I did not expect such a mechanism would be required. Rather, > I had > expected it to be the recognized optional mechanism. If an interface > does not > support anything but SQL, it would not need the mechanism. A > user/application > could look for the method or attribute and know the default (that only > SQL is > supported) by the absense. > > The issue is it is much much better that implementers use the same common > extensions when they will suffice and not do yet another unique solution. > We will however add what ever extensions are needed to make the features > of our unique DBMS available to our users. SO it is all about balance. As always :-) I know of a couple of databases that e.g. allow use of different SQL dialects. They can, for example, emulate Oracle or SQL Server SQL syntax and semantics. In order for this to work, you have to pass in connection parameters. The SQL dialect is not changeable after connect. mxODBC supports this, but only through virtue of being able to pass connection strings via a special DriverConnect() connection API. The same approach can be used to configure other parameters of a connection, e.g. read-only connections, code pages, optimizations, etc. >>> I doubt this is a complete list, but my mind has gotten empty >>> while writing >>> this so I will send it as is. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 15 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From fumanchu at amor.org Tue May 15 21:00:16 2007 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 15 May 2007 12:00:16 -0700 Subject: [DB-SIG] Other extensions In-Reply-To: <4649FDAB.30104@egenix.com> Message-ID: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> M.-A. Lemburg wrote: > On 2007-05-15 19:58, Art Protin wrote: > > No, this can not be done in a standard way above the DB-API > > level. There is nothing in the DB-API specification that can > > be used in a DBMS independent manner, that would be assured > > of producing the answers. One would need to presume that all > > DBMSs have SQL access to the systems tables and ours for > > one does not. > > This depends on the database backend. Most of them provide > system tables with the needed information in one form or > another. > > Others don't and require using special interfaces (at C level). > > It's difficult to standardize. ODBC has gotten this pretty well > sorted out, IMHO, but emulating it in the DB API would be > quite difficult for the module authors. It's not so much that making a standard interface is hard (SQLAlchemy, for example, and my Dejavu/Geniusql do this quite well). It's that such interfaces are then called "Object-Relational Mappers" [1] and are pretty universally shunned as an arena for standardization. Robert Brewer System Architect Amor Ministries fumanchu at amor.org [1] ... of the "Metadata Mapping" variety. See http://www.martinfowler.com/eaaCatalog/metadataMapping.html From mike_mp at zzzcomputing.com Tue May 15 21:29:41 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 15 May 2007 15:29:41 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> Message-ID: On May 15, 2007, at 3:00 PM, Robert Brewer wrote: > > It's not so much that making a standard interface is hard (SQLAlchemy, > for example, and my Dejavu/Geniusql do this quite well). It's that > such > interfaces are then called "Object-Relational Mappers" [1] and are > pretty universally shunned as an arena for standardization. I would hate for the SQL construction facilities of SQLAlchemy, Dejavu, SQLObject, or anything else like that to become "standardized" (if thats what you meant). to me thats the same as picking the one true web framework. both are too high-level to be distilled into a single methodology. There are some various ORM "standards" out there and they are all equally useless/ludicrous. standardization locks out all alternative approaches immediately, it also locks down the selected approach from further development without approval from a committee. the additional levels of bureaucracy inherent in any "standardization" stretches the productivity of the typical OSS working model (read: non-paid volunteers doing this in their free time, as opposed to prominent industry-supported standards bodies like W3, ANSI, etc.) super-thin and should only be used as absolutely necessary (which I believe does include very rudimental "agreements" such as WSGI and DBAPI). DBAPI needs to remain as the most minimal layer of standardization possible (and i think it should remain about SQL. to support other query languages would invariably require much richer APIs)...it just would be nice to iron out the API variances in implementations a little better...particularly things like dates, floats/Decimal, more accurate method specifications (like explictly requiring the named argument "size" when the spec says "fetchmany(size=x)"), expected return results of execute()/executemany(), unicode. From fumanchu at amor.org Tue May 15 21:30:11 2007 From: fumanchu at amor.org (Robert Brewer) Date: Tue, 15 May 2007 12:30:11 -0700 Subject: [DB-SIG] Other extensions In-Reply-To: Message-ID: <435DF58A933BA74397B42CDEB8145A860BFCF931@ex9.hostedexchange.local> Michael Bayer wrote: > On May 15, 2007, at 3:00 PM, Robert Brewer wrote: > > It's not so much that making a standard interface is hard > (SQLAlchemy, > > for example, and my Dejavu/Geniusql do this quite well). It's that > > such > > interfaces are then called "Object-Relational Mappers" [1] and are > > pretty universally shunned as an arena for standardization. > > I would hate for the SQL construction facilities of SQLAlchemy, > Dejavu, SQLObject, or anything else like that to become > "standardized" (if thats what you meant). to me thats the same as > picking the one true web framework. both are too high-level to be > distilled into a single methodology. There are some various ORM > "standards" out there and they are all equally useless/ludicrous. > > standardization locks out all alternative approaches immediately, it > also locks down the selected approach from further development > without approval from a committee. the additional levels of > bureaucracy inherent in any "standardization" stretches the > productivity of the typical OSS working model (read: non-paid > volunteers doing this in their free time, as opposed to prominent > industry-supported standards bodies like W3, ANSI, etc.) super-thin > and should only be used as absolutely necessary (which I > believe does > include very rudimental "agreements" such as WSGI and DBAPI). > > DBAPI needs to remain as the most minimal layer of standardization > possible (and i think it should remain about SQL. to support other > query languages would invariably require much richer APIs)...it just > would be nice to iron out the API variances in implementations a > little better...particularly things like dates, floats/Decimal, more > accurate method specifications (like explictly requiring the named > argument "size" when the spec says "fetchmany(size=x)"), expected > return results of execute()/executemany(), unicode. Agreed 100%. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From mal at egenix.com Tue May 15 21:39:49 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 15 May 2007 21:39:49 +0200 Subject: [DB-SIG] Other extensions In-Reply-To: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> References: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> Message-ID: <464A0C85.6050808@egenix.com> On 2007-05-15 21:00, Robert Brewer wrote: > M.-A. Lemburg wrote: >> On 2007-05-15 19:58, Art Protin wrote: >>> No, this can not be done in a standard way above the DB-API >>> level. There is nothing in the DB-API specification that can >>> be used in a DBMS independent manner, that would be assured >>> of producing the answers. One would need to presume that all >>> DBMSs have SQL access to the systems tables and ours for >>> one does not. >> This depends on the database backend. Most of them provide >> system tables with the needed information in one form or >> another. >> >> Others don't and require using special interfaces (at C level). >> >> It's difficult to standardize. ODBC has gotten this pretty well >> sorted out, IMHO, but emulating it in the DB API would be >> quite difficult for the module authors. > > It's not so much that making a standard interface is hard (SQLAlchemy, > for example, and my Dejavu/Geniusql do this quite well). Well, it's finding the right attributes / selectors for the meta data that's hard. After all, the meta data should be as detailed as possible, but not to a point where the majority of backends wouldn't be able to support the interface or where writing the interface for the meta data would require too much work (for little value). > It's that such > interfaces are then called "Object-Relational Mappers" [1] and are > pretty universally shunned as an arena for standardization. I think we'd just need to standardize this at the DB-API level if there are a significant number of database backends that don't allow querying this meta data via cursor.execute(). Looking at your provider implementations: http://projects.amor.org/geniusql/browser/trunk/geniusql/providers it appears as if most databases do support this kind of introspection. While for some you are using ADO (which essentially uses the same meta data catalog API as ODBC), I think those can also be covered using direct SELECT queries into the system tables. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 15 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From aprotin at research.att.com Tue May 15 21:53:39 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 15 May 2007 15:53:39 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: <464A0C85.6050808@egenix.com> References: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> <464A0C85.6050808@egenix.com> Message-ID: <464A0FC3.6030603@research.att.com> Dear folks, M.-A. Lemburg wrote: >On 2007-05-15 21:00, Robert Brewer wrote: > > >>M.-A. Lemburg wrote: >> >> >>>On 2007-05-15 19:58, Art Protin wrote: >>> >>> >>>>No, this can not be done in a standard way above the DB-API >>>>level. There is nothing in the DB-API specification that can >>>>be used in a DBMS independent manner, that would be assured >>>>of producing the answers. One would need to presume that all >>>>DBMSs have SQL access to the systems tables and ours for >>>>one does not. >>>> >>>> >>>This depends on the database backend. Most of them provide >>>system tables with the needed information in one form or >>>another. >>> >>>Others don't and require using special interfaces (at C level). >>> >>>It's difficult to standardize. ODBC has gotten this pretty well >>>sorted out, IMHO, but emulating it in the DB API would be >>>quite difficult for the module authors. >>> >>> >>It's not so much that making a standard interface is hard (SQLAlchemy, >>for example, and my Dejavu/Geniusql do this quite well). >> >> > >Well, it's finding the right attributes / selectors for the meta >data that's hard. > >After all, the meta data should be as detailed >as possible, but not to a point where the majority of backends >wouldn't be able to support the interface or where writing the >interface for the meta data would require too much work (for >little value). > > > >>It's that such >>interfaces are then called "Object-Relational Mappers" [1] and are >>pretty universally shunned as an arena for standardization. >> >> > >I think we'd just need to standardize this at the DB-API level >if there are a significant number of database backends that >don't allow querying this meta data via cursor.execute(). > >Looking at your provider implementations: > >http://projects.amor.org/geniusql/browser/trunk/geniusql/providers > >it appears as if most databases do support this kind of >introspection. While for some you are using ADO (which essentially >uses the same meta data catalog API as ODBC), I think those >can also be covered using direct SELECT queries into the system >tables. > > > OK. Given that there is no standard way to do this thing, I'll just code up whatever I think my users will like. Thank you, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070515/806f8a73/attachment-0001.htm From mike_mp at zzzcomputing.com Tue May 15 21:57:25 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 15 May 2007 15:57:25 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: References: <435DF58A933BA74397B42CDEB8145A860BFCF8AE@ex9.hostedexchange.local> Message-ID: <7C1B5950-163A-4BCF-92EF-CC9356CFE6A0@zzzcomputing.com> On May 15, 2007, at 3:29 PM, Michael Bayer wrote: > DBAPI needs to remain as the most minimal layer of standardization > possible (and i think it should remain about SQL. to support other > query languages would invariably require much richer APIs)...it just > would be nice to iron out the API variances in implementations a > little better...particularly things like dates, floats/Decimal, more > accurate method specifications (like explictly requiring the named > argument "size" when the spec says "fetchmany(size=x)"), expected > return results of execute()/executemany(), unicode. > also on this subject, has there been any thought given to creating a DBAPI "compliance test" suite ? one that does all the regular things a DBAPI should provide and produces a report of what percentage of required functionality is met ? this would be something you could send to a DBAPI author to...erm "encourage" him or her to get in line with a standard methodology rather than making arbitrary decisions. the current PEP does seem to encourage editorializing so its not all their fault. examples would include: - update several rows of a table, where not all rows actually get modified. ensure that cursor.rowcount meets the number of rows matched (not only those modified). MySQLDB will fail this unless a special argument is sent to connect(). check that it works for executemany() too (most DBAPIs dont seem to get this one right). - test all the functions (like fetchmany(), etc.) using named arguments as well as positional arguments. several DBAPIs dont recognize the named parameter "size" to fetchmany() for example, other DBAPIs choke when "parameters" is not present on execute(). - test that all required types (e.g. Binary, Timestamp, etc.) are present. cx_Oracle doesnt provide Binary for example (even though it has plenty of binary support?!) - test that the return result of a BLOB/CLOB/binary column is a python buffer (cx_Oracle returns the surprising LOB object, MySQLDB returns a non-buffer object). of course this would be better suited if words like "preferred" were replaced with "expected" in the PEP. - test that cursor.description works immediately (psycopg2 has special requirements in this regard when using server-side cursors) - test that an OperationalError is raised immediately upon execute (), cursor(), etc. when the database has been disconnected (theyre all over the map on this one). From farcepest at gmail.com Fri May 18 18:19:29 2007 From: farcepest at gmail.com (Andy Dustman) Date: Fri, 18 May 2007 12:19:29 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: References: <4649DA9A.9060700@research.att.com> <4649E200.600@egenix.com> Message-ID: <9826f3800705180919r7d67b670uf311a20ccd50e7c6@mail.gmail.com> On 5/15/07, Michael Bayer wrote: > the "standard" for database metadata are the information_schema > tables/views, they are part of ANSI 2003. Currently, there is > support for information_schema in postgres, mysql, SQL Server 7, and > possibly Oracle. At least for the PG/mysql implementations, they are > not compatible with each other and in the case of MySQL does not even > provide complete information as compared its built-in commands. also > information_schema is implemented as views within PG and have some > performance issues. MySQL-5.0 has information_schema, too. http://dev.mysql.com/doc/refman/5.0/en/information-schema.html """The implementation for the INFORMATION_SCHEMA table structures in MySQL follows the ANSI/ISO SQL:2003 standard Part 11 Schemata. Our intent is approximate compliance with SQL:2003 core feature F021 Basic information schema.""" -- Patriotism means to stand by the country. It does not mean to stand by the president. -- T. Roosevelt This message has been scanned for memes and dangerous content by MindScanner, and is believed to be unclean. From mike_mp at zzzcomputing.com Fri May 18 18:28:48 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Fri, 18 May 2007 12:28:48 -0400 Subject: [DB-SIG] Other extensions In-Reply-To: <9826f3800705180919r7d67b670uf311a20ccd50e7c6@mail.gmail.com> References: <4649DA9A.9060700@research.att.com> <4649E200.600@egenix.com> <9826f3800705180919r7d67b670uf311a20ccd50e7c6@mail.gmail.com> Message-ID: On May 18, 2007, at 12:19 PM, Andy Dustman wrote: > On 5/15/07, Michael Bayer wrote: > >> the "standard" for database metadata are the information_schema >> tables/views, they are part of ANSI 2003. Currently, there is >> support for information_schema in postgres, mysql, SQL Server 7, and >> possibly Oracle. At least for the PG/mysql implementations, they are >> not compatible with each other and in the case of MySQL does not even >> provide complete information as compared its built-in commands. also >> information_schema is implemented as views within PG and have some >> performance issues. > > MySQL-5.0 has information_schema, too. > > http://dev.mysql.com/doc/refman/5.0/en/information-schema.html > > """The implementation for the INFORMATION_SCHEMA table structures in > MySQL follows the ANSI/ISO SQL:2003 standard Part 11 Schemata. Our > intent is approximate compliance with SQL:2003 core feature F021 Basic > information schema.""" the key phrase being "approximate compliance". computers love approximation. From carsten at uniqsys.com Sat May 19 05:33:25 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Fri, 18 May 2007 23:33:25 -0400 Subject: [DB-SIG] Controlling return types, again Message-ID: <1179545605.3279.19.camel@localhost.localdomain> Hiya everybody: The important discussion on controlling return types has gone cold again, so I'd like to revive it. Revision 2 of my type mapping proposal was met with deafening silence except for valuable input by Jim Patterson, which I have incorporated into Revision 3. The result is available for your perusal once again at http://www.uniqsys.com/~carsten/typemap.html . I don't know whether the general silence indicates tacit agreement or if people are too busy to respond or even just to read my proposal in the first place. I'd appreciate some feedback to see how close we are to reaching consensus, even if it's just a "show of hands" in the form of +1/0/-1 responses. Thanks in advance, -- Carsten Haese http://informixdb.sourceforge.net From mal at egenix.com Sat May 19 12:13:59 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 19 May 2007 12:13:59 +0200 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179545605.3279.19.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> Message-ID: <464ECDE7.1010402@egenix.com> Hello Carsten, > The important discussion on controlling return types has gone cold > again, so I'd like to revive it. Revision 2 of my type mapping proposal > was met with deafening silence except for valuable input by Jim > Patterson, which I have incorporated into Revision 3. The result is > available for your perusal once again at > http://www.uniqsys.com/~carsten/typemap.html . > > I don't know whether the general silence indicates tacit agreement or if > people are too busy to respond or even just to read my proposal in the > first place. I'd appreciate some feedback to see how close we are to > reaching consensus, even if it's just a "show of hands" in the form of > +1/0/-1 responses. I did read your proposal and was about to reply, but then got "side" tracked again by other things. In general, I think that we shouldn't expose/impose such a low-level interface for type mapping. These details should be left to the module author and the user shouldn't have bother with them. Instead, we should try to find an API with simple methods like e.g. .setinputconverter(), .setoutputconverter() to let the user define new mappings. The module can then use an implementation like the one you described internally. What I like about the proposal is that it doesn't try to overgeneralize things (a very common habit in Python design). A few comments on the bullets in your proposal and a sketch of a slightly modified solution: * Connection and cursor objects will have an attribute called outputmap that maps objects from the database to the application and an attribute called inputmap that maps objects from the application to the database. Note that both mappings are between Python objects. The Python objects on the database side are mapped from and to actual database values by the canonical mapping. See above. I think we should not expose these low level mapping tables to the user. Note that the database will typically have different ways of defining the "column type code". Since we already expose this type code in the cursor.description field, we should probably use that as basis. In any case, the type codes will be database specific. It won't be possible to generically say: map integers to Python floats, since "integers" may refer to a whole set of database types for some backends or only to one type for others. * The default mappings are None for efficiency, which means that only the canonical mapping is in effect. The same can be achieved by an empty dictionary, but it's faster to check for None that to check for an empty dictionary. That's implementation detail and should not be exposed. We could add .getinputconverter() and .getoutputconverter() to query the currently active mappings in some way. * When a cursor is created, it inherits shallow copies of the connection's mappings. This allows changing the type mapping per cursor without affecting the connection-wide default. +1 * When a value is fetched from the database, if the value is not None, its column type (as it would be indicated in cursor.description) is looked up in outputmap, and the resulting callable object is called upon to convert the fetched value, as illustrated by this pseudo-code: converter = cursor.outputmap.get(dbtype, None) if converter is not None: fetched_value = converter(fetched_value) There's a problem here: since fetching the database value from the database will usually involve some more or less complicated C binding code, you can't just pass the fetched_value to a converter since this would mean that you already have a Python object representing the value. Normally, a database interface will have a set of different techniques by which a value is fetched from the database, e.g. fetch a number value as integer, long, string, float, decimal. To make full use of converters, we'll have to be able to tell the database module: fetch this number type as e.g. decimal using the internal fetch mechanisms (phase 1) and then call this converter on the resulting value (phase 2). Hope I'm clear enough on this. If not, please let me know. * The mappings need not be actual mappings. They may be any object that implements __getitem__ and copy. This allows specifying "batch" mappings that map many different types with the same callable object in a convenient fashion. That again is an implementation detail. We should factor such a type collection feature into the above methods. My favorite would be to not set the converters per type and then have a mapping, but to instead just have one function for implementing phase 1 which then returns a converter function to the database module to implement phase 2, e.g. def mydecimalformat(value): return '%5.2f' % value def outputconverter(cursor, position): dbtype = cursor.description[position][1] if dbtype == SQL.DECIMAL: # Fetch decimals as floats and call mydecimalformat on these return (SQL.DECIMAL, mydecimalformat) mxODBC has a converter function for defining phase 1 output conversions and it works nicely. It doesn't have a phase 2 implementation. Instead, it provides several attributes for tuning the internal fetch mechanisms. * For convenience, the module.connect() and connection.cursor() methods should accept outputmap and inputmap keyword arguments that allow the application to specify non-default mappings at connection/cursor creation time. Not sure about this: the type mapping setup should be explicitly done after a proper connect. It may have to rely on the connection already being established. * In discussions on the db-sig mailing list, some concern was raised that naming the directions of conversion as input and output is ambiguous because input could mean into the database or into the application. However, PEP 249 already uses input and output in the naming of setinputsizes and setoutputsizes, and this proposal follows the same semantics. Right, let's use names similar to those. "input" is always the direction from Python to the database (database gets input) and "output" from the database to Python (get receives output). * When input binding is performed and the cursor's inputmap is not None, a converter function is looked up in the inputmap according to the following pseudo-code: for tp in type(in_param).__mro__: converter = cursor.inputmap.get(tp, None) if converter is not None: break if converter is not None: in_param = converter(in_param) This will cause a serious performance hit since you have to do this for every single value fetched from the database. Just think of a result set 20 columns and 10000 rows. You'd have to run through the above for-loop 200000 times, even though it's really only needed once per column (since the types won't change within the result set). The above two-phase approach avoids this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 19 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten at uniqsys.com Sat May 19 19:55:41 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 19 May 2007 13:55:41 -0400 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179545605.3279.19.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> Message-ID: <1179597341.3162.63.camel@localhost.localdomain> quote=""" In general, I think that we shouldn't expose/impose such a low-level interface for type mapping. These details should be left to the module author and the user shouldn't have bother with them. """ For the common cases, the user won't have to bother with the low-level details. The module author will provide standard maps for the common use cases, and they're free to provide a library of nonstandard maps for "uncommon" use cases specific to their particular database, too. In my opinion, making the low-level details available is the only thing that *guarantees* that the application developer can use this mapping facility for *any* use case they can think of. If we try to hide the low-level details, we might take away a crucial feature the application developer needs to get their job done. quote=""" Instead, we should try to find an API with simple methods like e.g. .setinputconverter(), .setoutputconverter() to let the user define new mappings. The module can then use an implementation like the one you described internally. """ And how do you propose those simple methods are actually invoked so that they cover all common use cases in a database independent way without making them unusable for database-specific features? quote=""" What I like about the proposal is that it doesn't try to overgeneralize things (a very common habit in Python design). """ Interesting observation considering that I have tried to find the most general solution to the problem at hand. That would mean that my proposal is exactly as general as it needs to be ;) quote=""" A few comments on the bullets in your proposal and a sketch of a slightly modified solution: * Connection and cursor objects will have an attribute called outputmap that maps objects from the database to the application and an attribute called inputmap that maps objects from the application to the database. Note that both mappings are between Python objects. The Python objects on the database side are mapped from and to actual database values by the canonical mapping. See above. I think we should not expose these low level mapping tables to the user. Note that the database will typically have different ways of defining the "column type code". Since we already expose this type code in the cursor.description field, we should probably use that as basis. In any case, the type codes will be database specific. It won't be possible to generically say: map integers to Python floats, since "integers" may refer to a whole set of database types for some backends or only to one type for others. """ Right. Hence my proposal to use dictionary-like objects to perform the adapter function lookup. quote=""" * The default mappings are None for efficiency, which means that only the canonical mapping is in effect. The same can be achieved by an empty dictionary, but it's faster to check for None that to check for an empty dictionary. That's implementation detail and should not be exposed. """ Well, it's an implementation *hint*. quote=""" We could add .getinputconverter() and .getoutputconverter() to query the currently active mappings in some way. """ True, but unfortunately, "in some way" makes this suggestion uselessly vague. quote=""" * When a value is fetched from the database, if the value is not None, its column type (as it would be indicated in cursor.description) is looked up in outputmap, and the resulting callable object is called upon to convert the fetched value, as illustrated by this pseudo-code: converter = cursor.outputmap.get(dbtype, None) if converter is not None: fetched_value = converter(fetched_value) There's a problem here: since fetching the database value from the database will usually involve some more or less complicated C binding code, you can't just pass the fetched_value to a converter since this would mean that you already have a Python object representing the value. """ Right, and in this step, I do. This step happens after the database-dependent canonical mapping, which is informally defined as "Whatever the respective API module currently does." quote=""" Normally, a database interface will have a set of different techniques by which a value is fetched from the database, e.g. fetch a number value as integer, long, string, float, decimal. To make full use of converters, we'll have to be able to tell the database module: fetch this number type as e.g. decimal using the internal fetch mechanisms (phase 1) and then call this converter on the resulting value (phase 2). Hope I'm clear enough on this. If not, please let me know. """ Yes, you're perfectly clear, and my proposal already addresses this. What you're calling phase 1 is what I call the canonical mapping, and I am completely open to allowing database-dependent mechanisms for "guiding" or "tweaking" the behavior of this phase 1 mapping. I am even suggesting a way involving custom attributes on the adapter function. quote = """ * The mappings need not be actual mappings. They may be any object that implements __getitem__ and copy. This allows specifying "batch" mappings that map many different types with the same callable object in a convenient fashion. That again is an implementation detail. We should factor such a type collection feature into the above methods. """ I won't stop you from trying. Please feel free to suggest a concrete mechanism. quote=""" My favorite would be to not set the converters per type and then have a mapping, but to instead just have one function for implementing phase 1 which then returns a converter function to the database module to implement phase 2, e.g. def mydecimalformat(value): return '%5.2f' % value def outputconverter(cursor, position): dbtype = cursor.description[position][1] if dbtype == SQL.DECIMAL: # Fetch decimals as floats and call mydecimalformat on these return (SQL.DECIMAL, mydecimalformat) """ Maybe I misunderstand, but doesn't this force every application developer to reinvent the wheel? How would they influence the mappings of two different types such as DECIMAL and CHAR except by writing one output converter for each possible combination they need? Also, I don't see how this helps in getting to a set of database-independent solutions for common use cases. quote=""" * For convenience, the module.connect() and connection.cursor() methods should accept outputmap and inputmap keyword arguments that allow the application to specify non-default mappings at connection/cursor creation time. Not sure about this: the type mapping setup should be explicitly done after a proper connect. It may have to rely on the connection already being established. """ That's a good point. I agree. quote=""" * When input binding is performed and the cursor's inputmap is not None, a converter function is looked up in the inputmap according to the following pseudo-code: for tp in type(in_param).__mro__: converter = cursor.inputmap.get(tp, None) if converter is not None: break if converter is not None: in_param = converter(in_param) This will cause a serious performance hit since you have to do this for every single value fetched from the database. Just think of a result set 20 columns and 10000 rows. You'd have to run through the above for-loop 200000 times, even though it's really only needed once per column (since the types won't change within the result set). The above two-phase approach avoids this. """ I think you misunderstand. The code you're quoting is for input binding, not for output binding. True, it would have to be done for every value passed as a parameter, but most python objects that a database is likely to see will have a rather short MRO, and the pseudocode is just a suggestion. The cursor could memoize the results of the lookup in case the same query gets executed again with input parameters of the same types. (And of course, memoization could also be done in the lookup for output adapters.) Best regards, -- Carsten Haese http://informixdb.sourceforge.net From carsten at uniqsys.com Sat May 19 20:04:03 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Sat, 19 May 2007 14:04:03 -0400 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179597341.3162.63.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> <1179597341.3162.63.camel@localhost.localdomain> Message-ID: <1179597843.3162.69.camel@localhost.localdomain> And in case you're wondering who I'm replying to, I'm replying to Marc-Andre's responses that for some reason didn't get delivered to my inbox, so I quoted Marc-Andre from the pipermail archive. I thought I added an attribution on top, but I guess my mail client has a mind of its own today. -- Carsten Haese http://informixdb.sourceforge.net From mal at egenix.com Sat May 19 20:23:13 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 19 May 2007 20:23:13 +0200 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179597843.3162.69.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> <1179597341.3162.63.camel@localhost.localdomain> <1179597843.3162.69.camel@localhost.localdomain> Message-ID: <464F4091.7000003@egenix.com> [Sorry for spamming the list, but I don't think that this email will get through to Carsten directly either.] On 2007-05-19 20:04, Carsten Haese wrote: > And in case you're wondering who I'm replying to, I'm replying to > Marc-Andre's responses that for some reason didn't get delivered to my > inbox, so I quoted Marc-Andre from the pipermail archive. I thought I > added an attribution on top, but I guess my mail client has a mind of > its own today. Note that email to you generate bounces on a regular basis. You might want to check this. This is the mail system at host mail.egenix.com. #################################################################### # THIS IS A WARNING ONLY. YOU DO NOT NEED TO RESEND YOUR MESSAGE. # #################################################################### Your message could not be delivered for more than 4 hour(s). It will be retried until it is 5 day(s) old. For further assistance, please send mail to If you do so, please include this problem report. You can delete your own text from the attached returned message. The mail system : connect to mxrelay.uniqsys.net[64.246.99.196]: Connection refused Reporting-MTA: dns; mail.egenix.com X-Postfix-Queue-ID: D387D808D10 X-Postfix-Sender: rfc822; mal at egenix.com Arrival-Date: Sat, 19 May 2007 12:14:02 +0200 (CEST) Final-Recipient: rfc822; carsten at uniqsys.com Original-Recipient: rfc822;carsten at uniqsys.com Action: delayed Status: 4.4.1 Diagnostic-Code: X-Postfix; connect to mxrelay.uniqsys.net[64.246.99.196]: Connection refused Will-Retry-Until: Thu, 24 May 2007 12:14:02 +0200 (CEST) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 19 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten.haese at gmail.com Sat May 19 23:47:58 2007 From: carsten.haese at gmail.com (Carsten Haese) Date: Sat, 19 May 2007 17:47:58 -0400 Subject: [DB-SIG] Test message, please ignore Message-ID: <5fbcfcff0705191447h332f4de7p9cb0d0a2014f52ab@mail.gmail.com> Testing new subscription address. -Carsten -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070519/50376bc2/attachment.htm From carsten.haese at gmail.com Sat May 19 23:36:40 2007 From: carsten.haese at gmail.com (Carsten Haese) Date: Sat, 19 May 2007 17:36:40 -0400 Subject: [DB-SIG] Test message, please ignore Message-ID: <1179610600.3173.2.camel@localhost.localdomain> Testing new subscription address since my other email account is blind to mal's posts... -Carsten From mal at egenix.com Mon May 21 19:10:44 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 21 May 2007 19:10:44 +0200 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179597341.3162.63.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> <1179597341.3162.63.camel@localhost.localdomain> Message-ID: <4651D294.5070305@egenix.com> Hmm, a bit hard to read, your reply. I've added some more quote chars to make it easier... On 2007-05-19 19:55, Carsten Haese wrote: > >> quote=""" >> In general, I think that we shouldn't expose/impose such a >> low-level interface for type mapping. These details should be >> left to the module author and the user shouldn't have bother >> with them. >> """ > > For the common cases, the user won't have to bother with the low-level > details. The module author will provide standard maps for the common use > cases, and they're free to provide a library of nonstandard maps for > "uncommon" use cases specific to their particular database, too. > > In my opinion, making the low-level details available is the only thing > that *guarantees* that the application developer can use this mapping > facility for *any* use case they can think of. If we try to hide the > low-level details, we might take away a crucial feature the application > developer needs to get their job done. The DB-API cannot per-se address low-level details. These are way to specific to interface requirements imposed by the database backend. That's the reason why we have type code objects that allow doing many-to-one equal comparisons. Unfortunately, these don't work well with dictionaries since an object can only have one hash value. >> quote=""" >> Instead, we should try to find an API with simple methods like >> e.g. .setinputconverter(), .setoutputconverter() to let the user >> define new mappings. >> >> The module can then use an implementation like the one you >> described internally. >> """ > > And how do you propose those simple methods are actually invoked so that > they cover all common use cases in a database independent way without > making them unusable for database-specific features? See below: you pass in a single function which takes care of this. >> quote=""" >> What I like about the proposal is that it doesn't try to >> overgeneralize things (a very common habit in Python design). >> """ > > Interesting observation considering that I have tried to find the most > general solution to the problem at hand. That would mean that my > proposal is exactly as general as it needs to be ;) Well, let's put it this way: you could have started defining a new class structure, using abstract classes, a type registry, special object methods, various introspection APIs, etc. Luckily, you avoided all that :-) >> quote=""" >> A few comments on the bullets in your proposal and a sketch >> of a slightly modified solution: >> >> * Connection and cursor objects will have an attribute called outputmap that maps objects from the database to the >> application and an attribute called inputmap that maps objects from the application to the database. Note that both >> mappings are between Python objects. The Python objects on the database side are mapped from and to actual database >> values by the canonical mapping. >> >> See above. I think we should not expose these low level mapping >> tables to the user. >> >> Note that the database will typically have different ways of >> defining the "column type code". Since we already expose this >> type code in the cursor.description field, we should probably >> use that as basis. >> >> In any case, the type codes will be database specific. It won't >> be possible to generically say: map integers to Python floats, >> since "integers" may refer to a whole set of database types for >> some backends or only to one type for others. >> """ > > Right. Hence my proposal to use dictionary-like objects to perform the > adapter function lookup. > >> quote=""" >> * The default mappings are None for efficiency, which means that only the canonical mapping is in effect. The same >> can be achieved by an empty dictionary, but it's faster to check for None that to check for an empty dictionary. >> >> That's implementation detail and should not be exposed. >> """ > > Well, it's an implementation *hint*. > >> quote=""" >> We could add .getinputconverter() and .getoutputconverter() >> to query the currently active mappings in some way. >> """ > > True, but unfortunately, "in some way" makes this suggestion uselessly > vague. It was only a sketch. I think I'll write up a formal definition of the idea and post it here. >> quote=""" >> * When a value is fetched from the database, if the value is not None, its column type (as it would be indicated in >> cursor.description) is looked up in outputmap, and the resulting callable object is called upon to convert the fetched >> value, as illustrated by this pseudo-code: >> >> converter = cursor.outputmap.get(dbtype, None) >> if converter is not None: >> fetched_value = converter(fetched_value) >> >> There's a problem here: since fetching the database value from >> the database will usually involve some more or less complicated >> C binding code, you can't just pass the fetched_value to a >> converter since this would mean that you already have a Python >> object representing the value. >> """ > > Right, and in this step, I do. This step happens after the > database-dependent canonical mapping, which is informally defined as > "Whatever the respective API module currently does." Ah, but that's not necessarily what you need to convert the value into a different type or format. E.g. say you have a decimal column and the canonical method of retrieving the value by using floats. Now say you want to return these as decimals. Floats don't give you enough information to properly do this. Another example: say your database provides way to fetching BLOBs in chunks. The database module will likely retrieve the data in chunks, but still return the string in one piece as canoncial representation. What you'd really want is an iterator with which you could retieve the data in chunks as well. >> quote=""" >> Normally, a database interface will have a set of different >> techniques by which a value is fetched from the database, e.g. >> fetch a number value as integer, long, string, float, decimal. >> >> To make full use of converters, we'll have to be able to tell >> the database module: fetch this number type as e.g. decimal >> using the internal fetch mechanisms (phase 1) and then call >> this converter on the resulting value (phase 2). >> >> Hope I'm clear enough on this. If not, please let me know. >> """ > > Yes, you're perfectly clear, and my proposal already addresses this. > What you're calling phase 1 is what I call the canonical mapping, and I > am completely open to allowing database-dependent mechanisms for > "guiding" or "tweaking" the behavior of this phase 1 mapping. I am even > suggesting a way involving custom attributes on the adapter function. > >> quote = """ >> * The mappings need not be actual mappings. They may be any object that implements __getitem__ and copy. This allows >> specifying "batch" mappings that map many different types with the same callable object in a convenient fashion. >> >> That again is an implementation detail. We should factor such >> a type collection feature into the above methods. >> """ > > I won't stop you from trying. Please feel free to suggest a concrete > mechanism. Will do :-) >> quote=""" >> My favorite would be to not set the converters per type and then >> have a mapping, but to instead just have one function for >> implementing phase 1 which then returns a converter function >> to the database module to implement phase 2, e.g. >> >> def mydecimalformat(value): >> return '%5.2f' % value >> >> def outputconverter(cursor, position): >> dbtype = cursor.description[position][1] >> if dbtype == SQL.DECIMAL: >> # Fetch decimals as floats and call mydecimalformat on these >> return (SQL.DECIMAL, mydecimalformat) >> """ > > Maybe I misunderstand, but doesn't this force every application > developer to reinvent the wheel? How would they influence the mappings > of two different types such as DECIMAL and CHAR except by writing one > output converter for each possible combination they need? If they need to modify the mappings for DECIMAL and CHAR, then they'd put those two in the converter: def mycharconverter(value): return unicode(value, 'utf-8') def outputconverter(cursor, position): dbtype = cursor.description[position][1] if dbtype == SQL.DECIMAL: # Fetch decimals as floats and call mydecimalformat on these return (SQL.DECIMAL, mydecimalformat) elif dbtype == SQL.CHAR: # Fetch chars as Unicode objects return (SQL.CHAR, mycharconverter) The advantage is that you can also do more complicated mappings, e.g. by position of the column in a query: def outputconverter(cursor, position): dbtype = cursor.description[position][1] if dbtype == SQL.DECIMAL and position == 2 # Fetch decimals as floats and call mydecimalformat on these, # but only on column 3 in the result set return (SQL.DECIMAL, mydecimalformat) or use a dictionary mapping: mytypemap = { SQL.DECIMAL: (SQL.DECIMAL, mydecimalformat), } def outputconverter(cursor, position): dbtype = cursor.description[position][1] fetchastype, formatter = mytypemap.get(dbtype, None) if converter is None: # Use the default mapping return None return fetchastype, formatter It's also possible to chain converters: existing_converter = cursor.getoutputconverter() def outputconverter(cursor, position): dbtype = cursor.description[position][1] fetchastype, converter = mytypemap.get(dbtype, None) if converter is None: # Revert to existing_converter return exiting_converter(cursor, position) return fetchastype, converter > Also, I don't see how this helps in getting to a set of > database-independent solutions for common use cases. Depends on the common use cases. Do you have some ? >> quote=""" >> * For convenience, the module.connect() and connection.cursor() methods should accept outputmap and inputmap keyword >> arguments that allow the application to specify non-default mappings at connection/cursor creation time. >> >> Not sure about this: the type mapping setup should be explicitly done >> after a proper connect. It may have to rely on the connection already >> being established. >> """ > > That's a good point. I agree. > >> quote=""" >> * When input binding is performed and the cursor's inputmap is not None, a converter function is looked up in the >> inputmap according to the following pseudo-code: >> >> for tp in type(in_param).__mro__: >> converter = cursor.inputmap.get(tp, None) >> if converter is not None: break >> if converter is not None: >> in_param = converter(in_param) >> >> This will cause a serious performance hit since you have to do >> this for every single value fetched from the database. Just think >> of a result set 20 columns and 10000 rows. >> >> You'd have to run through the above for-loop 200000 times, even >> though it's really only needed once per column (since the types >> won't change within the result set). >> >> The above two-phase approach avoids this. >> """ > > I think you misunderstand. The code you're quoting is for input binding, > not for output binding. Sorry, I meant the output binding. To some extent the above also applies to input binding, but the situation is different there since the database module cannot assume that all objects in a parameter list passed to .executemany() are of the same type. > True, it would have to be done for every value > passed as a parameter, but most python objects that a database is likely > to see will have a rather short MRO, and the pseudocode is just a > suggestion. The cursor could memoize the results of the lookup in case > the same query gets executed again with input parameters of the same > types. (And of course, memoization could also be done in the lookup for > output adapters.) True. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 21 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten at UNIQSYS.COM Mon May 21 20:05:26 2007 From: carsten at UNIQSYS.COM (Carsten Haese) Date: Mon, 21 May 2007 14:05:26 -0400 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <4651D294.5070305@egenix.com> References: <1179545605.3279.19.camel@localhost.localdomain> <1179597341.3162.63.camel@localhost.localdomain> <4651D294.5070305@egenix.com> Message-ID: <1179770726.3387.38.camel@dot.uniqsys.com> Thanks for your input. I won't bother quoting all that, because it seems to me that we actually have a lot of common ground. You seem to be proposing that the user should write a converter dispatcher function that returns a "binding hint" and an adapter function. In my proposal, the dispatcher is a dictionary-like object, and it could even be a dictionary in cases where the type description can be used as a key, and the binding hint would ride along on the adapter function itself as an attribute. However, I suppose the dispatcher could just as easily return a tuple of binding hint and adapter function as in your outline. The advantage of my proposal is that it's easier to combine "canned" converters as long as those canned converters implement __add__, which I would require of the canned converters for common use cases. I listed a few common use cases in my proposal, including but not limited to whether to fetch chars as byte strings or unicode objects, whether to fetch decimals as floats or decimal objects, etc. In my opinion it is vital that the application developer be able to choose such standard mappings with code that's a) minimal and b) database independent. With the code you're proposing, I don't see how the application developer would combine canned standard converters without writing a big boilerplate if/elif/else dispatcher that contains lots of database dependent type descriptors. [P.S. My email problem has been corrected, thanks for bringing it to my attention. Apparently your mail exchanger is on the same IP range as a known spammer, and my sysadmin blocked the entire range without realizing that he was throwing out the baby with the bathwater.] Best regards, -- Carsten Haese http://informixdb.sourceforge.net From carl at personnelware.com Mon May 21 23:46:46 2007 From: carl at personnelware.com (Carl Karsten) Date: Mon, 21 May 2007 16:46:46 -0500 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179545605.3279.19.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> Message-ID: <46521346.70208@personnelware.com> I have been following this on and off. I don't have much to add to the specific problem, but I do have some general thoughts that may help. I. It seems to be a similar (not same, and even similar may be a stretch) problem to the parameter formating problem 'solved' by .paramstyle which is one of the worst solutions I have seen to any problem :) to me, the parameter problem should be solved "in" the db-api layer, not above it (assuming application is above database.) Again, I agree that the type problem is not the same as parameter, so it may need a different solution that will have similarities to what I dislike about .paramstyle. II. (going out on a limb here...) A user defined data type is used (therefor defined) by the application, so it makes perfect sense that the app layer would have implementation details. If the person defining the datatype wants to use it across applications, then they will code an appropriate class. I can't see how a user defined anything can be handled generically. Carl K From carsten at uniqsys.com Tue May 22 02:01:23 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Mon, 21 May 2007 20:01:23 -0400 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <46521346.70208@personnelware.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> Message-ID: <1179792083.3254.12.camel@localhost.localdomain> On Mon, 2007-05-21 at 16:46 -0500, Carl Karsten wrote: > I have been following this on and off. I don't have much to add to the specific > problem, but I do have some general thoughts that may help. > > I. > It seems to be a similar (not same, and even similar may be a stretch) problem > to the parameter formating problem 'solved' by .paramstyle which is one of the > worst solutions I have seen to any problem :) to me, the parameter problem > should be solved "in" the db-api layer, not above it (assuming application is > above database.) Again, I agree that the type problem is not the same as > parameter, so it may need a different solution that will have similarities to > what I dislike about .paramstyle. I'm not sure exactly what you mean here. paramstyle fulfills its function of giving the API implementor enough freedom to get the job done and letting the application developer know which option the implementor chose. Now, IMHO format and pyformat paramstyles are an abomination that should disappear from future versions of DB-API, and qmark should be the mandatory minimum, but I digress. My outputmap/inputmap proposal is built with flexibility in mind. Maybe that's what's bothering you, I don't know. > II. (going out on a limb here...) > A user defined data type is used (therefor defined) by the application, so it > makes perfect sense that the app layer would have implementation details. If > the person defining the datatype wants to use it across applications, then they > will code an appropriate class. I can't see how a user defined anything can be > handled generically. Of course. Informix Dynamic Server allows user-defined types, which is precisely why my outputmap/inputmap mechanism is flexible enough to handle a lot of different use case scenarios. But it also provides a foundation for handling common use cases with standard typemaps that have database-independent names and agreed-upon semantics. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From carl at personnelware.com Tue May 22 06:15:27 2007 From: carl at personnelware.com (Carl Karsten) Date: Mon, 21 May 2007 23:15:27 -0500 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179792083.3254.12.camel@localhost.localdomain> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> Message-ID: <46526E5F.60104@personnelware.com> Carsten Haese wrote: > On Mon, 2007-05-21 at 16:46 -0500, Carl Karsten wrote: >> I have been following this on and off. I don't have much to add to the specific >> problem, but I do have some general thoughts that may help. >> >> I. >> It seems to be a similar (not same, and even similar may be a stretch) problem >> to the parameter formating problem 'solved' by .paramstyle which is one of the >> worst solutions I have seen to any problem :) to me, the parameter problem >> should be solved "in" the db-api layer, not above it (assuming application is >> above database.) Again, I agree that the type problem is not the same as >> parameter, so it may need a different solution that will have similarities to >> what I dislike about .paramstyle. > > I'm not sure exactly what you mean here. paramstyle fulfills its > function of giving the API implementor enough freedom to get the job > done and letting the application developer know which option the > implementor chose. Now, IMHO format and pyformat paramstyles are an > abomination that should disappear from future versions of DB-API, and > qmark should be the mandatory minimum, but I digress. to me it gave the API implementor too much freedom for no good reason, other than make it easier by making it the application developers problem. I don't see why one format couldn't have been chosen and all API implementors could work with it, just like all application developers now have to work with it. > > My outputmap/inputmap proposal is built with flexibility in mind. Maybe > that's what's bothering you, I don't know. > >> II. (going out on a limb here...) >> A user defined data type is used (therefor defined) by the application, so it >> makes perfect sense that the app layer would have implementation details. If >> the person defining the datatype wants to use it across applications, then they >> will code an appropriate class. I can't see how a user defined anything can be >> handled generically. > > Of course. Informix Dynamic Server allows user-defined types, which is > precisely why my outputmap/inputmap mechanism is flexible enough to > handle a lot of different use case scenarios. But it also provides a > foundation for handling common use cases with standard typemaps that > have database-independent names and agreed-upon semantics. That sounds in line with my thoughts. Carl K From aprotin at research.att.com Tue May 22 15:22:20 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 22 May 2007 09:22:20 -0400 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <46526E5F.60104@personnelware.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> Message-ID: <4652EE8C.4070801@research.att.com> Dear folks, Carl Karsten wrote: >Carsten Haese wrote: > > >>On Mon, 2007-05-21 at 16:46 -0500, Carl Karsten wrote: >> >> >>>I have been following this on and off. I don't have much to add to the specific >>>problem, but I do have some general thoughts that may help. >>> >>>I. >>>It seems to be a similar (not same, and even similar may be a stretch) problem >>>to the parameter formating problem 'solved' by .paramstyle which is one of the >>>worst solutions I have seen to any problem :) to me, the parameter problem >>>should be solved "in" the db-api layer, not above it (assuming application is >>>above database.) Again, I agree that the type problem is not the same as >>>parameter, so it may need a different solution that will have similarities to >>>what I dislike about .paramstyle. >>> >>> >>I'm not sure exactly what you mean here. paramstyle fulfills its >>function of giving the API implementor enough freedom to get the job >>done and letting the application developer know which option the >>implementor chose. Now, IMHO format and pyformat paramstyles are an >>abomination that should disappear from future versions of DB-API, and >>qmark should be the mandatory minimum, but I digress. >> >> > >to me it gave the API implementor too much freedom for no good reason, other >than make it easier by making it the application developers problem. I don't >see why one format couldn't have been chosen and all API implementors could work >with it, just like all application developers now have to work with it. > > > Initially I also saw the .paramstyle as providing the implementor too much freedom. However, I chose to take it as a challenge. In my implementation, .paramstyle is not read-only but read-write and all of the defined styles are accepted. In fact, once I figured out how to handle the parsing of SQL comments & literals, parsing of any style of parameters was almost trivial (I had to do that parsing in Python due to the nature of our system). [Actually, I had to add code to block the styles "pyformat" and "format" because my boss agreed with Carsten, although he would have preferred that I removed code.] In my opinion (which is never as humble as it should be), "qmark" is barely adequate; numeric should be the required minimum. But now that so many have gotten used to "qmark", it will probably never go away. I am glad that the spec. [PEP 249] did not require that .paramstyle be read-only and now oppose any attempt to "correct" that oversight. >>My outputmap/inputmap proposal is built with flexibility in mind. Maybe >>that's what's bothering you, I don't know. >> >> >> >>>II. (going out on a limb here...) >>>A user defined data type is used (therefor defined) by the application, so it >>>makes perfect sense that the app layer would have implementation details. If >>>the person defining the datatype wants to use it across applications, then they >>>will code an appropriate class. I can't see how a user defined anything can be >>>handled generically. >>> >>> >>Of course. Informix Dynamic Server allows user-defined types, which is >>precisely why my outputmap/inputmap mechanism is flexible enough to >>handle a lot of different use case scenarios. But it also provides a >>foundation for handling common use cases with standard typemaps that >>have database-independent names and agreed-upon semantics. >> >> > >That sounds in line with my thoughts. > >Carl K >_______________________________________________ >DB-SIG maillist - DB-SIG at python.org >http://mail.python.org/mailman/listinfo/db-sig > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070522/b54c189f/attachment.html From carsten at uniqsys.com Tue May 22 16:16:24 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 22 May 2007 10:16:24 -0400 Subject: [DB-SIG] paramstyles, again (was: Controlling return types, again) In-Reply-To: <4652EE8C.4070801@research.att.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> Message-ID: <1179843384.3373.37.camel@dot.uniqsys.com> On Tue, 2007-05-22 at 09:22 -0400, Art Protin wrote: > Initially I also saw the .paramstyle as providing the implementor too > much freedom. > > However, I chose to take it as a challenge. In my > implementation, .paramstyle is not > read-only but read-write and all of the defined styles are accepted. > In fact, once I > figured out how to handle the parsing of SQL comments & literals, > parsing of any > style of parameters was almost trivial (I had to do that parsing in > Python due to the > nature of our system). [Actually, I had to add code to block the > styles "pyformat" > and "format" because my boss agreed with Carsten, although he would > have preferred > that I removed code.] > > In my opinion (which is never as humble as it should be), "qmark" is > barely adequate; > numeric should be the required minimum. But now that so many have > gotten used > to "qmark", it will probably never go away. > > I am glad that the spec. [PEP 249] did not require that .paramstyle be > read-only and > now oppose any attempt to "correct" that oversight. Making the module's paramstyle writable is an odd approach. If you have one function that needs one style and another function that needs another, each function will have to explicitly set the module-level paramstyle attribute before executing the query, for fear that another function might have changed it in the meantime. That faint screaming you hear in the distance is thread-safety being thrown out the window. As an example of an alternative, InformixDB allows qmark, numeric, and named parameters, and execute() recognizes on the fly which one you're using. The paramstyle attribute only "advertises" numeric, because there is no way to advertise all supported styles while remaining compliant with the spec. qmark may not be adequate in your opinion, but it has the advantage that it's SQL standard, as far as I know. Hence, it's the parameter style that's most likely to be a native parameter style in commercial SQL implementations. If we made qmark mandatory, application developers could rely on the fact that at least qmark style will be supported. Allowing additional parameter styles, either recognized on the fly by execute(), or given as a cursor attribute, or given as an optional argument to execute(), should be encouraged, but not required. This has been discussed before, but I'd like to re-cast a vote on this for DB-API 3.0: * Deprecate/disallow pyformat/format paramstyles. * Make qmark the mandatory minimum paramstyle. Allowing additional parameter styles (numeric and/or named) would be optional and implementation specific. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From aprotin at research.att.com Tue May 22 17:24:18 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 22 May 2007 11:24:18 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1179843384.3373.37.camel@dot.uniqsys.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> Message-ID: <46530B22.9090706@research.att.com> Dear folks, Carsten Haese wrote: >On Tue, 2007-05-22 at 09:22 -0400, Art Protin wrote: > > >>Initially I also saw the .paramstyle as providing the implementor too >>much freedom. >> >>However, I chose to take it as a challenge. In my >>implementation, .paramstyle is not >>read-only but read-write and all of the defined styles are accepted. >>In fact, once I >>figured out how to handle the parsing of SQL comments & literals, >>parsing of any >>style of parameters was almost trivial (I had to do that parsing in >>Python due to the >>nature of our system). [Actually, I had to add code to block the >>styles "pyformat" >>and "format" because my boss agreed with Carsten, although he would >>have preferred >>that I removed code.] >> >>In my opinion (which is never as humble as it should be), "qmark" is >>barely adequate; >>numeric should be the required minimum. But now that so many have >>gotten used >>to "qmark", it will probably never go away. >> >>I am glad that the spec. [PEP 249] did not require that .paramstyle be >>read-only and >>now oppose any attempt to "correct" that oversight. >> >> > >Making the module's paramstyle writable is an odd approach. If you have >one function that needs one style and another function that needs >another, each function will have to explicitly set the module-level >paramstyle attribute before executing the query, for fear that another >function might have changed it in the meantime. That faint screaming you >hear in the distance is thread-safety being thrown out the window. > > I have not spent any time yet on supporting thread safety. But I see that making .paramstyle writeable hurts there. So,... This seems like the lead in to suggesting that the API be extended to separate query specification from query execution. I currently have query objects being created and handled transparently by the cursor objects. I suspect that a better interface would allow the query object be explicitly created and then paramstyle would be an attribute of the query that could be controlled both explicitly and thread safely. (I use the query string as key for looking up the query object so as be able to reuse the preparation work required by each query.) What additional benefits would a separate query class provide? What liabilities would it create? >As an example of an alternative, InformixDB allows qmark, numeric, and >named parameters, and execute() recognizes on the fly which one you're >using. The paramstyle attribute only "advertises" numeric, because there >is no way to advertise all supported styles while remaining compliant >with the spec. > >qmark may not be adequate in your opinion, but it has the advantage that >it's SQL standard, as far as I know. Hence, it's the parameter style >that's most likely to be a native parameter style in commercial SQL >implementations. If we made qmark mandatory, application developers >could rely on the fact that at least qmark style will be supported. >Allowing additional parameter styles, either recognized on the fly by >execute(), or given as a cursor attribute, or given as an optional >argument to execute(), should be encouraged, but not required. > >This has been discussed before, but I'd like to re-cast a vote on this >for DB-API 3.0: > >* Deprecate/disallow pyformat/format paramstyles. > > > I can not get my boss to adequately describe why he dislikes these parameter styles. Can you offer up a rationale to help me see a reason to eschew pyformat and format? >* Make qmark the mandatory minimum paramstyle. Allowing additional >parameter styles (numeric and/or named) would be optional and >implementation specific. > > How about raising the bar. Make qmark, numeric, and named all required. It does not take much Python code to adjust between them (to be able to implement any one in terms of any other) . Then maybe SQL will be modivated to get to numeric. Why let them bring us down to the least common denominator? [Is this extreme enough?] >Best regards, > > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070522/55f12a69/attachment.html From carsten at uniqsys.com Tue May 22 18:12:37 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 22 May 2007 12:12:37 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <46530B22.9090706@research.att.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46530B22.9090706@research.att.com> Message-ID: <1179850357.3373.60.camel@dot> On Tue, 2007-05-22 at 11:24 -0400, Art Protin wrote: > This seems like the lead in to suggesting that the API be extended to > separate > query specification from query execution. I currently have query > objects > being created and handled transparently by the cursor objects. I > suspect > that a better interface would allow the query object be explicitly > created > and then paramstyle would be an attribute of the query that could be > controlled both explicitly and thread safely. (I use the query string > as > key for looking up the query object so as be able to reuse the > preparation > work required by each query.) It doesn't seem to make sense to allow the user to control the parameter style after your internal query object is created, since the query represents the statement that it executes, and a different parameter style would require a different statement. The most straightforward solution is to add an optional keyword parameter to execute(). You can store that on your query object if you'd like, but if the user wants to execute the query with a different parameter style, they're going to have to call execute() again anyway. > What additional benefits would a separate query class provide? None that I can see. > What liabilities would it create? Clutter. > > * Deprecate/disallow pyformat/format paramstyles. > > > > > I can not get my boss to adequately describe why he dislikes these > parameter styles. > Can you offer up a rationale to help me see a reason to eschew > pyformat and format? 1) They require literal percent signs to be escaped as %%. 2) They imply that parameter passing is a string formatting exercise, which is only true in the dumbest of database implementations. Also, the subtle difference between curs.execute("insert into tab values(%s,%s)" % (1,2) ) #WRONG! and curs.execute("insert into tab values(%s,%s)", (1,2) ) #CORRECT! makes it hard for newbies and pros alike to recognize the difference between string formatting and parameter passing, and a lot of bad code has been written as a result. Using question marks makes it immediately obvious that something special is going on. > Make qmark, numeric, and named all required. It does not take much > Python > code to adjust between them (to be able to implement any one in terms > of any > other) . Then maybe SQL will be modivated to get to numeric. Why let > them > bring us down to the least common denominator? -1. It may not have taken much to implement on your backend, but that may not be universally true. Even if "not much" code is required, the amount is greater than zero, for no obvious benefit. Even requiring qmark may require non-trivial code additions to some existing API modules, but I think the effort would be justified. Requiring numeric and named as well just adds a gratuitous implementation hurdle, and it would seriously hurt the acceptability of this API change. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From dieter at handshake.de Tue May 22 19:21:01 2007 From: dieter at handshake.de (Dieter Maurer) Date: Tue, 22 May 2007 19:21:01 +0200 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <4652EE8C.4070801@research.att.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> Message-ID: <18003.9853.748955.993147@gargle.gargle.HOWL> Art Protin wrote at 2007-5-22 09:22 -0400: > ... >In my opinion (which is never as humble as it should be), "qmark" is >barely adequate; >numeric should be the required minimum. But now that so many have >gotten used >to "qmark", it will probably never go away. If we speak about readability and safety, "%(name)s" combined with a dictionary is far better than "numeric" or "qmark". SQL statements can get quite a lot of parameters and readability is therefore valuable... -- Dieter From carsten at uniqsys.com Tue May 22 19:42:31 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Tue, 22 May 2007 13:42:31 -0400 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <18003.9853.748955.993147@gargle.gargle.HOWL> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <18003.9853.748955.993147@gargle.gargle.HOWL> Message-ID: <1179855751.3373.65.camel@dot.uniqsys.com> On Tue, 2007-05-22 at 19:21 +0200, Dieter Maurer wrote: > Art Protin wrote at 2007-5-22 09:22 -0400: > > ... > >In my opinion (which is never as humble as it should be), "qmark" is > >barely adequate; > >numeric should be the required minimum. But now that so many have > >gotten used > >to "qmark", it will probably never go away. > > If we speak about readability and safety, "%(name)s" combined > with a dictionary is far better than "numeric" or "qmark". > > SQL statements can get quite a lot of parameters and readability is > therefore valuable... I agree, but named style, i.e. ":name" is even more readable, and it's not as easily confused with string formatting. -- Carsten Haese http://informixdb.sourceforge.net From mal at egenix.com Tue May 22 19:53:36 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 22 May 2007 19:53:36 +0200 Subject: [DB-SIG] Controlling return types, again In-Reply-To: <1179855751.3373.65.camel@dot.uniqsys.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <18003.9853.748955.993147@gargle.gargle.HOWL> <1179855751.3373.65.camel@dot.uniqsys.com> Message-ID: <46532E20.7050205@egenix.com> On 2007-05-22 19:42, Carsten Haese wrote: > On Tue, 2007-05-22 at 19:21 +0200, Dieter Maurer wrote: >> Art Protin wrote at 2007-5-22 09:22 -0400: >>> ... >>> In my opinion (which is never as humble as it should be), "qmark" is >>> barely adequate; >>> numeric should be the required minimum. But now that so many have >>> gotten used >>> to "qmark", it will probably never go away. >> If we speak about readability and safety, "%(name)s" combined >> with a dictionary is far better than "numeric" or "qmark". >> >> SQL statements can get quite a lot of parameters and readability is >> therefore valuable... > > I agree, but named style, i.e. ":name" is even more readable, and it's > not as easily confused with string formatting. FWIW: Last time we discussed this, qmark was the agreed standard. Not because it's the easiest to read or safest, but simply because it's easy to implement and convert into all other styles. The named styles were out-ruled due to the confusion this causes among the users: most think they have to write the SQL statement as command % parameters which completely bypasses the advantages of bound parameters and indeed introduces security risks. I'm biased, of course, since ODBC does qmark, but still, I've never really had problems with it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 22 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From aprotin at research.att.com Tue May 22 23:08:46 2007 From: aprotin at research.att.com (Art Protin) Date: Tue, 22 May 2007 17:08:46 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1179850357.3373.60.camel@dot> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46530B22.9090706@research.att.com> <1179850357.3373.60.camel@dot> Message-ID: <46535BDE.8060504@research.att.com> Dear folks, Carsten Haese wrote: >On Tue, 2007-05-22 at 11:24 -0400, Art Protin wrote: > > >>This seems like the lead in to suggesting that the API be extended to >>separate >>query specification from query execution. I currently have query >>objects >>being created and handled transparently by the cursor objects. I >>suspect >>that a better interface would allow the query object be explicitly >>created >>and then paramstyle would be an attribute of the query that could be >>controlled both explicitly and thread safely. (I use the query string >>as >>key for looking up the query object so as be able to reuse the >>preparation >>work required by each query.) >> >> > >It doesn't seem to make sense to allow the user to control the parameter >style after your internal query object is created, since the query >represents the statement that it executes, and a different parameter >style would require a different statement. > > > Duh. I am going to have to learn to not respond so quickly. It would have to be an argument to the creation of the query object and there after a read-only attribute. >The most straightforward solution is to add an optional keyword >parameter to execute(). You can store that on your query object if you'd >like, but if the user wants to execute the query with a different >parameter style, they're going to have to call execute() again anyway. > > > I will have to take this under consideration after I finish hacking in support for discovering the DB metadata since our DBMS does not support SQL queries on the system tables (it doesn't really have system tables). It does sound better than what I put in. >>What additional benefits would a separate query class provide? >> >> > >None that I can see. > > Look below in your own response to my question about paramstyle=format. Separating out the query would greatly reduce the confusion between parameters and string formatting as then the two would be in separate commands. > > >>What liabilities would it create? >> >> > >Clutter. > > > >>>* Deprecate/disallow pyformat/format paramstyles. >>> >>> >>> >>> >>I can not get my boss to adequately describe why he dislikes these >>parameter styles. >>Can you offer up a rationale to help me see a reason to eschew >>pyformat and format? >> >> > >1) They require literal percent signs to be escaped as %%. > >2) They imply that parameter passing is a string formatting exercise, >which is only true in the dumbest of database implementations. Also, the >subtle difference between > >curs.execute("insert into tab values(%s,%s)" % (1,2) ) #WRONG! > >and > >curs.execute("insert into tab values(%s,%s)", (1,2) ) #CORRECT! > >makes it hard for newbies and pros alike to recognize the difference >between string formatting and parameter passing, and a lot of bad code >has been written as a result. Using question marks makes it immediately >obvious that something special is going on. > > > >>Make qmark, numeric, and named all required. It does not take much >>Python >>code to adjust between them (to be able to implement any one in terms >>of any >>other) . Then maybe SQL will be modivated to get to numeric. Why let >>them >>bring us down to the least common denominator? >> >> > >-1. It may not have taken much to implement on your backend, but that >may not be universally true. Even if "not much" code is required, the >amount is greater than zero, for no obvious benefit. Even requiring >qmark may require non-trivial code additions to some existing API >modules, but I think the effort would be justified. Requiring numeric >and named as well just adds a gratuitous implementation hurdle, and it >would seriously hurt the acceptability of this API change. > >Best regards, > > > Thank you, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070522/919a278f/attachment.html From carl at personnelware.com Wed May 23 00:33:27 2007 From: carl at personnelware.com (Carl Karsten) Date: Tue, 22 May 2007 17:33:27 -0500 Subject: [DB-SIG] client side sub queries Message-ID: <46536FB7.3030600@personnelware.com> Or some such abomination of results of one query as a parameter of a 2nd. given my use case, I can understand why this isn't in the spec, and why it may never be. but it seems to come up more often that I would expect, so here we go. My current problem: reconcile transaction details that are off due to rounding errors. the 2 sets of details are stored on different servers, and no chance of getting one server to hit the 2nd, so the python client code is going to have to help by getting a list of keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2', 'key3', ...)" which isn't 'hard' but I find annoying that I have to convert formats in the application layer. I have no idea how this should be implemented. I can imagine something like this: cSql="select ktbl1_pk from tbl1 where cFid1 = %(id)s" cur1.execute( cSql, { 'id':'a' } ) rows1 = cur.fetchall() cSql = "select ktbl2_fk from tbl3 where ktbl1_fk IN %l" cur2.execute( cSql, rows1 ) Or maybe even pass the whole cursor in: cSql="select ktbl1_pk from tbl1 where cFid1 = %(id)s" cur1.execute( cSql, { 'id':'a' } ) cSql = "select ktbl2_fk from tbl3 where ktbl1_fk IN %c" cur2.execute( cSql, cur ) In case it isn't clear what I am trying to do, below is working code including the CREATES. (which actually have more tables than are used by what I posted - cuz my over all task is even worse.) Carl K # get first set cSql="select ktbl1_pk from tbl1 where cFid1 = %(id)s" cur.execute( cSql, { 'id':'a' } ) rows = cur.fetchall() # get 2nd based on first. list = ["'%s'" % x for x in rows] cList = ','.join( list ) cSqlWhere = "ktbl1_fk IN (%s)" % cList cSql = "select ktbl2_fk from tbl3 where %s" % cSqlWhere print cSql # select ktbl2_fk from tbl3 where ktbl1_fk IN ('1','2','3') cur.execute( cSql ) rows = cur.fetchall() # mkTestdb.sql # tbl1 and tbl2 hold monies that should be equal for one transaction # tbl3 is the join table # tbl1.cFid1 is the transaction ID. drop database testdb1; create database testdb1; grant all on testdb1.* to testUserA IDENTIFIED BY 'pw'; create table testdb1.tbl1 ( ktbl1_pk int auto_increment primary key, cFid1 char(10), nFid2 decimal(10,2) ); create table testdb1.tbl2 ( ktbl2_pk int auto_increment primary key, nFid2 decimal(10,2) ); create table testdb1.tbl3 ( ktbl3_pk int auto_increment primary key, ktbl1_fk int references tbl1, ktbl2_fk int references tbl2); # sample data: # trasaction #a # t1 (1.01, 1.02, 1.03) # t2 (1.01, 1.02, 1.04) # but not stored in the same order # (will work on exactly how to deal with that) # looks like this will work for Oracle: # ORDER BY decode( X, n1, 1, n2, 2, n3, 3...) insert into testdb1.tbl1 (ktbl1_pk, cFid1, nfid2) values (1, 'a', 1.01) ; insert into testdb1.tbl1 (ktbl1_pk, cFid1, nfid2) values (2, 'a', 1.02) ; insert into testdb1.tbl1 (ktbl1_pk, cFid1, nfid2) values (3, 'a', 1.03) ; insert into testdb1.tbl2 (ktbl2_pk, nfid2) values (1, 1.01) ; insert into testdb1.tbl2 (ktbl2_pk, nfid2) values (2, 1.02) ; insert into testdb1.tbl2 (ktbl2_pk, nfid2) values (3, 1.04) ; insert into testdb1.tbl3 (ktbl1_fk,ktbl2_fk) values (1, 1) ; insert into testdb1.tbl3 (ktbl1_fk,ktbl2_fk) values (3, 3) ; insert into testdb1.tbl3 (ktbl1_fk,ktbl2_fk) values (2, 2) ; Carl K From carsten at uniqsys.com Wed May 23 07:12:36 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 23 May 2007 01:12:36 -0400 Subject: [DB-SIG] client side sub queries In-Reply-To: <46536FB7.3030600@personnelware.com> References: <46536FB7.3030600@personnelware.com> Message-ID: <20070523045615.M82360@uniqsys.com> On Tue, 22 May 2007 17:33:27 -0500, Carl Karsten wrote > Or some such abomination of results of one query as a parameter of a > 2nd. > > given my use case, I can understand why this isn't in the spec, and > why it may never be. but it seems to come up more often that I > would expect, so here we go. > > My current problem: reconcile transaction details that are off due > to rounding errors. the 2 sets of details are stored on different > servers, and no chance of getting one server to hit the 2nd, so the > python client code is going to have to help by getting a list of > keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2', > 'key3', ...)" which isn't 'hard' but I find annoying that I have > to convert formats in the application layer. Option 1: Create a temporary table on one server and load the data from the other server into it. Then, use a server side subquery or join the tables together. If you don't have write permission on either server, there is Option 2: Fetch all relevant data from server 1 into client memory, fetch all relevant data from server 2 into client memory, and do the reconciliation in client memory. There is also Option 3: Use actual parameter passing to build a WHERE ... IN (...) clause: cSql = ("select ktbl2_fk from tbl3 where OtherKey IN (" +",".join("%s" for _ in cList) +")" ) cur.execute(cSql, cList) HTH, -- Carsten Haese http://informixdb.sourceforge.net From carsten at uniqsys.com Wed May 23 07:29:48 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 23 May 2007 01:29:48 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <46535BDE.8060504@research.att.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46530B22.9090706@research.att.com> <1179850357.3373.60.camel@dot> <46535BDE.8060504@research.att.com> Message-ID: <20070523051617.M15665@uniqsys.com> On Tue, 22 May 2007 17:08:46 -0400, Art Protin wrote > Separating out the query would greatly reduce the confusion between parameters > and string formatting as then the two would be in separate commands. Ah, you mean a "prepared statement" object. This topic comes up from time to time, but there are at least two DB-API implementations (InformixDB and either mxODBC or cxOracle, or maybe both) that allow prepared statements without the clutter of a Query object: The cursor object *is* the query object, in essence. For example, in InformixDB you can write: conn = informixdb.connect("test") cur = conn.cursor() cur.prepare("select * from customer where cust_num = ?") # later... cur.execute(cur.command, (1,) ) Instead of cur.command, InformixDB also allows passing None as the statement, which makes for shorter code if the cursor has a long name, which it might, since the cursor is being used like a prepared statement, so its name should reflect what it does. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From mal at egenix.com Wed May 23 12:27:18 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 23 May 2007 12:27:18 +0200 Subject: [DB-SIG] paramstyles, again In-Reply-To: <20070523051617.M15665@uniqsys.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46530B22.9090706@research.att.com> <1179850357.3373.60.camel@dot> <46535BDE.8060504@research.att.com> <20070523051617.M15665@uniqsys.com> Message-ID: <46541706.9060508@egenix.com> On 2007-05-23 07:29, Carsten Haese wrote: > On Tue, 22 May 2007 17:08:46 -0400, Art Protin wrote >> Separating out the query would greatly reduce the confusion between parameters >> and string formatting as then the two would be in separate commands. > > Ah, you mean a "prepared statement" object. This topic comes up from time to > time, but there are at least two DB-API implementations (InformixDB and either > mxODBC or cxOracle, or maybe both) that allow prepared statements without the > clutter of a Query object: The cursor object *is* the query object, in essence. Just to confirm: mxODBC does support this approach and it works great for caching cursors with readily prepared statements in long running applications. > For example, in InformixDB you can write: > > conn = informixdb.connect("test") > cur = conn.cursor() > cur.prepare("select * from customer where cust_num = ?") > # later... > cur.execute(cur.command, (1,) ) > > Instead of cur.command, InformixDB also allows passing None as the statement, > which makes for shorter code if the cursor has a long name, which it might, > since the cursor is being used like a prepared statement, so its name should > reflect what it does. The idea behind having cur.command stems from an old optimization that we have in DB-API: if the interface detects that the same command is used for execution, it may reuse the already prepared statement for that command. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From mal at egenix.com Wed May 23 12:36:39 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 23 May 2007 12:36:39 +0200 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1179843384.3373.37.camel@dot.uniqsys.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> Message-ID: <46541937.5030709@egenix.com> On 2007-05-22 16:16, Carsten Haese wrote: > This has been discussed before, but I'd like to re-cast a vote on this > for DB-API 3.0: > > * Deprecate/disallow pyformat/format paramstyles. +0 > * Make qmark the mandatory minimum paramstyle. Allowing additional > parameter styles (numeric and/or named) would be optional and > implementation specific. +1 Just an aside: Note that the named style allows binding the same parameter more than once. This poses a few problems for interfaces which rely on the database telling the interface how to best bind a parameter, since it may well be that case that a parameter needs to be bound as e.g. integer in one place (e.g. as index) and float in another (e.g. if used in a formula). For numbers, it's fairly obvious what to do (create multiple bindings for the object), but it's not for objects that don't easily allow retrieving the same value twice, such as iterators or files. I don't know how interfaces that do support named style deal with this problem. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten at uniqsys.com Wed May 23 13:37:22 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 23 May 2007 07:37:22 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <46541937.5030709@egenix.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46541937.5030709@egenix.com> Message-ID: <1179920242.3409.40.camel@dot.uniqsys.com> On Wed, 2007-05-23 at 12:36 +0200, M.-A. Lemburg wrote: > Just an aside: > > Note that the named style allows binding the same parameter > more than once. Ditto for numeric style. > This poses a few problems for interfaces which > rely on the database telling the interface how to best bind > a parameter, since it may well be that case that a parameter > needs to be bound as e.g. integer in one place (e.g. as index) > and float in another (e.g. if used in a formula). > > For numbers, it's fairly obvious what to do (create multiple > bindings for the object), but it's not for objects that don't > easily allow retrieving the same value twice, such as > iterators or files. > > I don't know how interfaces that do support named style deal > with this problem. InformixDB supports binding by name, and in the real world this is not a problem at all. The parameter values must be supplied in a mapping, and the input binding loop simply calls __getitem__ on that mapping, possibly requesting the same key twice. If the same key is requested twice, you'd have to work hard to make that *not* return the same object twice. You'd have to pass in a dictionary-like object whose __getitem__ method has deliberate side-effects, and if you do that, you deserve to be punished. In all cases I've seen, the parameter mapping is a plain dictionary which is either built "by hand" or uses locals() to emulate host variables: start_date = datetime(2007,1,1) end_date = datetime.date.today() cur.execute(""" select * from orders where order_date between :start_date and :end_date """, locals() ) Even if one of the entries in the dictionary is the result of calling it.next() or f.readline(), the dictionary stores the result, and that result can be retrieved repeatedly without any problems. [Footnote: The locals() idiom for emulating host variables is what convinced me to implement named binding in the first place. For programmers coming to Python from a "4GL" or ESQL/C, having something similar to host variables is a neat feature.] Best regards, -- Carsten Haese http://informixdb.sourceforge.net From mal at egenix.com Wed May 23 13:52:14 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 23 May 2007 13:52:14 +0200 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1179920242.3409.40.camel@dot.uniqsys.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46541937.5030709@egenix.com> <1179920242.3409.40.camel@dot.uniqsys.com> Message-ID: <46542AEE.6050203@egenix.com> On 2007-05-23 13:37, Carsten Haese wrote: > On Wed, 2007-05-23 at 12:36 +0200, M.-A. Lemburg wrote: >> Just an aside: >> >> Note that the named style allows binding the same parameter >> more than once. > > Ditto for numeric style. > >> This poses a few problems for interfaces which >> rely on the database telling the interface how to best bind >> a parameter, since it may well be that case that a parameter >> needs to be bound as e.g. integer in one place (e.g. as index) >> and float in another (e.g. if used in a formula). >> >> For numbers, it's fairly obvious what to do (create multiple >> bindings for the object), but it's not for objects that don't >> easily allow retrieving the same value twice, such as >> iterators or files. >> >> I don't know how interfaces that do support named style deal >> with this problem. > > InformixDB supports binding by name, and in the real world this is not a > problem at all. The parameter values must be supplied in a mapping, and > the input binding loop simply calls __getitem__ on that mapping, > possibly requesting the same key twice. If the same key is requested > twice, you'd have to work hard to make that *not* return the same object > twice. You'd have to pass in a dictionary-like object whose __getitem__ > method has deliberate side-effects, and if you do that, you deserve to > be punished. That's not what I was asking. The problem (or maybe it's a non-issue in the real word) is: what happens at binding time to the data fetched from the object you bind to a command parameter, e.g. say the interface supports reading data from a file (instead of just using a string): file = open('my.dat', 'rb') cursor.execute('insert into mytable values (:data, :data)', {'data': file}) In theory, the interface would have to read and buffer the data from the file in order to be able to provide two bindings to the database. Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carsten at uniqsys.com Wed May 23 14:32:41 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 23 May 2007 08:32:41 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <46542AEE.6050203@egenix.com> References: <1179545605.3279.19.camel@localhost.localdomain> <46521346.70208@personnelware.com> <1179792083.3254.12.camel@localhost.localdomain> <46526E5F.60104@personnelware.com> <4652EE8C.4070801@research.att.com> <1179843384.3373.37.camel@dot.uniqsys.com> <46541937.5030709@egenix.com> <1179920242.3409.40.camel@dot.uniqsys.com> <46542AEE.6050203@egenix.com> Message-ID: <1179923561.3409.86.camel@dot.uniqsys.com> On Wed, 2007-05-23 at 13:52 +0200, M.-A. Lemburg wrote: > That's not what I was asking. The problem (or maybe it's a non-issue > in the real word) is: what happens at binding time to the data > fetched from the object you bind to a command parameter, e.g. > say the interface supports reading data from a file (instead of > just using a string): > > file = open('my.dat', 'rb') > cursor.execute('insert into mytable values (:data, :data)', > {'data': file}) > > In theory, the interface would have to read and buffer > the data from the file in order to be able to provide > two bindings to the database. I see. That would require a database interface that implicitly reads the file's contents when that file is bound as an input parameter. I don't know of any interfaces that do that, and I would find that behavior rather surprising. Remember, explicit is better than implicit. If I wanted the file to be read, I'd read it explicitly. The only way that I see for how the file contents could be read implicitly upon input binding would be if some kind of inputmap were involved, possibly of the kind I proposed before this thread was hijacked. The problem could then be avoided by memoizing the result of the adapter call, either in the adapter function itself or within the DB-API layer. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From carl at personnelware.com Wed May 23 20:13:18 2007 From: carl at personnelware.com (Carl Karsten) Date: Wed, 23 May 2007 13:13:18 -0500 Subject: [DB-SIG] client side sub queries In-Reply-To: <20070523045615.M82360@uniqsys.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> Message-ID: <4654843E.8090109@personnelware.com> Carsten Haese wrote: > On Tue, 22 May 2007 17:33:27 -0500, Carl Karsten wrote >> Or some such abomination of results of one query as a parameter of a >> 2nd. >> >> given my use case, I can understand why this isn't in the spec, and >> why it may never be. but it seems to come up more often that I >> would expect, so here we go. >> >> My current problem: reconcile transaction details that are off due >> to rounding errors. the 2 sets of details are stored on different >> servers, and no chance of getting one server to hit the 2nd, so the >> python client code is going to have to help by getting a list of >> keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2', >> 'key3', ...)" which isn't 'hard' but I find annoying that I have >> to convert formats in the application layer. > > Option 1: Create a temporary table on one server and load the data from the > other server into it. Then, use a server side subquery or join the tables > together. > > If you don't have write permission on either server, there is Option 2: Fetch > all relevant data from server 1 into client memory, fetch all relevant data > from server 2 into client memory, and do the reconciliation in client memory. The perms issue can be taken care of by having the table created ahead of time. but, the application level code is still 'custom' and falls into a similar pit as embedding parameters into the SQL command string. > > There is also Option 3: Use actual parameter passing to build a WHERE ... IN > (...) clause: > > cSql = ("select ktbl2_fk from tbl3 where OtherKey IN (" > +",".join("%s" for _ in cList) > +")" ) > cur.execute(cSql, cList) your solution #3 demonstrates my point perfectly: TypeError: not enough arguments for format string so a bit of debugging and I come up with this version: list = ['%s' % x for x in rows] cSql = ("select ktbl2_fk from tbl3 where ktbl1_fk IN (" +",".join("%s" for _ in list) +")" ) print cSql cur.execute(cSql, list) But that has 2 list comprehensions - In an attempt to get it in line with your 'simple' example: list = rows cSql = ("select ktbl2_fk from tbl3 where ktbl1_fk IN (" +",".join("%s" for _ in list) +")" ) print cSql cur.execute(cSql, list) _mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '),('2',),('3',))' at line 1") This is exactly the kind of stumbling I am trying to avoid. I would think that a list of items, or even a whole cursor should be able to be passed in just as elegantly as they are returned. It might even help the optimizers. this is a stretch: I am assuming these are not 'the same': "where x in (?,?)" and "...(?,?,?)" as where a single ? that represented a list of any size would use the same execution plan. (but I am in way over my head here, so feel free to just say no.) Carl K From carsten at uniqsys.com Wed May 23 20:30:12 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Wed, 23 May 2007 14:30:12 -0400 Subject: [DB-SIG] client side sub queries In-Reply-To: <4654843E.8090109@personnelware.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> <4654843E.8090109@personnelware.com> Message-ID: <1179945012.3409.97.camel@dot.uniqsys.com> On Wed, 2007-05-23 at 13:13 -0500, Carl Karsten wrote: > list = rows > cSql = ("select ktbl2_fk from tbl3 where ktbl1_fk IN (" > +",".join("%s" for _ in list) > +")" ) > print cSql > cur.execute(cSql, list) Assuming that "rows" is the fetchall() result from your first query, try list = [x[0] for x in rows] instead of list=rows. > _mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL > syntax; check the manual that corresponds to your MySQL server version for the > right syntax to use near '),('2',),('3',))' at line 1") > > This is exactly the kind of stumbling I am trying to avoid. > > I would think that a list of items, or even a whole cursor should be able to be > passed in just as elegantly as they are returned. > > It might even help the optimizers. this is a stretch: I am assuming these are > not 'the same': "where x in (?,?)" and "...(?,?,?)" as where a single ? that > represented a list of any size would use the same execution plan. (but I am in > way over my head here, so feel free to just say no.) SQL has no notion of a single parameter representing a list of multiple values. Allowing this would lead to horrible coding practices. In general, IN (, References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> <4654843E.8090109@personnelware.com> Message-ID: <465489FA.3060602@ingres.com> Carl Karsten wrote: > Carsten Haese wrote: > >> On Tue, 22 May 2007 17:33:27 -0500, Carl Karsten wrote >> >>> Or some such abomination of results of one query as a parameter of a >>> 2nd. >>> >>> given my use case, I can understand why this isn't in the spec, and >>> why it may never be. but it seems to come up more often that I >>> would expect, so here we go. >>> >>> My current problem: reconcile transaction details that are off due >>> to rounding errors. the 2 sets of details are stored on different >>> servers, and no chance of getting one server to hit the 2nd, so the >>> python client code is going to have to help by getting a list of >>> keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2', >>> 'key3', ...)" which isn't 'hard' but I find annoying that I have >>> to convert formats in the application layer. >>> >> Option 1: Create a temporary table on one server and load the data from the >> other server into it. Then, use a server side subquery or join the tables >> together. >> >> If you don't have write permission on either server, there is Option 2: Fetch >> all relevant data from server 1 into client memory, fetch all relevant data >> from server 2 into client memory, and do the reconciliation in client memory. >> > > The perms issue can be taken care of by having the table created ahead of time. > but, the application level code is still 'custom' and falls into a similar pit > as embedding parameters into the SQL command string. > From my ivory tower, this sounds more like a database problem than an application problem. I agree completely that having to implement this logic in the application is awkward. Is replication an option with the database you are using? Either replicating from server 1 into server 2 (possibly vice-versa) or replicate from server 1 into server 3 and replicate from server 2 into server 3 so you can perform all queries in server 3. With replication you tend to have options of either batch extract/load (ETL) or real time replication (usually implemented by the database itself, sometimes by a 3rd party tool). The downside of replication are increased space requirements and possibly delays in replication if it is not real time (for a reporting situation like this, collisions should not be an issue). Chris From mal at egenix.com Wed May 23 20:55:42 2007 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 23 May 2007 20:55:42 +0200 Subject: [DB-SIG] client side sub queries In-Reply-To: <465489FA.3060602@ingres.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> <4654843E.8090109@personnelware.com> <465489FA.3060602@ingres.com> Message-ID: <46548E2E.4020402@egenix.com> On 2007-05-23 20:37, Chris Clark wrote: > Carl Karsten wrote: >> Carsten Haese wrote: >> >>> On Tue, 22 May 2007 17:33:27 -0500, Carl Karsten wrote >>> >>>> Or some such abomination of results of one query as a parameter of a >>>> 2nd. >>>> >>>> given my use case, I can understand why this isn't in the spec, and >>>> why it may never be. but it seems to come up more often that I >>>> would expect, so here we go. >>>> >>>> My current problem: reconcile transaction details that are off due >>>> to rounding errors. the 2 sets of details are stored on different >>>> servers, and no chance of getting one server to hit the 2nd, so the >>>> python client code is going to have to help by getting a list of >>>> keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2', >>>> 'key3', ...)" which isn't 'hard' but I find annoying that I have >>>> to convert formats in the application layer. >>>> >>> Option 1: Create a temporary table on one server and load the data from the >>> other server into it. Then, use a server side subquery or join the tables >>> together. >>> >>> If you don't have write permission on either server, there is Option 2: Fetch >>> all relevant data from server 1 into client memory, fetch all relevant data >>> from server 2 into client memory, and do the reconciliation in client memory. >>> >> The perms issue can be taken care of by having the table created ahead of time. >> but, the application level code is still 'custom' and falls into a similar pit >> as embedding parameters into the SQL command string. >> > > From my ivory tower, this sounds more like a database problem than an > application problem. I agree completely that having to implement this > logic in the application is awkward. While I don't think that this is awkward. Indeed it can be a lot more efficient if done right, e.g. by loading the selects from both databases into memory and using Gadfly for the query work. However, there are also several other options you could use. Here's one that easy to setup, the EasySoft ODBC Join Engine: http://www.easysoft.com/products/data_access/odbc_odbc_join_engine/index.html The use mxODBC to access it from Python and you're set :-) Quite a few commercial database engines also allow similar tricks ie. you can directly hook up one database to another and query both using the same SQL statement, doing joins, selects, subqueries, etc. Oracle and SQL Server are two such engines. The setup is usually called a "linked server". Of course, all of this is completely unrelated to the DB-API :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 23 2007) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ :::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 From carl at personnelware.com Wed May 23 21:16:46 2007 From: carl at personnelware.com (Carl Karsten) Date: Wed, 23 May 2007 14:16:46 -0500 Subject: [DB-SIG] client side sub queries In-Reply-To: <465489FA.3060602@ingres.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> <4654843E.8090109@personnelware.com> <465489FA.3060602@ingres.com> Message-ID: <4654931E.5030702@personnelware.com> Chris Clark wrote: > Carl Karsten wrote: >> Carsten Haese wrote: >> >>> On Tue, 22 May 2007 17:33:27 -0500, Carl Karsten wrote >>> >>>> Or some such abomination of results of one query as a parameter of a >>>> 2nd. >>>> >>>> given my use case, I can understand why this isn't in the spec, and >>>> why it may never be. but it seems to come up more often that I >>>> would expect, so here we go. >>>> >>>> My current problem: reconcile transaction details that are off due >>>> to rounding errors. the 2 sets of details are stored on different >>>> servers, and no chance of getting one server to hit the 2nd, so the >>>> python client code is going to have to help by getting a list of >>>> keys from one and constructing "WHERE OtherKey IN ( 'key1', 'key2', >>>> 'key3', ...)" which isn't 'hard' but I find annoying that I have >>>> to convert formats in the application layer. >>>> >>> Option 1: Create a temporary table on one server and load the data >>> from the >>> other server into it. Then, use a server side subquery or join the >>> tables >>> together. >>> >>> If you don't have write permission on either server, there is Option >>> 2: Fetch >>> all relevant data from server 1 into client memory, fetch all >>> relevant data >>> from server 2 into client memory, and do the reconciliation in client >>> memory. >>> >> >> The perms issue can be taken care of by having the table created ahead >> of time. >> but, the application level code is still 'custom' and falls into a >> similar pit as embedding parameters into the SQL command string. >> > > From my ivory tower, this sounds more like a database problem than an > application problem. Which is why I started with >>>> I can understand why this isn't in the spec, and >>>> why it may never be. > I agree completely that having to implement this > logic in the application is awkward. Is replication an option with the > database you are using? Either replicating from server 1 into server 2 > (possibly vice-versa) or replicate from server 1 into server 3 and > replicate from server 2 into server 3 so you can perform all queries in > server 3. "it depends" :) sometimes yes, sometimes no. in this case, no. it is 5 or so developers on a team, many teams using the same servers. many DBAs, have to get one of them involved with server side things. but even when yes, it is still extra work that I could see being eliminated. > > With replication you tend to have options of either batch extract/load > (ETL) or real time replication (usually implemented by the database > itself, sometimes by a 3rd party tool). The downside of replication are > increased space requirements and possibly delays in replication if it is > not real time (for a reporting situation like this, collisions should > not be an issue). > Given my case of debugging a problem, the more clutter involved in the investigation, the harder the investigation is. kinda like the whole HeisenBug* thing, only you have to debug and understand the code used to debug and understand some other code. so in this case, it would be totally wonderful if the code was clean, simple, robust, consistent, etc. things taken for granted when you use dbapi features instead of code written quickly for this one task, or code written for some other task that is close to this one. In the end, I would not recommend this for production code, but even that I see often enough. I totally understand the ivory tower view. I prefer to keep things in that realm too. and I can be opposed to making it easy to write bad code. but given the environments that create this problem are not going to go away, does it really do 'us' any good to 'ignore' them as opposed to providing a solution? (*) A HeisenBug is a bug whose presence is affected by act of observing it. http://www.c2.com/cgi/wiki?HeisenBug Carl K From carl at personnelware.com Wed May 23 22:03:44 2007 From: carl at personnelware.com (Carl Karsten) Date: Wed, 23 May 2007 15:03:44 -0500 Subject: [DB-SIG] client side sub queries In-Reply-To: <1179945012.3409.97.camel@dot.uniqsys.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> <4654843E.8090109@personnelware.com> <1179945012.3409.97.camel@dot.uniqsys.com> Message-ID: <46549E20.7030300@personnelware.com> Carsten Haese wrote: > On Wed, 2007-05-23 at 13:13 -0500, Carl Karsten wrote: >> list = rows >> cSql = ("select ktbl2_fk from tbl3 where ktbl1_fk IN (" >> +",".join("%s" for _ in list) >> +")" ) >> print cSql >> cur.execute(cSql, list) > > Assuming that "rows" is the fetchall() result from your first query, try > > list = [x[0] for x in rows] > > instead of list=rows. The goal was to reduce comprehensions. Ideally eliminate them. Again, I know it can be done in the application layer. I did it, (OP has the code.) I am just hoping a future db-api could deal with it. Everything db-api does could be done custom. for some reason, db-api was defined, code was written, and life is better. > >> _mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL >> syntax; check the manual that corresponds to your MySQL server version for the >> right syntax to use near '),('2',),('3',))' at line 1") >> >> This is exactly the kind of stumbling I am trying to avoid. >> >> I would think that a list of items, or even a whole cursor should be able to be >> passed in just as elegantly as they are returned. >> >> It might even help the optimizers. this is a stretch: I am assuming these are >> not 'the same': "where x in (?,?)" and "...(?,?,?)" as where a single ? that >> represented a list of any size would use the same execution plan. (but I am in >> way over my head here, so feel free to just say no.) > > SQL has no notion of a single parameter representing a list of multiple > values. Allowing this would lead to horrible coding practices. In > general, IN (, short lists that don't change much. I agree with the 'in general' - but how times does the same exception need to be hand coded until it gets a lower level solution provided? > If you have a long list, or one that > changes a lot, store the list in a table and join it to your query. That is outside the scope of this problem. or something. Maybe this belongs in a similar category as ODBCs tables, columns and other meta data functions. it is code I would not expect to see in a normal app, but is used enough to become one of the included batteries. Carl K From carl at personnelware.com Thu May 24 22:10:56 2007 From: carl at personnelware.com (Carl Karsten) Date: Thu, 24 May 2007 15:10:56 -0500 Subject: [DB-SIG] client side sub queries In-Reply-To: <20070523045615.M82360@uniqsys.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> Message-ID: <4655F150.4020303@personnelware.com> > There is also Option 3: Use actual parameter passing to build a WHERE ... IN > (...) clause: > > cSql = ("select ktbl2_fk from tbl3 where OtherKey IN (" > +",".join("%s" for _ in cList) > +")" ) > cur.execute(cSql, cList) Don't suppose you know off the top of your head how to code it for .paramstyle=named >>> print cx_Oracle.paramstyle named Carl K From carsten at uniqsys.com Thu May 24 22:25:53 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Thu, 24 May 2007 16:25:53 -0400 Subject: [DB-SIG] client side sub queries In-Reply-To: <4655F150.4020303@personnelware.com> References: <46536FB7.3030600@personnelware.com> <20070523045615.M82360@uniqsys.com> <4655F150.4020303@personnelware.com> Message-ID: <1180038353.13040.19.camel@dot> On Thu, 2007-05-24 at 15:10 -0500, Carl Karsten wrote: > > There is also Option 3: Use actual parameter passing to build a WHERE ... IN > > (...) clause: > > > > cSql = ("select ktbl2_fk from tbl3 where OtherKey IN (" > > +",".join("%s" for _ in cList) > > +")" ) > > cur.execute(cSql, cList) > > Don't suppose you know off the top of your head how to code it for > .paramstyle=named > > >>> print cx_Oracle.paramstyle > named This doesn't necessarily mean that 'named' is the only style it will accept. Informix is the only commercial RDBMS I use--I'm odd that way--so your mileage may vary, but I'd say there's a more than 50% chance that cx_Oracle will silently accept qmark style: cSql = ("select ktbl2_fk from tbl3 where OtherKey IN (" +",".join("?" for _ in cList) +")" ) cur.execute(cSql, cList) If that doesn't work and you must use named style, you'll have to build a dictionary with your values, and the list of placeholders will look more involved. Something along these lines: paramnames = [ ":p%d"%i for (i,_) in enumerate(cList) ] paramdict = dict(zip(paramnames, cList)) cSql = ("select ktbl2_fk from tbl3 where OtherKey IN (" +",".join(paramnames) +")" ) cur.execute(cSql, paramdict) HTH, -- Carsten Haese http://informixdb.sourceforge.net From kf7xm at yahoo.com Wed May 30 17:17:10 2007 From: kf7xm at yahoo.com (Vern Cole) Date: Wed, 30 May 2007 08:17:10 -0700 (PDT) Subject: [DB-SIG] paramstyles, again Message-ID: <1346.22888.qm@web36213.mail.mud.yahoo.com> My company's MS Exchange server seems to have eaten my first attempt to send this, so here is another try. Sorry that it seems a bit stale in the stream. Here is my suggestion about what to do to paramstyles -- i.e. kill it off completly. >> Make qmark, numeric, and named all required. It does not take much >> Python code to adjust between them (to be able to implement any one >> in terms of any >> other) . Then maybe SQL will be modivated to get to numeric. Why >> let them bring us down to the least common denominator? > >-1. It may not have taken much to implement on your backend, but that may >not be universally true. Even if "not much" code is required, the amount is >greater than zero, for no obvious benefit. Even requiring qmark may require >non-trivial code additions to some existing API modules, but I think the >effort would be justified. Requiring numeric and named as well just adds a >gratuitous implementation hurdle, and it would seriously hurt the >acceptability of this API change. I was watching an interesting video a few days ago. I highly recommend it. The video is Guido van Rossum talking about plans for Python 3000. http://video.google.com/videoplay?docid=-6459339159268485356 There is a concept that if there is more than one way to do something in a computer language, one of them is probably wrong. I would suggest that requiring every implementation to support every kind of parameter passing is a large step in the wrong direction. Python 3 will do away with the special 'print' statement. It will be replaced by a function 'print()' with an unlimited argument list. I suggest that, for parameter passing, we use an execute() method with an unlimited argument list. In other words, we should pass SQL parameters as python parameters. Let me use some SQL which I am fighting with right now as an example. This is from the SQLAlchemy test suite, and I am struggling to make the new version of adodbapi execute it without throwing up. ###vvv(begin qmark example)vvv sql = """SELECT (CASE WHEN infos.pk < ? THEN ? WHEN (infos.pk >= ? AND infos.pk < ?) THEN ? END) AS x, infos.pk, infos.info FROM infos""" parm=[3, 'lessthan3', 3, 7, 'gt3'] c.execute(sql,parm) ###^^^ I suggest the following: ###vvv(begin python3 example)vvv p1, p2, p3, p4, p5 = [3, 'lessthan3', 3, 7, 'gt3'] c.execute("SELECT (CASE WHEN infos.pk <", p1, "THEN", p2, "WHEN (infos.pk >=", p3, " AND infos.pk <", p4, ")THEN", p5, "END) AS x, infos.pk, infos.info FROM infos") ###^^^ I think that most people would find the function parameter notation much easier to read, write, and maintain than the qmark version. There is no question what parameter goes where, no counting, no wondering how python or your reader will interpret it. And it will look like the built-in python 3 print() function. ;-) It will be up to the dbi writer to convert to the correct SQL dialect for his specific flavor of SQL. I for one would do it happly for adodbapi 3.0. Also, DBapi 3.0 should follow the proposed python 3.0 standard that all character strings are returned as Unicode, and all binary data should be returned as data type 'byte'. -- Vernon Cole -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070530/f6d0cb30/attachment.htm From mike_mp at zzzcomputing.com Thu May 31 04:53:04 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Wed, 30 May 2007 22:53:04 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1346.22888.qm@web36213.mail.mud.yahoo.com> References: <1346.22888.qm@web36213.mail.mud.yahoo.com> Message-ID: On May 30, 2007, at 11:17 AM, Vern Cole wrote: > > c.execute("SELECT (CASE WHEN infos.pk <", p1, > "THEN", p2, > "WHEN (infos.pk >=", p3, > " AND infos.pk <", p4, > ")THEN", p5, > "END) AS x, infos.pk, infos.info FROM infos") > ###^^^ > > I think that most people would find the function parameter notation > much easier to read, write, and maintain than the qmark version. > There is no question what parameter goes where, no counting, no > wondering how python or your reader will interpret it. And it will > look like the built-in python 3 print() function. ;-) > seriously ? how would it differentiate a string value that is part of the generated SQL vs. a string value that is intended to be a bind parameter ? how do i execute the same SQL string with 100 different sets of bind parameters, i have to keep building brand new arrays which contain an arbitrary amalgam of SQL and bind values ? how do I pass along bind parameters along with a SQL string that was generated, and i dont know the order of how the parameters fit in ? how do I do an executemany() ? this idea seems to extract only the worst inconvenience of positional parameters (i.e., that the order of params must be known at all times) with none of its advantages (i.e., that you dont have to come up with any names), and kills off any chance of isolating the syntax of a SQL string from its parameterized values. as far as guido's quote, I havent checked but I would be pretty surprised if py3K is doing away with parameterized strings, i.e. "foo %s" % ('hi') and "foo %(name);" % {'name':'hi'}, so right there is some variety in how to put together "literals and values". my vote for paramstyles would be, everyone supports qmark and named, and we're done. the rest of the styles are all redundant. From carsten at uniqsys.com Thu May 31 15:13:03 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Thu, 31 May 2007 09:13:03 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: References: <1346.22888.qm@web36213.mail.mud.yahoo.com> Message-ID: <1180617183.3382.37.camel@dot.uniqsys.com> On Wed, 2007-05-30 at 22:53 -0400, Michael Bayer wrote: > On May 30, 2007, at 11:17 AM, Vern Cole wrote: > > > > > c.execute("SELECT (CASE WHEN infos.pk <", p1, > > "THEN", p2, > > "WHEN (infos.pk >=", p3, > > " AND infos.pk <", p4, > > ")THEN", p5, > > "END) AS x, infos.pk, infos.info FROM infos") > > ###^^^ > > > > I think that most people would find the function parameter notation > > much easier to read, write, and maintain than the qmark version. > > There is no question what parameter goes where, no counting, no > > wondering how python or your reader will interpret it. And it will > > look like the built-in python 3 print() function. ;-) > > > > seriously ? how would it differentiate a string value that is part > of the generated SQL vs. a string value that is intended to be a bind > parameter ? Differentiating parameters from query bits is easy, they alternate, but I vote -1 on Vern's proposal for all the other reasons you mentioned. The same readability of injecting variables into an SQL query can already be achieved with named parameters and locals(): c.execute("""SELECT (CASE WHEN infos.pk < :p1 THEN :p2 WHEN (infos.pk >= :p3 AND infos.pk < :p4) THEN :p5 END) AS x, infos.pk, infos.info FROM infos""", locals() ) > my vote for paramstyles would be, everyone supports qmark and named, > and we're done. the rest of the styles are all redundant. That makes +3 votes for making qmark mandatory (you, Marc-Andre, and myself). I'm not sure what your stance is on format, pyformat, and numeric. Are you allowing them optionally or are you proposing that they be deprecated/removed? I would vote -1 on completely removing numeric because I don't think it's redundant. I personally like named style, but I'm +0 on making it required. If it were required, you'd have to specify how the API is expected to differentiate between qmark and named. Do you expect the API to auto-detect the parameter style from the query string, or do you expect some kind of switching mechanism? Best regards, -- Carsten Haese http://informixdb.sourceforge.net From aprotin at research.att.com Thu May 31 15:32:14 2007 From: aprotin at research.att.com (Art Protin) Date: Thu, 31 May 2007 09:32:14 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1346.22888.qm@web36213.mail.mud.yahoo.com> References: <1346.22888.qm@web36213.mail.mud.yahoo.com> Message-ID: <465ECE5E.30005@research.att.com> Dear folks, Vern Cole wrote: > My company's MS Exchange server seems to have eaten my first attempt > to send this, so here is another try. Sorry that it seems a bit stale > in the stream. Here is my suggestion about what to do to paramstyles > -- i.e. kill it off completly. > > >> Make qmark, numeric, and named all required. It does not take much > >> Python code to adjust between them (to be able to implement any one > >> in terms of any > >> other) . Then maybe SQL will be modivated to get to numeric. Why > >> let them bring us down to the least common denominator? > > > >-1. It may not have taken much to implement on your backend, but that > may > >not be universally true. Even if "not much" code is required, the > amount is > >greater than zero, for no obvious benefit. Even requiring qmark may > require > >non-trivial code additions to some existing API modules, but I think the > >effort would be justified. Requiring numeric and named as well just > adds a > >gratuitous implementation hurdle, and it would seriously hurt the > >acceptability of this API change. > > I was watching an interesting video a few days ago. I highly > recommend it. The video is Guido van Rossum talking about plans for > Python 3000. > http://video.google.com/videoplay?docid=-6459339159268485356 > > There is a concept that if there is more than one way to do something > in a computer language, one of them is probably wrong. Agreed. But the real problem is getting people to agree which of them is right!! > I would suggest that requiring every implementation to support every > kind of parameter passing is a large step in the wrong direction. > I would not necessaily accept that view. In fact, once everybody supported every form then few would be tempted to promote their only answer as the one right answer AND we could let the users decide which worked best for them. > Python 3 will do away with the special 'print' statement. It will be > replaced by a function 'print()' with an unlimited argument list. I > suggest that, for parameter passing, we use an execute() method with > an unlimited argument list. In other words, we should pass SQL > parameters as python parameters. Let me use some SQL which I am > fighting with right now as an example. This is from the SQLAlchemy > test suite, and I am struggling to make the new version of adodbapi > execute it without throwing up. > > ###vvv(begin qmark example)vvv > sql = """SELECT (CASE WHEN infos.pk < ? THEN ? WHEN (infos.pk >= ? > AND infos.pk < ?) THEN ? END) AS x, infos.pk, infos.info FROM infos""" > > parm=[3, 'lessthan3', 3, 7, 'gt3'] > > c.execute(sql,parm) > ###^^^ > > I suggest the following: > > ###vvv(begin python3 example)vvv > p1, p2, p3, p4, p5 = [3, 'lessthan3', 3, 7, 'gt3'] > > c.execute("SELECT (CASE WHEN infos.pk <", p1, > "THEN", p2, > "WHEN (infos.pk >=", p3, > " AND infos.pk <", p4, > ")THEN", p5, > "END) AS x, infos.pk, infos.info FROM infos") > ###^^^ > > I think that most people would find the function parameter notation > much easier to read, write, and maintain than the qmark version. There > is no question what parameter goes where, no counting, no wondering > how python or your reader will interpret it. And it will look like > the built-in python 3 print() function. ;-) OK. This is no worse than qmark and almost as good as named, and if I needed to implement it, I would code it to transform this new method's arguments into named or numeric parameter format SQL with a dictionary or list of values and pass it to the code I already have that handles them. In fact, I do not see this as any cleaner or clearer than the named parameter style. di = {'p1':3, 'p2':'lessthan3', 'p3':3, 'p4':7, 'p5':'gt3' } c.execute("""select (case when info.pk < :p1 then :p2 when (infos.pk >= :p3 and infos.pk < :p4 ) then :p5 end) as x, infos.pk, infos.info from infos""", di) Furthermore, I agree with the argument that was put forward earlier that it is very important that the form we employ should look enough different from string interpolation that users realize that muliple executions of the same SQL statement with different parameter sets is just that and not executions of multiple SQL statements made different by the insertion of parameters. Thus, we would not want it to look too much like the built-in print() function. > > It will be up to the dbi writer to convert to the correct SQL dialect > for his specific flavor of SQL. I for one would do it happly for > adodbapi 3.0. > > Also, DBapi 3.0 should follow the proposed python 3.0 standard that > all character strings are returned as Unicode, and all binary data > should be returned as data type 'byte'. > -- > Vernon Cole > > >------------------------------------------------------------------------ > >_______________________________________________ >DB-SIG maillist - DB-SIG at python.org >http://mail.python.org/mailman/listinfo/db-sig > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070531/3e8f91a3/attachment.html From aprotin at research.att.com Thu May 31 16:44:44 2007 From: aprotin at research.att.com (Art Protin) Date: Thu, 31 May 2007 10:44:44 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1180617183.3382.37.camel@dot.uniqsys.com> References: <1346.22888.qm@web36213.mail.mud.yahoo.com> <1180617183.3382.37.camel@dot.uniqsys.com> Message-ID: <465EDF5C.2060104@research.att.com> Dear folks, Carsten Haese wrote: >On Wed, 2007-05-30 at 22:53 -0400, Michael Bayer wrote: > > >>On May 30, 2007, at 11:17 AM, Vern Cole wrote: >> >> >> >>>c.execute("SELECT (CASE WHEN infos.pk <", p1, >>> "THEN", p2, >>> "WHEN (infos.pk >=", p3, >>> " AND infos.pk <", p4, >>> ")THEN", p5, >>> "END) AS x, infos.pk, infos.info FROM infos") >>>###^^^ >>> >>>I think that most people would find the function parameter notation >>>much easier to read, write, and maintain than the qmark version. >>>There is no question what parameter goes where, no counting, no >>>wondering how python or your reader will interpret it. And it will >>>look like the built-in python 3 print() function. ;-) >>> >>> >>> >>seriously ? how would it differentiate a string value that is part >>of the generated SQL vs. a string value that is intended to be a bind >>parameter ? >> >> > >Differentiating parameters from query bits is easy, they alternate, but >I vote -1 on Vern's proposal for all the other reasons you mentioned. > >The same readability of injecting variables into an SQL query can >already be achieved with named parameters and locals(): > >c.execute("""SELECT (CASE > WHEN infos.pk < :p1 THEN :p2 > WHEN (infos.pk >= :p3 AND infos.pk < :p4) THEN :p5 > END) AS x, infos.pk, infos.info FROM infos""", locals() ) > > > >>my vote for paramstyles would be, everyone supports qmark and named, >>and we're done. the rest of the styles are all redundant. >> >> > >That makes +3 votes for making qmark mandatory (you, Marc-Andre, and >myself). I'm not sure what your stance is on format, pyformat, and >numeric. Are you allowing them optionally or are you proposing that they >be deprecated/removed? I would vote -1 on completely removing numeric >because I don't think it's redundant. > > > I guess I am expected to weigh in and make perfectly clear where I stand. I disapprove of (-1) making qmark mandatory and exclusive. I would approve (+1) dropping the two formatted styles (format & pyformat). I would favor (+1) adding a switching requirement. I do strongly favor (+1) requiring support for all acceptable formats (any form not required is forbidden). Our users should not have to support all the different forms just so our implementations don't have to!!!! I do favor (+1) making named and/or numeric required. Without a switching requirement, I favor dropping qmark as too "impoverished" a form. (The commonness of qmark in SQL only reflects badly on SQL.) As for a switching requirement: how does this sound (I just though of it this morning) making the parameterstyle depend on the first character of the SQL statement. If it is a colon, remove it and the parameter style is either numeric or named; if it is not a colon, the parameter style is qmark. Numeric and named are only different in that with numeric all the keys are numbers, and thus a list of parameters could be provided instead of a dictionary of parameters. It is very easy to programatically distinguish between numeric and named, and both can be simultaneuosly supported. I also think that the specification should include fragments of code to show how to convert named to numeric and how to convert numeric to qmark (all in Python of course). Then no one should complain about having to support named and numeric. >I personally like named style, but I'm +0 on making it required. If it >were required, you'd have to specify how the API is expected to >differentiate between qmark and named. Do you expect the API to >auto-detect the parameter style from the query string, or do you expect >some kind of switching mechanism? >Best regards, > > Thank you all, Art Protin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070531/8d15f342/attachment.htm From mike_mp at zzzcomputing.com Thu May 31 17:04:44 2007 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Thu, 31 May 2007 11:04:44 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <1180617183.3382.37.camel@dot.uniqsys.com> References: <1346.22888.qm@web36213.mail.mud.yahoo.com> <1180617183.3382.37.camel@dot.uniqsys.com> Message-ID: On May 31, 2007, at 9:13 AM, Carsten Haese wrote: > > That makes +3 votes for making qmark mandatory (you, Marc-Andre, and > myself). I'm not sure what your stance is on format, pyformat, and > numeric. Are you allowing them optionally or are you proposing that > they > be deprecated/removed? I would vote -1 on completely removing numeric > because I don't think it's redundant. numeric is redunant if you have named. just name your params :1, : 2, :3, etc. OK, possibly you'd say then you have to send a dict instead of a list, id just use a dict/enumerate combination on my list for that. whats the use case for numeric exactly ? > > I personally like named style, but I'm +0 on making it required. If it > were required, you'd have to specify how the API is expected to > differentiate between qmark and named. Do you expect the API to > auto-detect the parameter style from the query string, or do you > expect > some kind of switching mechanism? psycopg2, mysqldb, and pysqlite all support positional and non- positional paramstyles right now (mysql/postgres do format and pyformat, pysqlite does qmark and named), and they all know how to automatically "switch" between the two categories. im not sure if they look at the string itself or the given args (my guess is they look at the args being sent) but its totally doable. From carsten at uniqsys.com Thu May 31 17:33:56 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Thu, 31 May 2007 11:33:56 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: References: <1346.22888.qm@web36213.mail.mud.yahoo.com> <1180617183.3382.37.camel@dot.uniqsys.com> Message-ID: <1180625636.3382.49.camel@dot.uniqsys.com> On Thu, 2007-05-31 at 11:04 -0400, Michael Bayer wrote: > numeric is redunant if you have named. just name your params :1, : > 2, :3, etc. OK, possibly you'd say then you have to send a dict > instead of a list, id just use a dict/enumerate combination on my > list for that. whats the use case for numeric exactly ? Backwards compatibility and performance. InformixDB has had numeric style for much longer than it had named style, and it's cheaper to build a parameter tuple than to build a parameter dict. > > > > I personally like named style, but I'm +0 on making it required. If it > > were required, you'd have to specify how the API is expected to > > differentiate between qmark and named. Do you expect the API to > > auto-detect the parameter style from the query string, or do you > > expect > > some kind of switching mechanism? > > psycopg2, mysqldb, and pysqlite all support positional and non- > positional paramstyles right now (mysql/postgres do format and > pyformat, pysqlite does qmark and named), and they all know how to > automatically "switch" between the two categories. im not sure if > they look at the string itself or the given args (my guess is they > look at the args being sent) but its totally doable. I know auto-detect is doable, that's what InformixDB does. My point was simply that you didn't specify your preference. -- Carsten Haese http://informixdb.sourceforge.net From carsten at uniqsys.com Thu May 31 23:20:15 2007 From: carsten at uniqsys.com (Carsten Haese) Date: Thu, 31 May 2007 17:20:15 -0400 Subject: [DB-SIG] paramstyles, again In-Reply-To: <465EDF5C.2060104@research.att.com> References: <1346.22888.qm@web36213.mail.mud.yahoo.com> <1180617183.3382.37.camel@dot.uniqsys.com> <465EDF5C.2060104@research.att.com> Message-ID: <1180646415.3382.155.camel@dot.uniqsys.com> On Thu, 2007-05-31 at 10:44 -0400, Art Protin wrote: > I guess I am expected to weigh in and make perfectly clear where I > stand. "Expected" may be too strong a word, but yes, input from as many interested parties and API module maintainers as possible is helpful to measure consensus. > [snip...] If I summarized that correctly, you are in favor of requiring qmark, named, and numeric, and dropping format and pyformat. I could live with that. As I said before, InformixDB already supports all three styles, so whether named/numeric are optional or required makes no difference to me. I'm concerned that requiring named/numeric might encounter more resistance than requiring qmark, but so far I haven't seen such resistance on this thread. > As for a switching requirement: how does this sound (I just though of > it this morning) > making the parameterstyle depend on the first character of the SQL > statement. > If it is a colon, remove it and the parameter style is either numeric > or named; if it is > not a colon, the parameter style is qmark. Sorry, I give that a -1. Using "magical characters" is utterly unpythonic, and somebody reading the code would have no clue what's going on. A sensible switching mechanism should look something like this, in my opinion: Add optional 'paramstyle' keyword arguments to module.connect(), connection.cursor(), cursor.execute() and cursor.executemany(). In the absence of a 'paramstyle' argument, individual executions inherit the cursor's paramstyle, cursors inherit the connection's paramstyle, and connections use the module's read-only paramstyle attribute as default. (As discussed before, allowing the module-wide paramstyle to be changed is IMO a bad idea.) Then again, I'm not too excited about the whole idea of manual switching. I'd prefer auto-detection based on whether the query string contains question marks, colon+number or colon+identifier markers outside of string literals. Not only because that is what InformixDB already does, but because it leads to cleaner code. Best regards, -- Carsten Haese http://informixdb.sourceforge.net From kf7xm at yahoo.com Thu May 31 10:07:10 2007 From: kf7xm at yahoo.com (Vern Cole) Date: Thu, 31 May 2007 01:07:10 -0700 (PDT) Subject: [DB-SIG] paramstyles, again Message-ID: <831120.61375.qm@web36203.mail.mud.yahoo.com> By George, Michael, you are correct! I've spent about half of the night reading stuff from all over the Internet, and the two methods you suggest are the ones that make sense. I am sold. +1 on requiring support of 'qmark' and 'named' parameter styles. Implementors, like me, who are running qmark only databases will have to parse the dictionary for 'named' parameters and build a qmark string as part of the cursor.execute() method. No big deal. Next question: can't I determine the paramstyle by looking at the second parameter of c.execute()? If there is no second parameter, there is no substitution. If the second parameter is a mapping, the programmer is using 'named' parameters. If the second parameter is a sequence (or singleton), the programmer is using 'qmark'. Only if I support other (nonstandard) styles will I really ever need to have someone specify which to expect. Paramstyle is documented as being a read-only attribute in DB API 2.0. I am thinking that it may be better to use a different construct in 3.0, perhaps something like: connection.setParamstyle('someString') with 'auto' being the default. --- Vernon Cole ----- Original Message ---- From: Michael Bayer To: Vern Cole Cc: db-sig at python.org Sent: Wednesday, May 30, 2007 8:53:04 PM Subject: Re: [DB-SIG] paramstyles, again On May 30, 2007, at 11:17 AM, Vern Cole wrote: > > c.execute("SELECT (CASE WHEN infos.pk <", p1, > "THEN", p2, > "WHEN (infos.pk >=", p3, > " AND infos.pk <", p4, > ")THEN", p5, > "END) AS x, infos.pk, infos.info FROM infos") > ###^^^ > > I think that most people would find the function parameter notation > much easier to read, write, and maintain than the qmark version. > There is no question what parameter goes where, no counting, no > wondering how python or your reader will interpret it. And it will > look like the built-in python 3 print() function. ;-) > seriously ? how would it differentiate a string value that is part of the generated SQL vs. a string value that is intended to be a bind parameter ? how do i execute the same SQL string with 100 different sets of bind parameters, i have to keep building brand new arrays which contain an arbitrary amalgam of SQL and bind values ? how do I pass along bind parameters along with a SQL string that was generated, and i dont know the order of how the parameters fit in ? how do I do an executemany() ? this idea seems to extract only the worst inconvenience of positional parameters (i.e., that the order of params must be known at all times) with none of its advantages (i.e., that you dont have to come up with any names), and kills off any chance of isolating the syntax of a SQL string from its parameterized values. as far as guido's quote, I havent checked but I would be pretty surprised if py3K is doing away with parameterized strings, i.e. "foo %s" % ('hi') and "foo %(name);" % {'name':'hi'}, so right there is some variety in how to put together "literals and values". my vote for paramstyles would be, everyone supports qmark and named, and we're done. the rest of the styles are all redundant. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/db-sig/attachments/20070531/1bb65c67/attachment.htm