From mike_mp at zzzcomputing.com Sun Feb 1 22:47:58 2015 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Sun, 1 Feb 2015 16:47:58 -0500 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" Message-ID: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> Hi - May I get clarification from pep-249 regarding the contract for fetchall() and fetchmany(), as to if it is acceptable and/or recommended that these methods return tuples of tuples, as opposed to lists of tuples? In my experience, while tuples and lists are just ?sequences? that are immutable / mutable, there is a more subtle semantic distinction between them, in that lists are for variable-length collections of homogenous elements (e.g. rows), whereas tuples are appropriate for fixed-length collections of heterogenous elements (e.g. column values). If this is the case, pep-249?s use of the language "e.g. a list of tuples? leaves this open to interpretation and inconsistency, like so many other things. I?d like to get some language, at least in email here, that lists are recommended, if this is the case. Otherwise, I guess it?ll just be another thing for me to suggest for DBAPI-3 :). thanks for your attention! - mike From mal at egenix.com Mon Feb 2 12:13:54 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 02 Feb 2015 12:13:54 +0100 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> Message-ID: <54CF5BF2.1070405@egenix.com> On 01.02.2015 22:47, Michael Bayer wrote: > Hi - > > May I get clarification from pep-249 regarding the contract for fetchall() and fetchmany(), as to if it is acceptable and/or recommended that these methods return tuples of tuples, as opposed to lists of tuples? > > In my experience, while tuples and lists are just ?sequences? that are immutable / mutable, there is a more subtle semantic distinction between them, in that lists are for variable-length collections of homogenous elements (e.g. rows), whereas tuples are appropriate for fixed-length collections of heterogenous elements (e.g. column values). > > If this is the case, pep-249?s use of the language "e.g. a list of tuples? leaves this open to interpretation and inconsistency, like so many other things. I?d like to get some language, at least in email here, that lists are recommended, if this is the case. Otherwise, I guess it?ll just be another thing for me to suggest for DBAPI-3 :). > > thanks for your attention! We have made this more general in the DB-API to allow authors to create e.g. result set objects which implement the sequence API and row objects which also implement sequence API. I know several database modules which provide ways to have the fetch methods return row objects, but I'm not aware of ones which implement a result set object. You get the best performance using lists of tuples. Tuple creation is fast and lists are ideal for variable length storage of objects. Using tuples for result sets is not a very useful thing to do, since you typically want to manipulate the result set in the application. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 02 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mike_mp at zzzcomputing.com Mon Feb 2 14:41:16 2015 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Mon, 2 Feb 2015 08:41:16 -0500 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: <54CF5BF2.1070405@egenix.com> References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> Message-ID: <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> M.-A. Lemburg wrote: > On 01.02.2015 22:47, Michael Bayer wrote: >> Hi - >> >> May I get clarification from pep-249 regarding the contract for >> fetchall() and fetchmany(), as to if it is acceptable and/or recommended >> that these methods return tuples of tuples, as opposed to lists of >> tuples? >> >> In my experience, while tuples and lists are just ?sequences? that are >> immutable / mutable, there is a more subtle semantic distinction between >> them, in that lists are for variable-length collections of homogenous >> elements (e.g. rows), whereas tuples are appropriate for fixed-length >> collections of heterogenous elements (e.g. column values). >> >> If this is the case, pep-249?s use of the language "e.g. a list of >> tuples? leaves this open to interpretation and inconsistency, like so >> many other things. I?d like to get some language, at least in email here, >> that lists are recommended, if this is the case. Otherwise, I guess it?ll >> just be another thing for me to suggest for DBAPI-3 :). >> >> thanks for your attention! > > We have made this more general in the DB-API to allow authors to > create e.g. result set objects which implement the sequence API > and row objects which also implement sequence API. > > I know several database modules which provide ways to have the > fetch methods return row objects, but I'm not aware of ones which > implement a result set object. > > You get the best performance using lists of tuples. Tuple creation > is fast and lists are ideal for variable length storage of objects. > > Using tuples for result sets is not a very useful thing to do, > since you typically want to manipulate the result set in the > application. Both MySQL and PyMySQL return the result as a tuple of tuples, specifically because tuples are ?immutable? and a result set from the DB should also be ?immutable?, because you can?t change what the DB just returned to you. It?s wrong both from a practical standpoint as well as a semantic one. Though in practice, people haven?t seemed to call for this result specifically to be mutable. It?s usually being shuttled off to some other collection somewhere. I?d love if DBAPI could grant some more clarity on this. From vernondcole at gmail.com Tue Feb 3 19:26:00 2015 From: vernondcole at gmail.com (Vernon D. Cole) Date: Tue, 3 Feb 2015 11:26:00 -0700 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> Message-ID: Recent versions of adodbapi do in fact return a result set ("Rows") object. Since ADO requires a separate operating system call to fetch each row, and since we can assume that the ADO result set is setting around in memory somewhere, I did not want the space overhead of making an additional copy of all of the result set, so each row is "lazy" fetched, when needed, as a "Row" object. Individual rows can be selected by sequence or index operations. Each "Row" object will then "lazy" fetch fields, which can be selected by sequence, index, or mapping (by column name) operations, or accessed as attributes (by column name). Each field datum is converted from internal database format into an appropriate Python object at that moment, optionally using a user-defined conversion. Since both "Rows" objects, and "Row" objects, obey the sequence API, this design satisfies the PEP requirements that we return a sequence of sequences. It will also do much more. If the PEP had been written to require a list of tuples, none of these other operations would have been possible. I applaud the PEP authors for avoiding the temptation to over-specify. [By the way, both Rows and Row objects are made immutable, so that an unwary programmer does not receive an unwelcome surprise by attempting to alter data without using an SQL statement.] > > > We have made this more general in the DB-API to allow authors to > > create e.g. result set objects which implement the sequence API > > and row objects which also implement sequence API. > > > > I know several database modules which provide ways to have the > > fetch methods return row objects, but I'm not aware of ones which > > implement a result set object. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike_mp at zzzcomputing.com Tue Feb 3 21:30:18 2015 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 3 Feb 2015 15:30:18 -0500 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> Message-ID: Vernon D. Cole wrote: > > Since both "Rows" objects, and "Row" objects, obey the sequence API, this design satisfies the PEP requirements that we return a sequence of sequences. It will also do much more. If the PEP had been written to require a list of tuples, none of these other operations would have been possible. I applaud the PEP authors for avoiding the temptation to over-specify. Please understand, in no way am I calling for the spec to indicate a ?list of tuples? as a fixed system. As is customary in Python, I am calling for it to specify a list-*like* object of tuple-*like* objects. The current problem is that the widely accepted semantics of tuples, that they are ?usually? [1] data structures of heterogeneous elements [2], are being disobeyed in the case of the MySQL drivers, in the name of ?immutability? above all other concerns. These semantics apply to the Python DBAPI so well that the Python DBAPI is even used as a leading example as to what the semantic difference is between a list and a tuple [3]. I don?t believe that adodbapi suffers from this problem. [1] The various articles and blogs regarding tuples typically will qualify their guidance with words like ?typically?, ?usually?, etc. As they should. However, I don?t think that the result set returned by cursor.fetchall() qualifies as an ?unusual? and/or ?atypical? use case for lists vs. tuples. It is the *idiomatic* use case. [2] https://docs.python.org/2/tutorial/datastructures.html#tuples-and-sequences [3] http://news.e-scribe.com/397 > [By the way, both Rows and Row objects are made immutable, so that an unwary programmer does not receive an unwelcome surprise by attempting to alter data without using an SQL statement.] I would actually favor the result list to be an immutable one as well, and the ideal structure would be an immutable list (which is of course entirely straightforward in Python). It?s unusual that Marc suggests the collection is explicitly mutable, and that this is a common use case. Overall though, I don?t care much that the collection is mutable or not though in this area as well, a clearer specification would only serve the purpose to eliminate this decision / discussion needing to happen at the endpoints of every DBAPI driver. A spec that provides clear guidance rather than loose suggestions would save a lot of effort. Such a spec is of course more difficult to produce, but IMHO would be worth it. > > > > > We have made this more general in the DB-API to allow authors to > > create e.g. result set objects which implement the sequence API > > and row objects which also implement sequence API. > > > > I know several database modules which provide ways to have the > > fetch methods return row objects, but I'm not aware of ones which > > implement a result set object. > > > > _______________________________________________ > DB-SIG maillist - DB-SIG at python.org > https://mail.python.org/mailman/listinfo/db-sig From mal at egenix.com Tue Feb 3 22:12:17 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 03 Feb 2015 22:12:17 +0100 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> Message-ID: <54D139B1.7050808@egenix.com> On 03.02.2015 21:30, Michael Bayer wrote: >> [By the way, both Rows and Row objects are made immutable, so that an unwary programmer does not receive an unwelcome surprise by attempting to alter data without using an SQL statement.] > > I would actually favor the result list to be an immutable one as well, and the ideal structure would be an immutable list (which is of course entirely straightforward in Python). It?s unusual that Marc suggests the collection is explicitly mutable, and that this is a common use case. I was probably unclear in what I wrote. I wasn't emphasizing on the mutability of the list elements, but rather the subsequent use of the result set list in the applications, which often slice and dice the list in various ways needed by the use case. A typical use case for this is fetching result sets in chunks and then recombining them in a larger list for further processing. > Overall though, I don?t care much that the collection is mutable or not though in this area as well, a clearer specification would only serve the purpose to eliminate this decision / discussion needing to happen at the endpoints of every DBAPI driver. A spec that provides clear guidance rather than loose suggestions would save a lot of effort. Such a spec is of course more difficult to produce, but IMHO would be worth it. The DB-API does provide guidance, but doesn't mandate specific implementations. What we can do is add a note to recommend that implementations return results sets as list of tuples, unless special result set and row objects are more appropriate or needed. BTW: I still don't quite understand the motivation to have this pinned down in the spec. If an application needs a list, the DB-API mandates the sequence protocol, so running: rs = list(cursor.fetchall()) will always give you a list as result set. Note that tuples are faster than lists in Python, so a performance aware module author might want to use this fact to squeeze out a few more cycles. People who have purely numeric needs, can also use numpy arrays as containers (even to represent the complete result set), since they also implement the sequence protocol. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 03 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mike_mp at zzzcomputing.com Wed Feb 4 00:10:34 2015 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 3 Feb 2015 18:10:34 -0500 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: <54D139B1.7050808@egenix.com> References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> <54D139B1.7050808@egenix.com> Message-ID: <2E795136-DDC7-4F9C-8056-ED4454C4F239@zzzcomputing.com> M.-A. Lemburg wrote: > > The DB-API does provide guidance, but doesn't mandate specific > implementations. > > What we can do is add a note to recommend that implementations > return results sets as list of tuples, unless special result set and > row objects are more appropriate or needed. > > BTW: I still don't quite understand the motivation to have this > pinned down in the spec. If an application needs a list, the DB-API > mandates the sequence protocol, so running: Marc - It?s mostly an issue of semantic correctness. A tuple implies a certain kind of role for a data structure. From a ?Does it work?? perspective, there?s really no problem at all, except if we?re concerned about applications that assume the return of fetchall() is mutable. Python is in often enthusiastic about this kind of thing in general, from pep8 to naming conventions to everything else (to pointing out in the Python docs what the usual role of tuples is). But there?s no pressing issue here other than that. > rs = list(cursor.fetchall()) > > will always give you a list as result set. > > Note that tuples are faster than lists in Python, so a performance > aware module author might want to use this fact to squeeze out a > few more cycles. > > People who have purely numeric needs, can also use numpy arrays as > containers (even to represent the complete result set), > since they also implement the sequence protocol. > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source (#1, Feb 03 2015) >>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ > ________________________________________________________________________ > > ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ From peter_e at gmx.net Tue Feb 10 22:28:44 2015 From: peter_e at gmx.net (Peter Eisentraut) Date: Tue, 10 Feb 2015 16:28:44 -0500 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: <2E795136-DDC7-4F9C-8056-ED4454C4F239@zzzcomputing.com> References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> <54D139B1.7050808@egenix.com> <2E795136-DDC7-4F9C-8056-ED4454C4F239@zzzcomputing.com> Message-ID: <54DA780C.7020907@gmx.net> On 2/3/15 6:10 PM, Michael Bayer wrote: > It?s mostly an issue of semantic correctness. A tuple implies a certain kind > of role for a data structure. From a ?Does it work?? perspective, there?s > really no problem at all, except if we?re concerned about applications that > assume the return of fetchall() is mutable. Python is in often enthusiastic > about this kind of thing in general, from pep8 to naming conventions to > everything else (to pointing out in the Python docs what the usual > role of tuples is). But there?s no pressing issue here other than that. As far as DB-API is concerned, it might be worth specifying that the result values from fetchall etc. may be immutable. I'm not sure if anything would break from that, though. Other than that, Python is not Haskell, and we're not going to succeed legislating "usual" or "semantic" differences between tuples and lists. The differences are what they are, and users are free to exploit them any way they want to. From mike_mp at zzzcomputing.com Wed Feb 11 00:08:48 2015 From: mike_mp at zzzcomputing.com (Michael Bayer) Date: Tue, 10 Feb 2015 18:08:48 -0500 Subject: [DB-SIG] "lists of tuples" vs. "tuples of tuples" In-Reply-To: <54DA780C.7020907@gmx.net> References: <19F049B2-EAF7-4F22-830E-A0F84125DD30@zzzcomputing.com> <54CF5BF2.1070405@egenix.com> <9A4ABF1A-4C1B-40B6-B373-FD5DE3B8C524@zzzcomputing.com> <54D139B1.7050808@egenix.com> <2E795136-DDC7-4F9C-8056-ED4454C4F239@zzzcomputing.com> <54DA780C.7020907@gmx.net> Message-ID: Peter Eisentraut wrote: > On 2/3/15 6:10 PM, Michael Bayer wrote: >> It?s mostly an issue of semantic correctness. A tuple implies a certain kind >> of role for a data structure. From a ?Does it work?? perspective, there?s >> really no problem at all, except if we?re concerned about applications that >> assume the return of fetchall() is mutable. Python is in often enthusiastic >> about this kind of thing in general, from pep8 to naming conventions to >> everything else (to pointing out in the Python docs what the usual >> role of tuples is). But there?s no pressing issue here other than that. > > > Other than that, Python is not Haskell, This issue is obviously closed. But I?d point out, I?d very much like Python to not be PHP either ;).