[Tutor] overview of how data is handled [getting dictionaries from sql cursor / generators] (fwd)

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Wed May 19 13:39:48 EDT 2004


[Forwarding to tutor at python.org: please keep tutor at python.org in CC.  It
gives other folks the opportunity to help answer your questions.]


---------- Forwarded message ----------
Date: Wed, 19 May 2004 00:37:19 -0700
From: John Fabiani <jfabiani at yolo.com>
To: Danny Yoo <dyoo at hkn.eecs.berkeley.edu>
Subject: Re: [Tutor] overview of how data is handled  [getting dictionaries
    from sql cursor / generators]

Danny Yoo wrote:

>On Tue, 18 May 2004, John Fabiani wrote:
>
>
>
>>The handling of data is like nothing I have ever done.  It's nothing
>>like Delphi, Java, .Net or VFP.
>>
>>
>
>Hi John,
>
>
>The approach that the DB API 2.0 takes is similar to Java's JDBC standard.
>A 'cursor' object in Python behaves like a combination of a Java's
>'java.sql.Statement' and 'java.sql.ResultSet'.
>
>When we say something in Java like:
>
>/*** Java ***/
>PreparedStatement stmt = conn.prepareStatement
>    ("select title, abstract from article");
>ResultSet rs = stmt.executeQuery();
>while (rs.next()) {
>     System.out.println(rs.getString(1));
>     System.out.println(rs.getString(2));
>}
>/******/
>
>then an equivalent translation of this, using Python's API, looks
>something like this:
>
>### Python
>cursor = conn.cursor()
>cursor.execute("select title, abstract from article")
>for (title, abstract) in cursor.fetchall():
>    print title
>    print abstract
>###
>
>The translation isn't exact.  I've tried to improve the readability of the
>Python code, but I think that's perfectly fair.  *grin*
>
>
>The examples above use explicit numeric indices to get at a particular
>column, but you've mentioned that you might want to use a dictionary
>instead.
>
>
>
>>OK I was able to convert the data into a dict.  But doing so required
>>many lines of code (two loops).  If the program has to convert from list
>>into dict for ten's of thousands of records it is going to be very slow.
>>
>>
>
>Can you show us what your code looks like?  And are you sure it's going to
>be too slow?
>
>Python is relatively slow, but at the same time, tens of thousands of
>records doesn't sound like too much data at all.  *grin* I would strongly
>recommend not to assume that performance is poor until a real performance
>measurement is taken.  Human intuition on what's fast and what's slow can
>be deceptive.
>
>
>Here's an example that shows how to extract dictionaries from a cursor,
>assuming that our database doesn't have a native dictionary cursor
>interface:
>
>###
>
>
>>>>def getDictsFromCursor(cursor):
>>>>
>>>>
>...     while True:
>...         nextRow = cursor.fetchone()
>...         if not nextRow: break
>...         d = {}
>...         for (i, columnDesc) in enumerate(cursor.description):
>...             d[columnDesc[0]] = nextRow[i]
>...         yield d
>...
>
>
>>>>cursor.execute("select id, name from pub_term limit 3")
>>>>
>>>>
>3L
>
>
>>>>for d in getDictsFromCursor(cursor):
>>>>
>>>>
>...     print d
>...
>{'id': 107152L, 'name': "'de novo' GDP-L-fucose biosynthesis"}
>{'id': 5258L, 'name': "'de novo' IMP biosynthesis"}
>{'id': 5259L, 'name': "'de novo' protein folding"}
>###
>
>
>Hope this helps!
>
>
>
>
my code
mylist=[]
myrowdata=[]
for count in range(0,len(myfields)):
   mylist.append(myfields[count][0])

for count in range(0,len(mydata[0])):
    myrowdata.append(mydata[0][count] )

d = dict(zip(mylist,myrowdata))
The above works with only one record at a time.

I'm using PyPgSQL and the resultset is a list.
mydata=cursor.fetchmany(10)
 Because the of books I'm reading (O'Rielly) I was expecting a tuple but
I got a list.   OK but now the routines in the books suggest that I use
a dict.  So I figured out how to convert the data (mydata in this case)
into a dict.  But then I realized that the two loops I was going to use
(to convert the list into a dict) will take time.  I did not test how
long.  But I also realized that converting has to be wrong.  The
returned data must be used as a list - not converted  and then used.
But  it appears that's what you are doing in the "getDictsFromCursor"  -
converting into a dict ( I know to show me).

In the java code you are looping thru the resultset.  That's all I want
to do - at first.  Then I'll need some way to loop thru the data based
on the field name.  Again it looks like I have to create the dict for
each record.  This just can't be right.  Fetchone 1000's of times is
also got to be wrong.   In the VFP world we have a "SCAN" that loops
thru the resultset.  I can access the data by field name in VFP. How
does one do same thing - loop thru the resultset (could be 1000's) and
access the fields by their field name?

One nice thing about getting a list is I can change the data in the list
(can't in a tuple).  so how about something like:
replace all extPrice with Qty * price (this is a very simple thing to do
in VFP,  Delphi or Java.) in the resultset. Then open a connection and
save the changed data.
What I think has to happen at the moment is each record must be
converted to a dict and then after I can change each record based on the
key of the dict.  This does not sound right - I'm doing something wrong.
John






More information about the Tutor mailing list