[DB-SIG] New take on PostgreSQL bindings for Python

Thu Sep 7 19:55:33 CEST 2006

On Thu, 7 Sep 2006, M.-A. Lemburg wrote:

> Right, but this isn't something for the DB API to define. You
> have to use the SQL of a particular database backend and its
> features (such as server side cursors, ability to limit/offset
> the query result set, etc.).
>
> A module author can make things a little easier for the programmer
> by providing this functionality via the cursor.scroll()
> method, e.g.

In most cases, opening up a server side cursor means rewriting the user's 
SQL query (ie, to insert a "DECLARE CURSOR <foo>" in front of it) or 
various other tricks that make cursor.scroll() useless.

> It would be better to define the fetching strategy on a regular
> cursor object, e.g. cursor.setprefetchsize(1024) to have the
> database module prepare fetches of 1024 rows or
> cursor.setprefetchsize(sys.maxint) to always read the whole
> result set.
>
> You could also use the cursor.arraysize attribute on cursors as
> indicator of how many rows to pre-fetch. cursor.arraysize defines
> the default number of rows to fetch using cursor.fetchmany().

I don't like this because it is not only imprecise and contraining, it is 
confusing as well. If I start a query using arraysize=1, should I infer 
from that that I would like to retrieve the results one by one and 
automatically take the speed penalty of a server side cursor? What does 
it mean when I change the arraysize or call setprefetchsize() in the 
middle of my cursor.fetchone() loop?

Every time we overload the standard cursor() with some more variables and 
methods, we have to define the behavior of what happens when any 
combination of those is set by the user. I tend to like simpler objects, 
with a single and well defined behavior that do not surprize the user with 
"magic".

In the itercursor() case, an iteration cursor attempts to extract data 
from the server in chunks and not load up the entire result set in RAM. 
It's simple, straightforward, you know what you get and at what cost.

> Aside: cursors can optionally implement the iteration
> protocol, so you can write:
>
> for row in cursor:
>    print row

Now you're asking the Cursor class to know at the instantiation or 
execute() time how one is gonna loop over the results?! Remember, by the 
time you hit fetchXXX() stage, it is too late to change your mind in this 
server-side or not business.

Cristian
-- 
Cristian Gafton
rPath, Inc.