Question about 'iterable cursors'

Alain Ketterlin alain at dpt-info.u-strasbg.fr
Sun Nov 6 04:16:22 EST 2011


"Frank Millman" <frank at chagford.com> writes:

> I am using a few DB_API adaptors - ceODBC for Sql Server, psycopg2 for
> PostgreSQL, and sqlite3 for sqlite3.
>
> They all offer the feature that if a cursor executes a SELECT, the
> cursor returns an iterator which can be used to fetch one row at a
> time. I have been using this feature for a while and it seems like a
> good thing'.
>
> Now I am not so sure. I am using a connection pool to maintain
> connections to the database. A principle I am following is that a
> connection must be returned quickly, so that it is available for
> reuse.
>
> I have been happily returning the connection, but keeping the cursor
> open while processing the rows selected. I now realise that this is
> dangerous. Therefore I have changed my system to execute fetchall() on
> the cursor before returning the connection. This obviously loses the
> benefit of the iterator.
>
> I would appreciate confirmation that my thinking is correct on this
> issue. Or is there any way that I can have my cake and eat it?

Your thinking is correct: you need to keep the connection while
processing the cursor. Databases are made to scale, you may well be
processing the first lines of the result before the DBMS has even
finished scanning tables. View this as a pipe, the cursor being one end
of the pipe. The usual setting, fetching one line at a time, lets you
overlap your processing with the network transfers.

Fetching all data, returning the connection, and then start processing
only makes sense if the processing take a lot of time (I mean: a lot
more than fetching results), which is a rare case. Unless you are in
such an extreme situation, I would suggest leaving the optimization to
the connection pool, which is here to solve what you are trying to
solve.

-- Alain.



More information about the Python-list mailing list