Question about optimization
exarkun at divmod.com
Thu Jul 24 23:40:54 CEST 2008
On Thu, 24 Jul 2008 17:19:41 -0400, Wei Hao <weihao89 at gmail.com> wrote:
>I'm pretty new to python and I have some optimization issues. I'll show you
>the piece of code which is causing it, with pseudo-code before it and
>comments. I'm accessing a gigantic table (like 15 million rows) in SQL.
>d is some dictionary, r is a precompiled regex string
>Big loop, so I search through the table in chunks given by delta
> SQL query ("select * from table where rowID >= n and rowID < (n +
>delta)"), result of query stored in a. Each individual row is a[n1], columns
>of rows are a[n1][n2].
>I am 100% sure it's this code snippet that's the cause of my problems.
>Here's what I can tell you. Each chunk of rows that I grab is essentially
>equal in size (rowID skips over stuff, but rather arbitrarily). The time it
>takes to fetch the SQL query doesn't change. But as the program progresses,
>this snippet gets slower. Here's the output:
>What is it in the code snippet that slows down as n increases? Is there
>something about the way low level python functions I don't understand which
>is slowing me down?
Perhaps you need an index on rowID.
More information about the Python-list