[Tutor] Code optmisation

Mon Apr 7 17:16:35 CEST 2008

Hi Yogi

On Fri, Apr 4, 2008 at 10:05 PM, yogi <byogi at yahoo.com> wrote:
> Hi ,
>        Here is my first usable Python code.
>  The code works.
Woohoo! congratulations.

>  Here is what I'm trying to do.
>  I have two huge text files. After some processing, One is 12M  (file A) and the other 1M (file B) .
>  The files have columns which are of interest to me.
...
>  Question1 : Is there a better way ?
I admit that I didn't spend too much time trying to understand your
code.  But at first glance your logic looks like it could be easily
represented in SQL.  I bet a relational database could do your lookup
faster than doing it in pure Python.  I do this kind of thing
frequently: use python to import delimited data into a relational
database like PostgreSQL, add indexes where they make sense, query the
database for the results.  It can all be done from inside Python but
it doesn't have to be.

SELECT a.*
FROM a INNER JOIN b ON a.field0=b.field0
WHERE
  b.field3=0
  AND
  a.field3 >= (b.field1-1000000) AND a.field3 <= (b.field2+1000001)
... etc.

>  Question2 : For now I'm using shells time  call  for calculating time required. Does Python provide a more fine grained check.
I think so but I've not used it: timeit.  Search this mailing list's
archives for 'timeit' and/or at the Python command line:
import timeit
help(timeit)

>  Question 2: If I have convert this code into a function.
>  Should I ?
Yes, especially if it helps make your code easier to read and understand.