please help with optimisation of this code - update of given table according to another table

Antoon Pardon apardon at forel.vub.ac.be
Wed Nov 8 08:24:30 EST 2006


On 2006-11-08, Farraige <farraige at go2.pl> wrote:
>
> ...
>
> The main part of my algorithm now looks something like ...
>
> merge(t1, t2, keyColumns, columnsToBeUpdated)
>
> .......
>
>         for row_t1 in t1:
>             for  row_t2 in t2:
>                 if [row_t1[i] for i in keyColumns] == [row_t2[j] for j
> in keyColumns]:
>                     # the keys are the same
>                     for colName in columnsToBeUpdated:
>                         row_t1[colName] = row_t2[colName]
>
>                     # go outside the inner loop - we found a row with
>                     # the same key in the table
>                     break
>
> In my algorithm I have 2 for loops and I have no idea how to optimise
> it (maybe with map? )
> I call this method for very large data and the performance is a
> critical issue for me :(
>
> I will be grateful for any ideas

One idea would be to precompute the list comprehensions in the if test.

    p2 = [[row_t2[i] for i in keyColums] for row_t2 in t2]
    for row_t1 in t1:
        proj1 = [row_t1[i] for i in keyColumns]
        for row_t1, proj2 in izip(t2, p2):
	    if proj1 == proj2:
	       ...


-- 
Antoon Pardon



More information about the Python-list mailing list