Optimisation problem - in C?

Tue Feb 20 19:07:45 EST 2001

Tim Churches <tchur at optushome.com.au> writes:
<snip>
> def makevector(sourcelist,elementlist):
> 	resultvector = []
> 	for element in elementlist:
> 		resultvector.append(sourcelist[element])
> 	return resultvector
<snip>
> My test data has about 300,000 elements in sourcelist and between about
> 10,000 and 70,000 elements in elementlist, but larger lists are
> envisaged, hence the earnest desire for maximum speed.
> 
> Questions:
> Can this function (which essentially does the fairly generic operation
> of "fetch the elements of sequence a using the element indexes contained
> in sequence b") be optimised without resorting to a C module?
<snip>

I think it can.  Since a list's contents are contiguous in memory (in
order to support random access), appending to it will cause it to
reallocate and move all its contents periodically.  If I remember
correctly, Python's lists increase their memory allocations by
constant amounts rather than by a constant factor, so this loop will
run in quadratic time, i.e. proportional to the square of the length
of elementlist!

Each of the following definitions should be equivalent to the above,
and I think at least one will run faster:

    def makevector(sourcelist,elementlist):
        resultvector = [None] * len(elementlist)
        for i in range(len(elementlist)):
            resultvector[i] = sourcelist[elementlist[i]]
        return resultvector

    def makevector(sourcelist,elementlist):
        return [sourcelist[element] for element in elementlist]

    def makevector(sourcelist,elementlist):
        return map(lambda i,l=sourcelist: l[i], elementlist)

-- 
Any opinions expressed are my own and not necessarily those of Roundpoint.