[Python-Dev] Use for enumerate()

Tim Peters tim.one@comcast.net
Sat, 27 Apr 2002 01:56:56 -0400


>> Challenge 3: do it faster and with less code.

[Raymond Hettinger]
> def getline(filename, lineno):
>     if lineno < 1:
>         return ''
>     f = open(filename)
>     i, line = zip(xrange(lineno), f)[-1]
>     f.close()
>     if i+1 == lineno:
>         return line
>     return ''

Hmm.  On my box it's a little slower than Guido's getline on my standard
<wink> test, here calling that function g3 (g2 and the timing driver were
posted before; the input is Zope's DateTime.py, a 1657-line Python source
file):

getline 4.85231314638
g2 2.8915829967
g3 5.19037613772

That's a curious result, since, as you say:

> The approach is to vectorize, trading away memory
> allocation time and xrange time to save the overhead
> of the pure Python loop and test cycle.

It gets a speed boost to below 5.0 if I use range instead of xrange.

It suggests this alternative, which is a tiny bit shorter and significantly
faster than Guido's:

def g4(filename, lineno):
    if lineno < 1:
        return ''
    f = open(filename)
    get = iter(f).next
    try:
        for i in range(lineno): line = get()
    except StopIteration:
        pass
    f.close()
    return line

That weighs in at 4.04 seconds on my test case.

I think the lesson to take is that building gobs of 2-tuples is more
expensive than taking the same number of quick trips around the eval loop.
Guido's and your function both build gobs of 2-tuples, while the zippier g4
and much zippier g2 avoid that.

> ...
> The test is saved by taking advantage of zip's feature
> which stops when the first iterator is exhausted.

It is clever!  Too bad it's pig slow <wink>.