Challenge 3: do it faster and with less code.
def getline(filename, lineno): if lineno < 1: return '' f = open(filename) i, line = zip(xrange(lineno), f)[-1] f.close() if i+1 == lineno: return line return '' To keep to the spirit of the challenge, I'm ignoring that the function is i/o bound which would lead to using an 'rb' read and doing .finds or .counts on '\n'. The approach is to vectorize, trading away memory allocation time and xrange time to save the overhead of the pure Python loop and test cycle. The test is saved by taking advantage of zip's feature which stops when the first iterator is exhausted. Raymond Hettinger
Challenge 3: do it faster and with less code.
[Raymond Hettinger]
def getline(filename, lineno): if lineno < 1: return '' f = open(filename) i, line = zip(xrange(lineno), f)[-1] f.close() if i+1 == lineno: return line return ''
Hmm. On my box it's a little slower than Guido's getline on my standard <wink> test, here calling that function g3 (g2 and the timing driver were posted before; the input is Zope's DateTime.py, a 1657-line Python source file): getline 4.85231314638 g2 2.8915829967 g3 5.19037613772 That's a curious result, since, as you say:
The approach is to vectorize, trading away memory allocation time and xrange time to save the overhead of the pure Python loop and test cycle.
It gets a speed boost to below 5.0 if I use range instead of xrange. It suggests this alternative, which is a tiny bit shorter and significantly faster than Guido's: def g4(filename, lineno): if lineno < 1: return '' f = open(filename) get = iter(f).next try: for i in range(lineno): line = get() except StopIteration: pass f.close() return line That weighs in at 4.04 seconds on my test case. I think the lesson to take is that building gobs of 2-tuples is more expensive than taking the same number of quick trips around the eval loop. Guido's and your function both build gobs of 2-tuples, while the zippier g4 and much zippier g2 avoid that.
... The test is saved by taking advantage of zip's feature which stops when the first iterator is exhausted.
It is clever! Too bad it's pig slow <wink>.
Challenge 3: do it faster and with less code.
def getline(filename, lineno): if lineno < 1: return '' f = open(filename) i, line = zip(xrange(lineno), f)[-1] f.close() if i+1 == lineno: return line return ''
Cute, but it builds up a list containing all the lines up to lineno. An implicit part of the exercise (sorry for not making this explicit) was to avoid this -- IOW it should work even if the file is too large to fit in memory (as long as each individual line fits in memory). --Guido van Rossum (home page: http://www.python.org/~guido/)
Challenge 3: do it faster and with less code. it should work even if the file is too large to fit in memory (as long as each individual line fits in memory).
def getline(filename, lineno): # Second attempt if lineno < 1: return '' f = open(filename) reduce(lambda x,y: f.readline(), xrange(lineno-1), None) return (f.readline(), f.close())[0] Arghh, must resist lambda. Must not use reduce. Must avoid tuple tricks. ... GvR's code is too powerful I can't resist Raymond Hettinger
participants (3)
-
Guido van Rossum
-
Raymond Hettinger
-
Tim Peters