FileInput too slow
Steven D'Aprano
steven at REMOVE.THIS.cybersource.com.au
Mon Jan 4 22:27:36 EST 2010
On Mon, 04 Jan 2010 23:35:02 +0100, wiso wrote:
> I'm trying the fileinput module, and I like it, but I don't understand
> why it's so slow...
Because it is written for convenience, not speed. From the source code:
"Performance: this module is unfortunately one of the slower ways of
processing large numbers of input lines."
> look:
>
> from time import time
> from fileinput import FileInput
>
> file = ['r1_200907.log', 'r1_200908.log', 'r1_200909.log',
> 'r1_200910.log', 'r1_200911.log']
>
> def f1():
> n = 0
> for f in file:
> print "new file: %s" % f
> ff = open(f)
> for line in ff:
> n += 1
> ff.close()
> return n
>
> def f2():
> f = FileInput(file)
> for line in f:
> if f.isfirstline(): print "new file: %s" % f.filename()
> return f.lineno()
>
> def f3(): # f2 simpler
> f = FileInput(file)
> for line in f:
> pass
> return f.lineno()
>
>
> t = time(); f1(); print time()-t # 1.0
> t = time(); f2(); print time()-t # 7.0 !!!
> t = time(); f3(); print time()-t # 5.5
>
> I'm using text files, there are 2563150 lines in total.
The extra second and a half in f2() is probably due to the time it takes
to call f.isfirstline() 2563150 times.
--
Steven
More information about the Python-list
mailing list