Seek the one billionth line in a file containing 3 billion lines.
Peter Otten
__peter__ at web.de
Wed Aug 8 02:52:20 EDT 2007
Sullivan WxPyQtKinter wrote:
> I have a huge log file which contains 3,453,299,000 lines with
> different lengths. It is not possible to calculate the absolute
> position of the beginning of the one billionth line. Are there
> efficient way to seek to the beginning of that line in python?
>
> This program:
> for i in range(1000000000):
> f.readline()
> is absolutely every slow....
>
> Thank you so much for help.
That will be slow regardless of language. However
n = 10**9 - 1
assert n < sys.maxint
f = open(filename)
wanted_line = itertools.islice(f, n, None).next()
should do slightly better than your implementation.
Peter
More information about the Python-list
mailing list