Seek the one billionth line in a file containing 3 billion lines.
tjreedy at udel.edu
Wed Aug 8 23:07:30 CEST 2007
"Marc 'BlackJack' Rintsch" <bj_666 at gmx.net> wrote in message
news:5htl5qF3md0abU1 at mid.uni-berlin.de...
| On Wed, 08 Aug 2007 09:54:26 +0200, Méta-MCI \(MVP\) wrote:
| > Create a "index" (a file with 3,453,299,000 tuples :
| > line_number + start_byte) ; this file has fix-length lines.
| > slow, OK, but once.
| Why storing the line number? The first start offset is for the first
| line, the second start offset for the second line and so on.
Somewhat ironically, given that the OP's problem stems from variable line
lengths, this requires that the offsets by fixed length. On a true 64-bit
OS (not Win64, apparently) with 64-bit ints that would work great.
More information about the Python-list