[Tutor] Reading large bz2 Files

Stefan Behnel stefan_ml at behnel.de
Mon Feb 22 18:25:34 CET 2010


Norman Rieß, 19.02.2010 13:42:
> i am trying to read a large bz2 file with this code:
> 
> source_file = bz2.BZ2File(file, "r")
> for line in source_file:
>     print line.strip()
> 
> But after 4311 lines, it stoppes without a errormessage. The bz2 file is
> much bigger though.

Could you send in a copy of the unpacked bytes around the position where it
stops? I.e. a couple of lines before and after that position? Note that
bzip2 is a block compressor, so, depending on your data, you may have to
send enough lines to fill the block size.

Does it also stop if you parse only those lines from a bzip2 file, or is it
required that the file has at least the current amount of data before those
lines?

Based on this, could you please do a bit of poking around yourself to
figure out if it is a) the byte position, b) the data content or c) the
length of the file that induces this behaviour? I assume it's rather
unpractical to share the entire file, so you will have to share hints and
information instead if you want this resolved.

Stefan



More information about the Tutor mailing list