[issue1625] bz2.BZ2File doesn't support multiple streams

Fri May 27 23:37:03 CEST 2011

Nadeem Vawda <nadeem.vawda at gmail.com> added the comment:

I seem to be unable to log in to rietveld, so I'll reply here.

>> result += decomp.decompress(data)
> Is this efficient?  I understood that other Python implementations
> had poorly performing str.__iadd__, and therefore that using a list
> was the common idiom (using “return b''.join(result)” at the end).

Good point. I hadn't thought about other implementations.

Also, you're right about the superfluous comments in test_bz2; I'll do a
general cleanup of the test code soon.

> Looks good.  I only have one paranoid comment: since the tests use
> self.TEXT * 5 to create multiple streams, the ordering of the files is
> not tested.  IOW, the code could mix-up the order of the files and the
> tests would not catch that.  Is it a concern?

I wouldn't think so. It's not as though there is an index that the code
looks at to find the location of each stream. It just reads the data from
the file and if it reaches the end of one stream, it assumes that the
following data is a new stream, and decompresses it accordingly.

That said, I wouldn't be opposed to adding a test for that sort of thing
(just for paranoia's sake :P) if it doesn't involve adding large amounts
of additional binary data in the file. I'll come back to it once I've
tidied up the existing code.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1625>
_______________________________________