[Python-Dev] Question on bz2 codec. Is this a bug?
mal at egenix.com
Wed Sep 29 23:05:38 CEST 2010
Chris Bergstresser wrote:
> Hi all --
> I looked through the bug tracker, but I didn't see this listed. I
> was trying to use the bz2 codec, but it seems like it's not very
> useful in the current form (and I'm not sure if it's getting added
> back to py3k, so maybe this is a moot point). It looks like the codec
> writes every piece of data fed to it as a separate compressed block.
> This results in compressed files which are significantly larger than
> the uncompressed files, if you're writing a lot of small bursts of
> data. It also leads to interesing oddities like this:
> import codecs
> with codecs.open('text.bz2', 'w', 'bz2') as f:
> for x in xrange(20):
> f.write('This is data %i\n' % x)
> with codecs.open('text.bz2', 'r', 'bz2') as f:
> print f.read()
> This prints "This is data 0" and exits, because the codec won't read
> beyond the first compressed block.
> My question is, is this known, intended behavior? Should I open a bug
> report? Is it going away in py3k, so there's no real point in fixing
The codec is scheduled to be added back to Python3.
However, it's main use is in working on whole chunks of
data rather than the line-by-line approach you're after.
This is provided by the codec's incremental encoder/decoders,
but these are currently not used by codecs.open() and
I'm not sure whether the io lib uses them, which could
be used via the regular open().
Professional Python Services directly from the Source (#1, Sep 29 2010)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
More information about the Python-Dev