distinction between unzipping bytes and unzipping a file
arkanes at gmail.com
Fri Jan 9 22:12:42 CET 2009
On Fri, Jan 9, 2009 at 3:08 PM, Chris Mellon <arkanes at gmail.com> wrote:
> On Fri, Jan 9, 2009 at 2:32 PM, webcomm <ryandw at gmail.com> wrote:
>> On Jan 9, 3:15 pm, Steve Holden <st... at holdenweb.com> wrote:
>>> webcomm wrote:
>>> > Hi,
>>> > In python, is there a distinction between unzipping bytes and
>>> > unzipping a binary file to which those bytes have been written?
>>> > The following code is, I think, an example of writing bytes to a file
>>> > and then unzipping...
>>> > decoded = base64.b64decode(datum)
>>> > #datum is a base64 encoded string of data downloaded from a web
>>> > service
>>> > f = open('data.zip', 'wb')
>>> > f.write(decoded)
>>> > f.close()
>>> > x = zipfile.ZipFile('data.zip', 'r')
>>> > After looking at the preceding code, the provider of the web service
>>> > gave me this advice...
>>> > "Instead of trying to create a file, take the unzipped bytes and get a
>>> > Unicode string of text from it."
>>> Not terribly useful advice, but one presumes he she or it was trying to
>>> be helpful.
>>> > If so, I'm not sure how to do what he's suggesting, or if it's really
>>> > different from what I've done.
>>> Well, what you have done appears pretty wrong to me, but let's take a
>>> look. What's datum? You appear to be treating it as base64-encoded data;
>>> is that correct? Have you examined it?
>> It's data that has been compressed then base64 encoded by the web
>> service. I'm supposed to download it, then decode, then unzip. They
>> provide a C# example of how to do this on page 13 of
>> If you have a minute, see also this thread...
> When they say "zip", they're talking about a zlib compressed stream of
> bytes, not a zip archive.
> You want to base64 decode the data, then zlib decompress it, then
> finally interpret it as (I think) UTF-16, as that's what Windows
> usually means when it says "Unicode".
> decoded = base64.b64decode(datum)
> decompressed = zlib.decompress(decoded)
> result = decompressed.decode('utf-16')
And of course as *soon* as I write that, I read the appendix on the
documentation in full and turn out to be wrong. Ignore me *sigh*.
It would really help if you could post a sample file somewhere.
More information about the Python-list