[Python-bugs-list] [ python-Bugs-755031 ] zipfile: inconsistent filenames with InfoZip "unzip"

SourceForge.net noreply@sourceforge.net
Sun, 15 Jun 2003 18:23:02 -0700


Bugs item #755031, was opened at 2003-06-15 16:23
Message generated for change (Comment added) made by sjones
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=755031&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Ward (gward)
Assigned to: Nobody/Anonymous (nobody)
Summary: zipfile: inconsistent filenames with InfoZip "unzip"

Initial Comment:
zipfile.py gives filenames inconsistent with the
InfoZIP "unzip" utility for certain ZIP files.  My
source is an email virus, so the ZIP files are almost
certainl malformed.  Nevertheless, it would be nice if
"unzip -l" and ZipFile.namelist() gave consistent
filenames.

Example: the attached Demo.zip (extracted from an email
virus caught on mail.python.org) looks like this
according to InfoZip:

$ unzip -l /tmp/Demo.zip 
Archive:  /tmp/Demo.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
    44544  01-26-03 20:49  
DOCUME~1\CHRISS~1\LOCALS~1\Temp\Demo.exe
 --------                   -------
    44544                   1 file

But according to ZipFile.namelist(), the name of that
file is:
 
DOCUME~1\CHRISS~1\LOCALS~1\Temp\Demo.exescr000000000000000000.txt

Getting the same result with Python 2.2.2 and a
~2-week-old build of 2.3 CVS.


----------------------------------------------------------------------

Comment By: Shannon Jones (sjones)
Date: 2003-06-15 20:23

Message:
Logged In: YES 
user_id=589306

The actual filename from the zipfile is:
filename =
'DOCUME~1\CHRISS~1\LOCALS~1\Temp\Demo.exe\x00\x00scr\x00000000000000000000.txt'

Notice there is a \x00 after Demo.exe. My guess is InfoZip
stores the filename in a null terminated string and this
extra null character in the filename terminates it at this
point. Python doesn't care if you have nulls in the string,
so it prints the entire filename.

You can see the zip file format description at
ftp://ftp.info-zip.org/pub/infozip/doc/appnote-981119-iz.zip

The format does say:
      2)  String fields are not null terminated, since the
          length is given explicitly.

But it doesn't really say if strings are allowed to have
nulls in them.

So does Python or InfoZip get this right?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-06-15 20:19

Message:
Logged In: YES 
user_id=6380

That almost sounds like an intentional inconsistency. Could
it be that the central directory has one name but the local
header has a different one? Or that there's a null byte in
the filename so that the filename length is inconsistent?
The front of the file looks like this according to od -c:

0000000   P   K 003 004  \n  \0  \0  \0  \0  \0   *   Š   :
  .   c   Ì
0000020  \v   g  \0   ®  \0  \0  \0   ®  \0  \0   D  \0  \0
 \0   D   O
0000040   C   U   M   E   ~   1   \   C   H   R   I   S   S
  ~   1   \
0000060   L   O   C   A   L   S   ~   1   \   T   e   m   p
  \   D   e
0000100   m   o   .   e   x   e  \0  \0   s   c   r  \0   0
  0   0   0
0000120   0   0   0   0   0   0   0   0   0   0   0   0   0
  0   .   t
0000140   x   t   M   Z 220  \0 003  \0  \0  \0 004  \0  \0
 \0   ÿ   ÿ
0000160  \0  \0   ž  \0  \0  \0  \0  \0  \0  \0   @  \0  \0
 \0  \0  \0


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=755031&group_id=5470