[Patches] [ python-Patches-651082 ] tarfile module implementation

noreply@sourceforge.net noreply@sourceforge.net
Wed, 25 Dec 2002 02:30:50 -0800


Patches item #651082, was opened at 2002-12-09 21:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=651082&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Lars Gustäbel (gustaebel)
Assigned to: Nobody/Anonymous (nobody)
Summary: tarfile module implementation

Initial Comment:
The tarfile module provides a comprehensive interface
to tar archive files with transparent gzip and bzip2
compression. The attachment includes tarfile.py itself,
the documentation and the tests.

----------------------------------------------------------------------

>Comment By: Lars Gustäbel (gustaebel)
Date: 2002-12-25 11:30

Message:
Logged In: YES 
user_id=642936

> However, I don't like that testtar.tar has to be in
> the local directory.  It would probably be better to allow
> this file to be in the Lib/test directory.

Ok, I fixed test_tarfile.py.

> Has there
> been any testing on these platforms?  In particular, make a
> tar file with symlinks, block devices, different users and
> try to unpack on win/mac.

tarfile is intended to be platform independent. It is known to
work on Unix (Linux, FreeBSD, HP-UX) and Windows. Special files
(like links, fifos and devices) are extracted depending on which
functions are present in the os module.  On Windows for example
where there are no os.symlink() and os.link() functions, links
are extracted as copies of the files they're pointing at. Device
files are simply not extracted and a warning is shown.
I've no experiences on how things work on MacOS, but I suppose
that it is similar to Windows (no links, no special devices
etc.). I've never had complaints from Mac users in the past, so
I guess everything works fine ;-)

> I noticed sometimes you raise an exception when self.closed,
> other times just return.  Do you need the self.closed flag?
> Could you use the underlying file-objects behaviour, ie,
> assume that the file is open?

self.closed is used for write-mode. A valid tar archive must end
with two zero blocks. So, if TarFile.close() is called, these
two blocks are written out and access to the TarFile is
disallowed. I didn't want to use the status of the underlying
file-object for this because the user her/himself should decide
when he wants to close it, perhaps he wants to append more data
to it after closing the TarFile. However, this applies to the
case when a user passes her/his own file-object to the TarFile
constructor. If TarFile itself creates a file-object, it is
closed when TarFile.close() is called.

> Does the normpath/lambda code work on the mac/windows?

Pathnames in tar archives should contain forward slashes as
separators.  The module-level normpath() is used on platforms
that have a different path separator to convert pathnames when
they're added.

> In ExFileObject._readsparse(), is the loop likely to be
> executed much?  If there is a performance concern here,
> suggest you use a list.

Good point.

> What does TarFile.list() print, rather than return a list of
> strings or something else?

TarFile.list() is a bit of a rudiment from the times when
tarfile.py was compatible to zipfile.py. It prints out a verbose
list output similar to 'tar -tv'. I think it is also a good
example on how to deal with information from TarInfo objects. To
get a list of archive members use TarFile.getmembers().

> Should TarFile._extract_member() use normpath() instead of
> os.path.normpath()?

As stated above, no. normpath is only used for addition of
files.

> os.makenod() & os.makedev() are not guaranteed to exist. 
> Should there be a try/except AttributeError in
> TarFile.makedev()?
> Same comment applies to os.geteuid() in chown.

In tarfile.py, I checked for os.mknod() and assumed that if
os.mknod() is
present, os.makedev() is present, too. However, I changed this
now to be a bit more explicit.
Same in chown(), I checked for pwd and, if present, assumed that
os.geteuid() is there too. I made this more explicit, too.

> Do the pwd and grp modules work on windows/mac?

Not AFAIK.

> Now on to the pychecker warnings :-)
>
> Lib/tarfile.py:950: Parameter (compresslevel) not used

This was indeed a bug :-(

> Lib/tarfile.py:1409: Parameter (tarinfo) not used
> Lib/tarfile.py:1435: Parameter (tarinfo) not used

Both ok. They could be used, when the methods are overloaded in
a subclass.

> Lib/tarfile.py:1878: Module (time) re-imported

fixed.

> Lib/tarfile.py:1898: Parameter (compress_type) not used

This is there for compatibility to the zipfile module's API.

> One additional note, Guido asked on python-dev on Dec 12:
>
>   Is there any chance that this can be contributed to the
>   PSF under the standard PSF license?
>
> I don't remember seeing an answer.

Maybe I could have stated this more clearly but I agreed to
using the PSF license. I sent the letter of agreement to the
PSF a week ago, it should soon arrive there.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-23 01:24

Message:
Logged In: YES 
user_id=33168

One additional note, Guido asked on python-dev on Dec 12:

  Is there any chance that this can be contributed to the
PSF under the standard PSF license?

I don't remember seeing an answer.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-23 01:14

Message:
Logged In: YES 
user_id=33168

The code looks pretty good.  I ran the test on Linux and it
passed.  However, I don't like that testtar.tar has to be in
the local directory.  It would probably be better to allow
this file to be in the Lib/test directory.

I think this would be a good addition to the stdlib.  My
biggest concern is portability issues to win/mac.  Has there
been any testing on these platforms?  In particular, make a
tar file with symlinks, block devices, different users and
try to unpack on win/mac.

In TarFile.open(), where you do:
      filemode, comptype = mode.split(":")
You should add 1 to the call to split:
      filemode, comptype = mode.split(":", 1)

Same deal further down when splitting on '|'

I noticed sometimes you raise an exception when self.closed,
other times just return.  Do you need the self.closed flag?
 Could you use the underlying file-objects behaviour, ie,
assume that the file is open?

Does the normpath/lambda code work on the mac/windows?

In ExFileObject._readsparse(), is the loop likely to be
executed much?  If there is a performance concern here,
suggest you use a list.  Then outside the loop, do a data =
''.join(list) ; size -= len(data).  (You could move the size
-= len(buf) outside the loop, if you get the len(data).)

What does TarFile.list() print, rather than return a list of
strings or something else?

Should TarFile._extract_member() use normpath() instead of
os.path.normpath()?

os.makenod() & os.makedev() are not guaranteed to exist. 
Should there be a try/except AttributeError in
TarFile.makedev()?
Same comment applies to os.geteuid() in chown.

Do the pwd and grp modules work on windows/mac?

Now on to the pychecker warnings :-)

Lib/tarfile.py:950: Parameter (compresslevel) not used
Lib/tarfile.py:1409: Parameter (tarinfo) not used
Lib/tarfile.py:1435: Parameter (tarinfo) not used
Lib/tarfile.py:1878: Module (time) re-imported
Lib/tarfile.py:1898: Parameter (compress_type) not used

The last one Paramter (compress_type) not used may be ok,
not sure about others.

----------------------------------------------------------------------

Comment By: Lars Gustäbel (gustaebel)
Date: 2002-12-17 15:41

Message:
Logged In: YES 
user_id=642936

I had to fix a small bug in TarFile.bz2open.

----------------------------------------------------------------------

Comment By: Lars Gustäbel (gustaebel)
Date: 2002-12-09 21:54

Message:
Logged In: YES 
user_id=642936

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=651082&group_id=5470