[issue14013] tarfile should expose supported formats
New submission from Éric Araujo <merwok@netwok.org>: shutil contains high-level functions to create a zipfile or a tarball. When a new format is added to the tarfile module, then shutil needs to be updated manually. If tarfile exposed the names of the compressors it supports, then shutil could just automatically support everything that tarfile supports instead of having to re-do import dances for optional modules (bz2, lzma, zlib) and also duplicate formats in its doc. This may also be useful for other code wanting to do some introspection. Attached patch implements tarfile.formats, a list of strings (I thought about using a frozenset but then followed the precedent set by the 3.3 crypt module). Tests and docs not updated, I wanted to get Lars’ approval on the principle first. One could argue that this is not needed: compression modules are not added often; updating shutil after updating tarfile is not hard; it is not that useful to have access to the list of supported formats. ---------- assignee: docs@python components: Documentation, Library (Lib) messages: 153350 nosy: docs@python, eric.araujo, lars.gustaebel, nadeem.vawda priority: normal severity: normal stage: patch review status: open title: tarfile should expose supported formats type: enhancement versions: Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue14013> _______________________________________
Changes by Éric Araujo <merwok@netwok.org>: ---------- keywords: +patch Added file: http://bugs.python.org/file24521/add-tarfile.formats.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue14013> _______________________________________
Lars Gustäbel <lars@gustaebel.de> added the comment: I think this is a reasonable proposal. I think it is good style to let tarfile figure out which supported compression methods are available instead of shutil or the user. So far I have no objections. Following 3.3's crypt module, I think the name `methods' is superior to `formats' (maybe `compression_methods' is even better). Also, crypt's concept of a sorted list from stronger to weaker could also make sense here: ["xz", "bz2", "gz"]. Why not? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue14013> _______________________________________
Éric Araujo <merwok@netwok.org> added the comment: Thanks for the quick reply.
I think it is good style to let tarfile figure out which supported compression methods are available instead of shutil or the user. Note that shutil will not be wholly transparent when I’m done with the refactoring, as it will be able to translate 'xztar', 'bztar' and 'gztar' to tarfile mode strings, but will need to have a special case to morph 'bztar' to 'bz2'. It will be a small ugliness.
(There will also be ugliness in packaging: Even if I make it transparently supports all formats that shutil supports, I’ll need to have a bit of duplication because packaging has a preferred format by platform. Well.)
Following 3.3's crypt module, I think the name `methods' is superior to `formats' (maybe `compression_methods' is even better). Note that crypt’s methods really are instances of something called Method. hashlib has algorithms_guaranteed and algorithms_available since 3.2 and shutil uses get_archive_formats and get_unpack_formats. I went for tarfile.compression_formats.
Also, crypt's concept of a sorted list from stronger to weaker could also make sense here: Sure. In my first patch I put gz first as it should be universally available, and then put xz before bz2 as I think bz2 is quickly losing ground to xz (even GNU and Debian are switching to xz for their archives). The attached patch follows your idea.
BTW I will gladly wait for commits related to the other bugs (misc bugs and misc doc edits) and refresh my patch then. ---------- Added file: http://bugs.python.org/file24573/add-tarfile.compression_formats.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue14013> _______________________________________
Berker Peksag added the comment: I've updated Éric's patch. Minor changes: - Updated versionadded directive - A couple of cosmetic changes (e.g. removed brackets in the list comprehension) ---------- assignee: docs@python -> components: -Documentation nosy: +berker.peksag versions: +Python 3.5 -Python 3.3 Added file: http://bugs.python.org/file35742/issue14013.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue14013> _______________________________________
participants (3)
-
Berker Peksag
-
Lars Gustäbel
-
Éric Araujo