[issue8784] tarfile/Windows: Don't use mbcs as the default encoding

Lars Gustäbel report at bugs.python.org
Thu Jun 10 20:52:01 CEST 2010


Lars Gustäbel <lars at gustaebel.de> added the comment:

Maybe I'm going out on a limb here, but I think we should again consider what tarfile users on Windows(!) actually use it for under which circumstances. The following list is probably not exhaustive, but IMHO covers 90%:

1. Download tar archives from a webpage (when no zip is supplied) for viewing or extracting.
2. Create backups for personal use.
3. Create source archives from a project for unix users who hate zipfiles.

I am convinced that the tarfile module is not very popular on Windows, because of a simple reason: tar archives are not. Windows users will always prefer zip archives and hence the zipfile module, because it's something they're familiar with.

The point I am trying to make is, that, first, we should not choose a default encoding based on what works best with WinRAR, 7-zip and such, because they all act very differently which makes it impossible. Second, we must not overemphasize the encoding issue to a point where portability is in danger. This means that in almost all real-life cases there are no encoding issues. In my whole tarfile maintaining career I cannot remember a single incident of a tar archive that I got from an external source that contained special characters. The only tar archives that contain special characters in my experience are backups. But: these backups are created and later restored on one and the same system. Again, no encoding issues.

Long story short, I still vote for utf-8, because it enables Windows users to create backups without losing special characters, and it's ASCII-"compatible" and should be able to read 99% of the files that you get from the internet.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8784>
_______________________________________


More information about the Python-bugs-list mailing list