[issue8784] tarfile/Windows: Don't use mbcs as the default encoding

STINNER Victor report at bugs.python.org
Thu Jun 10 23:14:11 CEST 2010


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> 2. Create backups for personal use.

What? Really? I'm sure that all Windows users will use ZIP or maybe RAR, but never the geek choice.

> 1. Download tar archives from a webpage (when no zip is supplied) for viewing or extracting.

Tarballs come from UNIX/BSD world which use UTF-8 by default since some years ago.

> 3. Create source archives from a project for unix users who hate zipfiles.

In this case, UTF-8 is also better.

--

Did I mentionned that 7-zip is only able to create TAR archive? I mean uncompressed archive. Who will use that? (not me ;-))

WinRAR is unable to create tarballs, even (uncompressed) .tar archive.

--

If the maintainer of the tarfile module agrees that UTF-8 is the best choice, I will commit my initial patch. I would prefer to commit tarfile_windows_utf8.patch because it changes 4 lines, whereas tarfile_mbcs_errors.patch changes... much more code :-)

tarfile_windows_utf8.patch is not complete: the documentation should also be updated:

.. data:: ENCODING

   The default character encoding i.e. the value from either
   :func:`sys.getfilesystemencoding` or :func:`sys.getdefaultencoding`.

=>

.. data:: ENCODING

   The default character encoding: ``'utf-8'`` on Windows,
   :func:`sys.getfilesystemencoding` otherwise.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8784>
_______________________________________


More information about the Python-bugs-list mailing list