[New-bugs-announce] [issue2562] Cannot use non-ascii letters in disutils if setuptools is used.

Tarek Ziadé report at bugs.python.org
Sun Apr 6 11:47:19 CEST 2008


New submission from Tarek Ziadé <ziade.tarek at gmail.com>:

If I try to put my name in the Author field as a string field, 
it will brake because distutils makes the assumption that 
the fields are string encoded in ascii, before it decodes
it into unicode, then encode it in utf8 to send the data.

See in distutils.command.register.post_to_server :

value = unicode(value).encode("utf-8")


One way to avoid this error is to provide unicode for all field,
but will fail farther if setuptools is used, because
this other package makes the assumption that the fields *are* strings::

self.run_command('egg_info')
...
distutils/dist.py", line 1047, in write_pkg_info
    pkg_info.write('Author: %s\n' % self.get_contact() )
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 18: ordinal not in range(128)

So I guess distutils shouldn't guess that it receives ascii strings
and do a raw unicode() call, and should make the assumption that 
it receives unicode fields only.


Since many packages out there use strings, I have left a unicode()
call in my patch, together with a warning. 

test provided.

----------
components: Distutils
files: unicode.patch
keywords: patch
messages: 65028
nosy: tarek
severity: normal
status: open
title: Cannot use non-ascii letters in disutils if setuptools is used.
type: crash
versions: Python 2.6
Added file: http://bugs.python.org/file9960/unicode.patch

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2562>
__________________________________


More information about the New-bugs-announce mailing list