I would like to summarize here http://bugs.python.org/issue2562.
Currently, distutils does not allows the usage of Unicode for some meta-data fields that should be able to use it.
These fields are:
- author - maintainer - description - long_description
For instance, if you use u"Barnabé" in the author field, the DistributionMetada.write_pkg_file will fail when it tries to serialize the information into a file.
The problem is that the current implementation makes the assumption that all fields are ascii string.
This won't be a problem in Python 3k, of course, but currently concerns the 2.x series.
One possible solution would be to move these fields to Unicode for the 2.6 series.
In the meantime, many people are using str type for those fields, so as Martin mentioned, a backward compatibility would be better to support either plain string either Unicode.
Other fields should be left imho in ascii, since an url for instance has to be ascii. But maybe it would be better, as Martin mentioned, to use Unicode for all fields.
In any case, if we do use Unicode for some fields, we will need to provide the codec to be used to serialize the data in a file.
My proposal here would be to add a 'encoding' field in the metadata that defaults to 'utf8', and that would let people explicitely indicates the encoding.
I have written a quick patch here of a possible implementation, so you can see the problem :
Tarek Ziadé wrote:
I would like to summarize here http://bugs.python.org/issue2562. ...
Just a quick note to say that my patch has been integrated to Python trunk with the help of MAL, So this bug is fixed for the 2.7 series, (doesn't concern 3.x)
If you have accented letters in your name you may use them now in your setup.py metadata :-)