Hello,
I would like to summarize here http://bugs.python.org/issue2562.
Currently, distutils does not allows the usage of Unicode for
some meta-data fields that should be able to use it.
These fields are:
- author
- maintainer
- description
- long_description
For instance, if you use u"Barnabé" in the author field,
the DistributionMetada.write_pkg_file will fail when it
tries to serialize the information into a file.
The problem is that the current implementation makes the
assumption that all fields are ascii string.
This won't be a problem in Python 3k, of course, but currently concerns the 2.x series.
One possible solution would be to move these fields to Unicode for the 2.6 series.
In the meantime, many people are using str type for those fields,
so as Martin mentioned, a backward compatibility would be better
to support either plain string either Unicode.
Other fields should be left imho in ascii, since an url for instance
has to be ascii. But maybe it would be better, as Martin mentioned,
to use Unicode for all fields.
In any case, if we do use Unicode for some fields, we will need
to provide the codec to be used to serialize the data in a file.
My proposal here would be to add a 'encoding' field in the metadata
that defaults to 'utf8', and that would let people explicitely
indicates the encoding.
I have written a quick patch here of a possible implementation, so
you can see the problem :
http://bugs.python.org/file9967/unicode.metadata.patch
Regards,
Tarek
--
Tarek Ziadé | Association AfPy | www.afpy.org
Blog FR | http://programmation-python.org
Blog EN | http://tarekziade.wordpress.com/