[Python-Dev] Unicode maintainer wanted (was: Encodings)

M.-A. Lemburg mal@lemburg.com
Mon, 10 Jul 2000 12:00:36 +0200

"Fred L. Drake, Jr." wrote:
> M.-A. Lemburg writes:
>  > Sorry, but I'm really surprised now: I've put many hours of
>  > work into this, hacked up encoding support for locale.py,
>  > went through endless discussions, proposed the changable default
>  > as compromise to make all parties (ASCII, UTF-8 and Latin-1) happy
>   My recollection was that the purpose of the changable default was so
> that we could experiment with different defaults more easily, not that
> we'd keep it changable.

That was the original idea -- noone actually did test this,
though, after the hooks were in place...

Then Guido (or was it someone else?) came up with the idea of
making the default encoding depend on the default locale 
active at Python startup time and it turned out to satisfy
proponents of all three encodings.

Guido gave his OK in private mail, Fredrik was satisfied, a few
people supported the decision in private mail and
the rest made their agreement clear via collective silence.

I then tried to hack up ways of extracting the encoding information
from the locale settings -- not exactly an easy task and
one which involved much research, two or three design
cycles (in public on python-dev) and much refinement on the
locale name mappings.

All this took quite a lot of my time which I felt was worth
investing to get all proponents happy. I, for one, still think that
the fixed UTF-8 encoding was the best choice of all possibilities.
Coding the locale stuff was meant as signal for compromise from
my part and now the whole idea get's marked as failed experiment...

>   I'm not an expert on these issues and have tried to stay out of the
> way on the discussion, but the idea of a changable default just seems
> like we're asking for trouble.
>   I'm sorry that you feel your efforts have been wasted; I don't think
> anyone tried to spend your time without careful consideration, but it
> is important that we all understand what we have gained from this
> experiment.
>   Before you implemented the changable default, we all had the
> opportunity to presume to know how things would work with it, and
> could bitch or plead about it as we felt appropriate.  Now that it's
> been tried, we have a little more real experiance with it, and can
> point to the evidence that's been presented here as we each make our
> arguments on the topic.
>   So I'd have to say that your efforts have *not* been wasted; we now
> have a very good idea of what's involved and whether "it works."  I,
> for one, am glad that the experiment was done (and expecially that it
> was done by someone who knows more about this than me!).

Thanks, Fred, but your words won't change my decision to leave
Unicode support to someone else.

I don't support the recent decisions that were made and don't
feel strong enough for Unicode to take all the blame for things
which don't work like some people expect them to.

Here's a TODO list for my successor(s):

* change hash value calculation to work on the Py_UNICODE data
  instead of creating a default encoded cached object (what
  now is .utf8str)

* change .utf8str to hold a default encoded string instead of
  an UTF-8 string (+ rename .utf8str to .defencstr)

* document the new stuff in locale.py and TeX the codecs.py
  APIs (the docstrings are there -- they just need to be
  converted to the Python TeX style)

Marc-Andre Lemburg
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/