[Python-Dev] Re: Moving to Unicode 3.2

M.-A. Lemburg mal@lemburg.com
Thu, 24 Oct 2002 10:50:42 +0200


Martin v. Loewis wrote:
 > "M.-A. Lemburg" <mal@lemburg.com> writes:
 >
 >
 >>I haven't seen any messages about this on python-dev. Did I miss
 >>something ?
 >
 > No. For a change like this, I did not think consultation was
 > necessary.

I saw that :-)

 >>The switch from Unicode 3.0 is a big one since 3.2 introduces
 >>non-BMP character points for the first time.
 >
 > I disagree; it's a small change. Just look at the patch itself: apart
 > from the (considerably large) generated data, there were very few
 > actual changes to source code: Changing a few limits was sufficient.

Which underlines Fredrik's good design of the Unicode database.

Still, the changes from Unicode 3.0 to 3.2 are significant (to the
few users who actually make use of the database):

	http://www.unicode.org/unicode/reports/tr28/

Looking at a diff of the 3.0 and the 3.2 data files you can
find quite a few additions, changes in categories and several
removals of numeric interpretations of code points. Most
important is, of course, that 3.2 actually assing code points
outside the 16-bit Unicode range.

What I'm really concerned about is that Python is moving on the
dev edge here while most other technologies are still using
Unicode 2.0. It's nice to be able to use Python as reference
implementation for Unicode, but the interoperability between
Python and Java/Windows suffers from this.

 > Since there are no backwards compatibility issues, and no design
 > choices (apart from the choice of updating the database at all), this
 > is a straight-forward change.
 >
 >
 >>I also don't think that it is a good idea to ship the Unicode
 >>3.2 database while the code behaves as defined in Unicode 3.0.
 >
 > Can you please elaborate? What code behaves as defined in Unicode 3.0
 > that is incompatible with the Unicode 3.2 database?

Well, I should have written: ... while the code does not even
fully implement Unicode 3.0.

Fortunately, you have already started working in that
direction (adding normalization) and I'd like to thank
you for your efforts.

 >>And last not least, I'd like to be asked before you make such
 >>changes.
 >
 >
 > I find this quite a possessive view, and I would prefer if you bring
 > up technical arguments instead of procedural ones, but ok...

I am not trying to possess anything here. This is about managing
the Unicode code base and I'm still under the impression that
I'm the one in charge here.

 > I was under the impression that I can apply my own professional
 > judgement when deciding what patches to apply without consultation, in
 > what cases to ask on python-dev, and when to submit a patch to SF.
  >
 > Apparently, this impression is wrong. Can you please give precise
 > instructions what constitutes "such a change"?

I consider changes in the design or additions that affect
the design such a change.

 > Also, should I back this change out?

No, let's first find out what the consequences of this change
are and then decide whether it's a good idea or not.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/