[Python-Dev] 2.2.1 issues

Michael Hudson mwh@python.net
19 Feb 2002 14:10:40 +0000


Well, we have the first 2.2 bugfix that isn't a no-brainer to port to
2.2.1.  This is to do with the 

[ #495401 ] Build troubles: --with-pymalloc

bug.

As far as understand it, there were two problems.

1) with wide unicode characters, some function in unicodeobject.c to
   do with interpreting escape codes could write into memory it didn't
   own.

2) something to do with the handling of "unpaired high surrogates" in
   the utf-8 codec.

Were these problems related?  I think they got fixed at the same time,
but I may have gotten confused.

1) shouldn't be too much of an issue to get into 2.2.1 (there was some
contention about which fix performed better, but for 2.2.1 I don't
care too much).

2) is more troublesome, because to fix it properly breaks .pycs, in
turn because marshal uses the utf-8 codec to store unicode string
constants, and this is a no-no according to PEP 6.

Is it possible to worm around 2) by reconstructing valid strings from
the bad marshal data, or has information been lost?  How severe is the
bug?  Maybe it would be best to leave it unfixed in 2.2.1.

Basically, I guess I'm saying I'm too much of a unicode dunce to
understand all the issues involved in fixing this problems in 2.2, so
as unofficial bugfix-porter, I'd like someone else (Marc?  Martin?) to
port these particular fixes.  If the mechanics of fiddling with the
branch is too much, sending me patches is fine.

Cheers,
M.

-- 
  This is the fixed point problem again; since all some implementors
  do is implement the compiler and libraries for compiler writing, the
  language becomes good at writing compilers and not much else!
                                 -- Brian Rogoff, comp.lang.functional