[Python-Dev] Regression in unicodestr.encode()?

Martin v. Loewis martin@v.loewis.de
10 Apr 2002 21:03:06 +0200


"M.-A. Lemburg" <mal@lemburg.com> writes:

> Some debugging with gdb indicates that the codec is indeed writing
> the 'nd', but the final _PyString_Resize() (which allocates a new
> buffer and copies the data into that buffer) fails to copy the last
> two characters from the string or overwrites it with NULLs.
>
> Looks like a pymalloc problem to me. Tim ?

It's a UTF-8 codec bug. The codec writes over the end of the buffer,
then invokes resize. Resizing only copies the allocated bytes, hence
the uninitialized bytes at the end.

Regards,
Martin