[Python-Dev] Regression in unicodestr.encode()?

M.-A. Lemburg mal@lemburg.com
Wed, 10 Apr 2002 21:27:04 +0200

"Martin v. Loewis" wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> > Some debugging with gdb indicates that the codec is indeed writing
> > the 'nd', but the final _PyString_Resize() (which allocates a new
> > buffer and copies the data into that buffer) fails to copy the last
> > two characters from the string or overwrites it with NULLs.
> >
> > Looks like a pymalloc problem to me. Tim ?
> It's a UTF-8 codec bug. The codec writes over the end of the buffer,
> then invokes resize. Resizing only copies the allocated bytes, hence
> the uninitialized bytes at the end.

Ah, yes, you're right.

Marc-Andre Lemburg
CEO eGenix.com Software GmbH
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/