[Python-Dev] Regression in unicodestr.encode()?
Martin v. Loewis
10 Apr 2002 21:32:15 +0200
"M.-A. Lemburg" <firstname.lastname@example.org> writes:
> > It's a UTF-8 codec bug. The codec writes over the end of the buffer,
> > then invokes resize. Resizing only copies the allocated bytes, hence
> > the uninitialized bytes at the end.
> Ah, yes, you're right.
Thanks :-) I think the right fix is to avoid any resizing in the UTF-8
codec; that has bitten way too often now. Instead, it should establish
the size of the string first, then perform the actual encoding.