[Python-Dev] Bug or feature? Unicode vs t#
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Thu, 11 Oct 2001 10:25:28 +0200
> So change the docs. hexlify() hasn't changed, yet the result has.
> So long as Unicode objects support the buffer interface, the
> question is why they've changed how they respond to that interface.
Unicode objects haven't changed in their response to the buffer
interface. What has changed is how t# uses this interface.
AFAICT, the change has occured in version 2.59 of getargs.c, in
response to patch #426072. In 2.1.1, t# used to call the getcharbuffer
operation, now it calls the getreadbuffer function.
For Unicode objects, these are different: getcharbuffer converts the
string using the default encoding into a character string, whereas
getreadbuffer returns a pointer to the internal representation
(depending on the platform, this makes 2 or 4 bytes per Unicode
character, and they may come out either little or big endian).
It seems that the problem is in the simplification attempted with said
patch.
Regards,
Martin