[Python-Dev] Bug or feature? Unicode vs t#

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Thu, 11 Oct 2001 10:25:28 +0200


> So change the docs.  hexlify() hasn't changed, yet the result has.
> So long as Unicode objects support the buffer interface, the
> question is why they've changed how they respond to that interface.

Unicode objects haven't changed in their response to the buffer
interface. What has changed is how t# uses this interface.

AFAICT, the change has occured in version 2.59 of getargs.c, in
response to patch #426072. In 2.1.1, t# used to call the getcharbuffer
operation, now it calls the getreadbuffer function. 

For Unicode objects, these are different: getcharbuffer converts the
string using the default encoding into a character string, whereas
getreadbuffer returns a pointer to the internal representation
(depending on the platform, this makes 2 or 4 bytes per Unicode
character, and they may come out either little or big endian).

It seems that the problem is in the simplification attempted with said
patch.

Regards,
Martin