[Python-Dev] Bug or feature? Unicode vs t#
M.-A. Lemburg
mal@lemburg.com
Fri, 12 Oct 2001 10:19:18 +0200
Guido van Rossum wrote:
>
> > My real question is whether there is any value in having Unicode objects
> > expose their internal representation to Python programmers through the
> > buffer interface?
>
> I used to think so, but I no longer believe this. UTF-16 should be an
> encoding and that's that.
... and later ...
> > And I think I agree, even though that /could/ break code. Then again,
> > maybe Paul's suggestion that hexlify() should reject Unicode strings
> > is the better approach.
>
> +1
Since hexlify() uses a parser marker which does not involve a
type check, there's no way to have it reject Unicode objects.
BTW, the "s#" parser marker does *not* map to getreadbuffer
for Unicode objects. Long ago we decided that the difference
between "s#" and "t#" does not make sense for Unicode objects
and, in order to increase compatibility of Unicode objects to
existing code which uses "s#", to have both parser markers
map to getcharbuffer.
As a result, both parser markers return the default encoded
version of the Unicode object. The getreadbuffer interface
is still in place, though... perhaps we ought to consider
removing it ?!
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company: http://www.egenix.com/
Python Software: http://www.lemburg.com/python/