[Python-Dev] bytes / unicode
P.J. Eby
pje at telecommunity.com
Sat Jun 26 20:17:44 CEST 2010
At 12:42 PM 6/26/2010 +0900, Stephen J. Turnbull wrote:
>What I'm saying here is that if bytes are the signal of validity, and
>the stdlib functions preserve validity, then it's better to have the
>stdlib functions object to unicode data as an argument. Compare the
>alternative: it returns a unicode object which might get passed around
>for a while before one of your functions receives it and identifies it
>as unvalidated data.
I still don't follow, since passing in bytes should return
bytes. Returning unicode would be an error, in the case of a
"polymorphic" function (per Guido).
>But you agree that there are better mechanisms for validation
>(although not available in Python yet), so I don't see this as an
>potential obstacle to polymorphism now.
Nope. I'm just saying that, given two bytestrings to url-join or
path join or whatever, a polymorph should hand back a
bytestring. This seems pretty uncontroversial.
> > What I want is for the stdlib to create stringlike objects of a
> > type determined by the types of the inputs --
>
>In general this is a hard problem, though. Polymorphism, OK, one-way
>tainting OK, but in general combining related types is pretty
>arbitrary, and as in the encoded-bytes case, the result type often
>varies depending on expectations of callers, not the types of the
>data.
But the caller can enforce those expectations by passing in arguments
whose types do what they want in such cases, as long as the string
literals used by the function don't get to override the relevant
parts of the string protocol(s).
The idea that I'm proposing is that the basic string and byte types
should defer to "user-defined" string types for mixed type
operations, so that polymorphism of string-manipulation functions is
the *default* case, rather than a *special* case. This makes
tainting easier to implement, as well as optimizing and other special
cases (like my "source string w/file and line info", or a string with
font/formatting attributes).
>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe:
>http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com
More information about the Python-Dev
mailing list