[Python-Dev] bytes / unicode
P.J. Eby
pje at telecommunity.com
Thu Jun 24 19:07:01 CEST 2010
At 05:12 PM 6/24/2010 +0900, Stephen J. Turnbull wrote:
>Guido van Rossum writes:
>
> > For example: how we can make the suite of functions used for URL
> > processing more polymorphic, so that each developer can choose for
> > herself how URLs need to be treated in her application.
>
>While you have come down on the side of polymorphism (as opposed to
>separate functions), I'm a little nervous about it. Specifically,
>Philip Eby expressed a desire for earlier type errors, while
>polymorphism seems to ensure that you'll need to Look Before You Leap
>to get early error detection.
This doesn't have to be in the functions; it can be in the
*types*. Mixed-type string operations have to do type checking and
upcasting already, but if the protocol were open, you could make an
encoded-bytes type that would handle the error checking.
(Btw, in some earlier emails, Stephen, you implied that this could be
fixed with codecs -- but it can't, because the problem isn't with the
bytes containing invalid Unicode, it's with the Unicode containing
invalid bytes -- i.e., characters that can't be encoded to the
ultimate codec target.)
More information about the Python-Dev
mailing list