[Python-Dev] email package status in 3.X

P.J. Eby pje at telecommunity.com
Mon Jun 21 21:14:29 CEST 2010


At 03:08 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
>Barry Warsaw writes:
>
>  > Would it make sense to have "encoding-carrying" bytes and str
>  > types?
>
>
>I think the answer is "no", though, because (1) it would constitute an
>attractive nuisance (the default would be abused, it would work fine
>in Kansas, and all hell would break loose in Kagoshima, simply
>delaying the pain and/or passing it on to third parties),

You have the proposal exactly backwards, actually.

In Kagoshima, you'd use pass in an ebytes with your encoding to a 
stdlib API, and *get back an ebytes with the right encoding*, rather 
than an (incorrect and useless) unicode object which has lost data you need.


>Why limit that to bytes and str?  Why not have all objects carry their
>serializer/deserializer around with them?

Because it's not a serialization or deserialization.  Your conceptual 
framework here implies that unicode objects are the real thing, and 
that bytes are "just" a way of transporting unicode around.

But this is not the case at all, for use cases where "no, really, you 
*have to* work with bytes-encoded text streams".  The mere release of 
Python 3.x will not cause all the world's applications, libraries, 
and protocols to suddenly work with unicode, where they did not before.

Being explicit about the encoding of the bytes you're flinging around 
is actually an *increase* in specificity, explicitness, robustness, 
and error-checking ability over the status quo for either 2.x *or* 
3.x...  *and* it improves these qualities for essentially *all* 
string-handling code, without requiring that code to be rewritten to do so.

It's like getting to use the time machine, really.


>and (2) you
>really want this under control of higher level objects that have
>access to some knowledge of the environment, rather than the lowest
>level.

This proposal actually has such a higher-level object: an 
ebytes.  And it passes that information *through* the lowest level, 
in such a way as to permit the stringlike operations to be fully 
polymorphic, without the information being lost inside somebody else's API.



More information about the Python-Dev mailing list