[Email-SIG] [Python-3000] email libraries: use byte or unicode strings?

Barry Warsaw barry at python.org
Thu Nov 6 18:17:31 CET 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Nov 6, 2008, at 7:22 AM, Nick Coghlan wrote:

> Glenn Linderman wrote:
>> Even 8-bit binary can be translated into a
>> sequence of Unicode codepoints with the same numeric value, for  
>> example.
>
> No, no, no, no. Using latin-1 to tunnel binary data through Unicode  
> just
> gets us straight back into the "is it text or bytes?" hell that is the
> 8-bit string in 2.x. It defeats the entire point of making the break
> between str and bytes in 3.0 in the first place.

And I'll note that this is essentially how the email package in 3.0  
cheats its way into some modicum of usability.  It is teh suck, but it  
works (defined as "passes the tests" ;).

> If something is potentially arbitrary binary data, we need to treat it
> that way and use bytes. People are just going to have to get over  
> their
> aesthetic objections to the leading b on their bytes literals. Heck,  
> be
> happy you don't have to write bytes(map(ord, 'literal')) as was the  
> case
> in the early stages of 3.0 :)
>
> Providing a Unicode based text API over the top for the cases where
> handling malformed data isn't necessary may be convenient and a good
> idea, but it shouldn't be the only API (3.0 is already guilty of  
> that in
> a few places - we shouldn't be adding more).

Right, and really it's a deeper issue.  We're really only concerned  
with bytes vs. unicodes in headers.  When talking about payloads, we  
get into a much more rich type hierarchy, with images, audio, byte  
streams, etc, etc.  Message.get_payload(decode=True) doesn't know  
anything about that stuff, but it could.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSRMmrHEjvBPtnXfVAQLjuQQAmhi6Fz/K4MN+QBDzRgxZmX5WnSpYs2IR
ZYei/S/0xxbtZbfvC0IzIeeg4BfR1SVGRYypZGWSwSOxHX08VWNKpR0QBa6oNZsm
xjiW02856wK8AHAM2Lt59GHpj4qXbEFvUDjnv7/72WmUJO+yJbRPTCwUGLY5IToZ
xFCftr/WWfQ=
=/faa
-----END PGP SIGNATURE-----


More information about the Email-SIG mailing list