[Python-Dev] Patch making the current email package (mostly) support bytes

R. David Murray rdmurray at bitdance.com
Fri Oct 8 21:40:00 CEST 2010

On Sat, 09 Oct 2010 02:48:23 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> R. David Murray writes:
>  > On Sat, 09 Oct 2010 01:06:29 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
>  > > That mess is entirely unnecessary in Python 3.  Text and wire format
>  > > can be easily distinguished with three different representations of
>  > > email: Unicode for the conceptual RFC 822 layer (of course this is an
>  > > extension, because RFC 822 itself is strictly limited to the ASCII
>  > > subset), bytes for wire format, and Message objects for modern
>  > > structured mail (including MIME, etc).
>  > That engineering is pretty much what we are looking at, although in
>  > practice I think you have to hang wire-format and text-format bits off
>  > of appropriate places in the model in order to keep everything properly
>  > coordinated.
> Right.  That's where I was going with my comment to Barry about the
> Received headers.  Even if email isn't going to serve clients working
> with wire format, it needs to deal with those headers.  But where I
> think the headers defined by RFC 822 should be stored as str in
> email6, I am leaning toward storing Received headers verbatim as bytes
> (including any RFC 822 folding whitespace) because of the RFC 5321
> requirement that they be preserved exactly.

Well, the plan for  email6 is to *allow* clients to work with wire format,
though it will probably be a bit more awkward than working with the
text interface.  And my current strategy is in general to preserve the
input bytes and, as long as the header in question hasn't been modified,
emit those bytes when serialization back to bytes is done.  My current
plan is that conversion to text is only done at the point where text
is requested, at which point the conversion is cached for later use.
And if the header is modified, the source bytes version is discarded.
Conversely if the source of the header was text input (msg['Subject'] =
'Hi'), then the conversion to bytes is only done when serialization to
bytes is requested.

None of this is implemented yet.

R. David Murray                                      www.bitdance.com

More information about the Python-Dev mailing list