[Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]

Fri Sep 17 06:26:52 CEST 2010

On Thu, 16 Sep 2010 21:53:17 -0400, Barry Warsaw <barry at python.org> wrote:
> And of course, what happens if the original subject is in one charset and the
> prefix is in an incompatible one?  Then you end up with a wire format of two
> RFC 2047 encoded words separated by whitespace.  You have to keep those chunks
> separate all the way through to do that properly.  (I know RDM knows this. :)

Heh, my example got messed up because my current mailer didn't MIME
encode it properly.  That is, I emitted a non-RFC-compliant email :(

This is actually a pretty interesting issue in a number of ways, though
I'm not sure it relates to any other part of the stdlib.  A header can
contain encoded words encoded with different charsets.  An MUA that sorts
by subject and takes prefixes ('Re:') into account, for example, might
be decoding the header entirely before doing header matching/sorting, or
it might be matching against the RFC2047 encoded header.  Hopefully the
former, these days, but don't count on it.  So when emitting a reply, a
careful MUA would want to *only* attach the 'Re:' to the front, and not
otherwise change the header.  If it is going to do that, though, it is
going to have to (a) make sure it preserves the original bytes version
of the header and (b) refold the line if necessary.  This means knowing
lots of stuff about header encoding.  So, really, that job should be
done by the email package, or at least the email package should provide
tools to do this.

The naive way (decode the header to unicode, attach the prefix, re-encode
using your favorite charset) is going to work most of the time, and
that's what it will be easiest to do with email6.  Tacking the Re: on
the front of the bytes version of the header and having email6 refold
it will probably work about as well as it currently does in the old
email package, which is to say that sometimes the unfolded header is
otherwise unchanged, and sometimes it isn't.

> >But I *am* open to being convinced otherwise.  If everyone hates the
> >BytesMessage/StringMessage API design, then that should certainly not
> >be what we implement in email.
> 
> Just as a point of order, to the extent that we're discussing generic
> approaches to similar problems across multiple modules, it's okay that we're
> having this discussion on python-dev.  But the email-sig has put in a lot of
> work on specific API and implementation designs for the email package, so any
> deviation really needs to be debated, discussed, and decided there.

I am also finding it useful to have the API exposed to a wider audience
for feedback, but I agree, any substantive change would need to be
discussed on the email-sig, not here.

--David