[Email-SIG] fixing the current email module
Stephen J. Turnbull
stephen at xemacs.org
Wed Oct 7 02:30:25 CEST 2009
Glenn Linderman writes:
> Yes, I interpreted, possibly misinterpreted, Barry's comment about
> storing things as bytes, as that he was figuring to store them in wire
> format.
What that means is unclear, though. Does a "header in wire format"
mean before or after MIME encoding? Probably after, but that's pretty
useless for the purpose of editing the header. Does it include the
tag (the part before the colon) or not? Etc.
> I would tend to agree with that, except that if something is
> received/provided in a particular format, it might want to stay in that
> format until such time it is needed in a different format... and then
> the appropriate set of conversions (current format => internal format =>
> needed format) applied as needed, avoiding all conversions when it is
> already in the needed format.
If you mean that the email module will keep track of what form the
object is currently represented by, that will eventually result in
"UnicodeError: octet out of range: 161, ascii".
> two conversions are slower than none, and use 2-4 times the space in
> string format.
Let's get this correct, *then* optimize, please.
> One has to write the conversion code anyway; it is just a matter of
> where it is called. Once converted, meta data could be retained in its
> natural format.
Meta data for what? Why would you convert meta data?
> > 2. MUA #1: Composition. Input will be strings and multimedia file
> > names, output will be bytes. Will attributes of message objects
> > be manipulated? Not in a conventional MUA, but an email-based MUA
> > might find uses for that.
>
> I'm not sure what an email-based MUA is.... seems to me even a
> conventional MUA is "email-based"???
Only if it's written using the Python email module.
> > 4. Mailing list processor. Message input will be bytes.
> > Configuration input, including heading and footer texts that may
> > be added are likely to be strings. Header manipulation (adding
> > topics, sequence numbers, RFC 2369 headers) most conveniently done
> > with strings. Output will be bytes.
> >
>
> But the bulk of the message parts, received in wire format, may not need
> to be altered to be sent along in the same wire format.
That depends. For example, multimedia parts may simply be discarded,
in which case it makes sense to not convert them. However, most
Mailman lists do add a footer, and because of crappy Windows MUAs that
don't implement MIME correctly, it's preferred to add that by
concatenating as text. That simply cannot be done correctly in wire
format for any character set except ISO 8859/1.
> Heading and footing texts are configured boilerplate, and could be
> cached in a variety of formats to avoid the need to convert them for
> each message,
Premature optimization is the root of all error.
> An archiver could archive wire format,
Are you suggesting that the email module should mandate that? We have
a severe tail-dog inversion problem here.
More information about the Email-SIG
mailing list