[Email-SIG] fixing the current email module
Stephen J. Turnbull
stephen at xemacs.org
Wed Oct 7 12:33:42 CEST 2009
Glenn Linderman writes:
> > If you mean that the email module will keep track of what form the
> > object is currently represented by, that will eventually result in
> > "UnicodeError: octet out of range: 161, ascii".
>
> The above sentence does not communicate your meaning to me... or any
> meaning, actually. Can you explain?
Yes, that Unicode error is one that took years for Mailman to work
around. If we are going to be converting different objects at
different times, I'm sure we'll get to see it agin in the future. Oh,
joy.
> If conversions are avoided, then octets are unlikely to be out of
> range?
Haven't looked in your spam bucket recently, I guess. Spammers
regularly put 8 bit characters into headers (and into bodies in
messages without a Content-Type header), for one thing.
> And the email module must be aware of the form of the data in
> order to manipulate it in any format other than wire format, but
> fortunately, wire format declares the format of the data (not to say
> there is not buggy wire format data -- but that is an issue best avoided
> by avoiding as many conversions as possible).
"Best" I can't speak to; you obviously are willing to accept a much
higher error rate than I am. "Robust" handling of buggy wire format
data means that the email module must do something sane with it before
giving it to the application. Maybe it's reasonable to do that
lazily, and/or cache the result, but access to bogus data (that the
email module can determine is bogus or suspicious) must not be allowed
unless the client says "hit me with your best shot" explicitly. Most
clients are simply not going to be prepared for the kind of crap I see
in /var/mail/turnbull every day.
> I was pushing back from your declaration that an archiver would
> always want string output
Please don't push back; we won't get anywhere. Use cases are
*examples*, not complete specifications of all possible inputs and
outputs. Use cases should be simple and clear cut. If you want a
different use case, state it. In fact in the real world, *all* of the
archivers I know of produce text formats on disk, either deleting
multimedia objects or saving them off and linking to them via URLs in
the text. If you know of a different kind of archiver, add it as a
use case.
More information about the Email-SIG
mailing list