[Email-SIG] fixing the current email module

Stephen J. Turnbull stephen at xemacs.org
Fri Oct 9 06:27:56 CEST 2009


Glenn Linderman writes:

 > > Conversions will eventually be done.  "Best it were done quickly."
 > 
 > Disagree.  Deferring the conversions defers failure issues to the point 
 > where the code (hopefully) somewhat understands the type of data being 
 > manipulated, and can then handle it appropriately.  Converting up front 
 > causes errors in things that may never be touched or needed, so the 
 > error detection and handling is wasteful.

That's theory; my position is based on Mailman practice.  Don't believe
me, ask Barry.  I also spend most of my OSS time on the
internationalization of XEmacs, and the experience is similar there.
Best to convert everything as early as possible, or admit that you
don't know how.

 > So for headers, which are supposed to be ASCII, or encoded via RFC rules 
 > to ASCII (no 8-bit chars), then the discovery of an 8-bit char should be 
 > produce a defect report, but then simply converted to Unicode as if it 
 > were Latin-1 (since there is no other knowledge available that could 
 > produce a better conversion).

No, that is already corruption.  Most clients will assume that string
is valid as a header, because it's valid as a string.

 > And if the result of that is not expected by the client (your
 > definition), then the client should either notice the defect report
 > and reject it based on that, or attempt to parse it, and reject it
 > if it encounters unexpected syntax.  After all, this is, for that
 > client, "raw user input" (albeit from a remote source) so fully
 > error checking the input is appropriate.

No way.  That environment would suck to program in.  And it's
un-Pythonic: "Errors should never pass silently."

 > Python way.  Since the email library is trying to avoid raising 
 > exceptions in large blocks of its code, it is non-Pythonic

I disagree with that.  "Unless explicitly silenced."  The strategy
that Barry and I favor is to signal errors lazily.  So we *explicitly*
silence errors (at least of the Exception kind) when parsing.  If we
can't parse, we look for a part terminator, encapsulate the bad stuff
and move on to the rest of the input.  Later, at use time, *if* the
unparsable object is used, *then* the error will be raised, hopefully
with enough metainformation to figure out what to do about it.

I don't see what's un-Pythonic about that.


More information about the Email-SIG mailing list