[Email-SIG] fixing the current email module

R. David Murray rdmurray at bitdance.com
Sat Oct 10 23:20:59 CEST 2009


On Fri, 9 Oct 2009 at 11:59, Glenn Linderman wrote:
> On approximately 10/9/2009 5:05 AM, came the following characters from the 
> keyboard of Barry Warsaw:
>>  On Oct 8, 2009, at 6:39 PM, Glenn Linderman wrote:
>> >  1) wire format.  Either what came in, in the parser case, or what would 
>> >  be generated.
>> >  2) internal headers from the MIME part
>> >  3) decoded BLOB.  This means that quopri and base64 are decoded, no more 
>> >  and no less.  This is bytes.  No headers, only payload.  For 
>> >  Content-Transfer-Encoding: binary, this is mostly a noop.
>> >  4) text/* parts should also be obtainable as str()/unicode(), payload 
>> >  only.  This is where charset decoding is done.
>> > 
>> >  I think your talk in the next paragraph about hooks and other object 
>> >  types being produced is a generalization of 4, not 3, and generally no 
>> >  additional decoding needs to be done, just conversion to the right 
>> >  object type (or file, or file-like object).
>>  I mostly agree with that.  I've always called #4 the "decoded payload" and
>>  #3 I've usually called the "raw payload".  Maybe we can bikeshed on better
>>  terms to help inform us about the API's method/attribute names.
>
> It would be good though to have standardized terms for easier communication. 
> Maybe as they are chosen, they could be added to that Wiki RDM set up?

I didn't set it up, Barry did.  I just started adding stuff ;)

> My only problem with "raw" and "decoded" payload, is that there are 3 payload 
> formats, not 2, so there needs to be a 3rd term, corresponding to #1, #3, and 
> #4, above.  #2 is somewhat orthogonal from the payload.
>
> To me, "raw" conjures up #1, not #3.

I think I understand why Barry uses it for #3: it's the 'raw data' that
went in to get transfer-encoded in the first place.  But clearly the
term is ambiguous.

I have set up two more documents on the wiki.  One is UseCases[1], and I've
tried to copy into it all of the use cases that have been mentioned in
this discussion, plus a few more.  Edits welcome.

The other is a Glossary[2].  I think most of it accurately reflects the
consensus here, but in it I'm proposing to use the term 'transfer-decoded'
for #3, and 'transfer-encoded' as an alternative to 'wire-format' just
for symmetry.  Comments and suggestions welcome.

Any other terms of art we should record?

--David

[1] http://wiki.python.org/moin/Email%20SIG/UseCases
[2] http://wiki.python.org/moin/Email%20SIG/Glossary


More information about the Email-SIG mailing list