[Email-SIG] fixing the current email module
R. David Murray
rdmurray at bitdance.com
Sat Oct 10 23:20:59 CEST 2009
On Fri, 9 Oct 2009 at 11:59, Glenn Linderman wrote:
> On approximately 10/9/2009 5:05 AM, came the following characters from the
> keyboard of Barry Warsaw:
>> On Oct 8, 2009, at 6:39 PM, Glenn Linderman wrote:
>> > 1) wire format. Either what came in, in the parser case, or what would
>> > be generated.
>> > 2) internal headers from the MIME part
>> > 3) decoded BLOB. This means that quopri and base64 are decoded, no more
>> > and no less. This is bytes. No headers, only payload. For
>> > Content-Transfer-Encoding: binary, this is mostly a noop.
>> > 4) text/* parts should also be obtainable as str()/unicode(), payload
>> > only. This is where charset decoding is done.
>> >
>> > I think your talk in the next paragraph about hooks and other object
>> > types being produced is a generalization of 4, not 3, and generally no
>> > additional decoding needs to be done, just conversion to the right
>> > object type (or file, or file-like object).
>> I mostly agree with that. I've always called #4 the "decoded payload" and
>> #3 I've usually called the "raw payload". Maybe we can bikeshed on better
>> terms to help inform us about the API's method/attribute names.
>
> It would be good though to have standardized terms for easier communication.
> Maybe as they are chosen, they could be added to that Wiki RDM set up?
I didn't set it up, Barry did. I just started adding stuff ;)
> My only problem with "raw" and "decoded" payload, is that there are 3 payload
> formats, not 2, so there needs to be a 3rd term, corresponding to #1, #3, and
> #4, above. #2 is somewhat orthogonal from the payload.
>
> To me, "raw" conjures up #1, not #3.
I think I understand why Barry uses it for #3: it's the 'raw data' that
went in to get transfer-encoded in the first place. But clearly the
term is ambiguous.
I have set up two more documents on the wiki. One is UseCases[1], and I've
tried to copy into it all of the use cases that have been mentioned in
this discussion, plus a few more. Edits welcome.
The other is a Glossary[2]. I think most of it accurately reflects the
consensus here, but in it I'm proposing to use the term 'transfer-decoded'
for #3, and 'transfer-encoded' as an alternative to 'wire-format' just
for symmetry. Comments and suggestions welcome.
Any other terms of art we should record?
--David
[1] http://wiki.python.org/moin/Email%20SIG/UseCases
[2] http://wiki.python.org/moin/Email%20SIG/Glossary
More information about the Email-SIG
mailing list