[Email-SIG] fixing the current email module

Fri Oct 9 14:05:44 CEST 2009

On Oct 8, 2009, at 6:39 PM, Glenn Linderman wrote:

> 1) wire format.  Either what came in, in the parser case, or what  
> would be generated.
> 2) internal headers from the MIME part
> 3) decoded BLOB.  This means that quopri and base64 are decoded, no  
> more and no less.  This is bytes.  No headers, only payload.  For  
> Content-Transfer-Encoding: binary, this is mostly a noop.
> 4) text/* parts should also be obtainable as str()/unicode(),  
> payload only.  This is where charset decoding is done.
>
> I think your talk in the next paragraph about hooks and other object  
> types being produced is a generalization of 4, not 3, and generally  
> no additional decoding needs to be done, just conversion to the  
> right object type (or file, or file-like object).

I mostly agree with that.  I've always called #4 the "decoded payload"  
and #3 I've usually called the "raw payload".  Maybe we can bikeshed  
on better terms to help inform us about the API's method/attribute  
names.

Which brings up another point: right now Message objects have a  
single .get_payload() method that takes a flag to indicate whether it  
should be the decoded or raw payload.  That's bong.  These should be  
different interfaces.

>> The problem is that if the bytes came off the wire, the parser  
>> currently can only attach the most basic MIME base class.  It  
>> doesn't know that an image/png should create a MIMEImagePNG  
>> instance there.  This is different from hacking the model directly  
>> because the application can instantiate the right class.  So the  
>> parser either has to have a hookable way for an application to go  
>> from content-type to class, or the generic MIME base class needs to  
>> be hookable in its .decode() method.
>
> So either the email package can stop at 3, and 4 only for text/*  
> parts, or it could learn more types (registered types, with well- 
> defined corresponding objects could be potentially built-in to the  
> email package), and/or it could become hookable for application  
> types.  Of course, for disposition to files, storing the BLOB in a  
> file of the right name is adequate... to avoid the file, I agree  
> that converting to a useful object type is handy.  But maybe file- 
> like objects would suffice, for most of the types.

My own preferences here is that email does support #4 with a  
registration system to handle returning concrete payload objects based  
on the Content-Type.

I also think that the email package probably should not implement  
"store-payloads-on-disk" by default, although it may provide some  
example implementations for simple applications (much the same way  
there's wsgiref for simple applications).  Still, that's different  
than say, storing attachments in a file named by the Content- 
Disposition header's filename parameter.  That latter is firmly in the  
domain of the application.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 832 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/email-sig/attachments/20091009/c2cbffa0/attachment.pgp>