[Email-SIG] fixing the current email module
Barry Warsaw
barry at python.org
Fri Oct 9 14:05:44 CEST 2009
On Oct 8, 2009, at 6:39 PM, Glenn Linderman wrote:
> 1) wire format. Either what came in, in the parser case, or what
> would be generated.
> 2) internal headers from the MIME part
> 3) decoded BLOB. This means that quopri and base64 are decoded, no
> more and no less. This is bytes. No headers, only payload. For
> Content-Transfer-Encoding: binary, this is mostly a noop.
> 4) text/* parts should also be obtainable as str()/unicode(),
> payload only. This is where charset decoding is done.
>
> I think your talk in the next paragraph about hooks and other object
> types being produced is a generalization of 4, not 3, and generally
> no additional decoding needs to be done, just conversion to the
> right object type (or file, or file-like object).
I mostly agree with that. I've always called #4 the "decoded payload"
and #3 I've usually called the "raw payload". Maybe we can bikeshed
on better terms to help inform us about the API's method/attribute
names.
Which brings up another point: right now Message objects have a
single .get_payload() method that takes a flag to indicate whether it
should be the decoded or raw payload. That's bong. These should be
different interfaces.
>> The problem is that if the bytes came off the wire, the parser
>> currently can only attach the most basic MIME base class. It
>> doesn't know that an image/png should create a MIMEImagePNG
>> instance there. This is different from hacking the model directly
>> because the application can instantiate the right class. So the
>> parser either has to have a hookable way for an application to go
>> from content-type to class, or the generic MIME base class needs to
>> be hookable in its .decode() method.
>
> So either the email package can stop at 3, and 4 only for text/*
> parts, or it could learn more types (registered types, with well-
> defined corresponding objects could be potentially built-in to the
> email package), and/or it could become hookable for application
> types. Of course, for disposition to files, storing the BLOB in a
> file of the right name is adequate... to avoid the file, I agree
> that converting to a useful object type is handy. But maybe file-
> like objects would suffice, for most of the types.
My own preferences here is that email does support #4 with a
registration system to handle returning concrete payload objects based
on the Content-Type.
I also think that the email package probably should not implement
"store-payloads-on-disk" by default, although it may provide some
example implementations for simple applications (much the same way
there's wsgiref for simple applications). Still, that's different
than say, storing attachments in a file named by the Content-
Disposition header's filename parameter. That latter is firmly in the
domain of the application.
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 832 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/email-sig/attachments/20091009/c2cbffa0/attachment.pgp>
More information about the Email-SIG
mailing list