[Email-SIG] I miss size() (and some latest frustration)
Barry Warsaw
barry at python.org
Thu Mar 24 22:41:49 CET 2011
On Mar 24, 2011, at 05:10 PM, Steffen Daode Nurpmeso wrote:
>It would be great if the message (file) size would also be
>provided as a public method, so that code-flow decisions can be
>made dependend upon the plain size of a message.
>(The size is known without parsing for many real-life message
>objects anyway or can be detected *cheap*. True, e.g., for
>all Message objects which are created by mailbox.py.)
Certainly the normal FeedParser will see every byte of the message, even if it
does save parts of it on disk. Mailman 3's LMTP server also sees every byte
and tucks the size away on an .original_size attribute of its Message
subclass.
But how would you handle it when you are creating the message yourself? I
think there are too many places you'd have to hook to get an accurate reading,
or you'd have to essentially serialize it via a generator before you'd know,
so it's less than helpful.
It may indeed be possible to ask some external process what the size of the
message is, but it would likely be a hint you couldn't necessarily trust.
(I.e. the server might only have an approximate size.)
So, I'm not sure whether the email package can have a consistent notion of a
message's 'size'. Perhaps though it ought to define an attribute for when the
message is created by a parser, but let it be writable so that e.g. your
application could get it from an IMAP server or whatever, and stick it in the
attribute.
>It's also so unfortunate that 'headersonly' of Parser is in fact treated as
>"a backwards compatibility hack", effectively consuming the entire input
>nonetheless. And *DesignThoughts* treats lazy parsing/partial loading as an
>"interesting idea" only, though i can think about many cases where it is a
>good thing to parse a Message{Headers[/Part/Part/Part...]} sequentially.
>
>E.g., why should a spam detector load an entire message if it only wants to
>check addresses against some white-/blacklists and simply throw away bad
>hits. Even more, why should a companies dispatcher read all the content if
>it's only about to rewrite addresses and dispatch the mail to some other
>internal server. (Of course - hey, it's you, you know *such* more about this
>stuff than i do.)
Do you have suggestions for how the email package can help with these use
cases? Do you have specific API or implementation proposals?
Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/email-sig/attachments/20110324/a6cdb335/attachment.pgp>
More information about the Email-SIG
mailing list