Reading Huge UnixMailbox Files

Nobody nobody at nowhere.com
Wed Apr 27 08:52:01 EDT 2011


On Tue, 26 Apr 2011 14:02:23 -0700, Dan Stromberg wrote:

> For the archive: This assumes traditional mbox.  A SysV-ish sendmail,
> for example, may not like it.

sendmail itself doesn't deal with mailboxes or spool files; that task is
left to the local delivery agent (e.g. mail.local or procmail).

To clarify: the awk script assumes that any line beginning with
"From " is the start of a message; any matching lines in the message body
must be escaped. sendmail will do this if the mailer has the "E" flag
(F=...E...).

If lines beginning with "From " are only escaped when preceded by a blank
line, you need to maintain a flag which is set when the current line is
the first line in the file or preceded by a blank line and clear
otherwise. This is the behaviour of sendmail's mail.local, and of procmail
when invoked with the -Y flag (this is the default when sendmail is
configured with FEATURE(local_procmail)) or when no Content-Length header
is present.

If lines beginning with "From " aren't escaped (relying upon a
Content-Length header), you need to find some other approach (which
probably won't involve traditional line-oriented tools). You also need to
be really careful when processing such files.




More information about the Python-list mailing list