[issue11728] mbox parser incorrect behaviour
valera
report at bugs.python.org
Thu Mar 31 18:48:39 CEST 2011
valera <vmasutin at apache.org> added the comment:
On Thu, 31 Mar 2011 14:13:50 +0000
"R. David Murray" <report at bugs.python.org> wrote:
>
> R. David Murray <rdmurray at bitdance.com> added the comment:
>
> All the references I could find talk about triggering the match
> without the proceeding newline. That is, it is not certain that a
> blank line will precede the 'From ' header, and the typical quoting
> rules for mbox format call for any 'From ' at the start of a line
> (whether preceded by a blank line or not) to be quoted. This might
> have something to do with the fact that otherwise you have to special
> case the first line of the mbox, but I don't really know.
>
> What tool are you using that is producing the unquoted 'From ' lines
> in your mbox? I know there are variants on the mbox format, so if
> one of them has the format you propose, this would become a feature
> request to support that variant mbox format.
>
> ----------
> nosy: +r.david.murray
>
Hello, David !
This is an email from netcraft mailing list - the host which accepted
it is running sendmail with some antivirus software on top -
mimedefang + spamassassin from what I know.
Could be tat something is broken in that chain, I've spotted the error
when I was writing the script for mailbox --> maildir conversion,
while migrating this server.
So I had to inherit mailbox.mbox and fix as I need, I'll investigate
further what lead to such behaviour.
Nevertheless, here is snippet from rfc4155 -
In order to improve interoperability among messaging systems, this
memo defines a "default" mbox database format, which MUST be
supported by all implementations that claim to be compliant with this
specification.
The "default" mbox database format uses a linear sequence of Internet
messages, with each message being immediately prefaced by a separator
line, and being terminated by an empty line.
---
So I think assuming that there should be an empty line before
"From " separator line is fine (for the second email and further) and
would help to deal with all kinds of mbox mailboxes, fix is rather
trivial.
Best regards,
Valery Masiutsin
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11728>
_______________________________________
More information about the Python-bugs-list
mailing list