[issue11728] mbox parser incorrect behaviour

valera report at bugs.python.org
Thu Mar 31 18:48:39 CEST 2011


valera <vmasutin at apache.org> added the comment:

On Thu, 31 Mar 2011 14:13:50 +0000
"R. David Murray" <report at bugs.python.org> wrote:

> 
> R. David Murray <rdmurray at bitdance.com> added the comment:
> 
> All the references I could find talk about triggering the match
> without the proceeding newline.  That is, it is not certain that a
> blank line will precede the 'From ' header, and the typical quoting
> rules for mbox format call for any 'From ' at the start of a line
> (whether preceded by a blank line or not) to be quoted.  This might
> have something to do with the fact that otherwise you have to special
> case the first line of the mbox, but I don't really know.
> 
> What tool are you using that is producing the unquoted 'From ' lines
> in your mbox?  I know there are variants on the mbox format, so if
> one of them has the format you propose, this would become a feature
> request to support that variant mbox format.
> 
> ----------
> nosy: +r.david.murray
> 

Hello, David !

This is  an email from netcraft mailing list - the host which accepted
it is running sendmail  with some antivirus software  on top -
mimedefang + spamassassin from what I know.
Could be tat something is broken in that chain, I've spotted the error
when I was writing the script for mailbox --> maildir conversion,
while migrating this server.
So I had to inherit mailbox.mbox  and  fix as I need, I'll investigate
further what lead to such behaviour. 
Nevertheless, here is snippet from rfc4155 -    
In order to improve interoperability among messaging systems, this
 memo defines a "default" mbox database format, which MUST be
 supported by all implementations that claim to be compliant with this
 specification.

 The "default" mbox database format uses a linear sequence of Internet
 messages, with each message being immediately prefaced by a separator
 line, and being terminated by an empty line.

---
So I think  assuming that there should be  an empty line before
"From " separator line is fine  (for the second email and further) and
would help to deal with all kinds of mbox  mailboxes, fix is rather
trivial.

Best regards,
Valery Masiutsin

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue11728>
_______________________________________


More information about the Python-bugs-list mailing list