[Patches] [ python-Patches-521478 ] mailbox / fromline matching

noreply@sourceforge.net noreply@sourceforge.net
Fri, 01 Mar 2002 13:42:23 -0800


Patches item #521478, was opened at 2002-02-22 09:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=521478&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Camiel Dobbelaar (camield)
Assigned to: Guido van Rossum (gvanrossum)
Summary: mailbox / fromline matching

Initial Comment:
mailbox.py does not parse this 'From' line correctly:
>From camield@sentia.nl Mon Apr 23 18:22:28 2001 +0200
                                                ^^^^^
This is because of the trailing timezone information, 
that the regex does not account for.

Also, 'From' should match at the beginning of the line.

----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-03-01 16:42

Message:
Logged In: YES 
user_id=12800

IMO, Jamie Zawinski (author of the original mail/news reader
in Netscape among other accomplishments), wrote the
definitive answer on From_

http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html

As far as Python's support for this in the mailbox module,
for backwards compatibility, the UnixMailbox class has a
strict-ish interpretation of the From_ delimiter, which I
think should not change.  It also has a class called
PortableUnixMailbox which recognizes delimiters as specified
in JWZ's document.  Personally, if I was trolling over a
real world mbox file I'd only use PortableUnixMailbox (as
long as non-delimiter From_ lines were properly escaped -- I
have some code in Mailman which tries to intelligently "fix"
non-escaped mbox files).

I agree with the Rejected resolution.

----------------------------------------------------------------------

Comment By: Camiel Dobbelaar (camield)
Date: 2002-03-01 06:34

Message:
Logged In: YES 
user_id=466784

I have tracked this down to Pine, the mailreader. 

In imap/src/c-client/mail.c, it has this flag:
 static int notimezones = NIL;    /* write timezones in
"From " header */

(so timezones are written in the "From" lines by default)

I also found the following comment in imap/docs/FAQ in the
Pine distribution:

"""
So, good mail reading software only considers a line to be a
"From " line if it follows the actual specification for a
"From " line. This means, among other things, that the day
of week is fixed-format: "May 14", but "May  7" (note the
extra space) as opposed to "May 7".  ctime() format for the
date is the most common, although POSIX also allows a
numeric timezone after the year.
"""

While I don't consider Pine to be the ultimate mailreader,
its heritage may warrant that the 'From ' lines it creates
are considered 'standard'.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-02-28 17:37

Message:
Logged In: YES 
user_id=6380

That From line is simply illegal, or at least nonstandard.

If your system uses this nonstandard format, you can extend
the mailbox parser by overriding the ._isrealfromline
method.

The pattern doesn't need ^ because match() is used, which
only matches at the start of the line.

Rejected.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=521478&group_id=5470