[Mailman-Developers] mailman misparses non RFC 822 compliant Email addresses

Georg Mischler schorsch@schorsch.com
Thu, 12 Apr 2001 23:27:55 -0400 (EDT)

Dan Mick wrote:

> > >>>>> "MM" == Marc MERLIN <marc_news@valinux.com> writes:
> >
> >     MM> Correct.  I haven't found the right piece of code in mailman
> >     MM> yet, but it smells like a regular expression that grabs the
> >     MM> wrong pattern and ends up with davide.fox instead of
> >     MM> dfox@m206-157.dsl.tsoft.com
> >
> > I suspect the problem is in Python's rfc822 module:
> >
> > >>> a = rfc822.AddrlistClass('David E.Fox <dfox@m206-157.dsl.tsoft.com>')
> > >>> a.getaddrlist()
> > [('', 'DavidE.Fox'), ('', 'dfox@m206-157.dsl.tsoft.com')]
> >
> > Mailman, interestingly enough, has Utils.ParseAddrs() which appears to
> > try to work around problems in rfc822:
> >
> > >>> ParseAddrs('David E.Fox <dfox@m206-157.dsl.tsoft.com>')
> > 'dfox@m206-157.dsl.tsoft.com'
> >
> > which seems to suck out the right address in this case.  What should
> > probably in MM2.1 is for MailList.HasExplicitDest() to fallback on the
> > output of ParseAddrs() if the rfc822 method doesn't match.
> I'm coming to this late, but: we all agree that
> David E.Fox <dfox@m206-157.dsl.tsoft.com>
> is not a legal RFC822 address, right?

I can't read this into the RFC document.

> Strings containing spaces must-repeat-must be enclosed in
> doublequotes:
> "David E.Fox" <dfox@m206-157.dsl.tsoft.com>

As far as I can recognize, the standard only requires
quotes when such whitespace characters are present within
the address part (inside the <...>). The rest of the
string is regarded as comment, and may pretty much take
any form. There are actually numerous examples in the
RFC document that contain spaces in this comment part,
where the comment part is *not* enclosed in any quotes.

> I see the subject, but: why is this considered to be a "problem"
> in rfc822, rather than a problem in the mail format?  I would
> expect other MTAs and MUAs to barf on this too.

This is indeed a problem with the rfc822 modle, assuming
that the following text in the standard document (copied
literally) is to be taken seriously:


     A.1.  ADDRESSES

     A.1.1.  Alfred Neuman <Neuman@BBN-TENEXA>

     A.1.2.  Neuman@BBN-TENEXA

             These two "Alfred Neuman" examples have identical  seman-
        tics, as far as the operation of the local host's mail sending
        (distribution) program (also sometimes  called  its  "mailer")
        and  the remote host's mail protocol server are concerned.  In
        the first example, the  "Alfred  Neuman"  is  ignored  by  the
        mailer,  as "Neuman@BBN-TENEXA" completely specifies the reci-
        pient.  The second example contains  no  superfluous  informa-
        tion,  and,  again,  "Neuman@BBN-TENEXA" is the intended reci-

Since the part within the angle brackets is unambiguously
identifyable as the actual address, the parser should just
ignore anything else on the same line. It may be wise to use
quotes when sending messages, but not accepting something
that is explicitly used as a valid example in the standard
document seems to be bad style to the very least...

Have fun!


Georg Mischler  --  simulations developer  --  schorsch at schorsch.com
+schorsch.com+  --  lighting design tools  --  http://www.schorsch.com/