Re: [Mailman-Developers] "@" in mail text gets replaced inarchives

30 Sep 2003 · *syntax*

      [John A. Martin]
...
...
...
...
...
...
"Harald" == Harald Meland
Harald> It is not clear to me that Mailman *is* an MTA.  It is not
Harald> an SMTP server, and is not (necessarily) an SMTP client.
To have been precise perhaps I should have said something like "a mail
agent must not muck with an existing Message-Id except as specified by
the applicable standards".  The Applicable Standards, to quote for
example rfc2822,, apply as follows:
This standard specifies a syntax for text messages that are sent
between computer users, within the framework of "electronic mail"
messages.
I agree that it is obvious that Mailman should strive to avoid sending
non-RFC2822-compliant messages.
However, I would think that the issue at hand is not about message
*syntax*, but rather about the *semantic* value of a message's
Message-Id.
Now that that nit is off my chest :-), I'll be quick to agree that RFC
2822 surely do contain a fair bit of semantic specifications as well;
more on that below.
...
The applicable standards govern what goes on the wire and therefore
what Mailman causes to be put on the wire through a MTA should be
compliant.
Mailman is sort of between a rock and a hard place here, as it
occupies a double role:

Mailman should be liberal in what it accepts -- which seems to
imply that it should accept incoming messages even if they do not
not conform strictly to all aspects of RFC 2822.
As one example, Mailman shouldn't offhandedly reject an incoming
message just because there is a slight address syntax error in the
message's From: header.

At the same time, Mailman should be conservative in what it sends.
Naively, this would mean that Mailman ought to ensure that any
message it puts on the wire conforms with RFC 2822; however, that
would then have to either clash with the "liberal in what you
expect" idea, or with the "don't change the message" maxim.

...
Harald> However, even if Mailman isn't an MTA, it would be nice if
Harald> it *mostly* tries to follow the MTA rules.

Harald> (As a side note, I am unable to find *clear* references to
Harald> the effect of your statement in RFCs 2821 or 2822.)
Rfc2822 Section 3.6.4 (the first paragraph below is the same paragraph
you quoted elsewhere)
[[ ... ]]
The "Message-ID:" field provides a unique message identifier that
refers to a particular version of a particular message.  The
uniqueness of the message identifier is guaranteed by the host that
generates it (see below).  This message identifier is intended to be
machine readable and not necessarily meaningful to humans.  A message
identifier pertains to exactly one instantiation of a particular
message; subsequent revisions to the message each receive new message
identifiers.
Note: There are many instances when messages are "changed", but those
changes do not constitute a new instantiation of that message, and
therefore the message would not get a new message identifier.  For
example, when messages are introduced into the transport system, they
are often prepended with additional header fields such as trace
fields (described in section 3.6.7) and resent fields (described in
section 3.6.6).  The addition of such header fields does not change
the identity of the message and therefore the original "Message-ID:"
field is retained.  In all cases, it is the meaning that the sender
of the message wishes to convey (i.e., whether this is the same
message or a different message) that determines whether or not the
"Message-ID:" field changes, not any particular syntactic difference
that appears (or does not appear) in the message.
Rfc822 Section 4.6.1 (in its entirety):
         This field contains a unique identifier  (the  local-part
    address  unit)  which  refers to THIS version of THIS message.
    The uniqueness of the message identifier is guaranteed by  the
    host  which  generates  it.  This identifier is intended to be
    machine readable and not necessarily meaningful to humans.   A
    message  identifier pertains to exactly one instantiation of a
    particular message; subsequent revisions to the message should
    each receive new message identifiers.
Rfc2822 in this case merely codifies long established practice
interpreting rfc822.  Rfc2822 Appendix A.3 may be helpful for the
present discussion.
The part that (still) isn't clear to me, is whether Mailman's action
of putting the message back on the wire can be said to be either 1)
generation of a new message (personally, I wouldn't think so) or 2) a
new instantiation of the message.
...
To test for compliance with the rfc2822 determination "whether this is
the same message or a different message" one might stipulate that if
the PGP signature verifies it is the same message, if the PGP
signature does not verify it is a different message.
Now we're deeply into message semantics. :-)
I'd like to point out to things about your argument:
Firstly, the RFC does not merely distinguish between "the same message
or a different message"; it also allows Message-ID: to be changed
whenever there is a new instantiation of a (single) message.
Secondly, having to resort to (your) *interpretation* of the RFC, by
using verification of PGP signatures for the test, is in my book a
clear indication that the RFC is *not* crystal clear on this issue.
...
(One certainly can see by inspection what would break a signature
without actually verifying the signature, right?)
That is my (rather shallow, I'm afraid) understanding of PGP email
signatures, yes.
...
Harald> Um.  Mailman lists have numerous configuration options for
Harald> changing messages (e.g. adding footers) before they are
Harald> sent to the list members, and it has had such options
Harald> since time immemorial.
Who reads the RFCs to say that footers cannot be added without
changing the message?
The more interesting issue, I think, is where should the line be
drawn; how much is Mailman allowed to change (various parts of) a
message before it should be considered a new message?
And, how does the Mailman modus operandi fit in with the RFCs "new
instantiation" use of words?
...
Harald> * To my mind it would not be obviously wrong to view
Harald>    Mailman as the *generator* of messages, at the very
Harald>    least in the cases where it is obvious that the
Harald>    previous generator didn't do its job of guaranteeing
Harald>    message-id uniqueness properly.
Why?
Given that there exists two (or more) distinct messages that share the
same message-id, the uniqueness of this identifier (as proscribed by
RFC 2822) is clearly not satisfied.  Hence, if Mailman really wants to
have the messages it puts on the wire conform with RFC 2822, it should
take on the role of message generator, and issue distinct message-ids
for such distinct messages.
The hard problem, of course, is to properly discover whether or not
two messages are indeed distinct; they might differ slightly by
e.g. an automatically added footer, or in some other minor, but
programmatically hard to discover, fashion.
...
ISTM the problem you are trying to solve is how to identify the
archive image of the message.
Why not construct a URL containing a scrubbed Message-Id (as Brad
Knowles has indicated) and a serial number (as I have indicated)?
Because, as Barry said, that would mean the "archive image identity"
of the messages could change whenever the archive needs to be rebuilt
(e.g. after a disk crash, the archives are gone, and there are no
backups; then some kind list member comes forward with a partial
archive constructed from the messages they've received from the list).
...
Such a URL should go into the "List-Archive" header field pointing to
the specific message without doing violence to rfc2369 Section 3.6,
right?
I don't think that's too far from the intention of that header, no.
That section seems rather loosely worded, something I hope was done
intentionally:
3.6. List-Archive
 The List-Archive field describes how to access archives for the list.

 Examples:

   List-Archive: <mailto:archive@host.com?subject=index%20list>
   List-Archive: <ftp://ftp.host.com/pub/list/archive/>
   List-Archive: <http://www.host.com/list/archive/> (Web Archive)
--
Harald

Re: [Mailman-Developers] "@" in mail **text** gets replaced inarchives

Harald Meland

Re: [Mailman-Developers] "@" in mail text gets replaced inarchives