[Mailman-Users] This should not have happened
mark at msapiro.net
Sat May 8 23:38:45 CEST 2010
On 5/8/2010 1:05 PM, Lindsay Haisley wrote:
> The poster used an "Approved" pseudo-header. Mailman found the
> pseudo-header in the text/plain part, removed it, and approved the post
> for distribution. However in the text/html portion, the pseudo-header
> was mucked up with markup and was apparently unrecognizable to Mailman.
> It shows up in the message source as:
> <p style=3D"margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Arial">Approved: =
> For rather obvious reasons, Mailman didn't find this rendition of the
> pseudo-header, but because it found the Approved pseudo-header in the
> text/plain portion, it distributed the message - with the administrator
> password clearly displayed to the subscriber list for everyone with an
> HTML-capable mail reader to see! Now this (very technically challenged)
> customer has to change her list admin password and I have to work with
> her to insure that this won't happen again.
> HTML-ized email is a real PITA, and we've had problems with the
> pseudo-header before. It seems to me that if a submitted email has both
> a text/plain and a text/html part, Mailman should look _first_ for the
> pseudo-header in the text/html portion, and if it's not found there, the
> post should be rejected at that point even if the pseudo-header is
> clearly present in a text/plain part. These two sections are supposed to
> be identical as far as content goes, or at least we can expect Mailman
> to assume that they are.
> How can this be prevented? As far as I'm concerned, this is a bug.
It is a bug, <https://bugs.launchpad.net/mailman/+bug/266220>.
My comments in the code say
# MAS: Bug 1181161 - Now try all the text parts in case it's
# multipart/alternative with the approved line in HTML or other
# text part. We make a pattern from the Approved line and delete
# it from all text/* parts in which we find it. It would be
# better to just iterate forward, but email compatability for pre
# Python 2.2 returns a list, not a true iterator.
# This will process all the multipart/alternative parts in the
# message as well as all other text parts. We shouldn't find the
# pattern outside the mp/a parts, but if we do, it is probably
# best to delete it anyway as it does contain the password.
# Make a pattern to delete. We can't just delete a line because
# line of HTML or other fancy text may include additional message
# text. This pattern works with HTML. It may not work with rtf
# or whatever else is possible.
So the question is why does this fail in this case. The HTML part is
clearly QP encoded, but we decode that and it decodes to
<p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Arial">Approved:
Where the \xA0 is the hex representation of the actual character which
is a no-break space.
The issue is that the pattern constructed in this case is
and the re.sub(pattern, '', lines) (where lines is the message body)
does not consider \xA0 to match \s.
This is clearly a deficiency in the code, but there are two underlying
1) the user double spaced between the Approved: and the password, and
2) the user's MUA encoded the two spaces as a space followed by a
no-break space for the HTML part but it represented the no-break space
as a raw character code instead of the HTML entity
Had either of the above conditions not been true, the Approved: password
would have been removed.
I will modify the code to add \xA0 to make the pattern
'Approved:(\xA0|\s| )*Hon94Bar' in this case, which will work for
this one and future ones like it, but I won't follow your suggestion to
check the HTML first. I think this is unworkable without implementing an
HTML rendering engine, and would likely be no different, at least in
some cases, from just not checking for the pseudo-header in the message
body at all.
Note that we have never guaranteed removal of the pseudo-header from
alternative parts, and if asked, I always recommend a true message
header for this purpose.
Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Users