add to FAQ? - mailman full raw archive mboxes not anti-spammed
hi mailman people, terri,
i looked for information regarding archtocnombox.html relating to the problem of a full raw mbox archive being non-antispammed, but couldn't find any discussion. It's clear somebody has thought about it, since the file exists.
Are there any plans (or any options already present) to have the full raw mbox archive, but corrected for "@" -> " at " (maybe other obvious corrections too), to be made available like the raw spammable version is available if the archtocnombox.html template is not chosen?
It seems to me that this should be made an obvious option - or else that the default should be that the full raw archive .mbox is *not* available.
At least, could info on this be added to the FAQ somewhere, so that people can easily find out what has already been done? E.g.
11.2 What does Mailman do to help protect me from unsolicited bulk email (spam)?
http://www.list.org/mailman-member/node40.html
i've added a comment on barry warsaw's wiki pages: http://zope.org/Members/bwarsaw/MailmanDesignNotes/MailmanProblems
and below is my own analysis.
thanks boud (convert from sympa to mailman ;)
PS: Sorry if this should be to mailman-users rather than mailman-developers, but it seems to me like a developers' issue.
SPAM PROBLEM: HYPOTHESIS SOLUTION 2.1.4
HACK SOLUTION 2.0.11 (AND PROBABLY 2.1.1) (1) remove link to *.mbox in /var/lib/mailman/Mailman/Archiver/HyperArch.py (2) remove compiled versions HyperArch.py[oc] (3) test on an archive (e.g. imc-pl-tech ;) (4) make the *.mbox/*.mbox files inaccessible (since google links remain) (5) think about newlist s (maybe nothing needs doing)
SPAM PROBLEM:
i think this is a known spam problem - by default, it looks like the archive files linked under "download the full raw archive" like the following etc are NOT antispammed:
http://lists.mydomain.org/mailman/public/mylist.mbox/mylist.mbox
It seems to be present in 2.0.11 and 2.1.1
HYPOTHESIS
IMHO the reason why this is probably not easy to solve is that this is where mail is automatically saved when it's received. If this is filtered by " at " -> "@" then it means that overall there are typically 4 copies of the entire mailbox (e.g. html version, monthly archives, true mailbox with @ hidden from external access, and " at " version for web access).
i couldn't find if this has been discussed, but it looks like there's a simple solution in 2.1.4.
SOLUTION 2.1.4
It looks like the solution in mailman-2.1.4 is to offer different templates:
mailman-2.1.4/templates/en/archtoc.html mailman-2.1.4/templates/en/archtocentry.html mailman-2.1.4/templates/en/archtocnombox.html -> this one has no mbox
e.g. http://mail.python.org/pipermail/mailman-announce/ has no *.mbox/*.mbox In fact, http://mail.python.org/pipermail/mailman-announce.mbox/ exists but nothing in it is accessible.
HACK SOLUTION 2.0.11 (AND PROBABLY 2.1.1)
In 2.0.11, the line pointing to the .mbox is in
/var/lib/mailman/Mailman/Archiver/HyperArch.py (for Debian anyway ;)
You can get <a href="%(listinfo)s">more information about this list</a>
or you can <a href="%(fullarch)s">download the full raw archive</a>
(%(size)s).
</p>
solution:
(1) remove the link to the full .mbox in /var/lib/mailman/Mailman/Archiver/HyperArch.py
To do this,
replace
You can get <a href="%(listinfo)s">more information about this list</a>
or you can <a href="%(fullarch)s">download the full raw archive</a>
(%(size)s).
</p>
by
You can get <a href="%(listinfo)s">more information about this list.</a>
</p>
(2) remove compiled versions (in my case the .pyc gets automatically recompiled)
rm /var/lib/mailman/Mailman/Archiver/HyperArch.py[co]
(3) test this
cd /var/lib/mailman/archives/public/ /usr/lib/mailman/bin/arch mylist
Then check out: http://lists.mydomain.org/pipermail/mylist/
Hopefully there will be no link to the .mbox and even direct access will be impossible.
(4) make the .mbox files inaccessible - since google links will still hang around for some time chmod go-rw /var/lib/mailman/archives/private/*.mbox/*.mbox
(5) probably nothing needed when running newlist for new lists.
Making new lists will, by default, write *.mbox/*.mbox which are web accessible, but nobody is going to link to them (unless paranoid, deliberately wants to subject indymedia users to spam, ...)
anti-spam solidarity boud
On Thu, 19 Feb 2004, boud wrote:
e.g. http://mail.python.org/pipermail/mailman-announce/ has no *.mbox/*.mbox In fact, http://mail.python.org/pipermail/mailman-announce.mbox/ exists but nothing in it is accessible.
In fact the mbox file *does* exist and is accessible if you look in the obvious place: (add in "mailman-announce.mbox" after ".mbox/")
i won't write the URL directly because i don't want to point google to it.
So 2.1.4 is not as safe from spiders/harversters as it could be.
boud
I don't think the explanation of how to do this belongs in the user manual, which is the only documentation I've written currently (and I assume why you specifically CC'ed me on this message), but the FAQ is a good place for now since the other manuals for 2.1 haven't been completed.
If you haven't seen it already, the user-addable FAQ is available at http://www.python.org/cgi-bin/faqw-mm.py
Please do put the info there if it isn't already, since I'm sure this'll be handy for other people.
thanks :) - done: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq04.034.htp
(but now i've got my email spammably posted to the RCS :( )
On Wed, 18 Feb 2004, Terri Oda wrote:
I don't think the explanation of how to do this belongs in the user manual, which is the only documentation I've written currently (and I
This page of the user manual has the title:
http://www.list.org/mailman-member/node40.html
11.2 What does Mailman do to help protect me from unsolicited bulk email (spam)?
My guess in version 2.1.4 is that the file archtocnombox.html has to be chosen sometime during installation if the user wishes to use it. Doesn't this mean it should be in the user manual under this question? Or is it too technical a question?
Maybe a developer should add some comment to the INSTALL file or a README.* ?
grep archtocnmbox mailman-2.1.4/*
gives no hint at all.
boud
On Feb 18, 2004, at 7:57 PM, boud wrote:
http://www.list.org/mailman-member/node40.html
11.2 What does Mailman do to help protect me from unsolicited bulk email (spam)?
My guess in version 2.1.4 is that the file archtocnombox.html has to be chosen sometime during installation if the user wishes to use it. Doesn't this mean it should be in the user manual under this question? Or is it too technical a question?
Basically, I'm trying to keep the list member manual to things that subscribers can change themselves. I mention things of interest, but a full explanation doesn't seem appropriate here -- it would belong in the list and/or site admin manuals.
Terri
participants (2)
-
boud
-
Terri Oda