[Mailman-Users] question about harvesting of non-archived lists byarchiveorange.com

Mark Sapiro mark at msapiro.net
Tue Nov 22 21:38:19 CET 2011


Christopher Adams wrote:
>
>A list owner of one of our 500+ mailing lists notified me that
>postings from their non-archived list was being harvested by
>archiveorange.com and could be found via Google. I found that many of
>our lists are archived on this site, and this is not something that we
>really want.  I am just querying the list to see if this has happened
>with others using Mailman. The website explains that archiveorange
>"subscribes" to lists and then harvests messages. It seems that the
>only way this could happen to a non-archived list is for the mailbox
>directories to readable to the outside. Is there something else that
>is going on here?


Yes. One of the "members" of the archived list is the archiveorange.com
archiver, so that they receive list posts as they are sent. From their
web site, the address may be newman at archiveorange.com.

If in fact archiveorange.com does this without fully informed
affirmative consent of the list owner, that is highly unethical and
deplorable, but they wouldn't be the only ones. answerpot.com is
another.

For answerpot, I have

^.*[@.]apot(mail)?\.com$

in the ban_list of all my lists.


The archiveorange.com web site implies that the list owner must add
newman at archiveorange.com to the list to enable archiving, and that
they don't do it. If the list doesn't require approval for
subscription, it may be possible for anyone to request subscription of
the newman at archiveorange.com address. I don't know if this address
will respond to a confirmation email, but if it does, I would add it
to the ban_list of all lists (see
<http://www.msapiro.net/scripts/add_banned.py>).

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list