[Python-Dev] Googlebot and the mail.python.org python-dev archive
A.M. Kuchling
amk at amk.ca
Sat Feb 28 18:36:20 CET 2009
On Sat, Feb 28, 2009 at 09:53:10PM +1000, Nick Coghlan wrote:
> Is pydotorg-www still the place for website questions?* If so, I should
> probably take this over there...
Just 'pydotorg' is the current list
(http://mail.python.org/mailman/listinfo/pydotorg).
Looking at the access logs, mail.python.org is
being actively crawled:
66.249.71.119 - - [28/Feb/2009:18:32:51 +0100] "GET /pipermail/python-list/2004-June/265194.html HTTP/1.1" 304 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
72.30.79.38 - - [28/Feb/2009:18:32:51 +0100] "GET /pipermail/csv/2003-February/000368.html HTTP/1.0" 200 3929 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)"
65.55.211.30 - - [28/Feb/2009:18:32:51 +0100] "GET /pipermail/python-list/2006-May/382528.html HTTP/1.1" 200 4028 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
Is it maybe that the site is just too large, so the search engines
index only 10,000 messages from it? One possible solution might be to
block crawling of the python-list archive; it's enormous, and already
available through Google's Usenet search.
--amk
More information about the Python-Dev
mailing list