[Moin-user] Some crawler makes MonthCalendar produce endless pages, even thought not logged in

Mark Martinec Mark.Martinec+moin at ijs.si
Fri Apr 5 20:45:33 EDT 2013


It happened once, now I'm seeing happening again: some web crawler
stumbled across our MoinMoin wiki (1.9.4), and the auto-generated
pages keep sprouting, eventually filling up a directory (32k files):

HelpOnMacros(2f)MonthCalendar(2f)2000(2d)09(2d)03
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)08(2d)15
HelpOnMacros(2f)MonthCalendar(2f)2003(2d)01(2d)15
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)09(2d)19
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)08(2d)22
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)10(2d)14
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)10(2d)09
HelpOnMacros(2f)MonthCalendar(2f)2011(2d)02(2d)22
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)11(2d)13
HelpOnMacros(2f)MonthCalendar(2f)2008(2d)11(2d)05
HelpOnMacros(2f)MonthCalendar(2f)1996(2d)08(2d)18
HelpOnMacros(2f)MonthCalendar(2f)2018(2d)12(2d)16
HelpOnMacros(2f)MonthCalendar(2f)2016(2d)10(2d)13
HelpOnMacros(2f)MonthCalendar(2f)1995(2d)12(2d)08
HelpOnMacros(2f)MonthCalendar(2f)1996(2d)08(2d)15
HelpOnMacros(2f)MonthCalendar(2f)1996(2d)05(2d)28
HelpOnMacros(2f)MonthCalendar(2f)2011(2d)05(2d)10
HelpOnMacros(2f)MonthCalendar(2f)1994(2d)10(2d)05

According to apache log, the crawler keeps doing
http GET requests like these:

GET .../HelpOnMacros/MonthCalendar?calparms=HelpOnMacros/
  MonthCalendar,2001,12,2,-51,,,MonthCalendarTemplate HTTP/1.1

Other page creation is restricted by only allowing
a POST request from internal networks, and page creation
is restricted to logged-in (Known) users.

So I wonder where the problem lies: is it our misconfiguration,
or is it considered 'normal' that any non-authenticated
request can cause some macro to create pages by a simple GET.
What would be a good way of stopping this? I don't want
to just block the crawler, because next time some other
crawler may come back again; the disease should be blocked
at its cause.

Since the MonthCalendar macro is not actually needed,
perhaps it should just be disabled entirely. What is
the proper way of doing that?

  Mark




More information about the Moin-user mailing list