robots exclusion file on the buildbot pages?

Hello, The buildbots are sometimes subject to a flood of "svn exception" errors. It has been conjectured that these errors are caused by Web crawlers pressing "force build" buttons without filling any of the fields (of course, the fact that we get such ugly errors in the buildbot results, rather than a clean error message when pressing the button, is a buildbot bug in itself). Couldn't we simply exclude all crawlers from the buildbot Web pages? Regards Antoine.

On 05:48 pm, solipsis@pitrou.net wrote:
Hello,
The buildbots are sometimes subject to a flood of "svn exception" errors. It has been conjectured that these errors are caused by Web crawlers pressing "force build" buttons without filling any of the fields (of course, the fact that we get such ugly errors in the buildbot results, rather than a clean error message when pressing the button, is a buildbot bug in itself). Couldn't we simply exclude all crawlers from the buildbot Web pages?
Most (all?) legitimate crawlers won't submit forms. Do you think there's a non-form link to the force build URL (which _will_ accept a GET request to mean the same thing as a POST)? One thing I have noticed is that spammers find these forms and submit them with garbage. We can probably suppose that such people are going to ignore a robots.txt file. Jean-Paul

On Sat, 15 May 2010 17:57:28 -0000 exarkun@twistedmatrix.com wrote:
One thing I have noticed is that spammers find these forms and submit them with garbage. We can probably suppose that such people are going to ignore a robots.txt file.
So we could "just" fix the buggy buildbot code. Not that I want to do it myself :S

On Sat, May 15, 2010 at 12:07 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sat, 15 May 2010 17:57:28 -0000 exarkun@twistedmatrix.com wrote:
One thing I have noticed is that spammers find these forms and submit them with garbage. We can probably suppose that such people are going to ignore a robots.txt file.
So we could "just" fix the buggy buildbot code. Not that I want to do it myself :S
I can help modify buildbot if you want, but I suppose I need a specification what precisely is a bug here. Not accepting forms with garbage? By default buildbot "force build" does not require forms to be filled and that's on purpose.

Hi,
I can help modify buildbot if you want, but I suppose I need a specification what precisely is a bug here. Not accepting forms with garbage? By default buildbot "force build" does not require forms to be filled and that's on purpose.
Well, the "fix" would be to forbid an empty "Branch to build" since it doesn't point to anything buildable (even worse, the fact that the branch then ends up as None rather than an empty string produces an exception in buildbot code). I'm not sure what the process to get the fix in would be, but it probably involves discussing it with Martin :) Regards Antoine.

Maciej Fijalkowski <fijall@gmail.com> wrote:
On Sat, May 15, 2010 at 12:07 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sat, 15 May 2010 17:57:28 -0000 exarkun@twistedmatrix.com wrote:
One thing I have noticed is that spammers find these forms and submit them with garbage. We can probably suppose that such people are going to ignore a robots.txt file.
So we could "just" fix the buggy buildbot code. Not that I want to do it myself :S
I can help modify buildbot if you want, but I suppose I need a specification what precisely is a bug here. Not accepting forms with garbage? By default buildbot "force build" does not require forms to be filled and that's on purpose.
I'd find it useful if the "branch" field was a choice pull-down listing valid branches, rather than a plain text field, and if the "revision" field always defaulted to "HEAD". Seems to me that since the form is coming from the buildmaster, that should be possible. Bill

On Sat, 15 May 2010 13:03:59 PDT Bill Janssen <janssen@parc.com> wrote:
I'd find it useful if the "branch" field was a choice pull-down listing valid branches, rather than a plain text field, and if the "revision" field always defaulted to "HEAD". Seems to me that since the form is coming from the buildmaster, that should be possible.
Actually, I think that it does already default to HEAD if you leave it empty. Regards Antoine.

I'd find it useful if the "branch" field was a choice pull-down listing valid branches, rather than a plain text field, and if the "revision" field always defaulted to "HEAD". Seems to me that since the form is coming from the buildmaster, that should be possible.
Unfortunately, these forms are deeply hidden in the buildbot code. So I'd rather avoid editing them, or else upgrading to the next buildbot version becomes even more tedious. Regards, Martin

On 08:32 pm, martin@v.loewis.de wrote:
I'd find it useful if the "branch" field was a choice pull-down listing valid branches, rather than a plain text field, and if the "revision" field always defaulted to "HEAD". Seems to me that since the form is coming from the buildmaster, that should be possible.
Unfortunately, these forms are deeply hidden in the buildbot code. So I'd rather avoid editing them, or else upgrading to the next buildbot version becomes even more tedious.
Someone sufficiently interested in this feature could work with buildbot upstream to get the feature added to an upcoming release, though (obviously). Jean-Paul

The buildbots are sometimes subject to a flood of "svn exception" errors. It has been conjectured that these errors are caused by Web crawlers pressing "force build" buttons without filling any of the fields (of course, the fact that we get such ugly errors in the buildbot results, rather than a clean error message when pressing the button, is a buildbot bug in itself). Couldn't we simply exclude all crawlers from the buildbot Web pages?
Hmm. Before doing any modifications, I'd rather have a definite analysis on this. Are you absolutely certain that, when that happened, the individual builds that caused this svn exception where actually triggered over the web, rather than by checkin? When it happens next, please report exact date and time, and the build log URL. Due to log rotation, it would then be necessary to investigate that in a timely manner. Without any reference to the specific case, I'd guess that a flood of svn exceptions is caused due to an svn outage, which in turn might be caused when a build is triggered while the daily Apache restart happens (i.e. around 6:30 UTC+2). That said: /dev/buildbot has been disallowed for all robots for quite some time now: http://www.python.org/robots.txt There is really no point robots crawling the build logs, as they don't contain much useful information for a search engine. Regards, Martin

On Sat, 15 May 2010 21:49:07 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
Hmm. Before doing any modifications, I'd rather have a definite analysis on this. Are you absolutely certain that, when that happened, the individual builds that caused this svn exception where actually triggered over the web, rather than by checkin?
How can I be "absolutely certain"? As I said, it's a conjecture, and the suggested fix is just that: a suggestion.
When it happens next, please report exact date and time, and the build log URL. Due to log rotation, it would then be necessary to investigate that in a timely manner.
Please take a look at http://www.python.org/dev/buildbot/trunk/. There are still a bunch of violet buildbots there. For example: http://www.python.org/dev/buildbot/builders/sparc%20Ubuntu%20trunk/builds/17... http://www.python.org/dev/buildbot/builders/sparc%20Ubuntu%20trunk/builds/17... Regards Antoine.

For us at least no branch specified builds the default branch (trunk) and does not end up with exception in buildbot code. How about specifying the default branch in config file? On Sat, May 15, 2010 at 1:55 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Sat, 15 May 2010 21:49:07 +0200 "Martin v. Löwis" <martin@v.loewis.de> wrote:
Hmm. Before doing any modifications, I'd rather have a definite analysis on this. Are you absolutely certain that, when that happened, the individual builds that caused this svn exception where actually triggered over the web, rather than by checkin?
How can I be "absolutely certain"? As I said, it's a conjecture, and the suggested fix is just that: a suggestion.
When it happens next, please report exact date and time, and the build log URL. Due to log rotation, it would then be necessary to investigate that in a timely manner.
Please take a look at http://www.python.org/dev/buildbot/trunk/. There are still a bunch of violet buildbots there. For example: http://www.python.org/dev/buildbot/builders/sparc%20Ubuntu%20trunk/builds/17... http://www.python.org/dev/buildbot/builders/sparc%20Ubuntu%20trunk/builds/17...
Regards
Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com

Hmm. Before doing any modifications, I'd rather have a definite analysis on this. Are you absolutely certain that, when that happened, the individual builds that caused this svn exception where actually triggered over the web, rather than by checkin?
How can I be "absolutely certain"?
In the example you gave, the build log says "The web-page 'force build' button was pressed by '<unknown>': <no reason specified> " So ISTM that it's indeed certain that the build was triggered over the web, rather than by a checkin.
http://www.python.org/dev/buildbot/builders/sparc%20Ubuntu%20trunk/builds/17...
AFAICT from the twistd logs, the user agent triggering this build was "Mozilla/4.7 [ja] (Win98; I)". It still may have been a bot; it was using a GET request, even though the form asks for a POST. The IP address points to some Japanese dialup network (reverse lookup reports address.dy.bbexcite.jp.) The bot probably has malicious intent: it has been using about 10 different user-agent strings, on various parts of the site. I have now blackholed this IP address (although it stopped contacting python.org around 8 hours ago, anyway). If desired, we could password-protect the "force build" forms. If that is to be done, some help from a buildbot expert on what to change specifically would be appreciated. Regards, Martin

Antoine Pitrou <solipsis@pitrou.net> wrote:
The buildbots are sometimes subject to a flood of "svn exception" errors. It has been conjectured that these errors are caused by Web crawlers pressing "force build" buttons without filling any of the fields (of course, the fact that we get such ugly errors in the buildbot results, rather than a clean error message when pressing the button, is a buildbot bug in itself). Couldn't we simply exclude all crawlers from the buildbot Web pages?
I caused a few of those myself yesterday updating my PPC buildbots. Apologies! Bill

I caused a few of those myself yesterday updating my PPC buildbots.
Apologies!
No need to apologize! these are not the ones Antoine is talking about. By convention, filling out the "Your name" field in a web build is recommended, so people know that this was an intentional build. I usually also fill out a reason. Regards, Martin
participants (5)
-
"Martin v. Löwis"
-
Antoine Pitrou
-
Bill Janssen
-
exarkun@twistedmatrix.com
-
Maciej Fijalkowski