[Python-Dev] Automated Python testing (was Re: status of development documentation)

Mon Dec 26 01:54:44 CET 2005

On 12/25/05, Tim Peters <tim.peters at gmail.com> wrote:
> [Tim]
> >> Take a look at:
> >>
> >>     http://buildbot.zope.org/
> >>
> >> That runs code from:
> >>
> >>     http://buildbot.sourceforge.net/
> >>
> >> Someone sets up a "buildbot master" (that's what the Zope URL points
> >> at), and then any number of people can volunteer to set up their boxes
> >> as "buildbot slaves".
>
> [Brett]
> > As in some machine I might personally have left on?
>
> Slaves have to be powered on to work, yes ;-)
>
> > That would require a static IP which I don't know how common that
> > will be.
>
> Spend a few minutes skimming the buildbot docs -- I'm not an expert,
> but it's a real system in real use, and they have solutions for the
> obvious problems.  In this case, while the master passes out commands
> to run and collects status, a slave _initiates_ communication with the
> master.  A static IP for the master is good for that, but I figure the
> slave can keep participating happily then for just as long as it can
> keep a socket connection open with the master.
>

OK, so it is a pull from the slave side, not a push on the master side.

> If a slave goes away (network problem; powered off; whatever), that's
> fine too.  The master can't talk to it then, and the slave's column in
> the master's display will say the slave is offline.  Or, if it's been
> so long that all info about the slave has "scrolled off" the display,
> the column will just say "none" above it.  You can see a couple
> examples of that in the
>
>     http://buildbot.zope.org
>
> display today, for some Windows slaves that have gone missing in action.
>
> > then again I am willing to bet that the Python community is big enough
> > that people who do have machines that are idle enough that we should
> > be able to get good coverage.  Wonder if we would have to worry about
> > result pollution from someone who thought it was funny to send in
> > false negatives?
>
> I wouldn't worry about it.  For one thing, while anyone can volunteer
> to participate, the buildbot master's admin has to set up config info
> for each specific slave they want to _allow_ to participate.  It's
> more like a moderated mailing list that way ;-)  A slave also needs to
> know a password (which the master's admin emails (for example) to the
> slave's admin if the former wants the latter to participate).
>
> ...
>
> >> One downside is that we seem unable to get an in-house Windows
> >> buildbot slave to work reliably, and so far don't even know whether that's
> >> because of Windows, the buildbot code, or flakiness in our internal
> >> network.  It seems quite reliable on Linux, though.
>
> > Well, it is written in Python so *someone* here should either be able
> > to fix it or properly blame it on Windows.  =)
>
> Yup!
>
> > The idea of the PSF paying to have some machines set up to run
> > consistent tests has come up multiple times.
>
> A brilliant part of the buildbot approach is that a sub-community
> claiming to care about a  platform (major or not) can put a bit of
> resource where their mouth is by offering part-time use of a box to
> run the checkout/build/test dance.  That way platform experts who
> presumably care about their platform are in charge of all aspects of
> setting their platform up.  No centralized "compile farm" can work as
> well, unless it has enough money to buy machines-- and expert
> caretakers --for umpteen off-the-wall OS variations.
>

I guess if someone complains about wanting better support for platform
X we could then say that we don't have a buildbot slave for it and if
they want to help the situation they can get one set up.  =)

> > I know Neal has said he would be willing to host the machines at his
> > house before (but I think this may have been before his relocation to
> > California).
>
> Looks like he's already running automated tests:
>
>     http://docs.python.org/dev/results/
>
> The various steps there could be defined as buildbot actions, and then
> run on any number of boxes "whenever something changes".
>

See, that is what threw me; thinking that when the master knows a
change happens it pushes out to the slaves.  I guess the master notes
that there is a reason to do a new run and that is what the slaves are
constantly checking with the master about.

> > This whole situation of going two months without knowing that a major
> > platform is broken shows that this is a real problem and ignoring it is
> > probably not a good thing.  =)
>
> It's generally true that the sooner you find out something has broken,
> the easier it is to repair it.  For "minor" platforms, I'd say we have
> no idea whether the trunk has regressed wrt 2.4.2.  There's simply no
> way to know without trying it.
>

Right.  Part of the reason AIX, I am sure, keeps breaking.

> > If we ask for volunteer machines we could offer to put up company or
> > personal names on the buildbot page of those who have volunteered CPU
> > cycles.  I am sure that will help motivate companies and people to
> > install the software on a spare machine.
>
> Finding volunteers has been surprisingly (to me) difficult.  Most (but
> not all) of the machines you see on the Zope page are ZC-internal
> boxes, and, e.g., a Mac OS box is still missing.
>

If the install process is really simple and we give people an easy way
to specify how often/when they poll the master then I think more
people would be willing to do it.  If you can have your box at work do
it after work hours or have  your box at home do it while you are at
work during the week then I think more people will step up.  Lowering
the barrier and helping people minimize the impact on their machines
to only when they want it to occur should help.

Maybe this is all in the docs, I don't know (about to leave for Xmas
dinner so don't have the time right now).

> > Heck, I would have no problem giving a specific company sole sponsorship
> > kudos if they gave us boxes that covered enough core operating systems.
>
> Cool too.
>
> > Maybe this is something to bring up at the PSF meeting and to hash out
> > at the sprints?
>
> It primarly takes someone with access to "a python.org machine" to
> volunteer to install and play admin for a buildbot master.  A
> committee wouldn't actually help with that ;-)

Well, maybe Neal will be up for this on top of the auto test he has
set up.  I would say I would do it but I don't have the proper server
access on pydotorg and I don't have much experience administering on
Linux or else I would be willing to do it with someone.

The other testing option I have seen tossed around is having
regrtest.py have an option of emailing the test results of a test run
somewhere.  So if tests failed run them directly and then append that
output in an email with the system information and an optional contact
email address if the person is willing to help debug the problem. 
Would be great for alpha and beta releases since it doesn't require a
dedicated system but just allowing an email to be sent with some
system info.

-Brett