[Python-Dev] Community buildbots and Python release quality metrics
glyph at divmod.com
glyph at divmod.com
Thu Jun 26 18:21:08 CEST 2008
On 03:33 pm, guido at python.org wrote:
>Too verbose, Glyph. :-)
Mea culpa. "Glyph" might be a less appropriate moniker than "Altogether
too many glyphs".
>It needs to be decided case-by-case.
If certain tests are to be ignored on a case-by-case basis, why not
record that decision by disabling the test in the code? Otherwise, the
decision inevitably gets boiled down to "it's okay to release the code
with a bunch of tests failing, but I don't know which ones". As it was
on Twisted when we used to make case-by-case decisions about failures,
and as it is here now.
>The beta is called beta because, well, it may break stuff and we may
>want to fix it.
That's also why the alpha is called an alpha. My informal understanding
is that a beta should have no (or at least very few) known issues, with
a medium level of confidence that no further ones will be found. An RC
would have absolutely no known issues with a fairly high level of
confidence that no further ones will be found. This would include the
community buildbots basically working for the most part; I would not be
surprised if there were a few tests that still had issues.
But clearly the reality does not meet my informal expectations, so it
would be nice to have something written down to check against. Still,
I'm bringing this up now because it _is_ a beta, and I think it will be
too late to talk about dealing with persistent test failures after the
RCs start coming out.
(Of course, I'm just being sneaky. I don't actually care if it's
clearly documented, I just care that I stop having problems with
incompatibility. But I believe having it clearly spelled out would
actually prevent a lot of the problems that I've been having, since I
don't think anyone would *intentionally* select a policy where every
release breaks at least one major dependent project.)
>I'm not particularly impressed by statistics like "all tests are red"
>may all be caused by a single issue.
The issue, for me, is not specifically that tests are red. It's that
there's no clear policy about what to do about that. Can a release go
out with some of the tests being red? If so, what are the extenuating
circumstances? Does this have to be fixed? If not, why not? Why are
we talking about this now? Shouldn't the change which caused Django to
become unimportable have been examined at the time, rather than months
later? (In other words, if it *is* a single issue, great, it's easy to
fix: revert that single issue.) If not, then shouldn't someone in
Django-land have been notified so they could cope with the change?
Sorry that there are so many questions here; if I had fewer, I'd use
fewer words to ask them.
>For example, I'd much rather read an explanation about *why* Django
>cannot even be imported than a blanket complaint that this is a
>disgrace. So why is it?
I don't know. JP is already addressing the issues affecting Twisted in
another thread (incompatible changes in the private-but-necessary-to-
get-any-testing-done API of the warnings module). But I really think
that whoever made the change which broke it should be the one
investigating it, not me.
More information about the Python-Dev