> It is nigh impossible to pick out errors in a PR from flaky tests, flaky machines and pre-existing errors.
I agree with the sentiment but is certainly not that dramatic. We (people watching the build bots) do it on a regular basis, and many contributors do it very efficiently as well. I agree that flaky tests and machines do not help here, but those are generally just a small subset (although a noisy one). Pre-existing errors should be fixed, so one could argue that these errors being annoying is a positive thing under some optics.
I think the best approach for a CI check is to link every source file to a particular test and then use the diff to select what tests to run. Notice this is still not enough because some tests (like test_asyncio) will take a huge amount of time on their own in refleak mode.
Additionally, some files will virtually touch all tests (like changing the VM) and some refleaks will only be visible with an unknown subset of them.
I would rather have to wait a while to merge good code, than merge bad code quickly.
With that logic, seems that waiting for buildbots is not bad ;)