[py-dev] xdist and thread-safe resource counting

Fri Jan 20 23:26:48 CET 2012

On Fri, Jan 20, 2012 at 12:56 -0800, Ateljevich, Eli wrote:
> Thanks, Holger. I appreciate the hint about where to do the testing and waiting in pytest_runtest_setup and I think the atomic file rename idea is an interesting way to set up a signal.
> 
> I may not fully understand about xdist. I certainly agree it is efficient use of mpirun that is the crux of doing the job right ... so probably any load balancing offered by xdist is going to be wasted on me. 
> 
> The main service I was looking for out of xdist was the ability to run tests concurrently. As I think you realize, if I have a pool of 16 processors and the first four tests collected require 8, 4, 8, 4 processors, I would want this behavior:
> 1.  the first test to start immediately
> 2.  the second test to start immediately without the first finishing
> 3.  the third test to either wait or start in a python sense but "sleep" before launching mpi
> 4.  the fourth test to start immediately
> 
> Is vanilla py.test able to do this kind of concurrent testing? Or would I need to tweak it to launch tests in threads according to my criterion for readiness? 

A run with pytest-xdist, notably, "py.test -nNUM" allows to implement this
behaviour, i think.

> I think we have settled how I would allocate resources, but your idea implies I might have all the test hints in one place. If I have full control all the test launches this might allow me to do some sort of knapsack problem-ish kind of reorganization to keep everything fully utilized rather than taking the test in the order they were collected. For instance, if I had 16 processors and the first four tests take 12-12-4-4 I could do this in the order (12+4 concurrently) (12+4 concurrently). Do I have this level of control?

I think so yes.  IIRC pytest-xdist distributed the first four tests
such that they each land at different nodes.  So, given the algorithm
i hinted at, and running with "py.test -n3" the first sub process would 
start and run on 12 processors.  The second process would see that 
there are 12 used and wait until 12 become available. The 
third process would only need 4 and immediatly continue, utilizing
all 16 processors at that time.  When the first one finishes the
second sub process would see that there now are enough and proceed
with its testing.  This is all fully compatible with pytest-xdist
semantics and only needs code at pytest_runtest_setup time i think.

best,
holger