[docs] [issue29575] doc 17.2.1: basic Pool example is too basic

Davin Potts report at bugs.python.org
Mon Feb 20 20:42:29 EST 2017


Davin Potts added the comment:

When passing judgement on what is "too basic", the initial example should be so basic as to be immediately digestible by as many people as possible.

Some background:
All too many examples mislead newcomers into believing that the number of processes should (a) match the number of processor cores, or (b) match the number of inputs to be processed.  This example currently attempts to dispel both notions.  In practice, and this depends upon what specific code is to be performed in parallel, it is not uncommon to find that slightly over-scheduling the number of processes versus the number of available cores can achieve superior throughput and performance.  In other cases, slightly under-scheduling may provide a win.  To help subtly encourage the newcomer, this example uses 5 processes as opposed to something which might be mistaken for a common number of cores available on current multi-core processors.  Likewise, the number of distinct inputs to be processed deliberately does not match the number of processes nor a multiple of the number of processes.  This hopefully encourages the newcomer to not feel obligated to only accept inputs of a particular size or multiple.  Granted, optimizing for performance motivates tuning such things but this is the first example / first glance at what functionality is available.

Considering the suggested change:
* range(20) will likely produce more output than can be comfortably accommodated and easily read in the available browser window where most will see this
* the addition of execution time measurement is an interesting choice here given how computationally trivial the f(x) function is, which is perhaps what motivated the introduction of a time.sleep(1) inside that function; a ThreadPool would be more appropriate for a sleepy function such as this

Ultimately these changes complicate the example while potentially undermining its value.  An interesting improvement to this example might be to introduce a computationally taxing function which more clearly demonstrates the benefit of using a process Pool but still achieving the ideal of being immediately digestible and understood by the largest reading audience.  Some of the topics/variations in the proposed change might be better introduced and addressed later in the documentation rather than unnecessarily complicating the first example.

----------
resolution:  -> works for me
stage:  -> resolved
status: open -> closed
type:  -> enhancement

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue29575>
_______________________________________


More information about the docs mailing list