Re: [Python-ideas] solving multi-core Python

June 25, 2015


      On Wed, Jun 24, 2015 at 04:55:31PM -0700, Nathaniel Smith wrote:
...
On Wed, Jun 24, 2015 at 3:10 PM, Devin Jeanpierre
<jeanpierreda@gmail.com> wrote:
...
So there's two reasons I can think of to use threads for CPU parallelism:
- My thing does a lot of parallel work, and so I want to save on
memory by sharing an address space
This only becomes an especially pressing concern if you start running
tens of thousands or more of workers. Fork also allows this.
Not necessarily true... e.g., see two threads from yesterday (!) on
the pandas mailing list, from users who want to perform queries
against a large data structure shared between threads/processes:
https://groups.google.com/d/msg/pydata/Emkkk9S9rUk/eh0nfiGR7O0J
https://groups.google.com/forum/#!topic/pydata/wOwe21I65-I
("Are we just screwed on windows?")
Ironically (not knowing anything about Pandas' implementation
    details other than... "Cython... and NumPy"), there should be
    no difference between getting a Pandas DataFrame available to
    PyParallel and a NumPy ndarray or Cythonized C-struct (like
    datrie).

    The situation Ryan describes is literally the exact situation
    that PyParallel excels at: large reference data structures
    accessible in parallel contexts.

        Trent.

Re: [Python-ideas] solving multi-core Python

Trent Nelson