Threads + Re(gex) Baffled!?!?!?
Gordon McMillan
gmcm at hypernet.com
Wed Sep 15 23:14:17 EDT 1999
Cliff Daniel writes:
> On Wed, 15 Sep 1999 09:10:25 -0400, "Gordon McMillan"
> <gmcm at hypernet.com> wrote:
>
> >The problem could be re, or a platform problem not exposed until
> >you use multiple CPUs, or a combination of these.
>
> I not sure if I mentioned this either but running with 1
> thread
> doesn't seem to cause the problem. I can't think of any reason 1
> thread or 10 should bring out this behaviour since the lwp
> accosiated with the thread can be scheduled on any given cpu at
> any give time. Something's not adding up.
This is a not-uncommon problem when multithreading with
multiple CPUs. The re module is loaded in memory belonging
to CPU A. If all activity takes place on CPU A, no problem.
But if CPU B wants access to memory owned by A, there's a
some system-driven locking going on, particularly if that
access involves any mallocs, in which case the lock may put
extreme limits on what CPU A can do without owning the
lock. Now if the schedulers for the 2 CPUs don't coordinate
pretty closely, you may end up with both CPUs waiting for
locks to be released, while the thread owning the lock is not
even running. The timings in your post seem to point to this
kind of problem.
In this kind of situation, even with really well done system
support for multithreading on mulitple CPUs, you'll still never
get anywhere close to twice the throughput by doubling the
number of CPUs.
So I don't think the regex-in-its-own-thread solution is a hack -
I think it's the proper solution. Your I/O bound threads (the
socket threads) can probably parallelize very nicely, but the
compute intensive resource is owned by a single thread, as it
should be.
'Course good old-fashioned multi-processing wouldn't have this
problem.
- Gordon
More information about the Python-list
mailing list