Threads in Python

Stephen Hansen me+list/python at ixokai.io
Thu Sep 1 18:27:32 EDT 2011


On 9/1/11 2:45 PM, George Kovoor wrote:
> Hi,
> Why doesn't python threads show an associated PID?  On spawning python
> threads using the threading module I can only see the main thread's pid on
> using top or ps unix command, no  subprocesses are displayed. In otherwords
> top or ps in not aware of any subprocesses created using threading module in
> python.
>
> Whereas in Java , creating threads will result in separate pid , these
> subprocesses can be listed using top or ps. Java threads get mapped to the
> cores in the system.

I think you're confused about what threads and subprocesses are. They
are completely different mechanisms for concurrent code. Threads never
show up on top or ps, in any language ... or the language isn't offering
threads. I don't know Java, so I can't really comment on it much, but it
may be misusing the 'thread' word, but I somehow doubt it. I suspect
you're just mistaken about what Java is offering.

Threads are separate operating ..er, chains-of-instructions within a
single process... Notably with threads, they share the same address
space so you can easily share objects amongst threads, without any
copying and with no overhead ... Also notably with threads, this can be
dangerous, so you often end up wrapping lots of locks around those
shared objects and have to take extreme care to make sure nothing goes
haywire.

Subprocesses are different; they are a whole, separate process with its
own address space and no shared memory (unless you go out of your way to
do it manually). Heck, each subprocess can have any number of threads.
Anything you want to share between them you have to take special care to
set up and do -- multiprocessing exists to make this easier and make
subprocesses easier to use, like threads are.

They're very distinct. Threads are a lot more lightweight and start up a
lot faster, but doing multithreaded programming right with any sort of
shared objects is really, really, really hard to get right. Some say you
can't.

But, in Python, only one thread actually ever executes actual Python
code at any given time. This does not actually make threading useless as
some people claim; if you're making a lot of calls into C-code, for
instance, the lock gets released while said C-code runs and other Python
code can continue along. Its just not useful if your program is
CPU-bound and wants to take advantage of multiple cores. But there's
lots of other reasons to go concurrent.

But if you do need lots of CPU power, multiprocessing lets you chew up
multiple cores and does so /fairly/ easily. Communication between the
processes can be expensive depending on the types of objects you need to
pass back and forth, but it depends on how you're designing your app.

They're just different ways of achieving concurrency, and the two
primary ways Python provides. (Greenlets is another, available as a
third-party module; Twisted's asynch dispatching isn't really exactly
concurrency, but it does a better job then concurrency does for some
operations; someone's always working on coroutines in some fashion or
another, which is another kind of concurrency.)

Lots of different ways to go concurrent, depending on your needs.

-- 

   Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 487 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20110901/e6d28b3d/attachment.sig>


More information about the Python-list mailing list