Why does start_new_thread() create an extra process under Linux?

Heiko Wundram heikowu at ceosg.de
Thu Jul 29 11:31:13 EDT 2004


Am Donnerstag, 29. Juli 2004 16:00 schrieb Jp Calderone:
>    Most likely, the "extra" you are seeing is an implementation detail
> of your platform's underlying thread library.  It probably exists to act
> as a scheduler or perform other administrative tasks for the "real"
> threads of your application.

Well, first of all, what the op was seeing wasn't actually what he thought he 
was seeing.

In Python there's always the main thread (which is started when python starts 
up), and other threads may be started. Thus, if you start two threads in your 
program, you'll see three processes in the process list (one for the main 
thread, two for the started threads).

But whether these threads will show up as processes depends on the threading 
library you use...

LinuxThreads creates a process for each thread that is run. All these 
processes share the same memory, although they show up as separate processes 
(and actually are, at least for the kernel, they are started by the sys-call 
CLONE, which clones a process creating a new process ID, stack and 
instruction pointer, but keeping the data and code segment of the cloning 
process).

NPTL (Native Posix Threads Library), the "next-generation" threads library for 
Linux, handles threads "correctly" in the sense that they are just one 
process with separate execution frames but shared memory. NPTL requires 
kernel >= 2.5.40-something and a specially adapted glibc. Most new Linux 
distributions (>= 9.0 something, debian sid aka. unstable) ship with NPTL 
enabled by default, although this creates compatability problems with apps 
written for LinuxThreads, as LinuxThreads isn't completely Posix-Threads 
compatible (which NPTL is). It also uses some form of syscall, but you'd have 
to see the docs for this, I don't know. ps from procps was augmented to 
support NPTL threads sometime ago, there's a specific flag you have to 
specify to have threads shown.

There are also other Linux threads libraries out there, all of them completely 
implemented in user-space, using dispatch/longjmp and other black magic. When 
a program uses one of these, you'll also see only one process, although I 
don't know any production program that uses one of these threading libraries.

Anyway, hope this clears it up a little...

Heiko.



More information about the Python-list mailing list