Threads + Re(gex) Baffled!?!?!?

Tim Peters tim_one at email.msn.com
Sun Sep 19 18:58:38 EDT 1999


[Cliff Daniel, with some anomolous timings using some flavor of threads]
> Problem found... read below.
> ...
>   I have no idea why i used re.split().  I always use
> string.split().  Well for giggles I changed it to string.split()
> and altered nothing else.  Guess what?  The cpu/kernel never went
> above 2%.  Problem resolved.  What the heck?

I've lost track of what "nothing else" might mean at this point, but all
symptoms are valuable clues <wink>.  The primary difference between re.split
and string.split in this context is probably that the latter is written
entirely in C while much of the former is written in Python (in Lib/re.py).
Once string.split is entered, Python won't even *try* to let another thread
run until string.split returns (because it's all C code).  When re.split is
entered, Python will try to give another thread a shot every 10 (by default)
bytecodes, for long as re.split is executing Python code.

So it's possible that when using re.split your system is just thrashing
madly.  If you still care <wink>, you can use

    sys.setcheckinterval(n)

to boost the number of bytecodes executed before Python offers to let
another thread run.  If bumping it up to (say) 100 makes the kernel time go
down, your platform simply sucks at mediating fine-grained thread contention
in the way it's currently configured.

> ...
>   The problem didn't exist with one thread.  As you increased the
> threadpool the problem grew.

Well, very few thread problems manifest when there's only one thread <wink>.

no-contention-implies-no-contention-ly y'rs  - tim






More information about the Python-list mailing list