Threads + Re(gex) Baffled!?!?!?
Gordon McMillan
gmcm at hypernet.com
Wed Sep 15 09:10:25 EDT 1999
Cliff,
The problem could be re, or a platform problem not exposed
until you use multiple CPUs, or a combination of these.
A solution might be to dedicated one thread to running the
regexes, and use Queue to pass information to him from the
socket-threads. Assuming that your regex exhibits decent
behavior, the bottleneck is probably in the network, anyway.
----original message-------------
> I'm totally baffled right now, wrt threads. I have built a test
> program using sockets to pull down just a few lines of text,
> parse it and junk the result.
>
> For practical purposes I have spawned 15 worker threads who
> each will connect to a different host (480 Hosts total). They
> would then execute code similar to this:
>
> ---------------------------------------
> Sock.send('some-command')
> Reply = ReadUntil('> ', Sock, Timeout)
>
> if Reply == '':
> return(-1)
> else:
> tmp = re.split('\r\n', Reply)
> for line in tmp:
> m = pat.match(line)
> .etc.
>
> Upon running the script you see nice and speedy results at first
> with very little cpu usage. However, the cpu usage grows
> continously until the program ends. I have noticed that when the
> bloating occurs the a lot of the cpu time is spent in the kernel
> which would lead me to believe this is some sort of locking issue
> hidden within the 're' module? When I put a 'continue' in front
> of the pat.match() every thing works just fine.
>
> With re:
> CPU states: 12.5% idle, 18.8% user, 46.4% kernel
> 26.54% test.py
>
> Without:
> CPU states: 19.1% idle, 35.8% user, 10.6% kernel,
> 0.76% test.py
>
> The pattern I'm matching is as follows:
> pat = re.compile('\s+(\w+)\s+\(.+?\):\s+(\d+)\s+(\d+)')
>
> Is there any known issues with locking and 're'?
>
> The system is an UE5000 with 8 cpus, Solaris 2.6, Python 1.5.2
>
> If anyone has any ideas about this I desperately need some help.
> I hate to have to dump this entire project for performance
> reasons. I've stripped out EVERYTHING in the program just to
> reproduct this to rule out my code.
>
> Regards,
> Cliff
>
>
> --
> http://www.python.org/mailman/listinfo/python-list
- Gordon
More information about the Python-list
mailing list