[Chicago] threading is slow
Daniel Griffin
dgriff1 at gmail.com
Thu Mar 7 00:35:59 CET 2013
What sort of speed are you looking for here? Does the ordering matter? If
not then you can just do a multiprocessing Pool and call map with a chunk
of the million int pairs.
On Wed, Mar 6, 2013 at 4:31 PM, Oren Livne <livne at uchicago.edu> wrote:
> Thanks so much for all your answers!
>
> I have a text file with a million int pairs, each of which can be
> processed immediately. I would like to set up a queue to read lines from
> the file and feed a thread pool that will process it in parallel and output
> into (say) another queue, to be processed by another thread that prints the
> results.
>
>
> On 3/6/2013 5:19 PM, Brantley Harris wrote:
>
> Whoa, back up. What are you trying to do with threads?
>
>
> On Wed, Mar 6, 2013 at 5:05 PM, Daniel Griffin <dgriff1 at gmail.com> wrote:
>
>> Python has a GIL so threads mostly sort of suck. Use multiprocessing,
>> twisted or celery.
>>
>>
>> On Wed, Mar 6, 2013 at 3:29 PM, Oren Livne <livne at uchicago.edu> wrote:
>>
>>> Dear All,
>>>
>>> I am new to python multithreading. It seems that using threading causes
>>> a slow down with more threads rather than a speedup. should I be using the
>>> multiprocessing module instead? Any good examples for threads reading from
>>> a queue with multiprocessing?
>>>
>>> Thanks so much,
>>> Oren
>>>
>>> #!/usr/bin/env python
>>> '''Sum up the first 100000000 numbers. Time the speed-up of using
>>> multithreading.'''
>>> import threading, time, numpy as np
>>>
>>> class SumThread(threading.Thread):
>>> def __init__(self, a, b):
>>> threading.Thread.__init__(self)
>>> self.a = a
>>> self.b = b
>>> self.s = 0
>>>
>>> def run(self):
>>> self.s = sum(i for i in xrange(self.a, self.b))
>>>
>>> def main(num_threads):
>>> start = time.time()
>>> a = map(int, np.core.function_base.linspace(0, 100000000,
>>> num_threads + 1, True))
>>> # spawn a pool of threads, and pass them queue instance
>>> threads = []
>>> for i in xrange(num_threads):
>>> t = SumThread(a[i], a[i + 1])
>>> t.setDaemon(True)
>>> t.start()
>>> threads.append(t)
>>>
>>> # Wait for all threads to complete
>>> for t in threads:
>>> t.join()
>>>
>>> # Fetch results
>>> s = sum(t.s for t in threads)
>>> print '#threads = %d, result = %10d, elapsed Time: %s' %
>>> (num_threads, s, time.time() - start)
>>>
>>> for n in 2 ** np.arange(4):
>>> main(n)
>>>
>>> Output:
>>> #threads = 1, result = 4999999950000000, elapsed Time: 12.3320000172
>>> #threads = 2, result = 4999999950000000, elapsed Time: 16.5600001812 ???
>>> #threads = 4, result = 4999999950000000, elapsed Time: 16.7489998341 ???
>>> #threads = 8, result = 4999999950000000, elapsed Time: 16.6720001698 ???
>>>
>>> _______________________________________________
>>> Chicago mailing list
>>> Chicago at python.org
>>> http://mail.python.org/mailman/listinfo/chicago
>>>
>>
>>
>> _______________________________________________
>> Chicago mailing list
>> Chicago at python.org
>> http://mail.python.org/mailman/listinfo/chicago
>>
>>
>
>
> _______________________________________________
> Chicago mailing listChicago at python.orghttp://mail.python.org/mailman/listinfo/chicago
>
>
>
> --
> A person is just about as big as the things that make him angry.
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20130306/5e007809/attachment.html>
More information about the Chicago
mailing list