problem in implementing multiprocessing

Carl Banks pavlovevidence at gmail.com
Mon Jan 19 04:09:30 EST 2009


On Jan 18, 10:00 pm, "James Mills" <prolo... at shortcircuit.net.au>
wrote:
> On Mon, Jan 19, 2009 at 3:50 PM, gopal mishra <gop... at infotechsw.com> wrote:
> > i know this is not an io - bound problem, i am creating heavy objects in the
> > process and add these objects in to queue and get that object in my main
> > program using queue.
> > you can test the this sample code
> > import time
> > from multiprocessing import Process, Queue
>
> > class Data(object):
> >    def __init__(self):
> >        self.y = range(1, 1000000)
>
> > def getdata(queue):
> >    data = Data()
> >    queue.put(data)
>
> > if __name__=='__main__':
> >    t1 = time.time()
> >    d1 = Data()
> >    d2 = Data()
> >    t2 = time.time()
> >    print "without multiProcessing total time:", t2-t1
> >    #multiProcessing
> >    queue = Queue()
> >    Process(target= getdata, args=(queue, )).start()
> >    Process(target= getdata, args=(queue, )).start()
> >    s1 = queue.get()
> >    s2 = queue.get()
> >    t2 = time.time()
> >    print "multiProcessing total time::", t2-t1
>
> The reason your code above doesn't work as you
> expect and the multiprocessing part takes longer
> is because your Data objects are creating a list
> (a rather large list) of ints.

I'm pretty sure gopal is creating a deliberately large object to use
as a
test case, so switching to xrange isn't going to help here.

Since multiprocessing serializes and deserializes the data while
passing
it from process to process, passing very large objects would have a
very
high latency and overhead.  IOW, gopal's diagnosis is correct.  It's
just not practical to share very large objects among seperate
processes.

For simple data like large arrays of floating point numbers, the data
can be shared with an mmaped file or some other memory-sharing scheme,
but actual Python objects can't be shared this way.  If you have
complex
data (networks and heirarchies and such) it's a lot harder to share
this
information among processes.


Carl Banks



More information about the Python-list mailing list