out of memory with processing module

alessiogiovanni.baroni at gmail.com alessiogiovanni.baroni at gmail.com
Tue Apr 21 04:09:00 EDT 2009


On 20 Apr, 18:32, Brian <knair... at yahoo.com> wrote:
> On Apr 20, 9:18 am, alessiogiovanni.bar... at gmail.com wrote:
>
>
>
> > On 20 Apr, 17:03, Brian <knair... at yahoo.com> wrote:
>
> > > I'm using the third-party "processing" module in Python 2.5, which may
> > > have become the "multiprocessing" module in Python 2.6, to speed up
> > > the execution of a computation that takes over a week to run. The
> > > relevant code may not be relevant, but it is:
>
> > >             q1, q2 = processing.Queue(), processing.Queue()
> > >             p1 = processing.Process(target=_findMaxMatch, args=
> > > (reciprocal, user, clusters[1:(numClusters - 1)/2], questions,
> > > copy.copy(maxMatch), q1))
> > >             p2 = processing.Process(target=_findMaxMatch, args=
> > > (reciprocal, user, clusters[(numClusters - 1)/2:], questions, copy.copy
> > > (maxMatch), q2))
> > >             p1.start()
> > >             p2.start()
> > >             maxMatch1 = q1.get()[0]
> > >             maxMatch2 = q2.get()[0]
> > >             p1.join()
> > >             p2.join()
> > >             if maxMatch1[1] > maxMatch2[1]:
> > >                 maxMatch = maxMatch1
> > >             else:
> > >                 maxMatch = maxMatch2
>
> > > This code just splits up the calculation of the cluster that best
> > > matches 'user' into two for loops, each in its own process, rather
> > > than one. (It's not important what the cluster is.)
>
> > > The error I get is:
>
> > > [21661.903889] Out of memory: kill process 14888 (python) score 610654
> > > or a child
> > > [21661.903930] Killed process 14888 (python)
> > > Traceback (most recent call last):
> > > ...etc. etc. ...
>
> > > Running this process from tty1, rather than GNOME, on my Ubuntu Hardy
> > > system allowed the execution to get a little further than under GNOME.
>
> > > The error was surprising because with just 1 GB of memory and a single
> > > for loop I didn't run into the error, but with 5 GB and two processes,
> > > I do. I believe that in the 1 GB case there was just a lot of
> > > painfully slow swapping going on that allowed it to continue.
> > > 'processing' appears to throw its hands up immediately, instead.
>
> > > Why does the program fail with 'processing' but not without it? Do you
> > > have any ideas for resolving the problem? Thanks for your help.
>
> > If your program crashes with more of one process, maybe you handle the
> > Queue objects
> > not properly? If you can, post the code of _findMaxMatch.
>
> Thanks for your interest. Here's _findMaxMatch:
>
> def _findMaxMatch(reciprocal, user, clusters, sources, maxMatch,
> queue):
>     for clusternumminusone, cluster in enumerate(clusters):
>         clusterFirstData, clusterSecondData = cluster.getData(sources)
>         aMatch = gum.calculateMatchGivenData(user.data, None, None,
> None, user2data=clusterSecondData)[2]
>         if reciprocal:
>             maxMatchB = gum.calculateMatchGivenData(clusterFirstData,
> None, None, None, user2data=user.secondUserData)[2]
>             aMatch = float(aMatch + maxMatchB) / 2
>         if aMatch > maxMatch[1]:
>             maxMatch = [clusternumminusone + 1, aMatch]
>     queue.put([maxMatch])

You can post the entire error message + full traceback?



More information about the Python-list mailing list