[IPython-dev] Parallel computing segfault behavior
Patrick Fuller
patrickfuller at gmail.com
Tue Jan 28 20:01:30 EST 2014
I guess my question is more along the lines of: should the cluster continue
on to complete the queued jobs (as it would if the segfaults were instead
python exceptions)?
On Tuesday, January 28, 2014, MinRK <benjaminrk at gmail.com> wrote:
> I get an EngineError when an engine dies running a task:
>
> http://nbviewer.ipython.org/gist/minrk/8679553
>
> I think this is the desired behavior.
>
>
> On Tue, Jan 28, 2014 at 2:18 PM, Patrick Fuller <patrickfuller at gmail.com<javascript:_e({}, 'cvml', 'patrickfuller at gmail.com');>
> > wrote:
>
>> Hi,
>>
>> Has there been any discussion around how ipython parallel handles
>> segfaulting?
>>
>> To make this question more specific, the following code will cause some
>> workers to crash. All results will become unreadable (or at least
>> un-iterable), and future runs require a restart of the cluster. Is this
>> behavior intended, or is it just something that hasn't been discussed?
>>
>> from IPython.parallel import Clientfrom random import random
>> def segfaulty_function(random_number, chance=0.25):
>> if random_number < chance:
>> import ctypes
>> i = ctypes.c_char('a')
>> j = ctypes.pointer(i)
>> c = 0
>> while True:
>> j[c] = 'a'
>> c += 1
>> return j
>> else:
>> return random_number
>>
>> view = Client(profile="something-parallel-here").load_balanced_view()
>> results = view.map(segfaulty_function, [random() for _ in range(100)])
>> for i, result in enumerate(results):
>> print i, result
>>
>> Backstory: Recently I've been working with a large monte carlo library
>> that segfaults for, like, no reason at all. It's due to some weird
>> underlying random number issue and happens once every 5-10 thousand runs. I
>> currently have each worker spin out a child process to isolate the
>> occasional segfault, but this seems excessive. (I'm also trying to fix the
>> source of the segfaults, but debugging is a slow process.)
>>
>> Thanks,
>> Pat
>>
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org <javascript:_e({}, 'cvml',
>> 'IPython-dev at scipy.org');>
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140128/871bdcef/attachment.html>
More information about the IPython-dev
mailing list