[IPython-dev] [parallel] issue when executing 'import pandas as pd' on engine

Francesco Montesano franz.bergesund at gmail.com
Mon Sep 16 12:03:12 EDT 2013


Dear all,

I'm having an issue with the parallel (v1.1.0, under python 2.7).

Some time ago I did build a number of python codes to manipulate
catalogues.
I can have either thousands of small file or few possibly huge file.
So I've written my codes such that I can chose from command whether to use
any of the two.

Typically my codes have the following structure:

import numpy as np
> import pandas as pd
> def parse(...):  #argparse
>     ....
> def to_do(fname,...):  #function(s) that do what I need
>     ....
> if __name__=='__main__':
>     args = parse(...)
>     if args.paralell == False:
>          for fn in file_name_list:
>               to_do(fn, ...)

     else:  #execute in parallel

         parallel_env = Lbv()       # custom class that init a load
> ballance view

         imports = ['import numpy as np', 'import pandas as pd']

         parallel_env.exec_on_engine(imports) # execute the above strings
> on all engines (direct view)

         #execute in parallel

         runs = [parallel_env.apply(to_do, os.path.abspath(fn), ...) for fn
> in file_name_list]



I build the whole parallel dispatching against ipython 0.13 and everything
worked fine.
But today I've tried to run one of my scripts enabling parallel and got the
following error


>   File "code.py", line XXX, in <module>
>     parallel_env.exec_on_engine(imports)
>   File "XXX/ipython_parallel.py", line 86, in exec_on_engine
>     e.raise_exception()
>   File "XXX.local/lib/python2.7/site-packages/IPython/parallel/error.py",
> line 199, in raise_exception
>     raise RemoteError(en, ev, etb, ei)
> RemoteError: NameError(name 'plt' is not defined)


The only thing that uses matplotlib is pandas, and modifying

 imports = ['import numpy as np', 'import matplotlib.pyplot as plt',
> 'import pandas as pd']


seems to solve the problem (although at least in one case the first call of
my code crashed with the error and the second went through).

If I run my code without requiring the ipy parallel I don't have any
problem with 'plt'

I guess that this is a bug. But I still haven't understood how to debug
what happens on the engines, so I can't give more details.
Any clues?

If needed I can load my 'ipython_parallel.py' module in gist/github

Cheers,

Fra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20130916/669599eb/attachment.html>


More information about the IPython-dev mailing list