[IPython-dev] Using Ipython cache as file proxy

Gökhan Sever gokhansever at gmail.com
Thu Oct 21 18:16:26 EDT 2010


On Thu, Oct 21, 2010 at 2:09 AM, Hans Meine
<meine at informatik.uni-hamburg.de> wrote:
> Hi!
>
> I might have a solution for you.  This is based on %run's "-i" parameter, which retains the environment, similar to execfile().
>
> What I do looks sth. like this:
>
>
> # content of foo.py:
> import numpy
>
> if 'data' not in globals():
>    data = numpy.load("...")
>    # more memory-/IO-intensive setup code
>
>
> # ---- plotting part, always executed ----
> import pylab
>
> pylab.clf()
> pylab.plot(data[0]...)
> pylab.show()
>
>
> Then, you can do %run -i foo.py from within ipython, and the upper part will only get executed once.  (You can "del data" or use %run without -i to run it again.)
>
> HTH,
>  Hans

Hello Hans,

This is indeed a super-short term solution for my analysis/plotting
case. I did a slight test modification in my code as you suggested:

if 'ccn02' not in globals():
    # Read data in as masked arrays
    ccn02 = NasaFile("./airborne/20090402_131020/09_04_02_13_10_20.dmtccnc.combined.raw")
    ccn02.mask_MVC()
    pcasp02 = NasaFile("./airborne/20090402_131020/09_04_02_13_10_20.conc.pcasp.raw")
    pcasp02.mask_MVC()
    # following 25 more similar data read-in.

and actually it is a very clever trick. All I need to check if only
one variable name is in globals() dictionary, and do the reading
accordingly.

However this hasn't given me the speed-up I was initially expecting.
Doing a bit more investigation with:

I[151]: %time run airplot.py
CPU times: user 29.02 s, sys: 0.17 s, total: 29.19 s
Wall time: 30.44 s

which a simple manual timer tests take about 35-40 seconds for the
last figure page in my plots to actually show-up in the screen
(possibly right after I save multi-page PDF file). When I do a profile
run --showing only top some results:

9900136 function calls (9652765 primitive calls) in 59.379 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       26    5.131    0.197   11.375    0.437 io.py:468(loadtxt)
      112    2.152    0.019    2.427    0.022 mathtext.py:569(__init__)
148944/148114    2.119    0.000    3.224    0.000 backend_pdf.py:128(pdfRepr)
   242859    1.845    0.000    3.637    0.000 io.py:574(split_line)
   244769    1.563    0.000    1.563    0.000 {zip}
799394/774449    1.381    0.000    1.469    0.000 {len}
   486984    1.371    0.000    1.371    0.000 {method 'split' of 'str' objects}
   638477    1.347    0.000    1.347    0.000 {isinstance}
   242080    1.259    0.000    1.758    0.000 __init__.py:656(__getitem__)
     4758    1.138    0.000    2.694    0.001
backend_pdf.py:1224(pathOperations)
    57101    1.063    0.000    1.623    0.000 path.py:190(iter_segments)

CPU time shows almost a minute of execution. I am guessing this should
be probably due to profile overloading. Anyways, the code run is still
within my breath-holding time limit --that's my rough approximation
for code execution upper-time. I usually start looking for ways to
decrease execution time if I can't hold my breath anymore after I hit
run script.py in IPython. Funny situation is in this scheme I might
have died a few times (when the wall times reaching over 20-25 minutes
in some modelling work) "Cython magic" was here to help reviving me
back into the life :)

-- 
Gökhan



More information about the IPython-dev mailing list