HI Brennan, On 28 March 2010 22:50, Brennan Williams wrote:

Andrea Gavana wrote:

HI All,

On 28 March 2010 19:22, Robert Kern wrote:

On Sun, Mar 28, 2010 at 03:26, Anne Archibald <peridot.faceted@gmail.com> wrote:

On 27 March 2010 20:24, Andrea Gavana <andrea.gavana@gmail.com> wrote:

Hi All,

I have an interpolation problem and I am having some difficulties in tackling it. I hope I can explain myself clearly enough.

Basically, I have a whole bunch of 3D fluid flow simulations (close to 1000), and they are a result of different combinations of parameters. I was planning to use the Radial Basis Functions in scipy, but for the moment let's assume, to simplify things, that I am dealing only with one parameter (x). In 1000 simulations, this parameter x has 1000 values, obviously. The problem is, the outcome of every single simulation is a vector of oil production over time (let's say 40 values per simulation, one per year), and I would like to be able to interpolate my x parameter (1000 values) against all the simulations (1000x40) and get an approximating function that, given another x parameter (of size 1x1) will give me back an interpolated production profile (of size 1x40).

If I understand your problem correctly, you have a function taking one value as input (or one 3D vector) and returning a vector of length 40. You want to know whether there are tools in scipy to support this.

I'll say first that it's not strictly necessary for there to be: you could always just build 40 different interpolators, one for each component of the output. After all, there's no interaction in the calculations between the output coordinates. This is of course awkward, in that you'd like to just call F(x) and get back a vector of length 40, but that can be remedied by writing a short wrapper function that simply calls all 40 interpolators.

A problem that may be more serious is that the python loop over the 40 interpolators can be slow, while a C implementation would give vector-valued results rapidly.

40 is not a bad number to loop over. The thing I would be worried about is the repeated calculation of the (1000, 1000) radial function evaluation. I think that a small modification of Rbf could be made to handle the vector-valued case. I leave that as an exercise to Andrea, of course. :-)

It seems like this whole interpolation stuff is not working as I thought. In particular, considering scalar-valued interpolation (i.e., looking at the final oil recovery only and not the time-based oil production profile), interpolation with RBFs is giving counter-intuitive and meaningless answers. The issues I am seeing are basically these:

# Interpolate the cumulative oil production rbf = Rbf(x1, x2, x3, x4, x5, x6, final_oil_recovery)

# Try to use existing combination of parameters to get back # the original result (more or less) possible_recovery = rbf(x1[10], x2[10], x3[10], x4[10], x5[10], x6[10])

1) If I attempt to use the resulting interpolation function ("rbf"), inputting a single value for each x1, x2, ..., x6 that is *already present* in the original x1, x2, ..., x6 vectors, I get meaningless results (i.e., negative oil productions, 1000% error, and so on) in all cases with some (rare) exceptions when using the "thin-plate" interpolation option; 2) Using a combination of parameters x1, x2, ..., x6 that is *not* in the original set, I get non-physical (i.e., unrealistic) results: it appears that RBFs think that if I increase the number of production wells I get less oil recovered (!), sometimes by a factor of 10.

I am pretty sure I am missing something in this whole interpolation thing (I am no expert at all), but honestly it is the first time I get such counter-intuitive results in my entire Python numerical life ;-)

Hi Andrea, I'm looking at doing this as well, i.e. using RBF to create what is effectively a "proxy simulator". I'm going to have a read through the scipy rbf documentation to get up to speed. I take it you are using an experimental design algorithm such as Plackett-Burman or similar?

No, it's simply a collection of all the simulations we have run in the past 3-4 years. We can't afford to use experimental design or latin hypercube algorithms in our simulations, as the field we are studying is so immense that simulation times are prohibitive for any kind of proxy model. What I was trying to do, is to collect all the information and data we gathered with so much effort in the past 4 years and then build some kind of approximating function that would give us a possible production profile without actually running the simulation itself. I should mention that, given any combination of these 6 parameters, any one of us reservoir engineers could draw a production profile for this field by hand (being blinded too, if you wish) without any effort, because the simulated response of this field is so linear any time you change any of the 6 parameters i mentioned in my other posts. This is why I am so surprised about the behaviour of RBFs. I am going to give Matlab (sigh...) a go tomorrow morning and see what griddatan thinks about this issue. I really don't want to go back to Matlab, it has been the worst of the nightmares for a GUI-builder person like me 6 years ago. I switched to Python/Numpy/Scipy/wxPython and I have never looked back. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <==