[Numpy-discussion] Interpolation question

Andrea Gavana andrea.gavana at gmail.com
Sun Mar 28 18:02:19 EDT 2010


HI Brennan,

On 28 March 2010 22:50, Brennan Williams wrote:
> Andrea Gavana wrote:
>> HI All,
>>
>> On 28 March 2010 19:22, Robert Kern wrote:
>>
>>> On Sun, Mar 28, 2010 at 03:26, Anne Archibald <peridot.faceted at gmail.com> wrote:
>>>
>>>> On 27 March 2010 20:24, Andrea Gavana <andrea.gavana at gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>>    I have an interpolation problem and I am having some difficulties
>>>>> in tackling it. I hope I can explain myself clearly enough.
>>>>>
>>>>> Basically, I have a whole bunch of 3D fluid flow simulations (close to
>>>>> 1000), and they are a result of different combinations of parameters.
>>>>> I was planning to use the Radial Basis Functions in scipy, but for the
>>>>> moment let's assume, to simplify things, that I am dealing only with
>>>>> one parameter (x). In 1000 simulations, this parameter x has 1000
>>>>> values, obviously. The problem is, the outcome of every single
>>>>> simulation is a vector of oil production over time (let's say 40
>>>>> values per simulation, one per year), and I would like to be able to
>>>>> interpolate my x parameter (1000 values) against all the simulations
>>>>> (1000x40) and get an approximating function that, given another x
>>>>> parameter (of size 1x1) will give me back an interpolated production
>>>>> profile (of size 1x40).
>>>>>
>>>> If I understand your problem correctly, you have a function taking one
>>>> value as input (or one 3D vector) and returning a vector of length 40.
>>>> You want to know whether there are tools in scipy to support this.
>>>>
>>>> I'll say first that it's not strictly necessary for there to be: you
>>>> could always just build 40 different interpolators, one for each
>>>> component of the output. After all, there's no interaction in the
>>>> calculations between the output coordinates. This is of course
>>>> awkward, in that you'd like to just call F(x) and get back a vector of
>>>> length 40, but that can be remedied by writing a short wrapper
>>>> function that simply calls all 40 interpolators.
>>>>
>>>> A problem that may be more serious is that the python loop over the 40
>>>> interpolators can be slow, while a C implementation would give
>>>> vector-valued results rapidly.
>>>>
>>> 40 is not a bad number to loop over. The thing I would be worried
>>> about is the repeated calculation of the (1000, 1000) radial function
>>> evaluation. I think that a small modification of Rbf could be made to
>>> handle the vector-valued case. I leave that as an exercise to Andrea,
>>> of course. :-)
>>>
>>
>> It seems like this whole interpolation stuff is not working as I
>> thought. In particular, considering scalar-valued interpolation (i.e.,
>> looking at the final oil recovery only and not the time-based oil
>> production profile), interpolation with RBFs is giving
>> counter-intuitive and meaningless answers. The issues I am seeing are
>> basically these:
>>
>> # Interpolate the cumulative oil production
>> rbf = Rbf(x1, x2, x3, x4, x5, x6, final_oil_recovery)
>>
>> # Try to use existing combination of parameters to get back
>> # the original result (more or less)
>> possible_recovery = rbf(x1[10], x2[10], x3[10], x4[10], x5[10], x6[10])
>>
>> 1) If I attempt to use the resulting interpolation function ("rbf"),
>> inputting a single value for each x1, x2, ..., x6 that is *already
>> present* in the original x1, x2, ..., x6 vectors, I get meaningless
>> results (i.e., negative oil productions, 1000% error, and so on) in
>> all cases with some (rare) exceptions when using the "thin-plate"
>> interpolation option;
>> 2) Using a combination of parameters x1, x2, ..., x6 that is *not* in
>> the original set, I get non-physical (i.e., unrealistic) results: it
>> appears that RBFs think that if I increase the number of production
>> wells I get less oil recovered (!), sometimes by a factor of 10.
>>
>> I am pretty sure I am missing something in this whole interpolation
>> thing (I am no expert at all), but honestly it is the first time I get
>> such counter-intuitive results in my entire Python numerical life ;-)
>>
>>
> Hi Andrea, I'm looking at doing this as well, i.e. using RBF to create
> what is effectively a "proxy simulator".
> I'm going to have a read through the scipy rbf documentation to get up
> to speed. I take it you are using
> an experimental design algorithm such as Plackett-Burman or similar?

No, it's simply a collection of all the simulations we have run in the
past 3-4 years. We can't afford to use experimental design or latin
hypercube algorithms in our simulations, as the field we are studying
is so immense that simulation times are prohibitive for any kind of
proxy model.

What I was trying to do, is to collect all the information and data we
gathered with so much effort in the past 4 years and then build some
kind of approximating function that would give us a possible
production profile without actually running the simulation itself. I
should mention that, given any combination of these 6 parameters, any
one of us reservoir engineers could draw a production profile for this
field by hand (being blinded too, if you wish) without any effort,
because the simulated response of this field is so linear any time you
change any of the 6 parameters i mentioned in my other posts. This is
why I am so surprised about the behaviour of RBFs.

I am going to give Matlab (sigh...) a go tomorrow morning and see what
griddatan thinks about this issue. I really don't want to go back to
Matlab, it has been the worst of the nightmares for a GUI-builder
person like me 6 years ago. I switched to Python/Numpy/Scipy/wxPython
and I have never looked back.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.alice.it/infinity77/

==> Never *EVER* use RemovalGroup for your house removal. You'll
regret it forever.
http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <==



More information about the NumPy-Discussion mailing list