Hi Friedrich & All, On 30 March 2010 21:48, Friedrich Romstedt wrote:
2010/3/30 Andrea Gavana <andrea.gavana@gmail.com>:
On 29 March 2010 23:44, Friedrich Romstedt wrote:
When you have nice results using 40 Rbfs for each time instant, this procedure means that the values for one time instant will not be influenced by adjacent-year data. I.e., you would probably get the same result using a norm extraordinary blowing up the time coordinate. To make it clear in code, when the time is your first coordinate, and you have three other coordinates, the *norm* would be:
def norm(x1, x2): return numpy.sqrt((((x1 - x2) * [1e3, 1, 1]) ** 2).sum())
In this case, the epsilon should be fixed, to avoid the influence of the changing distances on the epsilon determination inside of Rbf, which would spoil the whole thing.
Of course, it are here two and not three "other variables."
I have an idea how to tune your model: Take, say, the half or three thirds of your simulation data as interpolation database, and try to reproduce the remaining part. I have some ideas how to tune using this in practice.
Here, of course it are three quarters and not three thirds :-)
This is a very good idea indeed: I am actually running out of test cases (it takes a while to run a simulation, and I need to do it every time I try a new combination of parameters to check if the interpolation is good enough or rubbish). I'll give it a go tomorrow at work and I'll report back (even if I get very bad results :-D ).
I refined the idea a bit. Select one simulation, and use the complete rest as the interpolation base. Then repreat this for each simualation. Calculate some joint value for all the results, the simplest would maybe be, to calculate:
def joint_ln_density(simulation_results, interpolation_results): return -((interpolation_results - simulation_results) ** 2) / (simulation_results ** 2)
In fact, this calculates the logarithm of the Gaussians centered at *simulation_results* and taken at the "obervations" *interpolation_results*. It is the logarithms of the product of this Gaussians. The standard deviation of the Gaussians is assumed to be the value of the *simulation_results*, which means, that I assume that low-valued outcomes are much more precise in absolute numbers than high-outcome values, but /relative/ to their nominal value they are all the same precise. (NB: A scaling of the stddevs wouldn't make a significant difference /for you/. Same the neglected coefficients of the Gaussians.)
I don't know, which method you like the most. Robert's and Kevin's proposals are hard to compete with ...
You could optimise (maximise) the joint_ln_density outcome as a function of *epsilon* and the different scalings. afaic, scipy comes with some optimisation algorithms included. I checked it: http://docs.scipy.org/doc/scipy-0.7.x/reference/optimize.html#general-purpos...
I had planned to show some of the results based on the suggestion you gave me yesterday: I took two thirds ( :-D ) of the simulations database to use them as interpolation base and tried to reproduce the rest using the interpolation. Unfortunately it seems like my computer at work has blown up (maybe a BSOD, I was doing waaaay too many heavy things at once) and I can't access it from home at the moment. I can't show the real field profiles, but at least I can show you how good or bad the interpolation performs (in terms of relative errors), and I was planning to post a matplotlib composite graph to do just that. I am still hopeful my PC will resurrect at some point :-D However, from the first 100 or so interpolated simulations, I could gather these findings: 1) Interpolations on *cumulative* productions on oil and gas are extremely good, with a maximum range of relative error of -3% / +2%: most of them (95% more or less) show errors < 1%, but for few of them I get the aforementioned range of errors in the interpolations; 2) Interpolations on oil and gas *rates* (time dependent), I usually get a -5% / +3% error per timestep, which is very good for my purposes. I still need to check these values but the first set of results were very promising; 3) Interpolations on gas injection (cumulative and rate) are a bit more shaky (15% error more or less), but this is essentially due to a particular complex behaviour of the reservoir simulator when it needs to decide (based on user input) if the gas is going to be re-injected, sold, used as excess gas and a few more; I am not that worried about this issue for the moment. I'll be off for a week for Easter (I'll leave for Greece in few hours), but I am really looking forward to try the suggestion Kevin posted and to investigate Robert's idea: I know about kriging, but I initially thought it wouldn't do a good job in this case. I'll reconsider for sure. And I'll post a screenshot of the results as soon as my PC get out of the Emergency Room. Thank you again! Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ ==> Never *EVER* use RemovalGroup for your house removal. You'll regret it forever. http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <==