[CentralOH] OT: Statistician needed

Mon May 16 17:35:14 EDT 2016

I'm not making any connections here...

(a) Starting with 50 representative locations instead of all of them.

(b) Wanting to estimate paramaters as if you had used them all.

(c) For example, compare the error estimate of one of them by itself
with another one of them by itself.

On Mon, 16 May 2016 14:59:44 -0400
Eric Floehr <eric at intellovations.com> wrote:
> Hey all,
> 
> I'm in need of some help with statistics, and if anyone has any thoughts on
> this, or know someone who could do this, I would appreciate it greatly.
> 
> I have a set of errors, normally distributed around 0 error (it's
> temperature forecast error). You can assume that the sample of forecasts is
> representative of the entire population (for example, taking 50 strategic
> locations around the U.S. to represent all U.S. locations).
> 
> I then calculate the mean absolute error, and the RMSE. These have some
> positive value.
> 
> What I would like to calculate on the MAE and RMSE is a confidence interval
> that the population error is within given the sample MAE or RMSE and it's
> related statistics (sample size, mean error, MAE, RMSE, standard deviation,
> etc.).
> 
> For example, let's say that one provider's RMSE is 3.18 (A) and another's
> is 3.5 (B). I'd like to know with some confidence that there is (or isn't)
> a difference between providers (i.e. that provider A confidently has lower
> error than B).
> 
> Currently, the way I'm doing it is using the normative inverse function in
> Excel:
> 
> Lower bound: NORMINV(0.005,RMSE,STDDEV_RMSE/SQRT(NUMBER_OF_SAMPLES))
> 
> Upper bound: NORMINV(0.995,RMSE,STDDEV_RMSE/SQRT(NUMBER_OF_SAMPLES))
> 
> as in section 9.18 of:
> http://www.saylor.org/site/wp-content/uploads/2012/10/BUS204-Ling-6.2.pdf
> 
> But I'm not at all convinced that I'm doing that right, or that it applies
> in this situation.
> 
> Thanks so much!
> Eric