[SciPy-User] Minimum points for descriptive statistics?

Mark Livingstone livingstonemark at gmail.com
Sat Sep 10 23:40:01 EDT 2011


Hi Guys,

I am slowly bringing up to date the SalStat statistics program at
http://sourceforge.net/projects/salstat/ which uses Numpy to hold its data,
and to do some of the statistical calculations.

I have two questions which I would like to solicit statistical points of
view on.

In the GUI, I have a wxPython grid where, as you would expect you put a
series into each column and stats are then able to be calculated.

(a) What I am wondering is what is the minimum number of data points you
would feel should be present to perform the standard 5 number statistics? I
guess that technically if you had two points, you could interpolate the
median, then Q1 & Q3 but this seems doubtful to me? 3 numbers would seem a
more solid proposal? Maybe we need an "Are you sure?" message box! ;-)

(b) Is there any standard way that you deal with missing values (empty
cells) in the data? Given that you can tick boxes to have a number of
descriptive and other tests performed on a column, or between columns of
data, it seems to me that different tests will have different ways to deal
with missing data? It is not like you can just stick in some default value!

Thanks in advance for any help you can suggest :-D

Regards,

MarkL
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110911/61dbd756/attachment.html>


More information about the SciPy-User mailing list