<div dir="ltr">Here's more regarding student notes = concept outlines that RUN.<br><br>In dealing with sample vs population variance, in both cases we are talking about an average, but our traditional notation tends to obscure that fact. It's easy to see that population variance is the mean of the squared deviations, but the sample formula tends to blur that concept. The result at a high school level has been to appreciate the resulting messy formula but to then thankfully turn to some kind of magic black box that will perform those calculations for you.<br>
<br>Well, I think that's dumb. Instead of using software packages or other kinds of black boxes to magically generate results,<br>let's use our our own thoughts to generate definitions. This can be done as a class discussion.<br>
<br>I first needed a term for finding the sum(L)/(n - 1). That certainly is a kind of mean, but there doesn't seem to be an official term for it. So, I decided to call it an 'adjusted' mean. If there is a better term for this, please let me know. For the time being, I can say that variance is ALWAYS the mean of the squared deviations. It's just that if you're dealing with a sample, you find the 'adjusted' mean of the squared deviations. <br>
<br>So now we have a nice suite of mostly one-liner functions that handle things we've studied up through the Pearson correlation coefficient. <br><br>In the following, 'sample' is a global boolean variable. The concepts are most easily expressed in population form, but most frequently applied using sample form. This puts the two together.<br>
<br>Caveat: this is not meant to be definitive code. Not at all. It is only meant to be code that illustrates concepts. Feedback welcomed. As I said to my dept chair, certainly there are many software packages that already exist that will find these things for you, but could the code behind them serve as a math student's notes??? No way!<br>
<br>- Michel<br><br>=======================================<br><br>sample = True<br>
<br>def mean(L): return sum(L)/len(L)<br><br><div>def adjusted_mean(L): return sum(L)/(len(L) - 1)<br><br></div><div>def deviations(L): return [x - mean(L) for x in L]<br><br>def squares(L): return [x**2 for x in L]<br>
<br>def variance(L):<br></div>
if sample: return adjusted_mean(squares(deviations(L)))<br><div dir="ltr"> else: return mean(squares(deviations(L)))<br><div><br>def stdev(L): return sqrt(variance(L))<br><br>def zscores(L): return [deviation/stdev(L) for deviation in deviations(L)]<br>
<br></div>def X(L): return [x for (x, y) in L]<br>def Y(L): return [y for (x, y) in L]<br><br>def r(L):<br> if sample: return adjusted_mean([zx*zy for (zx, zy) in zip(zscores(X(L)), zscores(Y(L)))])<br> else: return mean([zx*zy for (zx, zy) in zip(zscores(X(L)), zscores(Y(L)))])<br>