** Basic statistics
:PROPERTIES:
:categories: statistics
:date: 2013/02/18 09:00:00
:updated: 2013/02/27 14:35:05
:END:
Given several measurements of a single quantity, determine the average value of the measurements, the standard deviation of the measurements and the 95% confidence interval for the average.
This is a recipe for computing the confidence interval. The strategy is:
1. compute the average
2. Compute the standard deviation of your data
3. Define the confidence interval, e.g. 95% = 0.95
4. compute the student-t multiplier. This is a function of the confidence
interval you specify, and the number of data points you have minus 1. You
subtract 1 because one degree of freedom is lost from calculating the
average. The confidence interval is defined as
ybar +- T_multiplier*std/sqrt(n).
#+BEGIN_SRC python
import numpy as np
from scipy.stats.distributions import t
y = [8.1, 8.0, 8.1]
ybar = np.mean(y)
s = np.std(y)
ci = 0.95
alpha = 1.0 - ci
n = len(y)
T_multiplier = t.ppf(1-alpha/2.0, n-1)
ci95 = T_multiplier * s / np.sqrt(n-1)
print [ybar - ci95, ybar + ci95]
#+END_SRC
#+RESULTS:
: [7.9232449090029595, 8.210088424330376]
We are 95% certain the next measurement will fall in the interval above.