## Basic statistics

| categories: statistics | tags:

Given several measurements of a single quantity, determine the average value of the measurements, the standard deviation of the measurements and the 95% confidence interval for the average.

This is a recipe for computing the confidence interval. The strategy is:

1. compute the average
2. Compute the standard deviation of your data
3. Define the confidence interval, e.g. 95% = 0.95
4. compute the student-t multiplier. This is a function of the confidence

interval you specify, and the number of data points you have minus 1. You subtract 1 because one degree of freedom is lost from calculating the average. The confidence interval is defined as ybar +- T_multiplier*std/sqrt(n).

import numpy as np
from scipy.stats.distributions import  t

y = [8.1, 8.0, 8.1]

ybar = np.mean(y)
s = np.std(y)

ci = 0.95
alpha = 1.0 - ci

n = len(y)
T_multiplier = t.ppf(1-alpha/2.0, n-1)

ci95 = T_multiplier * s / np.sqrt(n-1)

print [ybar - ci95, ybar + ci95]

[7.9232449090029595, 8.210088424330376]


We are 95% certain the next measurement will fall in the interval above.