Probability Theory 


Types of Convergence


Transforms of PDF/PMF


Important distributions:


Sterling's formula: n! \approx  (n/e)^n = e^{n log n - n},  the approximation is accurate as n goes to infinity.


What is inverse chi-square distribution?   It is just the inverse function of chi-square distribution, actually, the set of percentiles.


The Goodness of Fit Test


Preliminaries

Suppose that we have a random experiment with a random variable X of interest. Assume additionally that X is discrete with density function f on a finite set S. We repeat the experiment n times go generate a random sample of size n from the distribution of X:

X1, X2, ..., Xn.

Recall that these are independent variables, each with the distribution of X.

In this section, we assume that the distribution of X is unknown. For a given density function f0, we will test the hypotheses

H0: f = f0 versus H1: ff0,

The test that we will construct is known as the goodness of fit test for the conjectured density f0. As usual, our challenge in developing the test is to find a good test statistic--one that gives us information about the hypotheses and whose distribution, under the null hypothesis, is known, at least approximately.

Derivation of the Test

Suppose that S = {x1, x2, ..., xk}. To simplify the notation, let

pj = f0(xj) for j = 1, 2, ..., k.

Now let Nj = #{i in {1, 2, ..., n}: Xi = xj} for j = 1, 2, ..., k.

Mathematical Exercise 1. Show that under the null hypothesis,

  1. N = (N1, N2, ..., Nk) has the multinomial distribution with parameters n and p1, p2, ..., pk.
  2. E(Nj) = npj.
  3. var(Nj) = npj(1 − pj).

Exercise 1 indicates how we might begin to construct our test: for each j we can compare the observed frequency of xj (namely Nj) with the expected frequency of value xj (namely npj), under the null hypothesis. Specifically, our test statistic will be

V = (N1np1)2 / np1 + (N2np2)2 / np2 + ,,, + (Nknpk)2 / npk.

Note that the test statistic is based on the squared errors (the squares of the differences between the expected frequencies and the observed frequencies). The reason that the squared errors are scaled as they are is the following crucial fact, which we will accept without proof: Under the null hypothesis, as n increases to infinity, the distribution of V converges to the chi-square distribution with k − 1 degrees of freedom.

As usual, for m > 0 and r in (0, 1), we will let vm, r denote the quantile of order r for the chi-square distribution with k degrees of freedom. For selected values of m and r, vm, r can be obtained from the table of the chi-square distribution.

Mathematical Exercise 2. Show that the following test has approximate significance level α:

Reject H0: f = f0 versus H1: ff0, if and only if V > vk − 1, 1 − α.

Again, the test is an approximate one that works best when n is large. Just how large n needs to be depends on the pj; the rule of thumb is that the test will work well if the expected frequencies npj are at least 1 and at least 80% are at least 5.