ANALYSIS OF VARIANCE

Analysis of variance (ANOVA) is the collective name of an array of methods that find wide applications in inferential statistics. In essence, we compare groups of observations in order to check if there are significant differences between them, which may be attributed to the impact of underlying factors. One such case occurs when we compare sample means taken from m populations, in order to test the hypothesis that the respective expected values are all the same. Note that, so far, we only considered two populations; with ANOVA we may check an arbitrary number of populations. The ability to analyze the impact of factors is also useful to assess the significance of statistical models, as well as to design statistical experiments. The approach relies on the comparison of different estimates of variance, which should not be significantly different, if factors are not relevant; if we find a statistically significant difference in estimates, then we may reject the hypothesis that factors have no impact.

In this section, we take a somewhat limited view, which is nevertheless able to convey the essentials of ANOVA. We consider two simple and specific cases:

  1. One-way ANOVA, whereby we assume that there is one factor at work.
  2. Two-way ANOVA, whereby we assume that there are two factors at work.

We will take another view at ANOVA in the context of linear regression in Section 10.3.4.

9.6.1 One-way ANOVA

In Section 9.4.1 we considered a test concerning the hypothesis that the means of two (normal) populations are the same:

images

It is easy to imagine situations in which we want to check a similar claim for more than two populations. To set the stage for the following treatment, let us assume that we have m normal populations, i = 1,…, m, and that we take a sample of n elements from each population. If the number of observations for each population is the same, we have a balanced design; otherwise, we have an unbalanced design. Formally, we are considering the following random variables:

images

where the subscripts i and j refer to populations and observations, respectively. As usual, all observations are assumed independent. We denote by μi the unknown expected value of population i. Formally, the null hypothesis we want to test is

images

against the alternative Ha that not all expected values are the same. Another key assumption concerns population variances. They are unknown, but it is assumed that all of them have the same value σ. This might seem a bold assumption, but keep in mind that we want to check the equality of the expected values or, more informally, if there is any significant difference among the populations; hence, in terms of null hypothesis, it is natural to assume the same variance.

Since we have m samples of size n, we have a grand total of nm independent, normally distributed observations. If we standardize, square, and add all of them, we obtain a chi-square random variable with nm degrees of freedom

images

Since expected values are unknown, we should replace them with sample means for each population

images

where the notation images points out that this is a sample mean obtained by summing over the second subscript j. If we plug these sample means into Eq. (9.33), we get the random variable

images

where we define

images

as the sum of squares within samples, since deviations are taken with respect to each expected value within each population. This is again a chi-square variable, but with nm − m degrees of freedom, since we have estimated m expected values. Given the properties of chi-square variables, we obtain

images

which means that SSw/(nm − m) is an unbiased estimator of σ2, regardless of whether the null hypothesis H0 is true.

Now we build another estimator of σ2, which is unbiased only if H0 holds, i.e., if the expected values are the same: μi = μ, for i = 1,…, m. In such a case, we could estimate μ by taking the overall sample mean

images

Then, to estimate σ2, we could take a different route. Let us define the sum of squares between samples

images

To see the rationale behind the definition, let us observe that, under the null hypothesis, the variables

images

are standard normal. If we square and sum these variables, we get the following chi-square variable:

images

If we plug Eq. (9.35) into the sum above to replace the unknown expected value μ, under the null hypothesis, we obtain

images

This implies that, under H0

Table 9.5 Sample data for one-way ANOVA.

images
  • E[SSb]/σ2 = m − 1
  • SSb/(m − 1) is an unbiased estimator of σ2

To summarize, we have two estimators for the unknown variance: SSw/(nm − m) is always unbiased; SSb/(m − 1) is unbiased only if the means are all the same. Then, under the null hypothesis, the ratio of the two estimators should be close to 1. Moreover, it can be shown that SSb/(m − 1) tends to overestimate σ2 if H0 is not true. Then, we consider the following test statistic:

images

which under H0 is a F variable with m − 1 and nm − m degrees of freedom. We reject the hypothesis when the test statistic is too large. More precisely, if F1−αm−1, nmm is the (1 − α)-quantile of the F distribution, we obtain a test with significance level α if we reject when TS > F1−αm−1, nmm.

Example 9.24 Let us apply one-way ANOVA to the data listed in Table 9.5, where we have three samples of size n = 6, taken from m = 3 populations. The first step is to calculate the sums

images

We observe that the three sample means do look rather different. Now we should test the null hypothesis

images

We proceed calculating the following sums of squares:20

images

Thus, we find the following alternative estimates of the unknown variance σ2:

images

which do look different, at first sight. The test statistic is

images

and, assuming a significance level α = 5%, it should be compared with the following quantile of the F distribution with 2 and 15 degrees of freedom:

images

We see that the test statistic does not fall into the rejection region and, therefore, the apparent difference in sample means is not statistically significant. Actually, using the CDF of the F distribution, we obtain the p-value

images

which is pretty large. In order to reject the null hypothesis, we should accept a very large probability of a type I error.

This procedure can be easily adapted to the unbalanced case, where the m samples have not the same size. It is often argued, however, that a balanced design is preferable for nonnormal populations, as the resulting test is a bit more robust to lack of normality.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *