ANALYSIS OF VARIANCE

Analysis of variance (ANOVA) is the collective name of an array of methods that find wide applications in inferential statistics. In essence, we compare groups of observations in order to check if there are significant differences between them, which may be attributed to the impact of underlying factors. One such case occurs when we compare sample means taken from m populations, in order to test the hypothesis that the respective expected values are all the same. Note that, so far, we only considered two populations; with ANOVA we may check an arbitrary number of populations. The ability to analyze the impact of factors is also useful to assess the significance of statistical models, as well as to design statistical experiments. The approach relies on the comparison of different estimates of variance, which should not be significantly different, if factors are not relevant; if we find a statistically significant difference in estimates, then we may reject the hypothesis that factors have no impact.

In this section, we take a somewhat limited view, which is nevertheless able to convey the essentials of ANOVA. We consider two simple and specific cases:

One-way ANOVA, whereby we assume that there is one factor at work.
Two-way ANOVA, whereby we assume that there are two factors at work.

We will take another view at ANOVA in the context of linear regression in Section 10.3.4.

9.6.1 One-way ANOVA

In Section 9.4.1 we considered a test concerning the hypothesis that the means of two (normal) populations are the same:

It is easy to imagine situations in which we want to check a similar claim for more than two populations. To set the stage for the following treatment, let us assume that we have m normal populations, i = 1,…, m, and that we take a sample of n elements from each population. If the number of observations for each population is the same, we have a balanced design; otherwise, we have an unbalanced design. Formally, we are considering the following random variables:

where the subscripts i and j refer to populations and observations, respectively. As usual, all observations are assumed independent. We denote by μ_i the unknown expected value of population i. Formally, the null hypothesis we want to test is

against the alternative H_a that not all expected values are the same. Another key assumption concerns population variances. They are unknown, but it is assumed that all of them have the same value σ. This might seem a bold assumption, but keep in mind that we want to check the equality of the expected values or, more informally, if there is any significant difference among the populations; hence, in terms of null hypothesis, it is natural to assume the same variance.

Since we have m samples of size n, we have a grand total of nm independent, normally distributed observations. If we standardize, square, and add all of them, we obtain a chi-square random variable with nm degrees of freedom

Since expected values are unknown, we should replace them with sample means for each population

where the notation points out that this is a sample mean obtained by summing over the second subscript j. If we plug these sample means into Eq. (9.33), we get the random variable

where we define

as the sum of squares within samples, since deviations are taken with respect to each expected value within each population. This is again a chi-square variable, but with nm − m degrees of freedom, since we have estimated m expected values. Given the properties of chi-square variables, we obtain

which means that SS_w/(nm − m) is an unbiased estimator of σ², regardless of whether the null hypothesis H₀ is true.

Now we build another estimator of σ², which is unbiased only if H₀ holds, i.e., if the expected values are the same: μ_i = μ, for i = 1,…, m. In such a case, we could estimate μ by taking the overall sample mean

Then, to estimate σ², we could take a different route. Let us define the sum of squares between samples

To see the rationale behind the definition, let us observe that, under the null hypothesis, the variables

are standard normal. If we square and sum these variables, we get the following chi-square variable:

If we plug Eq. (9.35) into the sum above to replace the unknown expected value μ, under the null hypothesis, we obtain

This implies that, under H₀

Table 9.5 Sample data for one-way ANOVA.

E[SS_b]/σ² = m − 1
SS_b/(m − 1) is an unbiased estimator of σ²

To summarize, we have two estimators for the unknown variance: SS_w/(nm − m) is always unbiased; SS_b/(m − 1) is unbiased only if the means are all the same. Then, under the null hypothesis, the ratio of the two estimators should be close to 1. Moreover, it can be shown that SS_b/(m − 1) tends to overestimate σ² if H₀ is not true. Then, we consider the following test statistic:

which under H₀ is a F variable with m − 1 and nm − m degrees of freedom. We reject the hypothesis when the test statistic is too large. More precisely, if F_{1−α, m−1, nm−m} is the (1 − α)-quantile of the F distribution, we obtain a test with significance level α if we reject when TS > F_{1−α, m−1, nm−m}.

Example 9.24 Let us apply one-way ANOVA to the data listed in Table 9.5, where we have three samples of size n = 6, taken from m = 3 populations. The first step is to calculate the sums

We observe that the three sample means do look rather different. Now we should test the null hypothesis

We proceed calculating the following sums of squares:²⁰

Thus, we find the following alternative estimates of the unknown variance σ²:

which do look different, at first sight. The test statistic is

and, assuming a significance level α = 5%, it should be compared with the following quantile of the F distribution with 2 and 15 degrees of freedom:

We see that the test statistic does not fall into the rejection region and, therefore, the apparent difference in sample means is not statistically significant. Actually, using the CDF of the F distribution, we obtain the p-value

which is pretty large. In order to reject the null hypothesis, we should accept a very large probability of a type I error.

This procedure can be easily adapted to the unbalanced case, where the m samples have not the same size. It is often argued, however, that a balanced design is preferable for nonnormal populations, as the resulting test is a bit more robust to lack of normality.

ANALYSIS OF VARIANCE

9.6.1 One-way ANOVA

Comments

Leave a Reply Cancel reply