In one-way ANOVA we are testing if observations from different populations have a different mean, which can be considered as the one factor affecting such observations. In two-way ANOVA we consider the possibility that two factors affect observations. As a first step, it is useful to reconsider one-way ANOVA in a slightly different light. What we are implicitly assuming is that each random variable can be expressed as the sum of an unknown value plus a random disturbance

images

where images. Then, E[Xij] = μi, which is the only factor affecting the expected value of the observations. If we denote the average expected value by μ, where

images

we may write

images

where αi = μi − μ and images. Hence, the average value of αi is zero, but the null hypothesis of one-way ANOVA is much stronger, since it amounts to saying that there is no effect due to αi, and this is true if αi = 0 for all i.

We may generalize the idea and consider two factors

images

where

images

In this case, we are taking into consideration the presence of two factors, which are not interacting. If we want to account for interaction, we should extend the model to

images

If we organize observations in rows indexed by i and columns indexed j, we may test the following hypotheses:

  1. There is no row effect, i.e., αi = 0, for all i.
  2. There is no column effect, i.e., βj = 0, for all j.
  3. There is no effect due to interaction, i.e., γij = 0, for all i and j.

Let us consider the first case, assuming that there is no interaction and that variance is σ2, for all i and j:

images

As in one-way ANOVA, we build different estimators of σ2, one of which is unbiased only if the null hypothesis is true. To obtain an estimator that is always valid, let us consider

images

This is a chi-square variable with nm degrees of freedom, if observations are normal and independent. To estimate the unknown parameters, we consider the appropriate sample means

images

We should recall that, since the sum of the parameters αi is zero, we need to estimate only m − 1 of them; by the same token, we need to estimate only n − 1 parameters βj. So, we need to estimate a grand total of

images

parameters. Then, if we plug the above estimators into Eq. (9.36), we find that

images

is chi-square with

images

degrees of freedom. Then, if we define the sum of squared errors as

images

we have

images

Therefore, we have built an unbiased estimator of variance. Now, we build another estimator, which is unbiased only under the null hypothesis. In fact, under H0, we have:

images

Since images, the sum of squared standardized variables

images

is a chi-square variable with m degrees of freedom, if the null hypothesis is true. Replacing μ by its estimator images.., we lose one degree of freedom. So, if we define the row sum of squares

images

we have

images

Therefore, we have another estimator of variance, but this one is unbiased only under the null hypothesis. When H0 is not true, this estimator tends to overestimates σ2. Then, we may run a test based on the test statistic

images

which, under H0, has F distribution with (m − 1) and (m − l)(n − 1) degrees of freedom. Given a significance level α, we reject the null hypothesis that there is no row effect if

images

Clearly, a similar route can be taken to check for column effects, where we define a column sum of squares

images

which is related to a chi-square variable with n − 1 degrees of freedom, and we reject the null hypothesis that there is no column effect if

images

The case with interactions is a bit trickier, but it follows the same conceptual path.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *