Author: haroonkhan
-
Two-way ANOVA
In one-way ANOVA we are testing if observations from different populations have a different mean, which can be considered as the one factor affecting such observations. In two-way ANOVA we consider the possibility that two factors affect observations. As a first step, it is useful to reconsider one-way ANOVA in a slightly different light. What…
-
ANALYSIS OF VARIANCE
Analysis of variance (ANOVA) is the collective name of an array of methods that find wide applications in inferential statistics. In essence, we compare groups of observations in order to check if there are significant differences between them, which may be attributed to the impact of underlying factors. One such case occurs when we compare…
-
CHECKING THE FIT OF HYPOTHETICAL DISTRIBUTIONS: THE CHI-SQUARE TEST
So far, we have been concerned with parameters of probability distributions. We never questioned the fit of the distribution itself against empirical data. For instance, we might assume that a population is normally distributed, and we may estimate and test its expected value and variance. However, normality should not be taken for granted, just like…
-
Estimating skewness and kurtosis
We have defined skewness and kurtosis as:17 These definitions are related to higher-order moments of random variables. Just like expected value and variance, these are probabilistic definitions, and we should wonder if and how these measures should be estimated on the basis of sampled data. The “if” should not be a surprise. If we know that the sampled population…
-
Correlation analysis testing significance and potential dangers
To estimate the correlation coefficient ρXY between X and Y, we may just plug sample covariance SXY and sample standard deviations SX, SY into its definition, resulting in the sample coefficient of correlation, or sample correlation for short: The factors n − 1 in SXY, SX, and SY cancel each other, and it can be proved that −1 ≤ rXY ≤ +1, just like its probabilistic counterpart ρXY. Once again, we stress that the…
-
Estimating covariance and related issues
Just as we have defined sample variance, we may define sample covariance SXY between random variables X and Y: where n is the size of the sample, i.e., the number of observed pairs (Xi, Yi). Sample covariance can also be rewritten as follows: To see this, we note the following: This rewriting mirrors the relationship σXY = E[XY] − μX μY from probability theory. It is important to realize that our…
-
Estimating and testing proportions
So far, we have mostly relied on normality of observations or, at least approximately, on normality of the sample mean for large samples. However, there are cases in which we should be a bit more specific and devise approaches which are in tune with the kind of observations we are taking. Such a case occurs…
-
Estimating and testing variance
It is easy to prove that sample variance S2 is an unbiased estimator of variance σ2, but if we want a confidence interval for variance, we need distributional results on S2, which depend on the underlying population. For a normal population we may take advantage of Theorem 9.4. In particular, we recall that the sample variance is related to the…
-
Testing hypotheses about the difference in the mean of two populations
Sometimes, we have to run a test concerning two (or more) populations. For instance, we could wonder if two markets for a given product are really different in terms of expected demand. Alternatively, after the re-engineering of a business processes, we could wonder whether the new performance measures are significantly different from the old ones.…
-
BEYOND THE MEAN OF ONE POPULATION
In this section we generalize what we have seen so far concerning the expected value of a normal population. We consider the following problems: As you can see, this is a rather long list of topics. Because of space limitations, we will take a somewhat “cookbook” approach, cutting some corners to keep the treatment to…