Category: Inferential Statistics

  • Correlation analysis testing significance and potential dangers

    To estimate the correlation coefficient ρXY between X and Y, we may just plug sample covariance SXY and sample standard deviations SX, SY into its definition, resulting in the sample coefficient of correlation, or sample correlation for short: The factors n − 1 in SXY, SX, and SY cancel each other, and it can be proved that −1 ≤ rXY ≤ +1, just like its probabilistic counterpart ρXY. Once again, we stress that the…

  • Estimating covariance and related issues

    Just as we have defined sample variance, we may define sample covariance SXY between random variables X and Y: where n is the size of the sample, i.e., the number of observed pairs (Xi, Yi). Sample covariance can also be rewritten as follows: To see this, we note the following: This rewriting mirrors the relationship σXY = E[XY] − μX μY from probability theory. It is important to realize that our…

  • Estimating and testing proportions

    So far, we have mostly relied on normality of observations or, at least approximately, on normality of the sample mean for large samples. However, there are cases in which we should be a bit more specific and devise approaches which are in tune with the kind of observations we are taking. Such a case occurs…

  • Estimating and testing variance

    It is easy to prove that sample variance S2 is an unbiased estimator of variance σ2, but if we want a confidence interval for variance, we need distributional results on S2, which depend on the underlying population. For a normal population we may take advantage of Theorem 9.4. In particular, we recall that the sample variance is related to the…

  • Testing hypotheses about the difference in the mean of two populations

    Sometimes, we have to run a test concerning two (or more) populations. For instance, we could wonder if two markets for a given product are really different in terms of expected demand. Alternatively, after the re-engineering of a business processes, we could wonder whether the new performance measures are significantly different from the old ones.…

  • BEYOND THE MEAN OF ONE POPULATION

    In this section we generalize what we have seen so far concerning the expected value of a normal population. We consider the following problems: As you can see, this is a rather long list of topics. Because of space limitations, we will take a somewhat “cookbook” approach, cutting some corners to keep the treatment to…

  • Testing with p-values

    In the manufacturing example of the previous section we found such a large value for the test statistic that we are quite confident that the null hypothesis should be rejected, whatever significance level we choose. In other cases, finding a suitable value of α can be tricky. Recall that the larger the value of α, the easier it…

  • One-tail tests

    When the null hypothesis is of the form H0 : μ = μ0, we consider a two-tail rejection region. In many problems, the null hypotheses has the form H0 : μ ≥ μ0 or H0 : μ ≤ μ0 are more appropriate. As one could expect, this leads to a rejection region consisting of one tail. Before illustrating the technicalities involved, it is useful to consider a practical example. Example 9.14 A firm…

  • HYPOTHESIS TESTING

    The need for testing a hypothesis about an unknown parameter arises from many problems related to inferential statistics. There are general and powerful ways to build appropriate procedures for testing hypotheses, which we outline in Section 9.10. Since they do require some level of mathematical sophistication, we offer here an elementary treatment that is strongly linked…

  • Setting the sample size

    From a qualitative perspective, the form of the confidence interval (9.10) suggests the following observations: The last statement is quite relevant, and is related to an important issue. So far, we have considered a given sample and we have built a confidence interval. However, sometimes we have to go the other way around: Given a…