Author: haroonkhan
-
THE AXIOMATIC APPROACH
The axiomatic approach aims at building a consistent theory of probability and is based on the following logical steps: 5.2.1 Sample space and events To get going, we should first formalize a few concepts about running a random experiment and observing outcomes. The set of possible outcomes is called the sample space, denoted by Ω. For…
-
DIFFERENT CONCEPTS OF PROBABILITY
We have met relative frequencies, a fundamental concept in descriptive statistics. Intuitively, relative frequencies can be interpreted as “probabilities” in some sense, as they should tell us something about the likelihood of events. While this is legitimate and quite sensible in many settings, we should wonder whether this frequentist interpretation is the only meaning that we may…
-
Introduction
The probability theory, and this one is no exception. However, the careful reader should wonder title mentions probability theories. In Section 5.1 we show that probability, like uncertainty, is a rather elusive concept. Descriptive statistics suggests the concept of probabilities as relative frequencies, but we may also interpret probability as plausibility related to a state of belief. The…
-
MULTIDIMENSIONAL DATA
So far, we have considered the organization and representation of data in one dimension, but in applications we often observe multidimensional data. Of course, we may list summary measures for each single variable, but this would miss an important point: the relationship between different variables. In issues concerning independence, correlation, etc. Here we want to…
-
Quartiles and boxplots
Among the many percentiles, a particular role is played by the quartiles, denoted by Q1, Q2, and Q3, corresponding to 25%, 50%, and 75%, respectively. Clearly, Q2 is simply the median. A look at these values and the mean tells a lot about the underlying distribution. Indeed, the interquartile range has been proposed as a measure of dispersion, and an alternative measure…
-
CUMULATIVE FREQUENCIES AND PERCENTILES
The median m is a value such that 50% of the observed values are smaller than or equal to it. In this section we generalize the idea to an arbitrary percentage. We could ask which value is such that 80% of the observations are smaller than or equal to it. Or, seeing things the other way around,…
-
Dispersion measures
Location measures do not tell us anything about dispersion of data. We may have two distributions sharing the same mean, median, and mode, yet they are quite different. Figure 4.8, repeated illustrates the importance of dispersion in discerning the difference between distributions sharing location measures. One possible way to characterize dispersion is by measuring the range X(n) − X(1) i.e.,…
-
Location measures: mean, median, and mode
We are all familiar with the idea of taking averages. Indeed, the most natural location measure is the mean. DEFINITION 4.5 (Mean for a sample and a population) The mean for a population of size n is defined as The mean for a sample of size n is The two definitions above may seem somewhat puzzling,…
-
SUMMARY MEASURES
A look at a frequency histogram tells us many things about the distribution of values of a variable of interest within a population or a sample. However, it would be quite useful to have a set of numbers capturing some essential features quantitatively; this is certainly necessary if we have to compare two histograms, since…
-
ORGANIZING AND REPRESENTING RAW DATA
We have introduced the basic concepts of frequencies and histograms in Section 1.2.1. Here we treat the same concepts in a slightly more systematic way, illustrating a few potential difficulties that may occur even with these very simple ideas. Imagine a car insurance agent who has collected the weekly number of accidents occurred during the last…