Consider a firm that, on the basis of a set of variables measuring customer attributes, wishes to discriminate between purchasers and nonpurchasers of a product of service. In concrete, the firm has collected a sample of consumers and, given their attributes and observed behavior, wants to find a way to classify them. Two-group discriminant analysis aims at finding a function of variables, called a discriminant function, which best separates the two groups. This sounds much like cluster analysis, but the mechanism is quite different:
- Cluster analysis relies on a measure of distance and tries to find groups such that the distance within groups is small, and distance between groups is large.
- Discriminant analysis relies on a discriminant function f(x), possibly a linear combination of variables, and a threshold value γ such that if f(xa) ≤ γ, object a with attributes xa is classified in one group; if f(xa) > γ, object a is assigned to the other group.
Another fundamental difference is that, in discriminant analysis, clusters are known a priori and are used for learning. Discriminant analysis can be generalized to multiple groups, for both exploratory and confirmatory purposes.
Leave a Reply