MMU - Clustering and Classification

Introduction

If we have information about individuals, obtained from a number a variables, it is reasonable to ask if these variables can be used to define membership of a group. Discriminant Analysis works by combining variables in such a way that the differences between predefined groups are maximised.

Consider a simple two group example. The aim is to combine (weight) the variable scores in some way so that a single new composite variable, the discriminant score, is produced. One way of thinking about this is in terms of a recipe, changing the proportions (weights) of the ingredients will change the characteristics of the finished food.

At the end of the process it is hoped that each group will have a normal distribution of discriminant scores; the degree of overlap between the discriminant score distributions can then be used as a measure of the success of the technique. For example:

Overlapping frequency distributions

The discriminant scores are calculated from a Discriminant Function which has the form:

D = w₁Z₁ + w₂Z₂ + w₃Z₃ + .... w_iZ_i

where

D = discriminant score;
w_i = weighting (coefficient) for variable i;
Z_i = standardised score for variable i (a standardised variable has a mean of 0 and a standard deviation of 1).

Consequently a discriminant score is a weighted linear combination (sum) of the discriminating variables or predictors.

Standardising the predictors ensures that any scale differences between them are eliminated. Once standardised, absolute weights (i.e. ignore the sign) be used to rank predictors in terms of their discriminating power, the largest weight being associated with the most powerful discriminating predictor. Predictors with large weights are those which contribute most to differentiating the groups.

[Continuing the recipe metaphor.... The quality of the finished meal is very dependent on getting the proportions correct. However, absolute quantities are not a good guide to their contribution to the overall taste since the ingredients may have different inherent strengths (e.g. compare garlic and onion). Standardising predictors removes this effect .]

top

Graphical explanation

As with most other multivariate methods it is possible to present a pictorial explanation of the technique. The following example uses a very simple data set, 2 groups and 2 variables. If scatter graphs are plotted for scores against the two variables the above is obtained.

Simple scatter plots with 2 groups

Scatter plot with x & y frequency distributions

Clearly the two groups can be separated by these two variables, but there is a large amount of overlap on each single axis (although the y variable is the 'better' discriminator). It is possible to construct a new axis which passes through the two group centroids ('means'), such that the groups do not overlap on the new axis.

Discriminant axis and frequency distributions for 2 groups

The new axis represents a new variable which is a linear combination of x and y, i.e. it is a discriminant score. Obviously, with more than 2 groups or variables this graphical method becomes impossible. However, animation can demonstrate how some projections of data are able to separate data better than others.

Animated 3d separation

It is possible to construct axes through n-dimensional (where n is 2 or more) space. This means that more than two variables and/or groups can be used. The mathematical basis of the method is summarised in the next section.

Back to Discriminant Analysis

Next page (Mathematical Description)

top

Clustering and Classification methods for Biologists

Discriminant Analysis

Page Outline

Search

Introduction

Graphical explanation