Clustering and Classification methods for Biologists

MMU logo

Discriminant Analysis

LTSN Bioscience logo

Page Outline



[ Yahoo! ] options

Intended Learning Outcomes

At the end of this section students should be able to complete the following tasks.



A common biological problem is identifying the features responsible for splitting a set of observations into two or more groups. For example, we may wish to distinguish between

If there is information about individuals (cases), obtained from a number a variables, it is reasonable to ask if these variables can be used to define groups and/or predict the group to which an individual belongs. Discriminant analysis and logistic regression are two methods that achieve these aims.

Discriminant analysis is one of the simplest and most widely used classification methods. It was widely used in biology, but became less popular when a number of papers, particularly in ecological journals, questioned its validity for most analyses and suggested that logistic regression was a better alternative. However, in empirical tests discriminant analysis often emerges as one of the better classifiers.

Discriminant analysis works by creating a new variable which is a combination of the original predictors. This is done in such a way that the differences between the predefined groups, with respect to the new variable, are maximized. The most comprehensive text dealing with all aspects of discriminant analysis is Huberty's (1989) book. Note that group membership must be known before using Discriminant Analysis.

In summary:


Details and examples

A. Some background
B. Mathematical description (brief!)
C. Sample analyses
D. Classification accuracy
E. Self assessment exercises


empty space


Huberty, C. J. 1994. Applied discriminant analysis. Wiley.