These nine questions assume that the optimum tree is being used. This has
a maximum depth of 6, with pruning and the last 50 cases used for testing.
Make sure that you have the correct settings before building the tree.
1
The first rule
If a location is 400 m from water the prediction is that a mountain sheep will be present.
In order to answer the next question you need to examine the rules in each
of the nodes.
2
Variables in rules
Which predictor featured in the smallest number of nodes?
The next question asks you to match the rules to the nodes. For example,
node 5 has a rule that "slope < 65". But, remember that in order
to get to node 5 you must have passed through node 2. Therefore, cases in
node 5 have a value for water > 50 and a slope > 65.
3
Rules and Nodes
Match the rules to the Node numbers.
4
Nodes 7 to 10
Which feature are the nodes 7 - 10 using to segregate cases.
Nodes are either non-leaf or leaf. If they are a leaf node they are terminal
nodes and are not split by any further rules.
5
Leaf nodes
Which of the following combinations are composed entirely of leaf nodes?
Leaf nodes can be 'pure' (all cases have the same class) or 'impure' (cases
are a mix of classes). The next question concerns an impure leaf node.
6
Node 16
Why was node 16 not split any further when it contains 10 absence and 8 presence locations?
Some nodes are very impure and contain a lot of cases. If a node is vvery
impure one of the classes will not dominate by very much.
7
The confusion matrices summarise the performance of the tree, in accuracy
terms. The next question asks you to fill in the gaps.
8
One of the reasons for building the tree is to make future predictions. For
example, predicting the locations where you might expect to see sheep. You
could test this by going out and collecting data which are then compared with
the predictions. Although you wouldn't normally make the predictions by hand
it is a good test of your comprehension of the tree.
9
Making predictions
Using the values for water, slope, aspect and vegetation, use the tree to identify which cases would be correctly classified.