Simplifying Decision Trees
Summary and Critique by Michael Pouch
J.R. Quinlan offers different techniques to help simplify the making of decision trees while at the same time maintaining their accuracy. The purpose of the study is to simplify decision trees by using four methods that help prove their fluency and the ability to apply the knowledge.
The author begins to explain that in order to build knowledge based systems, the ability to function at an expert level in some task does not necessarily confer a corresponding ability to articulate this know-how. Therefore, all this knowledge that the “knowledge engineer” produced, tends to state the engineer’s conclusions and reasoning in broad terms for the analysis. Thus, Quinlan examines four methods to help simplify the method going into creating decision trees. On the whole, three of the methods that Quinlan examines involve pruning the decision tree by replacing one or two subtrees with leaves. The fourth method that Quinlan examines reformulates the decision trees as a set of production rules.
The first method that Quinlan introduces is the method of Cost-Complexity Pruning. After creating a large decision tree, we look at the subtrees that have one to zero nodes. To prune a tree, we look at these nodes that can be evaluated based on its classification accuracy with the other nodes. Next, we denote these nodes that are to be pruned because they are not focused on the overall validation of the tree. The basic idea of cost-complexity pruning is not to consider all pruned subtrees, but only those that are the “best of their kind” in a sense to be defined.
The second method introduced is Reduced Error Pruning. First, you prune the nodes that are classified as the subtrees and you remove them to make one leaf. Nodes are removed on the basis of choosing the node whose removal most increases the decision tree accuracy on the graph. Pruning process continues until further pruning is harmful.
The third method is called Pessimistic Pruning, where it’s based on a single pass algorithm on a True or False approach. At each step, we remove one rule, such that its removal brings the lowest valued node among all possible removals of the rule in the current tree.
The fourth method simplifies the production rules that go into decision trees. As decision trees classify a case and the leaves are satisfied along the path of conditions, examining the conditions or rules along the path could be generalized on the left side of the tree. Once generalize, there is a certainty factor that comes along the cases. To classify a case, you must find a rule that applies to it. If there is more than one, choose the rule with the higher certainty factor. Overall, this method is a condition-elimination strategy.
Overall, J.R. Quinlan intention is to suggest methods for simplifying decision trees without compromising their accuracy.
Overall, I find the first three methods easy to interpret and visualized. The pruning methods are helpful due to their ability to narrow and scope the tree the way that fits the overall validation. When the goal is to produce a sufficiently accurate compact concept description, pruning is highly useful. However, there is a lot swapping of conditions throughout the process. In reaction, decision trees perform well if a few highly relevant attributes exist, but less so if many complex interactions are present. Since most decision trees divide the leaf into mutually exclusive regions to represent a concept, in some cases the tree should contain several duplications of the same sub-tree to represent the classifier. To conclude, decision trees provide a framework to consider the probability and payoffs of decisions to help the analyst evaluate the likely course, however, it is impossible to plan for all contingencies when it comes to its outcomes. Though the process is simplified, there is no statistical backing if it actually works.
Quinlan, J. (n.d.). Simplifying Decision Trees. Massachusetts Institute Of Technology Artificial Laboratory. Retrieved from https://dspace.mit.edu/bitstream/handle/1721.1/6453/AIM- 930.pdf?sequence=2.