Tuesday, March 12, 2013

Phases vs. Levels Using Decision Trees for Intrusion Detection Systems

An article in the International Journal of Computer Science and Information Security written by the authors Heba Ezzat Ibrahim, Sherif M. Badr, and Mohamed A. Shaheen compares phase decision trees to level decision trees.  The authors state that decision trees are a useful and commonly used tool for detecting intrusions into computer networks.  Decision trees are composed of data that is broken down into likely attributes, and then assigned percentage detected values to these nodes.

For their paper they compared phase and level models.  The phase model is divided into three stages: the first detects if the incoming data is normal or an attack, the second detects if it is a DDoS, probe, R2L (remote to local), or U2R (user to root) attack, and the third detects the various intruder types from the previous step.  This model differs from the level model as the steps are sequential. 

The level model arranges each stage as separate processes that detect attacks individually and then tries to label them regardless of completion of the previous step.  Instead of treating detection as a process (singular tree), it treats each phase as a different section (three trees).  This allows for detection for false negatives of network attacks that the phase model may miss in the first step.

The authors found that the phase model frequently detects the threats than the level model.  Additionally, they found that the phase approach classifies new attacks more frequently.  The level model does show more 100% detection rates than the phase model, but on average the percentage rates are not higher.  Not only does a phase model decision tree show better consistency, it also exemplifies how real world attack prevention software processes an incoming threat through logical steps instead of trying all options simultaneously. 

The authors successfully explained the process of how a network attack can be detected through the use of decision trees.  I personally do not have any background in computer networks, but feel it was not too difficult to understand the reasoning for running the two separate models to compare consistentency.  This topic does not specifically relate to the intelligence field, however, it does relate to cyber security through computer network defense.

Despite arguing well for certain parts of this paper, I found several issues.  The first is the authors never clearly define the 23 types of attacks that are drawn from the second stage.  Without this information I feel that it is difficult to believe their results are accurate when I am unable to tell exactly what they are detecting.  They also do not describe the processes for which they run the data thoroughly, other than they are either new attacks or are partitioned data.  Additionally, the authors state that the data set they use (KDDCUP'99 data set) is the best available, but has some inherent problems.  The problems are never explained well (although they did eliminate duplicate entries), and instead they say that other people have used it and therefore this justifies them to use it.

Ibrahim, Heba Ezzat, Sherif M. Badr, and Mohamed A. Shaheen. (2012). Phases vs. Levels using Decision Trees for Intrusion Detection Systems. International Journal of Computer Science and Information Security, 10.8.  Retrieved from http://arxiv.org/ftp/arxiv/papers/1208/1208.5997.pdf  


  1. This comment has been removed by the author.

  2. Cori, I agree with a number of things you mentioned in your critique. It also seems that both the phase model and level model has its own advantages and disadvantages. While the phase model is a good method to organize data, the level model goes a step further by detecting false negatives. The ability to eliminating false negatives can be very helpful to intelligence analysts because is likely to reduce uncertainty. Often due to large volumes of data that needs to be analyzed, detecting false negatives may provide analysts and decision makers more time to formulate accurate decisions based on relevant intelligence.

  3. Cori, that's an interesting paper. Similar to your critique, I found it helpful to have the process diagrammed out so directly -- it was easy to understand without a previous background in computers. While this is not directly embedded in the Intel field, I agree with you that it definitely applicable to the Intel field -- especially with the increasing concern related to the cyber world and the evolving threats.

  4. A decision tree is a visual model for decision making which represents consequences, including chance event outcomes, resource costs, and utility. It is also one way to display an algorithm that only contains conditional control statements. Making decision trees are super easy with a decision tree maker with free templates.