Friday, September 15, 2017

Decision tree learning used for the classification of student archetypes in online courses

Alexandru Topırceanu
Department of Computer and Information Technology, Politehnica University Timisoara, Bd. V. Parvan 2, 300223 Timisoara, Romania
Gabriela Grosseck
Faculty of Psychology and Sociology, West University Timisoara, Bd. V. Parvan 4, 300223 Timisoara, Romania

21st International Conference on Knowledge Based and Intelligent Information and Engineering Systems, KES2017, 6-8 September 2017, Marseilles, France


This paper looks into the student profiles using decision tree and supervised learning. The researcher uses the responses from an online questionnaire to gather detailed opinion from 632 students from Romania regarding the advantages and disadvantages of Massive Open Online Courses(MOOCs), as well as the reasons for not joining online courses. Based on the extracted statistics, they present six decision trees for classifying the finalization and participation rates of online courses based on the students’ individual traits.

To support this direction of research in educational science, they rely on decision tree learning techniques to go beyond the simple statistical analysis and profiling of students done in eLearning. To improve the efficiency of eLearning, the goal of this study is set to define a set of archetypes which can quickly assess any student, so that educators better understand their inner drives to participate and finalize a course.

Decision tree learning is used into link observations about entities (represented in the tree’s branches) to conclude upon the entities’ target value (represented in the tree’s leaf nodes). It is consistently used in data mining and represents a predictive modelling approach. Based on what values the target variable can take, there are classification trees, where the target variable can take a finite set of values, respectively regression trees, where the target variable can take continuous values; for this study they only make use of classification trees

The paper classifies the result with six decision tree as below

1. Finalization rate based on perceived advantages of online course.
2. Finalization rate based on ongoing course evaluation.
3. Participation rate in online courses based on demographics.
4. Course completion based on perceived disadvantages of online courses.
5. Desire for free course based on the reason for not participating in online courses.
6. Desire for certification based on reasons for not participating in online courses.

The paper describes each tree by defining student profiles for each of the unique branches in the tree. Thus, one or more leaf nodes represent a student profile, and the branches that lead to those nodes represent the characteristics of each profile.


This is a well-researched article which uses various techniques in combination with decision tree to research advantages and disadvantages of Massive Open Online Courses(MOOCs). This article publishes final result with much descriptive explanation on how they have used the Decision tree. The results are based on online questionnaire by 632 students, which is smaller a sample and all the participants were from only Romania. When this method is properly implanted, the results can be very useful in enhancing the outcome of courses based on the extracted knowledge from this study. It will also provide decision makers with a better understanding of students’ enrollment pattern to courses and to provide suggestions for structuring course offerings. 



  1. The amalgamation of techniques in this study is quite impressive. I think this is a very useful study as online courses are not a one size fits all for students. I agree with you that the results can be an eye-opener for decision makers in the educational system. It appears to include Machine learning in their techniques. It will be interesting to see where this study leads to in the nearest future.

  2. A good way to improve this study would be to do similar studies using a larger sample. As various countries have different education systems, tailoring the decision tree analysis to the education system would be good way to produce a more accurate analysis.