A REAL TIME DATA MINING MODEL TO PREDICT ACADEMIC ATTRITION
##plugins.themes.bootstrap3.article.main##
Abstract
Quality of education system is very important for a country growth. Today education sector is facing challenges, the major challenges of higher education being decrease in students success rate and their leaving a course without completion. An early prediction of student’s failure can avoid poor performance, which will help to enhance their performance. It can help not only the current students but also the future students to predict thier performance. Data mining provides powerful techniques to analysis student performance. For this purpose, In this dissertation various educational data mining techniques have been used such as Naive Bayes, Decision Tree, K-Nearest Neighbour, Random Forest, Rpart , C5.0 to build a model for academic attrition based on students social integration, academic integration and various emotional skills considered. In order to future through data mining techniques data was collected from mullana university Data from the admission process are complemented with the academic information that is gathered for each academic period; however, the causes of low academic performance occur on day-to-day basis and waiting until the academic period ends could be crucial. This leads to think that new, and possibly, non –traditional ways, for collecting information close to real time are needed. In this dissertation new attributes are identified which represent real time student academic attrition. The implementation of different data mining techniques is done on R language develop at the university of Auckland,New Zealand. It is an open source language. It is an interactive language used for easy input,output and large data manipulation and used for various statistical analysis and modelling. Many classification and regression algorithms are used, which are attribute dependent.some are used on categorical, nominal data others on numerical data. The experimental results are validated against test data and interesting co-relations are observed. The comparison of their accuracy is done to find the most accurate predictions. Graphs are also used for illustrative comparison, along with numerical values.