Student’s Academic Performance Prediction and Analysis using Decision Tree Algorithms Essay

A Literature Review from 2011 to 2014 on Student’s Academic Performance Prediction and Analysis utilizing Decision Tree Algorithms

Abstraction— Success of any educational institute depends upon the success of the pupils of institute. Student’s public presentation anticipation and its analysis are indispensable for betterment in assorted properties of pupils like concluding classs, attending etc. This anticipation helps instructors in designation of weak pupils and to better their tonss. Various informations excavation techniques like categorization, bunch, are used to execute analysis. In this paper execution of assorted determination tree algorithms ID3, J48/C4.5, random tree, Multilayer Perception, Rule Based and random wood have been studied for student’s public presentation anticipation and analysis. The WEKA tool is used to execute rating. To measure the public presentation per centum split method or cross proof method is used. Main nonsubjective behind this analysis is to better student’s public presentation. This review paper explores the usage of assorted determination tree algorithms for student’s academic public presentation anticipation and its analysis.

Keywords— EDM, Decision tree, J48, random tree, ID3, Multilayer Perception, CART, IBI.

I. Introduction

A.Data Mining and Educational Data Mining ( EDM )

Data excavation is a procedure of taking out utile information and forms from big sum of informations. Data Mining is used for work outing jobs by analysing informations that is present in the databases. [ 1 ]

Educational Data Mining ( EDM ) is a procedure which is concerned with developing assorted techniques or methods for pull outing the different types of informations that come from educational scenes, and usage of those methods for better apprehension of pupils. Main utilizations of EDM include pupil public presentation anticipation and analyzing pupils larning to propose betterments in current educational pattern. [ 2 ]

B.Student Performance Prediction and Analysis

In pupil public presentation anticipation, we predict the unknown value of a variable that defines the pupil. In educational sector, the largely predicted values are student’s public presentation, their Markss, cognition or mark. Student’s public presentation anticipation is really popular application of DM in instruction sector. Different techniques and theoretical accounts are applied for anticipation and analysis of student’s public presentation like determination trees, nervous webs, regulation based systems, Bayesian webs etc. This analysis is helpful for person in foretelling student’s public presentation i.e. anticipation about student’s success in a class and anticipation about student’s concluding class on the footing of characteristics taken from logged informations. [ 2 ] [ 3 ]

This paper is organized as follows: In subdivision II we present work related to pupil public presentation anticipation and analysis. In subdivision III we present comparative survey of study. Decision is presented in subdivision IV. In subdivision V we discuss future range.

II. RELATED WORK

Sing the betterments required in pupils classs or tonss, literature study has been surveyed based on pupil public presentation anticipation and analysis utilizing determination tree algorithms.

Brijesh Kumar Baradwaj, Saurabh Pal [ 5 ] ( 2011 ) have discussed that pupils public presentation is examined by internal Markss and concluding consequences. Data set of 50 pupils was used in this survey which was taken from MCA section of VBS Purvanchal University, Uttar Pradesh. Information like old semester Markss, attending, and assignment and category trial Markss from old database of pupils. They have used determination tree algorithms for pupil public presentation anticipation and analysis. This overall survey will assist module members in bettering student’s tonss for future scrutinies.

R. R. Kabra, R. S. Bichkar [ 11 ] ( Dec. 2011 ) collected informations from S.G.R. college of technology and direction, Maharashtra. They collected informations from 346 pupils of technology first twelvemonth. Evaluation was performed utilizing J48 algorithm by 10 fold cross proof. The truth of J48 algorithm was 60.46 % . This theoretical account is successful in placing the pupils who are likely to neglect. So it will be helpful for increasing public presentation of pupils.

Surjeet Kumar Yadav, Saurabh Pal [ 6 ] ( 2012 ) conducted analysis on 90 pupils of technology section ( session 2010 ) from VBS Purvanchal University, Uttar Pradesh. ID3, C4.5 and CART determination tree algorithms were used for rating. Evaluation was performed utilizing 10 fold cross proof method. It has been found that C4.5 has higher truth 67.7778 % than ID3 and CART algorithm. Model’s True Positive rate for category Fail is high 0.786 for ID3 and C4.5 which means it will successfully place the fail pupils. This survey will be helpful for those pupils that need particular attending from instructors.

Manpreet Singh Bhullar, Amritpal Kaur [ 10 ] ( 2012 ) have taken informations set of 1892 pupils from assorted colleges for pupil public presentation anticipation and rating. J48 algorithm was chosen for rating utilizing 10 fold cross proof. Success rate of J48 algorithm was 77.74 % . In this manner it will be helpful in placing weak pupils so that instructors can assist them before failure.

Mrinal Pandey, Vivek Kumar Sharma [ 4 ] ( Jan. 2013 ) compared J48, Simple Cart, Reptree and NB tree algorithms for foretelling public presentation of technology pupils. They have taken informations of 524 pupils for 10 fold cross proof and 178 pupils for per centum split method. It has been found that J48 determination tree algorithm achieved higher truth 80.15 % utilizing 10 fold cross proof method. By utilizing per centum split method higher truth 82.58 % is achieved by J48 algorithm. From this comparing it has been found that J48 performs best than other algorithms in both the instances. J48 determination tree algorithm will be utile for instructors in bettering public presentation of weak pupils.

Anuja Priyam, Abhijeet, Rahul Gupta, Anju Rathee, and Saurabh Srivastava [ 12 ] ( June 2013 ) compared ID3, C4.5 and CART determination tree algorithms on the footing of pupils informations. Evaluation was performed utilizing 10 fold cross proof method. It shows that the CART algorithm has higher truth 56.2500 % . Model’s True Positive rate for category Fail is high 0.786 for ID3 and C4.5 which means it will successfully place the fail pupils. So this theoretical account will assist instructors in cut downing failure rates.

Ramanathan L, Saksham Dhanda, Suresh Kumar D [ 14 ] ( June-July 2013 ) performed analysis on 50 pupils informations. They were used naive Bayess, J48 and proposed algorithm ( Weighted ID3 ) for rating. It shows that WID3 has higher truth 93 % than J48 and naive Bayess. In future you can do user friendly package utilizing WID3 which will be really helpful for instructors.

Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao [ 7 ] ( September 2013 ) performed analysis on informations set of 182 pupils utilizing ID3 and C4.5 determination tree algorithms. When they performed bulk rating on informations set of 173 pupils both algorithms have same truth of 75.145 % and when they performed remarkable rating on informations set of 9 pupils so both algorithms have accuracy 77.778 % . For 182 pupils truth was about 75.257.

Mrs. M.S. Mythili, Dr. A.R.Mohamed Shanavas [ 9 ] ( Jan. 2014 ) compared J48, Random Forest, Multilayer Perception, IBI and determination tree algorithms utilizing informations set of 260 pupils from assorted schools. 10 fold cross proof was chosen for rating. It has been found that Random Forest has higher truth 89.23 % and less executing clip amongst all other algorithms. This survey will be helpful for educational establishments.

Jyoti Namdeo, Naveenkumar Jayakumar [ 13 ] ( Feb. 2014 ) collected 51 pupils informations from MCA 2007 batch. Decision tree algorithms used in rating were Naive Bayes, Multilayer Perception, J48 and Random Forest. These algorithms were trained on 2007 batch informations and tested on 2008 batch informations. Evaluation was performed utilizing preparation, cross proof, per centum split and trial on 2008 informations. After proving on 2008 information it has been found that naive Bayes has higher truth 31.57 % amongst other algorithms but this truth is non harmonizing to demand.

Azwa Abdul Aziz, Nor Hafieza IsmailandFadhilah Ahmad [ 8 ] ( September 2014 ) conducted analysis on 399 records of pupils utilizing naive Bayess, regulation based and J48 determination tree algorithm. They have used cross proof and per centum split method for rating. In cross proof 3, 5, 10 fold cross proof was performed and in per centum split method preparation: proving 10:90, 20:80, 30:70, 40:60, 50:50, 40:60, 30:70, 20:80, 10:90 per centum split were used. After comparing of 3 categorization algorithms it has been found that regulation based and J48 determination tree algorithm has higher truth 68.8 % .

III. COMPARATIVE STUDY OF SURVEY

  1. Comparison of study work based on different parametric quantities

Paper Name

Year of Publication

Size of Data Set

( No. of pupils )

Algorithms Used

Trial Options Used

Algorithm with Higher Accuracy

Accuracy ( in % ) of Algorithm

Performance Prediction of Engineering Students utilizing Decision Trees

Dec. 2011

346

J48

Cross Validation

J48

60.46 %

Data Mining: A Prediction for Performance Improvement of Engineering Students utilizing Categorization

2012

90

ID3

C4.5

Cart

Cross Validation

C4.5

67.7778 %

Use of Data Mining in Education Sector

2012

1892

J48

Cross Validation

J48

77.74 %

A Decision Tree Algorithm Pertaining to the Student Performance Analysis and Prediction

Jan. 2013

524

J48

Simple cart

Reptree

NB tree

Cross Validation

J48

80.15 %

178

J48

Simple cart

Reptree

NB tree

Percentage Split

J48

82.58 %

Comparative Analysis of Decision Tree Classification Algorithms

June 2013

____________

ID3

C4.5

Cart

Cross Validation

Cart

56.2500 %

Predicting Students’ Performance utilizing Modified ID3 Algorithm

June-July 2013

50

Naive Bayess

J48

Weighted ID3

____________

Weighted ID3

93 %

Predicting Students Performance utilizing ID3 and C4.5 Classification Algorithms

September 2013

173

ID3

C4.5

for bulk rating

Cross Validation

ID3

C4.5

75.145 %

9

ID3

C4.5

for remarkable rating

Cross Validation

ID3

C4.5

77.778 %

An Analysis of students’ public presentation utilizing categorization algorithms

Jan. 2014

260

J48

Random Forest

Multilayer Perception

IBI

Cross Validation

Random Forest

89.23 %

Predicting Students Performance Using Data Mining Technique with Rough Set Theory Concepts

Feb. 2014

51

J48

Random Forest

Multilayer Perception

Naive Bayes

Training

Cross Validation

Percentage Split

Trial

Naive Bayes

31.57 %

First Semester Computer Science Students’ Academic Performances Analysis by Using Data Mining Classification Algorithms

September 2014

399

Naive Bayes

J48

Rule Based

Cross Validation

Percentage Split

J48

68.8 %

IV. Decision

Educational informations mining’s ( EDM ) importance is increasing twenty-four hours by twenty-four hours as the student’s public presentation anticipation and analysis demands are increasing for betterment of student’s academic public presentation. As given above assorted writers have implemented different determination tree algorithms: J48, random forest, multilayer perceptual experience, naive Bayess, regulation based, IBI, reptree, NB tree and CART utilizing different informations sets. Some writers performed comparing of algorithms to happen out the best algorithm from them on the footing of truth. The study done in this paper shows that most likely J48/C4.5 determination tree algorithm is considered best algorithm in footings of truth for different informations sets. So it is clear from study that J48 performs good for any size of informations set. This is the ground behind broad usage of J48 algorithm amongst all determination tree algorithms.

Survey done in the subdivision II will be helpful to assorted research workers that are working in the field of student’s public presentation anticipation and analysis utilizing determination tree algorithms.

V. FUTURE WORK

For growing of any educational institute, student’s academic public presentation is chief subscriber. If pupils perform good academically so institution growing rate goes high. It is necessary in these yearss to concentrate on the student’s consequences so there is a broad range in this field. To increase student’s public presentation, pupil public presentation anticipation and analysis is used. For this purpose determination tree algorithms are used chiefly. Assorted research workers have done batch of research in this field by executing rating utilizing individual algorithm or by comparing three or four algorithms.

In future research workers can heighten the research by comparing big figure of algorithms utilizing big size informations sets. So there is a broad range for research workers in this field.

Recognition

First of all I express my sincerest debt of gratitude to the Almighty God who ever supports me in my enterprises.

I would wish to thank Prof. Neena Madan for their encouragement and support. Then, I would wish to thank my household and my friends. I am grateful to all those who helped me in one manner or the other at every phase of my work.

Mentions
  1. Nikita Jain, Vishal Srivastava, “Data excavation techniques: A study paper” , IJRET: International Journal of Research in Engineering and Technology, Volume: 02 Issue: 11, Nov-2013.
  2. Mrs. M.S. Mythili, Dr. A.R.Mohamed Shanavas, “An Analysis of students’ public presentation utilizing categorization algorithms” , IOSR Journal of Computer Engineering, Volume 16, Issue 1, January 2014.
  3. Dr. Mohd Maqsood Ali, “Role of informations excavation in instruction sector” , International Journal of Computer Science and Mobile Computing Vol. 2, Issue. 4, April 2013.
  4. Mrinal Pandey, Vivek Kumar Sharma, “A Decision Tree Algorithm Pertaining to the Student Performance Analysis and Prediction” , International Journal of Computer Applications Volume 61, No.13, January 2013.
  5. Brijesh Kumar Baradwaj, Saurabh Pal, “Mining Educational Data to Analyze Students Performance” , International Journal of Advanced Computer Science and Applications, Vol. 2, No. 6, 2011.
  6. Surjeet Kumar Yadav, Saurabh Pal, “Data Mining: A Prediction for Performance Improvement of Engineering Students utilizing Classification” , World of Computer Science and Information Technology Journal Vol. 2, No. 2, 2012.
  7. Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao, “Predicting Students Performance utilizing ID3 and C4.5 Classification Algorithms” , International Journal of Data Mining & A ; Knowledge Management Process ( IJDKP ) Vol.3, No.5, September 2013.
  8. Azwa Abdul Aziz, Nor Hafieza IsmailandFadhilah Ahmad, “First Semester Computer Science Students’ Academic Performances Analysis by Using Data Mining Classification Algorithms” , Proceeding of the International Conference on Artificial Intelligence and Computer Science ( AICS 2014 ) , September 2014.
  9. Mrs. M.S. Mythili, Dr. A.R.Mohamed Shanavas, “An Analysis of students’ public presentation utilizing categorization algorithms” , IOSR Journal of Computer Engineering ( IOSR-JCE ) Volume 16, Issue 1, Jan. 2014.
  10. Manpreet Singh Bhullar, Amritpal Kaur, “Use of Data Mining in Education Sector” , Proceedings of the World Congress on Engineering and Computer Science ( WCECS ) , San Francisco, USA, October 2012.
  11. R. R. Kabra, R. S. Bichkar, “Performance Prediction of Engineering Students utilizing Decision Trees” , International Journal of Computer Applications Volume 36, No.11, December 2011.
  12. Anuja Priyam, Abhijeet, Rahul Gupta, Anju Rathee, and Saurabh Srivastava, “Comparative Analysis of Decision Tree Classification Algorithms” , International Journal of Current Engineering and Technology, Volume 3, No.2, June 2013.
  13. [ 13 ] Jyoti Namdeo, Naveenkumar Jayakumar, “Predicting Students Performance Using Data Mining Technique with Rough Set Theory Concepts” , International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 2, February 2014.
  14. [ 14 ] Ramanathan L, Saksham Dhanda, Suresh Kumar D, “Predicting Students’ Performance utilizing Modified ID3 Algorithm” , International Journal of Engineering and Technology ( IJET ) Volume 5, No. 3, Jun-Jul 2013.