@MASTERSTHESIS{pgi2020012,
author = "A. Bruce",
supervisor = "K. Liaskos",
title = "The Prediction of Student Performance Through the Use of Machine Learning",
school = "Department of Computer and Information Sciences, University of Strathclyde",
year = "2019",
abstract = "With the increasing popularity and importance of Higher Education, degree cohorts are becoming larger and it is becoming increasingly difficult for lecturers and advisors of studies to engage with students personally. As a result, students who are struggling with course material are receiving feedback and advice too late or are being missed out altogether which increases the number of students failing or dropping out of higher education.
The increase in research into machine learning and advances in technology have made complex and computationally expensive machine learning algorithms a viable solution to this problem. The aim of this study was to determine which machine learning algorithms were most suited to solving this problem and to determine whether it is possible to accurately predict a student{\^a}€™s performance early in a course.
Using Scikit Learn, five machine learning algorithms were trained and tested using the Open Universities Learning Analytics data. The percentage of correctly classified failing students to the total number of failing students in the testing data set and the percentage of failing students correctly classified to the total number of students classified as failing were used as metrics to determine which of the algorithms were most suited to this problem.
The results show that, with minimal amounts of parameter tuning for optimisation, Random Forest and K-Nearest Neighbours are both suited to predicting student performance even with only a small amount of available student data which would allow for accurate and early prediction.",
}