Concepedia

Publication | Open Access

Educational data mining: prediction of students' academic performance using machine learning algorithms

579

Citations

41

References

2022

Year

TLDR

Educational data mining is an effective tool for uncovering hidden relationships in educational data and predicting students’ academic achievements, and such data‑driven studies are crucial for establishing learning‑analysis frameworks and informing decision‑making in higher education. The study proposes a machine‑learning model that predicts undergraduate final exam grades using midterm grades as input. The authors evaluated random forests, nearest neighbour, support vector machines, logistic regression, Naïve Bayes, and k‑nearest neighbour algorithms on a dataset of 1,854 Turkish Language‑I students, using only midterm grades, department, and faculty data to predict final exam scores. The model achieved a classification accuracy of 70–75%, enabling early identification of students at high risk of failure and highlighting the most effective machine‑learning methods.

Abstract

Abstract Educational data mining has become an effective tool for exploring the hidden relationships in educational data and predicting students' academic achievements. This study proposes a new model based on machine learning algorithms to predict the final exam grades of undergraduate students, taking their midterm exam grades as the source data. The performances of the random forests, nearest neighbour, support vector machines, logistic regression, Naïve Bayes, and k-nearest neighbour algorithms, which are among the machine learning algorithms, were calculated and compared to predict the final exam grades of the students. The dataset consisted of the academic achievement grades of 1854 students who took the Turkish Language-I course in a state University in Turkey during the fall semester of 2019–2020. The results show that the proposed model achieved a classification accuracy of 70–75%. The predictions were made using only three types of parameters; midterm exam grades, Department data and Faculty data. Such data-driven studies are very important in terms of establishing a learning analysis framework in higher education and contributing to the decision-making processes. Finally, this study presents a contribution to the early prediction of students at high risk of failure and determines the most effective machine learning methods.

References

YearCitations

Page 1