Comparative Analysis of Machine Learning Models on Student Performance Data: Insights from Test Scores and Survey Data

Authors

  • Sanjana Sundararaman BSc (Hons) Statistical Data Science, Heriot Watt University, Dubai, UAE
  • Maheen Hasib Department of Mathematics and Computer Science, Heriot Watt University, Dubai, UAE

DOI:

https://doi.org/10.33422/ejte.v7i1.1459

Keywords:

AI in education, educational data mining, machine learning in education, predictive modeling, test scores

Abstract

With the increasing use of digital learning platforms, large volumes of student data have become available for analysis. This paper investigates how machine learning, learning analytics, and educational data mining can be utilized to gain insights into student performance. Various predictive modeling techniques, including Random Forest (RF), K-Nearest Neighbor (KNN), and Decision Trees (DT), are evaluated for their ability to forecast student test scores. Clustering algorithms like K-means are employed to identify patterns within the data. The study integrates these predictive models with survey data collected from undergraduate students at Heriot-Watt University Dubai, aiming to identify factors that influence academic outcomes. The research uses comparative analysis across different machine learning models which is applied to both the survey data and Kaggle test score data. The analysis reveals that linear regression is the most effective model for the Kaggle test score dataset, while K-means clustering provides the best insights from the survey data. The survey model is determined to be more comprehensive due to its inclusion of more predictors. Key metrics, such as accuracy scores, precision, recall, F1 score, and mean squared error, were calculated for both datasets to provide a quantitative overview, enabling a comparative evaluation of model performance and predictor effectiveness for both the datasets. The findings contribute to understanding how data-driven approaches can support educational decisions and interventions while addressing ethical considerations and inclusivity in educational settings.

Downloads

Published

2025-02-16

How to Cite

Sundararaman, S., & Hasib, M. (2025). Comparative Analysis of Machine Learning Models on Student Performance Data: Insights from Test Scores and Survey Data. European Journal of Teaching and Education, 7(1), 61–76. https://doi.org/10.33422/ejte.v7i1.1459

Issue

Section

Articles