Ensemble Machine Learning for Higher Education: Designing a Student Performance Prediction Model for Enhanced Academic Outcomes
Abstract
Academic success is the top priority for all stakeholders, including students, teachers, parents, administrators & management, industry,
environment, at any educational institution. HIEs can advance professionally and academically with regular input from all stakeholders,
but they must also use cutting-edge technologies that can accelerate institutions’ growth. Management and students can both benefit from
early success prediction of kids using popular artificial intelligence technologies like machine learning and early detection of at-risk
students.Through the use of ensemble machine learning and the stacking of the four multi-class classifiers Decision Tree, k-Nearest
Neighbour, Naive Bayes, One versus Rest Support Vector Machine Classifiers, we have developed a new model for predicting student
performance. The suggested model makes an early prediction regardina student’s academic progress. To test the findings, a student dataset of over a thousand students from five different engineering institute branches was collected. The suggested model accurately predicts the
final grade with 93% accuracy and compares the four ML approaches being employed.
References
IEEE Access, vol. 9, pp. 95608–95621, 2021, doi: 10.1109/ACCESS.2021.3093563.
2. Y. Pang, N. Judd, J. O’Brien, M. Ben-Avie, “Predicting students’ graduation outcomes through support vector
machines,” in Proceedings - Frontiers in Education Conference, FIE, 2017, vol. 2017-October. doi: 10.1109/
FIE.2017.8190666.
3. F. Ünal, “Data Mining for Student Performance Prediction in Education,” in Data Mining - Methods,
Applications and Systems, 2021. doi: 10.5772/ intechopen.91449.
4. C. A. Palacios, J. A. Reyes-Suárez, L. A. Bearzotti, V. Leiva, C. Marchant, “Knowledge discovery for higher
education student retention based on data mining: Machine learning algorithms and case study in chile,”
Entropy, vol. 23, no. 4, 2021, doi: 10.3390/e23040485.
5. F. J. Kaunang and R. Rotikan, “Students’ academic performance prediction using data mining,” in Proceedings of the 3rd International Conference on Informatics and Computing, ICIC 2018, 2018. doi: 10.1109/IAC.2018.8780547.
6. S. Ruiz, M. Urretavizcaya, C. Rodríguez, I. Fernández- Castro, “Predicting students’ outcomes from emotional
response in the classroom and attendance,” Interactive Learning Environments, vol. 28, no. 1, 2020, doi: 10.1080/10494820.2018.1528282.
7. K. Sethi, V. Jaiswal, M. D. Ansari, “Machine Learning Based Support System for Students to Select Stream
(Subject),” Recent Advances in Computer Science and Communications, vol. 13, no. 3, pp. 336–344, Nov. 2018, doi: 10.2174/2213275912666181128120527.
8. D. E. M. Cervera, O. J. S. Parra, M. A. A. Prado, “Forecasting model with machine learning in higher education ICFES exams,” International Journal of Electrical and Computer Engineering, vol. 11, no. 6, pp. 5402–5410, Dec. 2021, doi: 10.11591/ijece.v11i6. pp5402-5410.
9. F. Marbouti, H. A. Diefes-Dux, K. Madhavan, “Models for early prediction of at-risk students in a course using standards-based grading,” Comput Educ, vol. 103, 2016, doi: 10.1016/j.compedu.2016.09.005.
10. S. K. Pushpa, T. N. Manjunath, T. v. Mrunal, A. Singh, C. Suhas, “Class result prediction using machine learning,” in Proceedings of the 2017 International Conference On Smart Technology for Smart Nation, SmartTechCon 2017, 2018. doi: 10.1109/SmartTechCon.2017.8358559.
11. L. Tuggener et al., “Automated Machine Learning in Practice: State of the Art and Recent Results,” in Proceedings - 6th Swiss Conference on Data Science, SDS 2019, 2019. doi: 10.1109/SDS.2019.00-11.
12. B. Pavlyshenko, “Using Stacking Approaches for Machine Learning Models,” in Proceedings of the 2018 IEEE 2nd International Conference on Data Stream Mining and Processing, DSMP 2018, 2018. doi: 10.1109/ DSMP.2018.8478522.
13. J. Xu, “An extended one-versus-rest support vector machine for multi-label classification,” Neurocomputing, vol. 74, no. 17, 2011, doi: 10.1016/j. neucom.2011.04.024.
14. A. Trabelsi, Z. Elouedi, E. Lefevre, “Decision tree classifiers for evidential attribute values and class labels,” Fuzzy Sets Syst, vol. 366, 2019, doi: 10.1016/j. fss.2018.11.006.
15. S. M. Rezaeijo, R. Abedi-Firouzjah, M. Ghorvei, S. Sarnameh, “Screening of COVID-19 based on the extracted radiomics features from chest CT images,” J Xray Sci Technol, vol. 29, no. 2, 2021, doi: 10.3233/ XST-200831.
16. A. Churcher et al., “An experimental analysis of attack classification using machine learning in IoT networks,”
Sensors (Switzerland), vol. 21, no. 2, 2021, doi: 10.3390/ s21020446.
17. G. Akçapınar, A. Altun, P. Aşkar, “Using learning analytics to develop early-warning system for atrisk students,” International Journal of Educational Technology in Higher Education, vol. 16, no. 1, 2019, doi: 10.1186/s41239-019-0172-z.
18. N. Hutagaol and Suharjito, “Predictive modelling of student dropout using ensemble classifier method in
higher education,” Advances in Science, Technology and Engineering Systems, vol. 4, no. 4, 2019, doi: 10.25046/aj040425.
19. N. Rohilla and M. Rai, “Advance Machine Learning Techniques Used for Detecting and Classification of
Disease in Plants: A Review,” in Proceedings - 2021 3rd International Conference on Advances in Computing,
Communication Control and Networking, ICAC3N 2021, 2021. doi: 10.1109/ICAC3N53548.2021.9725616.
20. M. Naseer, W. Zhang, W. Zhu, “Prediction of coding intricacy in a software engineering team through machine learning to ensure cooperative learning and sustainable education,” Sustainability (Switzerland), vol. 12, no. 21, 2020, doi: 10.3390/su12218986.
21. S. Wang, L. Jiang, C. Li, “Adapting naive Bayes tree for text classification,” Knowl Inf Syst, vol. 44, no. 1, 2015, doi: 10.1007/s10115-014-0746-y.
22. S. Hussain and M. Q. Khan, “Student-Performulator: Predicting Students’ Academic Performance at Secondary and Intermediate Level Using Machine Learning,” Annals of Data Science, 2021, doi: 10.1007/ s40745-021-00341 0.
23. S. E. Sorour, K. Goda, T. Mine, “Evaluation of effectiveness of time-series comments by using machine learning
techniques,” Journal of Information Processing, vol. 23, no. 6, 2015, doi: 10.2197/ipsjjip.23.784.
24. H. S. Park and S. J. Yoo, “Early Dropout Prediction in Online Learning of University using Machine Learning,”
International Journal on Informatics Visualization, vol. 5, no. 4, 2021, doi: 10.30630/JOIV.5.4.732.
25. M. Singh, C. Verma, R. Kumar, P. Juneja, “Towards enthusiasm prediction of Portuguese school’s students towards higher education in realtime,” in Proceedings of International Conference on Computation, Automation
and Knowledge Management, ICCAKM 2020, 2020. doi: 10.1109/ICCAKM46823.2020.9051459.
26. V. H. Barella, L. P. F. Garcia, M. C. P. de Souto, A. C. Lorena, A. C. P. L. F. de Carvalho, “Assessing the data complexity of imbalanced datasets,” Inf Sci (N Y), vol. 553, 2021, doi: 10.1016/j.ins.2020.12.006.
27. S. Bej, N. Davtyan, M. Wolfien, M. Nassar, O. Wolkenhauer, “LoRAS: an oversampling approach for imbalanced datasets,” Mach Learn, vol. 110, no. 2, 2021, doi: 10.1007/s10994-020-05913-4.
28. G. Lemaître, F. Nogueira, C. K. Aridas, “Imbalancedlearn: A python toolbox to tackle the curse of imbalanced datasets in machine learning,” Journal of Machine Learning Research, vol. 18, 2017.
29. I. K. Nti, A. F. Adekoya, B. A. Weyori, “A comprehensive evaluation of ensemble learning for stock-market prediction,” J Big Data, vol. 7, no. 1, 2020, doi: 10.1186/ s40537-020-00299-5.
30. A. S. Wibawa and A. Purwarianti, “Indonesian Namedentity Recognition for 15 Classes Using Ensemble
Supervised Learning,” in Procedia Computer Science, 2016, vol. 81. doi: 10.1016/j.procs.2016.04.053.
31. X. Hu, H. Zhang, H. Mei, D. Xiao, Y. Li, M. Li, “Landslide susceptibility mapping using the stacking ensemble
machine learning method in lushui, southwest China,” Applied Sciences (Switzerland), vol. 10, no. 11, 2020, doi: 10.3390/app10114016.
32. M. Rahman et al., “Application of stacking hybrid machine learning algorithms in delineating multi-type flooding in Bangladesh,” J Environ Manage, vol. 295, 2021, doi: 10.1016/j.jenvman.2021.113086.
33. W. Jiang, Z. Chen, Y. Xiang, D. Shao, L. Ma, J. Zhang, “Ssem: A novel self-adaptive stacking ensemble
model for classification,” IEEE Access, vol. 7, 2019, doi: 10.1109/ACCESS.2019.2933262.
34. K. Davagdorj, J. S. Lee, V. H. Pham, K. H. Ryu, “A comparative analysis of machine learning methods for
lass imbalance in a smoking cessation intervention,” Applied Sciences (Switzerland), vol. 10, no. 9, 2020, doi: 10.3390/app10093307.
35. J. H. Seo and Y. H. Kim, “Machine-learning approach to optimize smote ratio in class imbalance dataset for intrusion detection,” Comput Intell Neurosci, vol. 2018, 2018, doi: 10.1155/2018/9704672.
36. M. F. Ijaz, G. Alfian, M. Syafrudin, J. Rhee, “Hybrid Prediction Model for type 2 diabetes and hypertension
using DBSCAN-based outlier detection, Synthetic Minority Over Sampling Technique (SMOTE), random forest,” Applied Sciences (Switzerland), vol. 8, no. 8, 2018, doi: 10.3390/app8081325.
37. I. Burman and S. Som, “Predicting Students Academic Performance Using Support Vector Machine,” in Proceedings - 2019 Amity International Conference on Artificial Intelligence, AICAI 2019, 2019. doi: 10.1109/
AICAI.2019.8701260.
38. H. Altabrawee, O. A. J. Ali, S. Q. Ajmi, “Predicting Students’ Performance Using Machine Learning Techniques,” JOURNAL OF UNIVERSITY OF BABYLON for Pure and Applied Sciences, vol. 27, no. 1, 2019, doi: 10.29196/jubpas.v27i1.2108.
39. H. Zeineddine, U. Braendle, A. Farah, “Enhancing prediction of student success: Automated machine learning approach,” Computers and Electrical Engineering, vol. 89, Jan. 2021, doi: 10.1016/j. compeleceng.2020.106903.
40. R. Saluja and M. Rai, “Analysis of Existing ML Techniques for Students Success Prediction,” 2022 Seventh
International Conference on Parallel, Distributed and Grid Computing (PDGC), Solan, Himachal Pradesh, India, 2022, pp. 507-512, doi: 10.1109/ PDGC56933.2022.10053236.