COSTECH Integrated Repository

Data driven approach for predicting student dropout in secondary schools

Show simple item record

dc.creator Mduma, Neema
dc.date 2020-09-14T07:46:45Z
dc.date 2020-09-14T07:46:45Z
dc.date 2020-06
dc.date.accessioned 2022-10-25T09:15:30Z
dc.date.available 2022-10-25T09:15:30Z
dc.identifier https://dspace.nm-aist.ac.tz/handle/20.500.12479/898
dc.identifier.uri http://hdl.handle.net/123456789/94539
dc.description A Thesis Submitted in Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Information and Communication Science and Engineering of the Nelson Mandela African Institution of Science and Technology
dc.description Student dropout is among the challenges that face most schools in developing countries particularly in Africa. In Tanzania alone, student dropout in secondary schools is pronounced to be around 36%. In addressing the student dropout problem, a thorough understanding of the fundamental factors that cause the student dropout is essential. Several researchers have identified and proposed causes, methods and strategies that will help to reduce or stop the student dropout problem, however, most of the proposed solutions didn’t show promising results and the students dropout trend continue to increase over time. This study focused on developing a data driven approach that will help to identify and predict students who are at risk of dropping out of school in order to facilitate an intervention program as an active measure in eliminating the problem of dropout in Tanzania. In doing so, (a) 122 research articles were examined, (b) 4 focus group discussions and 2 round table surveys with 38 respondents from 5 districts (Arusha, Mbeya, Kisarawe, Rufiji and Nzega) were conducted, and (c) 3 datasets from Tanzania and India were used in order to identify factors that contribute significantly to student dropout problem, disclose the best classifier from the commonly used classifiers (Logistic Regression, Random Forest, K-nearest Neighbor and Multilayer Perceptron) and assessing the data balancing techniques for predictive performance of the model. Results revealed that, most of the respondents mentioned students’ gender, age, parent’s income, number of qualified teachers and remoteness as the main contributing factors to the students’ dropout problem in secondary schools. Furthermore, results from the examined articles indicated that, most studies conducted in developing countries focused on the social aspects of student dropout, and a paltry mentioned the use of other approaches such as machine learning. Nevertheless, results from data driven approach development shows that the Logistic Regression and Multilayer perceptron achieved the highest performance when over-sampling technique was employed. Also, the hyper parameter tuning improved the algorithm's performance compared to its baseline settings, and stacking of the classifiers improved the overall predictive performance of the developed approach. The study, therefore, recommends the developed approach to be considered by relevant authorities in identifying and predicting students at risk of dropping out for early intervention, planning and informative decisions making on addressing the student dropout problem.
dc.format application/pdf
dc.language en
dc.publisher NM-AIST
dc.rights Attribution-NonCommercial-ShareAlike 4.0 International
dc.rights http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subject Research Subject Categories::TECHNOLOGY
dc.title Data driven approach for predicting student dropout in secondary schools
dc.type Thesis


Files in this item

Files Size Format View
PhD_ICSE_Neema_Mduma_2020.pdf 7.043Mb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search COSTECH


Advanced Search

Browse

My Account