This research article published by Taylor & Francis Online, 2022
Presumptive treatment and self-medication for malaria have been used in limited-resource countries. However, these approaches have been considered unreliable due to the unnecessary use of malaria medication. This study aims to demonstrate supervised machine learning models in diagnosing malaria using patient symptoms and demographic features. Malaria diagnosis dataset extracted in two regions of Tanzania: Morogoro and Kilimanjaro. Important features were selected to improve model performance and reduce processing time. Machine learning classifiers with the k-fold cross-validation method were used to train and validate the model. The dataset developed a machine learning model for malaria diagnosis using patient symptoms and demographic features. A malaria diagnosis dataset of 2556 patients’ records with 36 features was used. It was observed that the ranking of features differs among regions and when combined dataset. Significant features were selected, residence area, fever, age, general body malaise, visit date, and headache. Random Forest was the best classifier with an accuracy of 95% in Kilimanjaro, 87% in Morogoro and 82% in the combined dataset. Based on clinical symptoms and demographic features, a regional-specific malaria predictive model was developed to demonstrate relevant machine learning classifiers. Important features are useful in making the disease prediction.