AI Commons Health and Wellbeing Solutions
Overview
The AI Commons Project is a proof of concept of a new methodology of developing Artificial Intelligence solutions that allows anyone, anywhere to benefit from the possibilities that AI can provide. The project aims to increase/improve the accessibility, reproducibility, contextualization and enhancement of Artificial Intelligence solutions globally and especially in emerging markets.
The project aims to demonstrate how a global community of AI experts can learn and co-create mutually beneficial solutions with the opportunity for cross-county incremental enhancement.
Malaria Classification
Statement of Purpose
Introduction
Data Science Nigeria
Rising Odegua (Data Science Nigeria)
Nelson Ogbeide; Ojeabulu O. Gift; Comfort Igboko; Caleb Emelike; Precious Cadeton
Problem Definition
Malaria is a serious and sometimes fatal disease known to kill thousands of people yearly. The world malaria report in December 2019 shows that 228million cases of malaria were recorded in 2018 with a death estimate of 405 thousand.
People mostly affect are young kids and pregnant women and 93% of the recorded cases in 2018 according to World Health Organization(WHO) were in the the African region
A blood sample is taken from the patient and a trained microscopist examines it with the help of a microscope to detect malaria parasite.
Yes, Malaria parasite detection using machine learning and cancer detection using machine learning have been developed.
Solution
The solution is a malaria detection model. It helps health practitioners in hospitals, clinics and laboratories to quickly detect whether a patient has malaria, given the patient’s microscopic slide smears. The first and only release of the solution was in 2019
Paper: Malaria and Pneumonia Classification
The outcome of the solution is a prediction of whether the patient has malaria or not if a microscopic slide smear is provided as input.
Health practitioners
Patients
A data scientist/ machine learning engineer is needed to build the prediction model and a domain expert (such as a doctor) is needed to provide technical details on how the diseases work and are being identified.
N/A.
More data should be used for training and evaluation.
No. We hope to keep it updated whenever the solution is retrained.
Usage
A simple use case: When a patient shows symptoms of malaria, her microscopic slide smear is fed into the model to predict in seconds the likelihood of the patient having malaria. With this, doctors can help recommend drugs that could help treat the disease at early stage.
Health practitioners.
The microscopic slide smear which is the input is uploaded into the model by a medical practitioner and the prediction of whether the patient has malaria or not is displayed on the screen.
The solution can be made to read user’s incoming text automatically and return a notification appropriately.
Domain and Applications
The solution has not been used in the past it is newly developed.
Dataset
Composition
Collection Process
Preprocessing/Cleaning/Labelling
Uses
Maintenance
Dataset Publicly Available
Malaria Dataset
The dataset contains 27,558 segmented malaria cells with equal instances of parasitised and uninfected cells. The dataset description can be found HERE.
Model
Model Details
Model date: 2019. The model was built using keras, a neural network framework in python for deep learning applications. The type of neural network is a 9-level convolutional neural network.
Basic image resize was performed on the data.
Evaluation
Testing the Solution
For the training, 98% of the dataset was used while 2% was used for testing. Recall, f1_score, precision and accuracy were the the evaluation metrics used . Confusion matrix was used to know how right the algorithm was in predicting the wrong classes.
Testing by Third Party
Result
Result Details
The model was able to predict 75% of the test data correctly and 25% wrongly. The model accuracy, precision, recall, f1_score are 74%, 66%, 98% and 78% respectively.
20 epochs in 421 steps,
The measure of statistics utilized is the classification report available in scikit-learn (python library) which showed the precision, recall , accuracy and f1_score of the model
On training the model, there was an error drop from 30% to 7%.
The average runtime is about 1-2s to predicting each sample result
Environment
Operating system is linux, programming language is python3.7, keras version 1.5
Yes. Django, a python framework was used.
Steps to Reduce the Solution
Result Details
The solution can be reproduced by running all cells in the notebook in the link HERE.
Safety
General
The possible sources of bias/ unfairness wasn’t analyzed.
Explainability
The solution output are not easily explainable.
Concept Drift
The solution was not tested on different unseen data apart from the one used to develop the solution.
Security
It is not suitable for any solution that is not related to malaria.
The solution does not collect user data.