Concepedia

Publication | Closed Access

A Novel Approach for Classifying Diabetes’ Patients Based on Imputation and Machine Learning

23

Citations

14

References

2020

Year

Abstract

Since the last decade, many research studies has been conducted on machine learning-based diabetes disease prediction using diagnostic measurement. However, the main challenge in machine learning-based diabetes disease prediction is the preprocessing of data, which contains, in most cases missing values and outliers. For data analytics and accurate prediction, data cleansing is highly desired and recommended. The goal of this study is to predict diabetic patients using realworld datasets. The proposed approach is based on three main steps: cleansing, modelling, and storytelling. In the first step, an imputation process is conducted to remove missing values. Then, k-nearest neighbor's algorithm is applied to classify patients. To evaluate the performance of the proposed approach, two criteria, namely the F1 score and the Receiver Operating Characteristic (ROC) has been used. F1 score and ROC curve show a clear distinction between diabetic and nondiabetic patients.

References

YearCitations

Page 1