JOMPAC

Journal of Medicine and Palliative Care (JOMPAC) is an open access scientific journal with independent, unbiased, and double-blind review under international guidelines. The purpose of JOMPAC is to contribute to the literature by publishing articles on health sciences and medicine.

EndNote Style
Index
Original Article
Optimized machine learning based predictive diagnosis approach for Diabetes mellitus
Aims: Diabetes mellitus is a metabolic disease caused by elevated blood sugar. If this disease is not diagnosed on time, it has the potential to pose a risk to other organs and tissues. Machine learning algorithms have started to preferred day by day in the detection of this disease, as in many other diseases. This study suggests a diabetes prediction approach incorporating optimized machine learning (ML) algorithms.
Methods: The framework presented in this study starts with the application of different data pre-processing processes. Random forest (RF), support vector machine (SVM), K-nearest neighbor (K-NN) and decision tree (DT) algorithms are used for classification. Grid search is utilized for hyperparameter optimization of algorithms. Different performance evaluation measures are used to find the algorithm that best predicts diabetes. PIMA Indian dataset (PID) is chosen for testing the experiments. In addition, it is investigated to what extent the attributes in the data set affect the result using Shapley additive explanations (SHAP) analysis.
Results: As a result of the experiments, the RF algorithm achieved the highest success rate with 89.06%, 84.33%, 84.33%, 84.33% and 0.88% accuracy, precision, sensitivity, F1-score and AUC scores. As a result of the SHAP analysis, it is found that the “Insulin”, “Age” and “Glucose” attributes contributed the most to the prediction model in identifying patients with diabetes.
Conclusion: The hyperparameter optimized RF approach proposed in the framework of the study provided a good result in the prediction and diagnosis of diabetes mellitus when compared with similar studies in the literature. As a result, an expert system can be designed to detect diabetes early in real time using the proposed method.


1. American Diabetes Association. Diagnosis and classification ofdiabetes mellitus. Diabetes Care. 2011;37 (Suppl_1):62-S69.
2. Priya G, Kalra S, Dasgupta A, Grewal E, Diabetes insipidus: a pragmaticapproach to management. Cureus. 2021;13(1):e12498- e12498.
3. Prabhakar PK, Pathophysiology of secondary complications ofdiabetes mellitus. Pathophysiology. 2016;9(1):32-36.
4. Sun H, Saeedi P, Karuranga S, et al. IDF Diabetes Atlas: Global,regional and country-level diabetes prevalence estimatesfor 2021 and projections for 2045. Diabetes Res Clin Pract.2022;183:109119.
5. Sönmez A, Ozdoğan O, Arıcı M, et al. Diyabette kardiyovaskülerve renal komplikasyonların önlenmesi, tanısı ve tedavisi içinEndokrinoloji Kardiyoloji Nefroloji (ENKARNE) Uzlaşı Raporu.Turk J Endocrinol Metab. 2021;25(4):392-411.
6. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health andmedicine. Nat Med. 2022;28(1):31-38.
7. Ghaffar Nia N, Kaplanoglu E, Nasab A. Evaluation of artificialintelligence techniques in disease diagnosis and prediction.Discover Artificial Intelligence. 2023;3(1):5. doi:10.1007/s44163-023-00049-5
8. Ali YA, Awwad EM, Al-Razgan M, Maarouf A, Hyperparametersearch for machine learning algorithms for optimizing thecomputational complexity. Processes. 2023;11(2):349.
9. Birjais R, Mourya AK, Chauhan R, Kaur H, Prediction anddiagnosis of future diabetes risk: A machine learning approach.SN Appl Sci. 2019;9(1):1-8.
10. Tigga, NP, Garg S. Prediction of type 2 diabetes using machinelearning classification methods. Procedia Comput Sci. 2020;167:706-716.
11. Singh, N, Singh P. Stacking-based multi-objective evolutionaryensemble framework for prediction of diabetes mellitus.Biocybern Biomed Eng. 2020;40(1):1-22.
12. Lyngdoh AC, Choudhury NA, Moulik S. Diabetes diseaseprediction using machine learning algorithms. 2020 IEEE-EMBSConference on Biomedical Engineering and Sciences (IECBES),Langkawi Island, Malaysia. 2021:517-521.
13. Kumari S, Kumar D, Mittal M, An ensemble approach forclassification and prediction of diabetes mellitus using soft votingclassifier. Int J Cog Comp in Eng. 2021;2:40-46
14. Chang V, Ganatra MA, Hall K, Golightly L, Xu QA. An assessmentof machine learning models and algorithms for early predictionand diagnosis of diabetes using health indicators. HealthcareAnalytics. 2022;2(1):100118.
15. Yakut Ö. Diabetes prediction using colab notebook-basedmachine learning methods. IJCESEN. 2023;9(1):36-41.
16. Kluyver T, Ragan-Kelley B, Pérez F, et al. Jupyter Notebooks—Apublishing format for reproducible computational workflows. InPositioning and Power in Academic, Players, Agents and Agendas;IOS Press: Amsterdam, The Netherlands. 2016;pp. 87-90.
17. The Python Library Reference, Release 3.8.8, Python SoftwareFoundation. Available online: https://www.python.org/downloads/release/python-388/ (accessed on 10 May 2023).
18. Kumar VH. Python libraries, development frameworksand algorithms for machine learning applications. IJERT.2018;7(4):2278-0181.
19. Pima Indians Diabetes Database | Kaggle, https://www.kaggle.com/datasets/uciml/pima-indiansdiabetes-database/ Accessed 09May. 2023.
20. Joshi, AP, Patel BV, Data preprocessing: The techniques forpreparing clean and quality data for data analytics process.Orient. J Comput Sci Technol. 2021;13(0203):78-81.
21. Ahsan MM, Mahmud MP, Saha PK, Gupta KD, Siddique Z. Effectof data scaling methods on machine learning algorithms andmodel performance. Technologies. 2021;9(3):52.
22. Venkatesh B, Anuradha J, A review of feature selection and itsmethods. Cybern Inform Tech (CIT). 2019;19(1):3-26.
23. Jamaluddin NSA, Kadir SA, Abdullah A, Alias SN, Learningstrategy and higher order thinking skills of students in accountingstudies:Correlation and regression analysis. Univers J Educ.2020;8(3C):85-90.
24. Prusty S, Patnaik S, Dash SK. SKCV: Stratified K-fold cross-validation on ML classifiers for predicting cervical cancer. FrontNanosci. 2022;4:972421.
25. Ibrahim I, Abdulazeez A, The role of machine learning algorithmsfor diagnosing diseases. J App Sci Techol Trends. 2021;2(01):10-19.
26. Belete DM, Huchaiah MD, Grid search in hyperparameteroptimization of machine learning models for prediction of HIV/AIDS test results. Int J Comput Appl. 2022;44(9):875-886.
27. Nohara Y, Matsumoto K, Soejima H, Nakashima N, Explanationof machine learning models using shapley additive explanationand application for real data in hospital. Comput MethodsPrograms Biomed. 2022;214:106584.
28. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machinelearning in Python. J Mach Learn Res. 2011;12:2825-2830.
Volume 4, Issue 4, 2023
Page : 270-276
_Footer