Comparative Evaluation of Machine Learning Models for Chronic Kidney Disease Diagnosis in Resource-Limited Healthcare Settings Using Clinically Relevant Low-Cost Biomarkers
DOI:
https://doi.org/10.65420/sjphrt.v2i1.64Keywords:
Chronic Kidney Disease, Machine Learning, Resource-Limited Healthcare, Random Forest, Low-Cost Biomarkers, Predictive Analytics.Abstract
This research addresses the critical gap between advanced diagnostic technologies and the operational constraints of healthcare systems in resource-limited settings. Chronic Kidney Disease (CKD) represents a growing global health burden, yet early detection remains a challenge in underserved regions due to the high cost of specialized diagnostic tools. This study presents a comparative evaluation of five prominent machine learning algorithms—Random Forest, Gradient Boosting, Logistic Regression, Support Vector Machines (SVM), and Decision Trees—to develop a high-precision diagnostic framework. Unlike conventional models that rely on expensive parameters, this study prioritizes 12 low-cost, clinically relevant biomarkers, such as serum creatinine, albumin levels, and hemoglobin, which are routinely available in basic clinical laboratories. A key innovation of this research is the implementation of a "Missing Indicator" preprocessing strategy, which transforms incomplete clinical data into robust diagnostic features, ensuring the model remains functional in real-world environments where data gaps are common. The experimental results demonstrate that the Random Forest model achieved superior predictive performance, with an accuracy exceeding 99%, outperforming both traditional classifiers and more complex architectures in terms of sensitivity and computational efficiency. The study concludes that integrating machine learning with routine, low-cost biomarkers can significantly democratize early CKD diagnosis, providing a scalable and cost-effective solution for improving patient outcomes in developing healthcare infrastructures. This framework offers a practical pathway for implementing explainable AI tools that align with the economic realities of global health challenges.
This research addresses the critical gap between advanced diagnostic technologies and the operational constraints of healthcare systems in resource-limited settings. Chronic Kidney Disease (CKD) represents a growing global health burden, yet early detection remains a challenge in underserved regions due to the high cost of specialized diagnostic tools. This study presents a comparative evaluation of five prominent machine learning algorithms—Random Forest, Gradient Boosting, Logistic Regression, Support Vector Machines (SVM), and Decision Trees—to develop a high-precision diagnostic framework. Unlike conventional models that rely on expensive parameters, this study prioritizes 12 low-cost, clinically relevant biomarkers, such as serum creatinine, albumin levels, and hemoglobin, which are routinely available in basic clinical laboratories. A key innovation of this research is the implementation of a "Missing Indicator" preprocessing strategy, which transforms incomplete clinical data into robust diagnostic features, ensuring the model remains functional in real-world environments where data gaps are common. The experimental results demonstrate that the Random Forest model achieved superior predictive performance, with an accuracy exceeding 99%, outperforming both traditional classifiers and more complex architectures in terms of sensitivity and computational efficiency. The study concludes that integrating machine learning with routine, low-cost biomarkers can significantly democratize early CKD diagnosis, providing a scalable and cost-effective solution for improving patient outcomes in developing healthcare infrastructures. This framework offers a practical pathway for implementing explainable AI tools that align with the economic realities of global health challenges.

