27 августа 2025

Machine Learning for Cancer Risk Prediction Using Electronic Health Records: An Automated Screening Framework

65

Andrey D. Ermak, Denis V. Gavrilov, Roman E. Novitskiy, Alexander V. Gusev, Yuriy I. Komarov, Anna E. Andreychenko.

Introduction

Timely cancer diagnosis significantly improves patient survival rates while reducing healthcare costs through decreased hospitalizations and increased likelihood of remission. There remains an urgent need for practical, clinically interpretable screening tools capable of effectively identifying at-risk patients to enable early intervention

Aim. To develop and externally validate machine learning models for predicting 18-month cancer risk using real-world clinical data.

Materials and Methods. The study analyzed anonymized electronic health records (EHR) from 1.3 million patients across 36 Russian regions. We examined multiple predictors including sex, age, monthly weight change rate, erythrocyte sedimentation rate, hemoglobin levels, body mass index, and clinically significant comorbidities. The primary outcome was any cancer diagnosis classified under ICD-10 codes C00-C96, which occurred in 177,384 patients. We conducted comparative analysis of five machine learning approaches: Logistic Regression, LightGBM Classifier, Random Forest, Linear Discriminant Analysis, and Naïve Bayes. External validation was performed using two independent geographically distinct patient cohorts (n=29,681 and n=25,145) to evaluate model generalizability across diverse populations.

Download pdf|1,5 МБ

Andrey D. Ermak, Denis V. Gavrilov, Roman E. Novitskiy, Alexander V. Gusev, Yuriy I. Komarov, Anna E. Andreychenko. Machine learning for cancer risk prediction using electronic health records: An automated screening framework. Voprosy Onkologii = Problems in Oncology. 2025; 71(4): 00-00.-DOI: 10.37469/0507-3758-2025- 71-4-OF-2258

Share

Subscribe to our newsletter

Are you interested in digital healthcare and artificial intelligence for medicine? Join our mailing list!

Join us

We are in social networks