Covid-19 has affected a lot of people globally and is solely responsible for the massive number of fatalities. Thoughtful resource allocation and early recognition of high-risk patients is the need of the hour with this pandemic looming large. However, effective methods to meet these requirements are lacking. Machine Learning (ML) models can forecast the probability of serious illness or fatality in Covid-19 patients, which can help health care professionals manage and provide better care for individuals who are infected with the virus, according to a study published in Journal of Medica Research (JMIR).

Model Development and Validation

The objective of this study was to examine and analyze the Electronic Health Records (EHRs) of patients who tested positive for the virus and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop ML models for making forecasts about the hospital duration of the patients over clinically meaningful time periods depending on patient characteristics during admission; and to evaluate the performance of these models at various hospitals and time points.

Identifying the characteristics of patients that drive the disease across large patient numbers is essential, as it could enable hospitals and providers to forecast disease trajectories and assign crucial resources. Initial efforts to create ML models for this purpose have been restricted by small sample sizes, shortage of generalization to diverse populations, differences in feature missingness, and probability for bias.

baseline comparator models and Extreme Gradient Boosting (XGBoost) were used to foresee in-hospital mortality and important events at time windows of 3, 5, 7, and 10 days from admission. The study population comprised harmonized EHR data from five hospitals in New York City for 4098 COVID-19–positive patients admitted between the dates March 15 to May 22, 2020. The models were first tried on patients from one hospital (n=1514) before or on May 1, validated externally on patients from four other hospitals (n=2201) before or on May 1, and then validated on all patients after May 1 (n=383). The models performed best at the one-week mark when they were able to correctly flag the most critical events while returning the fewest false positives.

At this juncture in the study, fast breathing, acute kidney injury, elevated lactate dehydrogenase (LDH), and high blood sugar indicating tissue damage were the best drivers in forecasting critical illness. Blood level imbalance, older age, and C-reactive protein levels which indicated inflammation were the best drivers in forecasting mortality. These results show not only the capability of ML to predict patient outcomes but also huge efforts of the healthcare sector in creating these technologies to fit the current pandemic.

As the pandemic crisis continues, provider organizations and researchers will work to further develop analytics tools that can help hospitals triage patients and manage care.

High-performing predictive models using ML to boost and improve the care of our patients at Mount Sinai has been created,” states Girish Nadkarni, MD, Assistant Professor of Medicine (Nephrology) at the Icahn School of Medicine, Clinical Director of the Hasso Plattner Institute for Digital Health at Mount Sinai, and Co-Chair of MSCIC.


A method that recognizes the important health markers that determine probability estimates for acute care prognosis and can be used by health institutions across the world to improve care decisions at both the physician and hospital level, and more effectively manage patients with Covid-19 is the future.