Predictive Analytics and ML Help in the Fight Against COVID-19


Various forecasting exercises and predictive model tools have been developed by different organizations, such as academic institutions, research groups, hospitals, and consulting companies, to offer support to health systems associated with COVID-19 strategic decision making, planning, and health policy formulation. Predictive models can help in estimating the number of COVID-19 cases and deaths; the resources needed, for example hospital patient beds and ICU beds; and the requirement for supplies, such as Personal Protective Equipment (PPE). Because predictive models for COVID-19 depend on a rapidly changing situation and underlying data, they generate results that may vary repeatedly as data is updated and revised. Nevertheless, predictive models are meaningful and can offer crucial insights to policymakers.

Researchers Use ML and Predictive Models to Help Diagnose COVID

A recent JAMA study highlighted and flagged the risk factors related with the severity of COVID-19 in individuals using ML models and predictive analytics. By studying COVID-19 severity and risk factors over time, providers can use AI technology to forecast the clinical severity and provide better care for patients.

To address the gaps in data, the National COVID Cohort Collaboration (N3C) has been formed to accelerate the understanding of COVID-19 and set up a novel approach for collaborative data sharing and analytical data during the pandemic.

The N3C is comprises members from the National Institutes of Health Clinical and Translational Science Awards Program and its Center for Data to Health, the National Patient-Centered Clinical Research Network, the IDeA Centers for Translational Research, TriNetX, the Observational Health Data Sciences, and Informatics network, and the Accrual to Clinical Trials network.

The N3C has created a report that provides a detailed clinical description of the largest region of US COVID-19 cases and the representative controls till date. The region under study was ethnically and racially diverse and distributed geographically. The COVID-19 severity and associated clinical and demographic factors were assessed over time and ML was used to develop a clinically useful model that can accurately predict the severity using data from the first day of hospital admission.

Patients were graded based on the severity scale recognized by the World Health Organization for COVID-19 and the demographic data and characteristics. The differences between the groups were evaluated and tracked using multivariable logistic regressions.

The mortality rate of hospitalized patients was close to 11.6 percent overall and had decreased from 16.4 percent in March to April 2020, to 8.6 percent in September to October 2020.

Utilizing the 64 inputs present on the first hospital day, this study forecasted a severe clinical course using random forest and XGBoost models (area under the receiver operating curve = 0.87 for both) that were stable over time. The factor most strongly associated with clinical severity was pH; this result was consistent across ML methods,” the study explained. According to the research collated in the study, it has been concluded that COVID-19 mortality decreased during the year 2020 and patient demographic characteristics and comorbidities were associated to a higher clinical severity.


Units of analysis, such as patient or population subgroups, can have a broad degree of heterogeneity, which mandates various sets of assumptions to be applied for each group. The degree of accuracy around a statistical estimate and how much deviation from that estimate is the parameter uncertainty (statistical error). Finally, the model’s results can be evaluated by the ability to generalize them to the wider target population(generalizability). Performing “sensitivity analyses” is key to better understanding uncertainty.