Causal inference for medical AI: what can we learn from observational data of COVID-19 patients?


At the time of writing, it has been more than a year since the onset of the COVID-19 pandemic. Among all the ways in which our lives have been changed, we have all been presented with a very pressing question: how do we find a cure?

Collecting data of COVID-19 patients

To be able to say anything at all, we must first collect the data of COVID-19 patients. In the Netherlands, an unprecedented collaboration between Intensive Care Units (ICU) across the country, led by the Amsterdam University Medical Center, resulted in the collection of a vast database containing the information of COVID-19 patients from the ICUs of dozens of hospitals. You can find a description of the dataset in this letter, published on the Journal of Intensive Care Medicine.

Estimating treatment effect

The problem of assessing the efficacy of a treatment is always about answering the “what if” question, aka the counterfactual question. Let’s look at an example.

The headache-aspirin example

(Disclaimer: this example is completely made up for explanatory purposes, it does not contain any medical advice.)

The subject of this example, headache. Incidentally, also a potential side-effect of this article.

Back to the main story

In summary, to be able to say something reasonable from observational data we need to

  • identify two groups of patients, one that received treatment and one that did not;
  • check that the two groups are very similar, meaning that the distributions of parameters are similar;
  • make sure that all variables influencing treatment have been measured;
  • check that each patient could have received either the treatment or the non-treatment.

What can we do then about COVID-19

Now that the background is out of the way, we can see that there are two key questions remaining:

  1. Is there a treatment for COVID-19 for which the above conditions are fulfilled on the Dutch data and we can estimate its efficacy?
  2. What are the best methods to carry out this estimation?



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Pacmed builds decision support tools for doctors based on machine learning that makes sure patients only receive care that has proven to work for them!