By Giovanni Cinà and Michele Tonutti (Data Scientists at Pacmed)
Machine learning promises to be a technology that could help medical professionals and patients to achieve better health outcomes. This value does not come easily, and the implementation of machine learning in healthcare is a precarious issue. There should be constant consideration of the boundary conditions under which it must be implemented in order to be of sustainable and scalable value for everyone. Machine learning is not good and safe by definition, and responsible deployment is the key to success for the patient. There are risks of empty correlations, biases and limitations in the data, (un)interpretability of algorithms, and unilateral interests. One way of mitigating these risks is a deep understanding of the clinical and personal context, by working closely together with all relevant stakeholders in the medical field.
What is also important is to have an active and curious mind, which enables you to be on the cutting edge of available knowledge at all time. Therefore, we find it important to objectively determine what are the promising developments to keep an eye on. Amongst others, we do this by internally sharing and discussing papers that we find interesting. We would like this knowledge to become publicly available, in order to help each other in learning about the right and responsible way of implementing machine learning models.
This blog post is the first edition of a collection of articles that we found particularly interesting and useful. The theme of this edition is the handling of (and the reusing of) electronic health records (EHR) from a technical perspective. Being often a bottleneck in the deployment of machine learning, this is a hot topic.
EHRs are digital records of a patient’s health information taken over time, generated by one or more visit to hospitals, clinics, GPs, or other care institutions. They include results of lab tests and measurements, such as heart rate, blood pressure, etc., as well as medications, diagnosis, and the notes written by doctors and nurses. The volume of EHR data recorded for each patient is vast; this amount of information presents valuable opportunities to develop data-driven decision support tools, but also present great challenges.
EHR data is often heterogeneous, meaning that most measured parameters are widely different from each other; sometimes incoherent, since even a single parameter may be measured differently over time or by different actors; sparse, with many missing values; noisy, i.e. many measurements are wrongly measured or not validated; biased, for instance due the fact that more data might be present for patients with severe conditions, or influenced by the beliefs and decisions taken by the clinical staff. Finally, EHRs can change shape dramatically across institutions, and they are highly sensitive due to privacy issues.
These limitations have hindered the application of AI techniques in health-care; yet, there have been noteworthy developments. At Pacmed we are keen to implement cutting-edge solutions that allows us to build models that are not only precise and useful, but also scalable, generalizable, reliable and interpretable.
# 1: A general overview
Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review
By: Cao Xiao, Edward Choi, Jimeng Sun
With the development of advanced Machine Learning techniques, the promise to unlock the value contained in EHR data seems more concrete than ever. However, as usual with subfields of AI that gather more and more attention, the amount of papers on the topic is growing at a fast pace and it becomes hard to separate the wheat from the chaff. In order to navigate this literature, our first suggestion is an overview paper that highlights the main trends. It reviews 98 articles describing applications of deep learning on EHR data in the last 8 years, classifying them by task performed and architecture used.
Why it matters: Besides providing a general overview and pointing to many good papers, we choose this article because it touches on the challenges presented by EHR data and discusses some solutions proposed by the research community. A good starting point!
#2: Producing synthetic data
Generating Multi-label Discrete Patient Records using Generative Adversarial Networks
By: Edward Choi, Siddharth Biswal, Bradley Malin, et al.
As mentioned, one of the main issues when using EHR data is the concern over privacy, since the disclosure of medical information can be highly problematic. As a consequence, it is difficult to obtain and pool together large datasets, which in turn hampers the development of data-hungry Machine Learning algorithms.
A number of researchers is trying to tackle this problem with the generation of synthetic data. The idea is roughly the following: to train a model with real data to produce synthetic data that is “as good as real”. We can then share the synthetic data safely, since it is hard to obtain information about real patients from it, and we can also use synthetic data to augment our datasets. The techniques to generate synthetic data are still developing and we are far from safely using them in health-care, nevertheless we find this an interesting niche.
Why it matters: The paper we highlight applies Generative Adversarial Networks to generate synthetic patient records. Although limited to the generation of EHR data with discrete variables, the article offers some promising results along with the introduction of a new technique called minibatch averaging. Intriguing if data privacy is your cup of tea.
#3: Deep learning for scalable models
Scalable and accurate deep learning with electronic health records
By: Alvin Rajkomar, Eyal Oren, Kai Chen, et al.
The number of different parameters stored in EHRs is often in the hundreds. Choosing which ones to use to train machine learning models is often difficult and requires a certain degree of trial and error. The paper shows that deep learning models can bypass the need to choose predictive features in advance and, in addition, that they can be used to learn the most optimal representation of clinical data from the data itself, for instance using autoencoders. From a data science perspective, there are clear advantages in not having to select features manually and to automatically find the most optimal representation of input data: models can be built and computed much faster, and often with an increase in performance compared to using the whole set of available parameters. Extensive collaboration with medical professionals will still be needed, but it could mean a gain in efficiency.
Why it matters: The contributions of this paper are substantial, as it reports improved performance of their models on a number of very different tasks in the Intensive Care Unit, namely prediction of mortality, readmission, length of stay and discharge diagnosis. Being able to use similar or even identical model architectures and data representations for different problems is extremely valuable to speed up development time, increase robustness, and quickly scale software across different hospitals. We must however not forget that medical expertise is often crucial to the development of a successful decision support tool. The features selected by the model might be heavily biased towards a certain clinical population; or they might be heavily predictive but measured very infrequently, thus possibly making them unsuitable for urgent predictions.
#4: Learning personalized data representations
Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record
By: Jinghe Zhang, Kamran Kowsari, James H. Harrison, et al.
This paper, from the University of Virginia, also employs representation learning — the set of techniques to learn the optimal way to represent the data– together with a method called attention to both improve performance and increase interpretability of deep neural networks for clinical decision making. Their framework, named Patient2Vec as a reference to Google’s Natural Language Processing algorithm Word2Vec, aims at capturing the complex relationships between clinical events and at learning a personalized representation of each patient’s history. Treating medical events (prescriptions, interventions, lab results, etc.) like words in a sentence –hence the reference to Natural Language Processing– data is represented in sequences and subsequences in a way that highlights the clinical and temporal relationship between visits and allows to consider irregularly-sampled sequential data in a meaningful manner. The addition of the attention technique is of particular interest. As its name suggests, it is inspired by human (visual) attention: when performing a task, we tend to focus on one or very few specific things at once, those that are most important for the completion of the task. Similarly, a neural network with an attention layer will focus on the sections of a sequence that are most relevant to predict the outcome. The weights of these attention layers can be visualized to understand what these sections are and how they influence the predicted outcome.
Why it matters: The paper highlights how data collected in practice about the patient’s phenotype (physical and medical characteristics) and clinical history can lead to models that improve both the outcome and patient’s experience, as well as lead to lower costs and improve overall efficiency. In addition, the successful application of cutting-edge NLP and Computer Vision techniques to numerical sequential data goes to show how much can be gained from multidisciplinarity within the field of machine learning.
#5: Learning from suboptimal clinical decisions
The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care
By: Matthieu Komorowski, Leo A. Celi, Omar Badawi, et al.
One of the most interesting approaches to solve the problem of sparse data is reinforcement learning (RL). RL applications in the medical domain are particularly appealing because models can learn from suboptimal observations — meaning treatment choices that turned out to be not the best possible solution. The authors of this paper used data from treatment decisions obtained in practice in the Intensive Care Unit to improve treatment decision for sepsis, the leading cause of death in hospitals worldwide. The clinical events in each patient’s history were modelled using a Markov Decision Process, which, at a statistical level, adds an element of stochasticity (i.e. random chance) to the outcomes of treatment choices, visits, and measurements. While this is not a novel approach per se, the results are extremely promising: mortality was lowest when the treatment chosen by the clinician matched the one chosen by the model. In addition, their solution allowed for intuitive interpretation of the results by adding an extra layer consisting in a random forest classifier, thereby providing the possibility to explain the decision taken by the model by looking at the importance of each feature.
Why it matters: Aside from the technical achievements, this paper touches upon a crucial point in machine learning development in healthcare: how to evaluate a model’s performance. It is pivotal to measure the effect on both patients and doctors: there is no need for a complex algorithm with extremely good performance if it provides no benefit to a patient’s life. Moreover, the addition of an interpretable classifier is of particular relevance in the medical sector, since it is crucial that the clinical staff fully understands an algorithm’s output. Lack of interpretability is one of the major downsides of deep learning, and alternative ways to visualize and explain results and feature importance are always valuable.
What do you think of this selection? We are keen on hearing your take on this subject: do share your comments here or reach out to us on Twitter. Last but not least, let us know your ideas for the next topic of our newsletter!
 Xiao C, Choi E, Sun J. Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. Journal of the American Medical Informatics Association. 2018 Jun 8.
 Choi E, Biswal S, Malin B, Duke J, Stewart WF, Sun J. Generating multi-label discrete patient records using generative adversarial networks. arXiv preprint arXiv:1703.06490. 2017 Mar 19.
 Rajkomar A, Oren E, Chen K, Dai AM, Hajaj N, Hardt M, Liu PJ, Liu X, Marcus J, Sun M, Sundberg P. Scalable and accurate deep learning with electronic health records. npj Digital Medicine. 2018 May 8;1(1):18.
 Zhang J, Kowsari K, Harrison JH, Lobo JM, Barnes LE. Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record. IEEE Access. 2018;6:65333–46.
 Komorowski M, Celi LA, Badawi O, Gordon AC, Faisal AA. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nature Medicine. 2018 Nov;24(11):1716.