Title
|
|
|
|
Exploring and understanding neural models for clinical tasks
| |
Author
|
|
|
|
| |
Abstract
|
|
|
|
We explore the use of deep learning techniques to understand medical text in tasksrelated to understanding patient conditions. Additionally, given the sensitive nature ofthe medical domain, an important requirement is to ensure that the models we trainare not biased due to nuances of the data and training algorithms. Hence, we alsodevelop several interpretability techniques to investigate the information learned bythese models during the training process. In Chapter 1, we describe our efforts towards learning a holistic patient view in the formof semantic representations of patients from medical notes. We evaluate theserepresentations on the tasks of primary diagnostic and procedural category prediction,as well as in-hospital, 30 days post-discharge, and 1 year post-discharge mortalityprediction. We find that induced neural representations significantly outperformbag-of-features based patient representations when there are few cases to learn from.Furthermore, we find that the most frequent terms can be recovered most easily,although several medical terms are deemed highly important for classification. In Chapter 2, we further develop techniques to find more complex feature-interactionpatterns learned by feedforward neural networks. We explore the use of if-then-elserules as explanation patterns, where these patterns are learned by training a rulelearning algorithm to estimate the outputs of a given neural network from a set of mostimportant features for it. With our method, we find that explanations learned in theform of hierarchical if-then-else rules have a high precision, although the recall can belower, providing an overall explanation faithfulness of nearly 80%. However, due to thehierarchical nature of the if-then-else rules, the explanations can sometimes becomecomplex to parse. In Chapter 3, we adapt our methods to find non-hierarchical decision lists that cancapture longer contexts in the form of skipgrams and explain more complex sequentialarchitectures that use word embeddings as inputs. We successfully validate theexplanation patterns by evaluation on a synthetic textual dataset, which we mimic to besimilar to real clinical corpora. We find that the explanations correspond to the rules weuse to label the synthetic dataset. Later, we discuss the patterns learned by our modelson the real clinical task of sepsis classification at the time of patient discharge. From theexplanation rules, we find that lexical mentions of sepsis are often used for predictingsepsis when discharge notes are included, whereas more complex patterns involvingdifferent patient conditions are assessed when we exclude discharge notes fromanalysis. In a sentiment analysis task, we find that our methods are more scalable, andthat our explanations provide more context and are more faithful than existingstate-of-the-art rule-based explanation methods, although they also have higherexplanation complexity. Finally, in Chapters 4 and 5, we analyze how existing state-of-the-art models trained ontask-specific data fail on several instances that require medical domain knowledge inthe tasks of medical language inference and relation classification between medicalentities mentioned in text. We focus on improving these models by supplementingdomain information from textual medical corpora. For the task of medical languageinference, we explore methods for augmenting in-domain knowledge both implicitly viafurther language modeling, and explicitly by adding relevant background information tothe instance. For relation extraction, we augment a feature that quantifies the pointwisemutual information (PMI) between medical entities to provide additional background forrelation classification. In both the setups, we find that this task of integratingbackground knowledge from textual corpora is extremely complex. We do not see anysignificant improvements with our methods, leaving this question open for furtherresearch. To sum up, we develop several state-of-the-art techniques for medical textunderstanding, while also enabling understanding of these techniques to make theblack box algorithms more transparent, and to ensure capabilities for error analysis andfairness of these algorithms. This line of work will not only improve existing models fornatural language understanding, but will also promote a wider adoption of successfulmethods developed in academic research environments for more critical clinicaldecision making in hospitals. |
| |
Language
|
|
|
|
English
| |
Publication
|
|
|
|
Antwerp
:
University of Antwerp, Faculty of Arts, Department of Linguistics
,
2021
| |
Volume/pages
|
|
|
|
155 p.
| |
Note
|
|
|
|
:
Daelemans, Walter [Supervisor]
:
Suster, Simon [Supervisor]
| |
Full text (open access)
|
|
|
|
| |
|