Title
|
|
|
|
Unsupervised concept extraction from clinical text through semantic composition
| |
Author
|
|
|
|
| |
Abstract
|
|
|
|
Concept extraction is an important step in clinical natural language processing. Once extracted, the use of concepts can improve the accuracy and generalization of downstream systems. We present a new unsupervised system for the extraction of concepts from clinical text. The system creates representations of concepts from the Unified Medical Language System (UMLS®) by combining natural language descriptions of concepts with word representations, and composing these into higher-order concept vectors. These concept vectors are then used to assign labels to candidate phrases which are extracted using a syntactic chunker. Our approach scores an exact F-score of.32 and an inexact F-score of.45 on the well-known I2b2-2010 challenge corpus, outperforming the only other unsupervised concept extraction method. As our approach relies only on word representations and a chunker, it is completely unsupervised. As such, it can be applied to languages and corpora for which we do not have prior annotations. All our code is open-source and can be found at www.github.com/clips/conch. |
| |
Language
|
|
|
|
English
| |
Source (journal)
|
|
|
|
Journal of biomedical informatics. - New York
| |
Publication
|
|
|
|
New York
:
2019
| |
ISSN
|
|
|
|
1532-0464
| |
DOI
|
|
|
|
10.1016/J.JBI.2019.103120
| |
Volume/pages
|
|
|
|
91
(2019)
, p. 1-11
| |
ISI
|
|
|
|
000525688200013
| |
Full text (Publisher's DOI)
|
|
|
|
| |
|