Publication
Title
Unsupervised concept extraction from clinical text through semantic composition
Author
Abstract
Concept extraction is an important step in clinical natural language processing. Once extracted, the use of concepts can improve the accuracy and generalization of downstream systems. We present a new unsupervised system for the extraction of concepts from clinical text. The system creates representations of concepts from the Unified Medical Language System (UMLS®) by combining natural language descriptions of concepts with word representations, and composing these into higher-order concept vectors. These concept vectors are then used to assign labels to candidate phrases which are extracted using a syntactic chunker. Our approach scores an exact F-score of.32 and an inexact F-score of.45 on the well-known I2b2-2010 challenge corpus, outperforming the only other unsupervised concept extraction method. As our approach relies only on word representations and a chunker, it is completely unsupervised. As such, it can be applied to languages and corpora for which we do not have prior annotations. All our code is open-source and can be found at www.github.com/clips/conch.
Language
English
Source (journal)
Journal of biomedical informatics. - New York
Publication
New York : 2019
ISSN
1532-0464
DOI
10.1016/J.JBI.2019.103120
Volume/pages
91 (2019) , p. 1-11
ISI
000525688200013
Full text (Publisher's DOI)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 25.03.2019
Last edited 08.11.2024
To cite this reference