Publication
Title
Discourse lexicon induction for multiple languages and its use for gender profiling
Author
Abstract
We propose a novel way to create categorized discourse lexicons for multiple languages. We combine information from the Penn Discourse Treebank with statistical machine translation techniques on the Europarl corpus. Using gender profiling as an application, we evaluate our approach by comparing it with an approach using features from a knowledge-based lexicon and with an Rhetorical structure theory (RST) discourse parser. Our experiments are performed on corpora for three languages (English, Dutch, and German) in two genres (news and blogs). We include a feature analysis in which we look for (in)consistencies of discourse features related to male and female authors between the different experimental settings.
Language
English
Source (journal)
Digital scholarship in the humanities : a journal of the Alliance of Digital Humanities Organizations. - Oxford, 2015, currens
Publication
Oxford : Oxford University Press , 2019
ISSN
2055-7671 [print]
2055-768X [online]
DOI
10.1093/LLC/FQY025
Volume/pages
34 :1 (2019) , p. 208-220
Article Reference
fqy025
ISI
000481421200015
Medium
E-only publicatie
Full text (Publisher's DOI)
Full text (publisher's version - intranet only)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Art 
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 01.02.2019
Last edited 25.02.2025
To cite this reference