Title
|
|
|
|
Discourse lexicon induction for multiple languages and its use for gender profiling
|
|
Author
|
|
|
|
|
|
Abstract
|
|
|
|
We propose a novel way to create categorized discourse lexicons for multiple languages. We combine information from the Penn Discourse Treebank with statistical machine translation techniques on the Europarl corpus. Using gender profiling as an application, we evaluate our approach by comparing it with an approach using features from a knowledge-based lexicon and with an Rhetorical structure theory (RST) discourse parser. Our experiments are performed on corpora for three languages (English, Dutch, and German) in two genres (news and blogs). We include a feature analysis in which we look for (in)consistencies of discourse features related to male and female authors between the different experimental settings. |
|
|
Language
|
|
|
|
English
|
|
Source (journal)
|
|
|
|
Digital scholarship in the humanities : a journal of the Alliance of Digital Humanities Organizations. - Oxford, 2015, currens
|
|
Publication
|
|
|
|
Oxford
:
Oxford University Press
,
2019
|
|
ISSN
|
|
|
|
2055-7671
[print]
2055-768X
[online]
|
|
DOI
|
|
|
|
10.1093/LLC/FQY025
|
|
Volume/pages
|
|
|
|
34
:1
(2019)
, p. 208-220
|
|
Article Reference
|
|
|
|
fqy025
|
|
ISI
|
|
|
|
000481421200015
|
|
Medium
|
|
|
|
E-only publicatie
|
|
Full text (Publisher's DOI)
|
|
|
|
|
|
Full text (publisher's version - intranet only)
|
|
|
|
|
|