Publication
Title
Character-level transformer-based Neural Machine Translation
Author
Abstract
Neural machine translation (NMT) is nowadays commonly applied at the subword level, using byte-pair encoding. A promising alternative approach focuses on character-level translation, which simplifies processing pipelines in NMT considerably. This approach, however, must consider relatively longer sequences, rendering the training process prohibitively expensive. In this paper, we discuss a Transformer-based approach, that we compare, both in speed and in quality to the Transformer at subword and character levels, as well as previously developed character-level models. We evaluate our models on 4 language pairs from WMT'15: DE-EN, CS-EN, FI-EN and RU-EN. The proposed architecture can be trained on a single GPU and is 34% faster than the character-level Transformer; still, the obtained results are at least on par with it. In addition, our proposed model outperforms the subword-level model in FI-EN and shows close results in CS-EN. To stimulate further research in this area and close the gap with subword-level NMT, we make all our code and models publicly available.
Language
English
Source (book)
NLPIR 2020: Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval, Seoul Republic of Korea, December 2020
Publication
New York, N.Y. : Association for Computing Machinery , 2020
ISBN
978-1-4503-7760-7
DOI
10.1145/3443279.3443310
Volume/pages
p. 149-156
Full text (Publisher's DOI)
Full text (publisher's version - intranet only)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
VABB-SHW
Record
Identifier
Creation 02.02.2021
Last edited 17.06.2024
To cite this reference