Publication
Title
Heimdall: In-line Language Identification Tool
Author
Abstract
Heimdall is an open-source tool, written in Python, that searches a text for sequences written in other languages than the main one. Heimdall's roots in historical linguistics are reflected in the language models it is shipped with: its three default languages are medieval Latin, Early Modern French and Early Modern English. The tool is compatible with plain text files as well as xml-files. In the first case, it simply outputs a list of all foreign passages. In the second, it also allows you to tag the foreign passages directly in the input documents.
Language
English
DOI
10.5281/ZENODO.3621420
Full text (Publisher's DOI)
UAntwerpen
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Record
Identifier
Creation 19.04.2021
Last edited 22.03.2023
To cite this reference