A memory-based approach to Kïkamba named entity recognition
Faculty of Arts. Linguistics and Literature
S.l. , 2011
Proceedings of the Conference on Human Language Technology for Development
University of Antwerp
This paper describes the development of a data-driven part-of-speech tagger and named entity recognizer for the resource-scarce Bantu language of Kıkamba. A small webmined corpus for Kıkamba was manually annotated for both classification tasks and used as training material for a memory-based tagger. The encouraging experimental results show that basic language technology tools can be developed using limit amounts of data and state-of-the-art language-independent machine learning techniques.