Publication
Title
Lexical category acquisition is facilitated by uncertainty in distributional co-occurrences
Author
Abstract
This paper analyzes distributional properties that facilitate the categorization of words into lexical categories. First, word-context co-occurrence counts were collected using corpora of transcribed English child-directed speech. Then, an unsupervised k-nearest neighbor algorithm was used to categorize words into lexical categories. The categorization outcome was regressed over three main distributional predictors computed for each word, including frequency, contextual diversity, and average conditional probability given all the co-occurring contexts. Results show that both contextual diversity and frequency have a positive effect while the average conditional probability has a negative effect. This indicates that words are easier to categorize in the face of uncertainty: categorization works best for words which are frequent, diverse, and hard to predict given the co-occurring contexts. This shows how, in order for the learner to see an opportunity to form a category, there needs to be a certain degree of uncertainty in the co-occurrence pattern.
Language
English
Source (journal)
PLoS ONE
Publication
2018
ISSN
1932-6203
DOI
10.1371/JOURNAL.PONE.0209449
Volume/pages
13 :12 (2018) , p. 1-36
Article Reference
e0209449
ISI
000454621900025
Pubmed ID
30592738
Medium
E-only publicatie
Full text (Publisher's DOI)
Full text (open access)
UAntwerpen
Faculty/Department
Research group
Project info
Bootstrapping operations in language acquisition: a computational psycholinguistic approach.
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 31.01.2019
Last edited 02.10.2024
To cite this reference