|
Description
|
Manually annotated dataset of 3,000 uses of exterior locative constructions (specifically cases and postpositions) in present-day Estonian. The data is extracted from the Estonian National Corpus (ENC 2017; 1.1 billion words, mainly web-based texts). The data includes 500 uses of each of the following constructions: allative, adessive, ablative, peale, peal, pealt. The data sampling procedure and more details about the dataset is given in Klavan & Schützler (to appear in Cognitive Linguistics). The data is annotated for 9 variables: postpos (outcome variable: case, postposition), position (post, pre), complexity (simple, compound), length (length in syllables of landmark phrase), frequency (raw frequency of landmark form in association with the respective semantic relation), function (adverbial, modifier), verb_lemma (224 levels for lative, 279 levels for locative, 252 levels for separative), lm_lemma (592 levels for lative, 438 levels for locative, 528 levels for separative), sem_rel (lative, locative, separative). The dataset was collected by the PI of the project PUT1358 "The Making and Breaking of Models: Experimentally Validating Classification Models in Linguistics" (1.01.2017−31.12.2020) funded by the Estonian Research Council. (2022-06-14)
|
|
Related Publication
| Klavan, Jane and Schützler, Ole. "The complexity principle and the morphosyntactic alternation between case affixes and postpositions in Estonian" Cognitive Linguistics, vol. 34, no. 2, 2023, pp. 297-331. https://doi.org/10.1515/cog-2021-0114doi: https://doi.org/10.1515/cog-2021-0114 |