O dwóch czeskich jednostkach leksykalnych będących wykładnikami negatywnych stanów emocjonalnych i ich polskich ekwiwalentach. Analiza na materiale z korpusu paralelnego InterCorp

The analysis is focused on the Czech verbs žárlit ‘to be jealous’ and závidět ‘to envy’. The goal is to establish their closest equivalents in Polish. We use dictionary definitions to find the correct meaning of the analyzed verbs and link them with equivalents proposed by a traditional Czech-Polish dictionary. Equivalents automatically extracted from the corpus help us to find translations available in InterCorp. Although the results are consistent with those proposed by the bilingual dictionary, the number of equivalents found in InterCorp is larger. Next, we apply a method developed in our pilot studies, including automatic excerption of given words with aligned segments from InterCorp. The segments are analysed manually. In each segment we check how a given word was translated and we examine its collocations and arguments. The study was supposed to determine if valence requirements could influence the choice of an equivalent in Polish. A pilot study concerning the ambiguous Czech verb toužit ‘to miss, to want, to desire’ (Kaczmarska, Rosen, 2013) was supposed to reveal if valence can influence the choice of an equivalent in Polish. It was assumed that for some senses the equivalent can be established based on the convergence of the valence requirements (Levin, 1993). Unfortunately, for the analysed Czech verbs žárlit and závidět, the number of occurrences is insufficient, so collocation profiling (using the Word Sketch tool available in the Sketch Engline) cannot be applied to analyze syntactic contexts. We conduct a corpus-based research instead. The data from InterCorp confirm our assumptions based on the dictionary definitions. The equivalent-searching algorithm, based also on a syntactico-semantic analysis (automatic extraction of pairs of equivalents, valence analysis, Case Grammar, Pattern Grammar, Cognitive Grammar) and described in last part of the paper, cannot be applied to the two verbs. We found Word Sketch as a promising tool for our research and we hope it to be the turning point for building our algorithm (Word Sketch for the Czech part of InterCorp is in the phase of preparation). We hope that our algorithm will be able to cooperate with machine translation tools. This is why, in addition to a manual analysis, we also try to conduct experimental trials of stochastic modelling of the choice of an equivalent on the basis of the context (Kaczmarska et al., 2015).
Gruszczyńska, Ewa; Leńko-Szymańska, Agnieszka, red. (2016). Polskojęzyczne korpusy równoległe. Polish-language Parallel Corpora. Warszawa: Instytut Lingwistyki Stosowanej, pp. 228-248.
Belongs to collection