Институт лингвистических исследований РАН

General design and goals

The BivalTyp project investigates encoding frames in constructions with two arguments in a typological perspective. From bivalent patterns, one that has received by far the most attention in the literature is the transitive construction. Even though morphosyntactic properties of transitive constructions vary across languages, transitivity has a cross-linguistically stable and well understood semantic basis: verbs that entail a volitional agent and a fully affected patient, such as ‘kill’ and ‘break’, typically belong to the transitive class (Hopper, Thompson 1980; Tsunoda 1985; Dowty 1991; Kittilä 2002; Malchukov 2005; Næss 2007; Malchukov & Comrie 2015).

However, the lexical range of the transitive class varies greatly across languages. Verbs with meanings such as ‘listen’, ‘touch’, ‘follow’, or ‘envy’ are transitive in some languages but intransitive in others.

Bivalent intransitives usually fall into several minor classes. On a language-specific level, it is usually possible to uncover semantic motivation behind minor valency classes. For example, in English, the valency frame in The old man looked at the clouds is identical to the valency frame in The old man shot at the bird. This fact arguably reflects some similarity between these two situations. However, typological research on minor valency classes is rather scarce, and it is generally not known whether different languages are governed by similar principles of valency-class assignment. One of the biggest challenges in this domain is to find a tertium comparationis: encoding devices per se, such as, e.g., the English preposition at, are language specific and cannot be directly equated across languages.

This project is based on a questionnaire containing 130 predicates given in context. The resulting database allows to identify the ways in which translation equivalents of these 130 predicates are grouped into valency classes in individual languages. Thus, the lexical extent of valency classes is used as a tertium comparationis. For example, the translation equivalent of The old man looked at the clouds belongs to a huge transitive class (which also encompasses the equivalents of The boy broke a stick and Peter killed Michael) in languages such as Albanian, Japanese, Chukchi and Kalmyk. In other languages, this translation equivalent belongs to a large intransitive class that also encompasses translation equivalents of Peter entered the house, Peter touched the wall and A bear attacked a fisherman; this is the case in Ossetic and many Turkic languages. In yet other languages, the translation equivalent of The old man looked at the clouds belongs to a very small class also including the translation equivalent of Peter touched the wall (and got dirty); this is the case in English, German and Udmurt. Observations of this kind provide a method to systematically compare valency classes across languages on an empirical basis.

Ultimately, this approach makes it possible to tackle a number of typological questions:

Why bivalent patterns?

This project covers only those predicates that are typologically expected to be bivalent. This restriction is largely motivated by practical reasons. The inclusion of predicates from other numerical valency types would have resulted in either too large a questionnaire with hundreds of entries or to an undesirable loss in lexical coverage. With 130 predicates, it is usually possible to get an idea of the lexical extent of all major types of bivalent structures that are present in individual languages.

Bivalent constructions were preferred to monovalent constructions because in most languages monovalent constructions are not very variegated structurally. Usually, there is just a handful of monovalent constructions, and in some languages, all monovalent constructions belong to just one class (English is a case in point). Bivalent constructions were preferred to trivalent (and more complex) constructions because trivalent constructions are less frequent, not very variegated semantically and are sometimes split into numerous very small classes.

Why contexts, not verbs?

In most languages, frequent verbs are often polysemous and their different meanings can be associated with different valency patterns. As a consequence, isolated lemmas, such as, e.g., English to hit, can hardly be used as comparative concepts: the network of meanings of this verb is too large and its valency behaviour can be different in different contexts even in English, let alone in possible translations of hit in other languages. By contrast, a context such as ‘lightning hit the house’ is a better comparative concept because it is usually possible to identify one basic way to express this specific meaning in a given language. An obvious downside of this approach is that it is not suitable for the study of individual languages: the 130 contexts might look as a random choice among thousands of possible bivalent contexts. However, for typological purposes, a context-based approach is more reliable than a lemma-based approach.


By default, the data for this project are obtained from native speakers who translated the 130 stimulus sentences into the target language. Russian, English and sometimes French were used as inter-languages (languages of the questionnaire and the elicitation procedure).

A crucial role belongs to language experts, also referred to as contributors (in some cases, language experts are simultaneously native speakers). By default, language experts contacted one or several native speakers (typically during fieldwork), elicited translations, selected the translation to be included in the dataset if multiple translations were available, annotated and glossed translations, identified valency classes and provided additional information on the languages in question. All analytical decisions were made in collaboration with the project coordinator, Sergey Say. See also instructions for contributors.

Full data on investigated varieties, including information on language experts and brief grammar notes, can be found on pages devoted to individual languages. Deviations from the standard procedure (e.g., if the data were gathered using dictionaries and/or corpora) are also described on these pages.

History of the project

This project was launched as early as in 2009 at the The Laboratory for the Typological Study of Languages, Institute for Linguistic Studies, Russian Academy of Sciences. In 2009–2013, it was supported by two consecutive grants from the Russian Foundation for Humanities.

2009-2010, “Non-canonical argument marking of two-place predicates”, grant #09-04-00366а.

2011-2013, “Verb argument structure variation and verb classification in languages of various structural types”, grant #11-04-00179a.

After 2013, the dataset continued to grow thanks to language experts who contributed data for new languages. The online version of the project was launched in 2020.

Similar projects


Dowty, David. 1991. Thematic proto-roles and argument selection. Language, 67, 547–619.
Hopper, Paul J. and Sandra A. Thompson. 1980. Transitivity in grammar and discourse. Language, 56, 251–299.
Kittilä, Seppo. 2002. Transitivity: Towards a Comprehensive Typology. Turku: Åbo Akademis Tryckeri.
Malchukov, Andrej & Bernard Comrie (eds.). 2015. Valency Classes in the World’s Languages. Berlin: De Gruyter Mouton, 2015.
Malchukov, Andrej. 2005. Case pattern splits, verb types, and construction competition. M. Amberber, H. de Hoop (eds.). Competition and Variation in Natural Languages. London: Elsevier, 73–117.
Næss, Åshild. 2007. Prototypical Transitivity. Amsterdam, Philadelphia: John Benjamins.
Tsunoda, Tasaku. 1985. Remarks on transitivity. Journal of Linguistics, 21, 385–396.