The resource was obtained my manually annotating texts from the OpenCorpora corpus by senses of the Russian wordnet RuWordNet.
Entity | Count |
---|---|
Documents | 807 |
Sentences | 6,751 |
Tokens | 109,893 |
Annotated tokens | 46,320 |
Lexical entries | 17,126 |
Annotated lexical entries | 10,683 |
RuWordNet synsets | 8,619 |
Alexander Kirillovich, Natalia Loukachevitch, Maksim Kulaev, Angelina Bolshina, Dmitry Ilvovsky. Sense-Annotated Corpus for Russian // Proceedings of the 5th International Conference on Computational Linguistics in Bulgaria (CLIB 2022), Sofia, Bulgaria, 8–9 September 2022. Bulgarian Academy of Sciences (forthcoming).
Creative Commons Attribution-ShareAlike License (CC BY-SA 4.0).