Text Corpus

The text corpus includes selections of representative texts from several Ob-Ugric dialects: Northern Mansi, Eastern Mansi and Western Mansi (Pelym Mansi, Northern Vagilsk Mansi, Middle Lozva Mansi), Kazym Khanty, Surgut Khanty and its subdialect Yugan Khanty.

Text sources differ from already published texts from different publication versions, in different transcriptions, with translations in different languages – especially those which were earlier practically unavailable in Europe. On the other hand, there are unpublished texts from archives and fieldwork of project participants. It means, on the one hand, the re-working of major old publications such as Munkácsi, Kálmán, Kannisto - Liimola for Mansi, and Paasonen, Rédei for Khanty; but it also means edition and analysis of fieldwork archive of Chernetsov (1930s, Mansi) and our own materials.

All texts will be presented in a unified transcription (IPA), with glossing according to our own Glossing Conventions (FLEx Preliminaries and Glossing Rules, FLEx Template OUDB) and with translations into English, which represents the main work of both projects OUL and OUDB. Texts edited during the OUL project chiefly provide also Russian and German translations, either taken from other sources or prepared by project participants. If Hungarian translations were avalaible, we included them as well. Any information on this is provided in the metadata of each text. Some texts feature an additional analysis of functional, semantic and pragmatic roles (see also the section annotation on this particular work of the OUDB project). If possible, audio materials are also provided.

You can access the corpus content overview including a query form here. Additionally there are preset search templates for both languages as well as for each (sub)dialect. Alternatively, choose a sub-corpus via the dialect in the menue on the left. Each sub-corpus contains a short description about sources and editing of the data. Each text opens in a new tab that contains all relevant metadata, including an automatically generated reference that can be used for quotation.
The icon bar gives information on the kind of analysis (e.g. glossing, annotation or audio files) available for each text. It can also be used for filtering. Once a text is chosen, there are several choices for displaying information (e.g. metadata, translation only, audio data). The export function is accessible via the icon "glossed text". There is an icon above each sentence which generates different export formats for each sentence in a new window. Clicking on export mode provides an export function for the whole text. By checking the box Synpra Layer (for annotation) and/or audio mode both feautres will be displayed (if available) and/or provided for export.
The glossing is in addition linked to the concordance section. Clicking on a gloss leads you directly to the lexicon section.

