The Digital Pallas, an eighteenth-century multilingual Russian dictionary with 60.000 language data
By Nicoline van der Sijs, Dutch Language Institute and Radboud University Nijmegen
A team of researchers is preparing an annotated digital edition of the eighteenth-century Russian Comparative Dictionary of All Languages and Dialects, compiled by the Prussian scholar Peter Simon Pallas on the initiative of none other than the Russian Empress Catherine the Great. This dictionary contains language data of hundreds of different languages, some of which are now extinct, others endangered. For some languages the data in the dictionary are the oldest or belong to the oldest known sources. How did this dictionary come about?
In 1784 Empress Catherine threw herself into the study of languages. She made a list of some three hundred Russian concepts and had these translated into all the languages and dialects she could find. She even single-handedly began composing a dictionary on the basis of the collected material, starting with ‘Caribbean’ words; see Figure 1. In her opinion this dictionary was ‘perhaps the most useful research ever conducted in the field of all languages and dictionaries, and especially relevant to the Russian language’.
In 1785 Catherine sought the help of the Prussian physician, zoologist, botanist, geographer and explorer Peter Simon Pallas, who had entered Russian service in 1767 and was a member of the Russian Academy of Science. Pallas was an internationally renowned scholar who had a great knowledge of languages. He expanded Catherine’s word list and composed a Modèle du vocabulaire qui peut servir à la comparaison de toutes les langues, with 443 concepts in Russian, German, Latin and French. This ‘model’ was sent not only to the administrators of the provinces of the vast Russian empire, but also to Russian diplomats all over the world, and it was handed out to foreign diplomats in Russia. All recipients were asked to provide translations of the concepts in as many languages as possible. As a result, a great number of language data were submitted to Pallas, to which he himself added material from printed dictionaries.
Already in 1787 the first part of the Linguarum Totius Orbis Vocabularia Comparativa Augustissimae Cura Collecta, or in Russian Сравнительные словари всѣхъ языковъ и нарѣчiй собранные десницею всевысочайшей особы, was published, anonymously but with a foreword by Pallas. The second part of this ‘Comparative glossaries of all languages and dialects in the world collected thanks to the care of a royal person’ came out in 1789. The two parts contained 273 concepts with their translation in 200 numbered languages, ending with the names of the numerals 1-10, 100 and 1000 in 222 languages. The first concept was Богъ (God), which can serve as an example of the simple lay-out of the dictionary (between square brackets I have added the translation):
1 По Славяснки – Богъ. [1 Slavic Bog]
2 По Славяно-Венгерски – Бугъ. [2 Slavic-Hungarian Bug]
3 По Иллирїйски – Боогъ. [3 Illyric Boog]
199 На островахъ Маркезанскихъ – Етуо. [199 On the Marquesas Islands Etuo]
200 На островахъ Сандвича – Итуа. [200 On the Sandwich Islands Itua]
After the publication of the dictionary, new language data kept coming in, which prompted Catherine to commission a new edition, with a new structure: she ordered the words in the various languages to be arranged alphabetically. The new edition was published in 1790-1791 in four volumes and with a slightly altered title: Сравнительный словарь всѣхъ языковъ и нарѣчiй, по азбучному порядку расположенный, or Comparative dictionary of all languages and dialects, in alphabetical order. Foreword and name of the editor are lacking, but Friedrich Adelung reveals in his extensive publication on the dictionary, titled Catherinens der Grossen Verdienste um die vergleichende Sprachenkunde (1815), that Theodor Jankiewitsch de Miriewo, director of the College of Education, had served as editor, since Pallas was engaged otherwise. The dictionary is set up in three columns (word, concept and language), starting:
A Кто По Ирландски [a, who, Irish]
А Онъ По Сарамакски въ Суринамѣ [a, he, Saramaccan in Surinam]
А Онъ Суринамски Креольски [a, he, Surinam creole]
А Она Суринамски Креольски [a, she, Surinam creole]
А Да Татарски около Кузнецка [a, yes, Tatar around Kuznetsk]
Scholars outside of Russia are unfamiliar with the dictionary, because it is written in Russian and Cyrillic. But even within Russia it is largely unknown, since most of the thousand copies of the second edition were immediately stored at the Imperial Cabinet. This ignorance is regrettable, since the dictionary contains a wealth of language data. To remedy this, a team of researchers in the Netherlands has in 2020 devoted themselves to digitize and annotate the second edition of the Pallas dictionary. The Institute for the Dutch Language in Leiden has provided ‘The Digital Pallas’ with the Interactive Lexicon Tool Lex’it. With this tool all the data of the dictionary have by now been added to a database.
From this database we learn that the dictionary contains 61,960 words for 296 concepts in 328 different languages – which is much more than was previously known. These include a large number of native languages in North and South America and Africa. We found remarkable differences between the number of words per language: for some languages the dictionary mentions a large number of synonyms, for instance for Japanese (859 words for the 296 concepts). However, for other languages the number of words is very small (23 words for Sinhalese, only 2 words for Thai).
Our next job is to add transcriptions and the original (non-Cyrillic) spelling to the words in the various languages, and to add modern English names to the language names and language classification as used by Pallas. It comes as no surprise that we chance upon all kinds of irregularities during this annotation process. For instance, words are not always assigned to the language they belong to (Wanst ‘belly’ is German, not Dutch as Pallas asserts), and concepts are wrongly translated (Russian шум ‘noise’ is not the same as Dutch alarm ‘alert, uproar’).
After finishing the annotation we will publish the annotated database on a public website, since we expect comparative linguistics, lexicostatistic and colexification research to profit from the Pallas data. Furthermore we intend to publish an edited volume with chapters devoted to the historical background of the Pallas dictionary and to the various languages and language families that are included in the dictionary. Anyone who is interested in the project or needs more information, is invited to contact Nicoline van der Sijs: firstname.lastname@example.org.
The Digital Pallas project is coordinated by Tjeerd de Graaf, Wim Honselaar, Janine Jager, Bruno Naarden and Nicoline van der Sijs. Collaborators are Melle Groen, Marien Jacobs, Richard Kellermann Deibel, Martijn Knapen, Djoeke Leguijt, Sasha Lubotsky, Michael Nestorowytsch, Tamara Schermer and Vincent Wintermans.