Text+ User Story

BBAW/SAW-Egyptological metadata thesauri

Daniel Werning (Berlin-Brandenburgische Akademie der Wissenschaften) 

DFG subject area: 101 Ancient Cultures

Text+ data domain: Lexical Resources

Motivation

The Egyptian-Coptic language is the human language with the longest documented lifetime, spanning approx. 4,500 years. Its vocabulary and its texts reflect the knowledge of one of the formative cultures of the ancient world. In the last decades, a considerable number of digital projects dealt with Egyptian textual artefacts and compiled metadata for them. As a Digital Humanities-oriented Egyptological (DFG 101–05) researcher and IT-developer, I faced the problem to develop a database on Egyptian textual artefacts with a large set of more or less cultural-specific metadata on the text and its support, e.g., relative dating based on Egyptian rulers, text type, place of origin/provenance in ancient and modern Egypt, object category, material, and many more (Rubensohn project). At this point, only few resources existed that systematically developed appropriate thesauri for some of the metadata. For the relative dating thesaurus, for example, I was allowed to use the thesaurus of the Thesaurus Linguae Aegyptiae, other thesauri came from comparable projects, e.g., “BerlPap”. Back then, barely any of them had officially published their thesauri and they did not provide stable IDs, so that a linked data scenario was not at hand. Meanwhile the situation is a little better. Projects like “Trismegistos” and “Thot” (University Liège and BBAW Berlin / SAW Leipzig) provide stable IDs and also web services (APIs). However, the confidence in these thesauri projects is undermined by the fact that these are research projects without long-term financing or even without current financing. One of them even had to implement a subscription system, i.e., certain functions were suddenly not freely available anymore. Moreover, none of these projects is highly collaborative in the sense that new Egyptological DH-projects with new demands can engage in the development of the thesauri easily and fast (compare the WikiData concept).

The “Thesaurus Linguae Aegyptiae” (TLA), edited by the joined Academies’ project “Strukturen und Transformationen des Wortschatzes der ägyptischen Sprache” (BBAW, Berlin / SAW, Leipzig), provides the worldwide largest electronic corpus of Egyptian texts, annotated, among other things, with artefact metadata. The leaders of this project are interested in publishing the developing versions of their thesauri in a sustainable way according to FAIR principles. Since the TLA project is one of the two major cooperation partners in the above mentioned “Thot” project, further data may come from this project, which is currently without financing.  

Objectives

Hosting of Egyptological metadata thesauri offered by different projects in Text+ according to FAIR principles, in different versions of their development (versioning) and possibly interlinked between each other would make it easier for Egyptological DH projects to link their data based on these thesauri (Open Linked Data), notably also long after the end of the respective projects. Projects may not have to build up their own thesauri from scratch but simply use thesauri offered in Text+ without change, or they may develop their own thesauri based on one already offered in Text+. 

Solution

I imagine Text+ to offer metadata thesauri of different disciplines and projects, with their own permanent IDs/URIs. A thesaurus of a given project should possibly be offered in different stages of its development (versioning). Projects modifying already included thesauri may offer their modified versions as ‘branches’ or ‘forks’ of the original thesaurus. As far as possible, thesauri of different projects should be interlinked via various permanent IDs/URIs, those of Text+ and those of other authorities. A browser is needed for the thesauri, as well as a set of APIs to efficiently use them. 

Challenges

The TLA project at BBAW, Berlin and SAW, Leipzig supports such a data scenario. 

Probably, the hosting of a network of versions of text metadata thesauri requires the position of an editor, preferably with a background in IT as well as in Humanities (ancient history, papyrology, …, or information sciences).

Review by community

The TLA project at BBAW, Berlin and SAW, Leipzig is happy to review a possible implementation.