Text+ User Story

Dictionary data at the Saxon Academy of Sciences in Leipzig

Uwe Kretschmer (Sächsische Akademie der Wissenschaften zu Leipzig)

DFG subject area: 104 Linguistics

Text+ data domain: Lexical Resources

Motivation

At the Saxon Academy of Sciences in Leipzig we run several dictionary projects, primarily dialect and historical dictionaries. These include the Old High German Dictionary (AWB), the Etymological Dictionary of Old High German (EWA) and the dialect dictionaries (Mecklenburg Dictionary, Brandenburg-Berlin Dictionary, Dictionary of Upper Saxon Dialects, Thuringian Dictionary). 

These dictionaries are at different stages of digitization. Depending on the project, digital provision of the data has been partially achieved (AWB via dictionary network), is being implemented or is planned. 

Objectives 

Our goal is to make these dictionaries available in digital form in the medium term to make them more accessible and as well-integrated as possible (following the FAIR principles) independently.  

Text+ could play an important role, as many objectives are difficult to achieve independently. On the one hand, findability/searchability should be guaranteed, preferably also from a central location, in order to increase the reach of individual dictionaries. 

It would be helpful to have mechanisms that allow linking with other data sets, especially at the level of dictionary entries. We would also like to integrate external dictionary entries, parts of dictionary entries and references in our own search portal. Persistent addressability and referencing should play a role in order to enable permanent user references. We would also be interested in solutions for cases of aggregated entries. 

Solutions 

Text+ can act in several ways to eliminate possible obstacles: 

PIDs: Recommendations are needed for PIDs in the field of lexical resources. Questions of granularity (PID at entry level or fine granular?) or technical nature (reference to data or metadata) need to be clarified. A PID allocation service should possibly be offered by Text+. This can quickly become a stumbling block for smaller institutions. 

Formats: Specifications for recommended formats or for the structuring of dictionary entries are necessary to enable a fine-grained exchange between dictionaries. 

Interfaces: Specifications, recommendations or guides to standardized interfaces are necessary. These should also offer networking possibilities, such as the integration of information from other data sources (entire dictionary entries or more fine-grained) or the adjustment / monitoring of changes in data sources. 

Search: A distributed search across different data sets should be made possible. A central aggregator would be helpful. 

Review by Community 

We would like to implement or use solutions to the problems described within the framework of Text+ and thus integrate our data resources and make them available as openly as possible. In doing so, we would be prepared for a broad exchange of experiences.