Text+ User Story

Digital vocabularies and XPath-Searches on the Web

Thomas Gloning (Justus-Liebig Universität Gießen)

DFG subject area: 104 Linguistics

Text+ data domain: Lexical Resources

Motivation

I am the supervisor of dissertations projects that aim to produce three components: an investigation, a digital text corpus and a digital vocabulary to the topic of the dissertations. Anna Pfundt’s dissertation on early women’s suffrage is one of the three examples, there is an article in the Clarin 2019 Leipzig Conference on this project, which might help to understand our needs.

The data of the digital vocabulary are encoded by using the TEI Lex‑0 Standard. We use oXygen for producing the articles. 

What we need is a digital environment on a website that allows (1) to publish the TEI Lex‑0 encoded data, (2) that allows to do XPath-Queries on the data in order to search for specific lexical descriptors.

Apart from the article on Anna Pfundt’s dissertation, a set of slided on my own project on the German vocabulary of Jazz shows the specific XPath needs.

https://​zhistlex​.de/​f​o​l​i​e​n​/​G​l​o​n​i​n​g​_​2​0​1​9​_​H​i​s​t​V​o​k​-​J​a​z​z​_​S​a​a​r​b​r​u​e​c​k​e​n​.​p​ptx

Objectives

The solution to our need will allow users to produce specific lexical documentations for specific communicative fields, both historical and contemporary. The kind of digital documentation will allow to refer to specific findings (e.g. form dissertation projects of reserch projects) from global portals like DWDS or OWID.

Solution

The suggestions developed within the ZHistLex project of APIs could be used for the purpose at hand.

Challenges

Some of the texts used in the projects are still in copyright. While this problem is vital for the construction of digital corpora, it should not be a problem for the use of short quotations within the articles of the digital vocabulary.