Text+ User Story

Collections of digitized historical periodicals — challenges and interoperability

Nanette Rißler-Pipka (SUB Göttingen) and DHd-AG Zeitungen & Zeitschriften

DFG subject areas: 102 History, 103 Art History, Music, Theatre and Media Studies, 104 Linguistics, 105 Literary Studies, 106 Social and Cultural Anthropology, Non-European Cultures, Jewish Studies and Religious Studies

Text+ data domain: Collections (refers also to Task Area Infrastructure/Operations)

Motivation

The mass digitization in libraries, which began in Germany around 2005, was one of the major prerequisites for building and working with collections of digitized periodicals. Yet people soon realized that digitization alone does not mean that the material is saved from vanishing. While the fragile material (paper) is going to vanish, the expectation that it could be saved through simple digitization is often disappointing.  In Germany the DFG funded several pilot projects (2013–2015) to evaluate the challenges regarding the digitization of historical periodicals. One outcome of this evaluation was a masterplan (https://​www​.zeitschriftendatenbank​.de/​f​i​l​e​a​d​m​i​n​/​u​s​e​r​_​u​p​l​o​a​d​/​Z​D​B​/​z​/​M​a​s​t​e​r​p​l​a​n​.​pdf) for the digitization workflow (2017). For researchers working with historical periodicals as well as for providers of data collections this plan ostensibly improved the situation. Particularly the recommendations regarding the quality of data and metadata as well as the accessibility via standardized interfaces seem to provide ideal conditions for working with the material. Still, looking at the research results produced in the field of disciplines working on digitized historical periodicals, it must be stated that currently only a rather small group of researchers is properly using the massive quantity of data and metadata. Whereas at least the majority of researchers in cultural, literary and media studies still use only few examples of well-known and canonical magazines to prove their hypotheses — this might be different in other disciplines and for newspapers. 

Objectives 

We ask why this massive data and metadata amount is not used in the community of historical periodical researchers or from other disciplines using the data for information retrieval? There are several problems concerning how the data is provided to the community with particular regard to issues of their findability, accessibility, and reusability. First of all, the researchers need to find the data they are interested in: collections of digitized newspapers, journals, and magazines are often the product of the historically developed holdings of each library. Projects like the “Zeitschriftendatenbank” or the “Deutsche Digitale Bibliothek” are changing this towards a virtual aggregation of resources. This positive development could be improved though — not speaking about legal restrictions. Data and metadata of historical periodicals are not necessarily interoperable — even if they are findable and accessible. While there are metadata standards like METS/MODS/ALTO it is likely that they contain insufficient information with regard to the needs of the researchers. This problem is further exacerbated by the fact that researchers generally do not know how to analyse this kind of data or how to evaluate the data quality as this is outside of the scope of their domain knowledge. Here, an investment in data quality, training, and knowledge transfer is necessary. 

Solution 

The connection of data providers for digitized historical periodicals in Germany is very advanced but could additionally include international networks of libraries and other data providers regarding these particular resources (the networks already exist in general like LIBER, IFLA, etc.). The data should be simply accessible via “data shops” (see the example by DNB: https://​portal​.dnb​.de/​m​e​t​a​d​a​t​a​S​h​o​p​.​htm) with an indication regarding licences and re-usability. It would also help to have one single point of access to the data and metadata. This is partly realized via the DDB portal (https://​www​.dnb​.de/​D​E​/​P​r​o​f​e​s​s​i​o​n​e​l​l​/​P​r​o​j​e​k​t​e​K​o​o​p​e​r​a​t​i​o​n​e​n​/​P​r​o​j​e​k​t​e​/​D​D​B​-​Z​e​i​t​u​n​g​s​p​o​r​t​a​l​/​D​D​B​-​Z​e​i​t​u​n​g​s​p​o​r​t​a​l​_​n​o​d​e​.​h​tml) on the German level and by Europeana-newspapers (https://​www​.europeana​.eu/​d​e​/​c​o​l​l​e​c​t​i​o​n​s​/​t​o​p​i​c​/​1​8​-​n​e​w​s​p​a​p​ers) on the European level. These two resources only cover newspapers, though. As such, the inclusion of journals and magazines would be a desideratum. Further, the scope of these resources should also be extended to include and provide direct access to the data and metadata produced by individual research projects.