Data Management

All data management measures aim to support researchers in the creation, use, analysis, networking and curation of research data and to ensure their preservation and reusability, also for machine processing. Certified data repositories of the applicant and participating institutions and, perspectively, other partners play a decisive role in the implementation of Text+’s data hosting and archiving services. Each institution takes responsibility for those research data that are particularly related to its mission and expertise. From the beginning, researchers are supported in the creation of data management plans, the selection of suitable repositories and licenses, the selection and use of subject-specific tools and procedures (e.g. annotation procedures, text data mining) as well as the application and further development of standards and normative data for the representation of research data and their metadata in order to establish interoperability. The FAIRification of inventory data is also part of the portfolio. In annual community-based review and innovation cycles, the data portfolio as well as the range of digital methods and tools is further developed.

Dealing with the diversity and breadth of language and text-based data places special demands on research data management. This diversity includes different metadata formats as well as different data formats and different degrees of data structure. Due to copyright and data protection requirements, this heterogeneous research data is distributed among many different actors and, to a large extent, has to be offered at different geographical locations. This results in the need for a location-distributed research data infrastructure that complies with the FAIR principles and at the same time meets the legal requirements, following common standards and recommendations such as the BSI Basic Protection and ISO 27001. Text+ is dedicated to orchestrating existing and future activities that span the entire research data lifecycle.

The applicant and participating institutions have considerable experience in setting up and operating a location-distributed network of certified data centres. Certification, which will be a mandatory requirement for all data repositories involved in Text+, will be based on international standards of the Core Trust Seal. In implementing the FAIR principles, Text+ can draw on the following existing components of research data management: compliance with metadata standards, application of standards-compliant protocols for metadata harvesting, assignment of persistent identifiers for research data, application of standards-compliant protocols for authentication, authorization and identification (AAI) and for harvesting. The applicant and participating institutions of Text+ have developed sophisticated data storage solutions that have been in operation for several years and are used for the curation and archiving of language and text-based research data.