Working with Full-Texts in the History of Science

Working with Full-Texts in the History of Science: Development of XML-Workflow and Content-based Web Access

Cooperation Partners: 

Max Planck Digital Library

echo-book1b.jpg

Example of a xml-fullltext representation in the ECHO viewer environment

This project aims to support the implementation of some of the key Epistemic Web concepts. A group has been established on the basis of a cooperation with the MPDL in order to complement the generic infrastructure of the MPDL with an application layer and interface to serve the specific purposes of scholars working with sources in the humanities. This application layer consists of a content-based access mechanism for texts that incorporates language technology and thus enables semantic access to their content. In order to prepare historical sources for ingestion into this infrastructure, a workflow is being developed to structure texts in an XML-format, a presupposition for the integration of texts with various tools. As a testbed for this development, sources from a variety of the Department’s projects written in different languages—such as Chinese, Greek, Italian and Latin—have been used, opening up new approaches for computer-assisted analysis of their content. Upon completion of the project, the application layer will be freely accessible. It is conceived as a prototype that can be further generalized within the MPDL and made available to a broad scholarly community.