CrossRef metadata conversion in toolbox
As discussed in the TEI Basisformat meeting today, the following procedure is planned for each harvested article:
-
-
Harvesting script sends crossref.json or datacite.json file containing the CrossRef or DataCite metadata, respectively, of the article to an API endpoint of the toolbox. -
@dariok: Does the endpoint already exist (if yes, is it one of those documented here)?
-
-
-
-
Toolbox converts the CrossRef metadata to a "Rumpf-TEI-Header" file which also contains a standOff element with the attribute type="metadata". All the metadata in this file is from the crossref.json file. The TEI header contains the CrossRef metadata with modifications to adhere to the conventions of the Basisformat (e.g. author names in the form "Mustermann, M."). In the separate standOff element, the original CrossRef metadata, without any modifications, is stored.
-
3. Toolbox converts the metadata also to a metadata format that is suitable for an ingest to Archivematica (and also interoperable to Rosetta, if possible) and provides it via an API endpoint. (see #11 (comment 13389))
- @FreundJ, @NVoelkering will evaluate which format is needed and write it in this issue.
-
-
There will be random intellectual checks of the metadata quality of the TEI Header files created from the CrossRef metadata (e.g. on every 100th article). It will be checked, e.g., if the CrossRef metadata contained all necessary information (authors, title etc.) and if this information was correct, i.e. is the same as in the article full text document.
-
@dariok, @sika: Would this description be correct? Please correct / complete otherwise. Thank you in advance!
Edited by Jens Freund