FREYA and the many hands that make data work

Author: Frances Madden, British Library

‘Collective Curation: the many hands that make data work’ was the theme of the 15th International Digital Curation Conference held in Dublin last month. Held in a stadium, Croke Park, lots of attendees were delighted to see the DCC’s Curation Lifecycle Model in on the big sports screen, a big change from match scores. Throughout the two days full of talks about all aspects of digital curation practices and skills, FREYA was well represented with a demonstrator of the PID Graph and a paper from the British Library on extending the PID Graph in their datasets collection.


Croke Park

The Digital Curation Lifecycle Model on the big screen in Croke Park at the 15th IDCC

Robin Dasler from DataCite demonstrated the PID Graph and how it can illustrate relationships in the research process. After a brief explainer of what the PID Graph is (see here) and the types of questions it’s designed to answer, she demonstrated the DataCite GraphQL API and it can be used both through directly querying it and using Jupyter notebooks to create visualisations. Robin also updated that the GraphQL API, which is in pre-release at present, should have a full release in time for the RDA plenary in March.

The following day Jez Cope and Frances Madden from the British Library gave a paper looking at work done to improve the provenance information in British Library datasets by augmenting them with identifiers on the shared research repository. This piece of work highlighted a couple of aspects of dataset provenance, including how hard it can be to identify and classify contributors and their roles. In the paper, they highlighted two sample datasets, Theatrical Playbills of Great Britain and Ireland and UK Doctoral Thesis Metadata from EThOS, which holds the metadata from EThOS, the index of UK theses. The theatre dataset was augmented with the names of the theatres with which the playbills were concerned. These were added as contributors (other) with ISNIs (International Standard Name Identifiers). The original plan had been to add these ISNIs as related identifiers but ISNI is not an allowed Related Identifier Type under the DataCite v4.1 schema, something which highlights the novelty of the use case this work addresses. Related identifiers were added for the Archive Resource Key (ARK) identifiers which link to the digitised playbills themselves. For the EThOS metadata the university’s which contribute theses were added as Contributor (Other) with ISNIs.

Several participants commented on the own issues they had working with metadata schema for their particular collections and use case. There were also some interesting questions about human readability vs machine readability and grappling with managing a very large number of identifiers. This lead to some discussion about the concept of machines as users of resources we produce that we need to design for in the same way as we design for humans.

The British Library represents the humanities’ disciplines within FREYA. Also of interest to the humanities community is the report Sustainable and FAIR Data Sharing in the Humanities which was launched by ALLEA at IDCC.