The PID Graph


network_colour


One of the key goals of FREYA is to connect new and existing persistent identifier (PID) services to make the most of the information available in different PID systems. PIDs are not only important to uniquely identify a publication, dataset, or person, but the metadata for these persistent identifiers can provide unambiguous linking between persistent identifiers of the same type, e.g. journal articles citing other journal articles, or of different types, e.g. linking a researcher and the datasets they produced.

 

Work is needed to connect existing persistent identifiers to each other in standardized ways, e.g. to the outputs associated with a particular researcher, repository, institution or funder, for discovery and impact assessment. Some of the more complex but important use cases can’t be addressed by simply collecting and aggregating links between two persistent identifiers, including

  1. Aggregate the citations for all versions of a dataset or software source code

  2. Aggregate the citations for all datasets hosted in a particular repository, funded by a particular funder, or created by a particular researcher

  3. Aggregate all citations for a research object: a publication, the data underlying the findings in the paper, and the software, samples, and reagents used to create those datasets.


To address these use cases we need a more complex model to describe the resources that are identified by PIDs, and the connections between them: a graph. In FREYA, we are building a PID Graph, a network of interconnected PID systems, as a basis for a wide range of services. The PID graph can link PIDs together via relations in their metadata to enable the discovery of connections at least two “hops” away.

 

PID Graph steps


Using a graph makes it easier to describe these more complex use cases and relationships, and this approach has been frequently applied to similar questions in the past. FREYA builds on the expertise and close collaboration with the Research Graph team and adopts the outputs of the Research Data Alliance DDRI Working group to transform PID connections into an improved graph of research objects. This project takes advantage of the best practices of graph modelling and distributed network analysis techniques.


PID Graph

A schematic representation of the PID graph with digital objects connected by PIDs, showing three use cases: A: Different versions of software code, B: Datasets hosted by a particular repository, C: All digital objects connected to a research object.


More information

Fenner, M., & Aryani, A. (2019). Introducing the PID Graph (Version 1.0). https://doi.org/10.5438/JWVF-8A66

Fenner, M. (2019). Using Jupyter Notebooks with GraphQL and the PID Graph (Version 1.0). https://doi.org/10.5438/HWAW-XE52

Please reach out to us via the PID Forum if you are interested to learn more about PID Graph, want to see your data in PID Graph, or are working on a related project and want to coordinate.