The PID Graph


One of the key goals of FREYA was to connect new and existing persistent identifier (PID) services to make the most of the information available in different PID systems. PIDs are not only important to uniquely identify a publication, dataset, or person, but the metadata for these persistent identifiers can provide unambiguous linking between persistent identifiers of the same type, e.g. journal articles citing other journal articles, or of different types, e.g. linking a researcher and the datasets they produced.


Work is needed to connect existing persistent identifiers to each other in standardized ways, e.g. to the outputs associated with a particular researcher, repository, institution or funder, for discovery and impact assessment. Some of the more complex but important use cases can’t be addressed by simply collecting and aggregating links between two persistent identifiers, including

  1. Aggregate the citations for all versions of a dataset or software source code

  2. Aggregate the citations for all datasets hosted in a particular repository, funded by a particular funder, or created by a particular researcher

  3. Aggregate all citations for a research object: a publication, the data underlying the findings in the paper, and the software, samples, and reagents used to create those datasets.

To address these use cases we need a more complex model to describe the resources that are identified by PIDs, and the connections between them: a graph. In FREYA, we have been building a PID Graph, a network of interconnected PID systems, as a basis for a wide range of services. The PID graph can link PIDs together via relations in their metadata to enable the discovery of connections at least two “hops” away.




Using a graph makes it easier to describe these more complex use cases and relationships, and this approach has been frequently applied to similar questions in the past. FREYA build on the expertise and close collaboration with the RDA Open Science Graphs IGThis project took advantage of the best practices of graph modelling and distributed network analysis techniques.


More information

Fenner, M., & Aryani, A. (2019). Introducing the PID Graph (Version 1.0).

Fenner, M. (2019). Using Jupyter Notebooks with GraphQL and the PID Graph (Version 1.0).

Please reach out to us via the PID Forum if you are interested to learn more about PID Graph, want to see your data in PID Graph, or are working on a related project and want to coordinate.