Researchers and Persistent Identifiers: Preserving Data through Accessibility


Author: Tabish Virani (University College London - MA in Digital Humanities programme)

For three weeks in May 2019, Tabish Virani, a student at University College London’s MA in Digital Humanities programme, had a work placement assisting the FREYA team in compiling training materials for researchers. Here he reflects on what he has learned.

Over the past three weeks, I have had the privilege of taking part in a work placement at the British Library where I assisted with the FREYA project. As a student in UCL’s MA in Digital Humanities program, I have quite the interest in data protection as well as data accessibility. During my time at the British Library I was able to expand my horizons when it comes to the importance of data accessibility and the steps that need to be taken to ensure it. In particular, I had the opportunity to learn about persistent identifiers (PIDs) and the role they can play to ensure that online data and resources can be more easily found and accessed.

Simply put, the role of a persistent identifier is to label online data in a way that will allow for it to be found even if its online location might change. This is incredibly useful when online resources have their URL changed, as it allows for these resources to still be found, even if one might not be aware of the current URL. During my time working with FREYA, I spent time looking specifically at how researchers can benefit from the use of PIDs. One way in particular that persistent identifiers can help researchers is through having a personal PID that is associated with them in order to distinguish themselves from other researchers. Through services such as ORCID, one can receive a persistent identifier that will link to an ORCID profile, a page that displays one’s personal work. By linking the ORCID ID to articles that the author has written they can help to ensure that they receive credit for the work they have done as it will be correctly associated to them. Furthermore, if someone else tries to find their work, instead of searching by their name, which can be inefficient due to the possibility of the researcher having a common name or different iterations of their name, one can simply search for the researcher by their ORCID ID and find their works in a single place.

Although persistent identifiers can be quite useful when it comes to data accessibility and data preservation, they cannot be used reliably unless people make a conscious effort to implement them and attach them to their data. It is for this reason that the organizations associated with the FREYA project are encouraging the use of PIDs in a wide range of different disciplines. Through this collaboration, FREYA is also working on creating a PID graph, which demonstrates the link between different data points with persistent identifiers. For example, if an institution has its own persistent identifier and gives a grant to a researcher with his or her own PID, who in turn writes an article with the help of the grant and assigns the article a PID, one could use PID graph to see how the article is related to both the researcher and the institution. By creating a graph such as this, one would be able to detect all the different relationships that a data point has through tracking links between persistent identifiers, which would ultimately make accessing data much easier and more efficient. However, the entire process of creating the PID graph is dependent on the consistent use of persistent identifiers. Through the use of PIDs in a multidisciplinary context, a larger range of people can benefit from the advancement it could lead to in data preservation and accessibility.

My time working with FREYA at the British Library, though brief, has been quite eye opening. It has shown me just how much of the responsibility of data preservation falls upon the academic community and just how important it is for anyone who is creating data to take an interest in ensuring that it can be continually accessed. Although it may seem like an extra burden for researchers to implement PIDs, the benefits that can be reaped from their use are well worth it in the long run. Through their use, researchers can help ensure that their past works stay relevant due to the fact that they will remain accessible for extended periods of time. Without taking these measures to ensure its longevity, important data can become lost, leading to a hinderance in further academic progress. Unfortunately, it seems as though there is not enough awareness amongst the academic community regarding the necessity of using persistent identifiers, which is why the institutions working as a part of the FREYA project have created a platform to bring information regarding persistent identifiers to light. For those who are interested in learning more, I would recommend visiting the FREYA project’s https://www.pidforum.org/ for more information, where you can find a number of useful resources and participate in discussion regarding persistent identifiers.

Tabish Virani -  https://orcid.org/0000-0002-0176-8498