PIDs for historic literature

Author: Nicole Kearney (BHL)

My passion for persistent identifiers (PIDs) really only began in 2014, when I started working with material that sits outside the PID graph: I manage the Australian branch of the Biodiversity Heritage Library (BHL), the world’s largest online repository of biodiversity heritage literature.

The PID graph consisting of DOIs and ORCIDs has revolutionised modern publishing, enabling researchers and organisations to permanently locate, cite, link, connect, share and track all elements of scholarly research. However, this PID graph falls apart when applied to legacy literature (pretty much everything pre-2000).

The vast majority of historic publications lack DOIs and this means they are excluded from the linked network of modern publications, either appearing in reference lists as unlinked citations or not at all, because they are too hard for authors to locate and/or cite. The upshot of this is that our historic literature is in danger of falling into obscurity.

The same is true for historic authors. One of ORCID’s core principles is that individuals control their ORCID and the information attached to it (see ORCID Trust: Individual Control) and therein lies the problem: ORCIDs must be self-administered. This means they cannot be retrospectively assigned to the dead (or to the living who, for whatever reason, have not registered for one). This makes it impossible to connect and track the academic output of historic authors using ORCIDs.

Since 2014, I’ve been working (with many others) to bring the world’s historic biodiversity literature into the linked network of scholarly research and, wherever possible, to use PIDs to achieve this. I started by retrospectively assigning DOIs to the back issues of the Memoirs of Museum Victoria (the journal of my home institution), all the way back to volume 1 published in 1906: see I then ensured that these new DOIs appeared in the bibliographic data wherever else the Memoirs appeared online, such as on the BHL website.



Woodward, A.S. 1906. On a Carboniferous fish fauna from the Mansfield district, Victoria, Memoirs of Museum Victoria, vol. 1 p. 1-32,


I joined the FREYA Project as an Ambassador in 2018 and that year I was extremely fortunate to win FREYA’s first Ambassador Competition, which enabled me to travel from Australian to Dublin in January 2019 to attend and present at PIDapalooza, the festival of Persistent Identifiers (thank you FREYA!). This gave me the opportunity to speak about an issue I’d stumbled across in my PID work (an issue that still riles me today): commercial publishers assigning DOIs to out-of-copyright journal articles and then placing their definitive versions of these articles behind paywalls (see What are we DOIng about the out-of-copyright literature? and Historic literature, DOIs & PIDapalooza). 


Nicole Kearney speaking at PIDapalooza in Dublin, Jan 2019.


Since becoming a FREYA Ambassador, I’ve actively promoted the use of persistent identifiers, particularly as a way to increase the discoverability of historic literature. 2019 was a particularly busy year for me; after PIDapalooza in Dublin, I spoke about PIDs in Sydney, New YorkCanberraBulgaria, Leiden and my home city of Melbourne. However, my PID highlight of 2019 was making the historic journal articles on BHL discoverable via Unpaywall (thanks to the incredible work of Unpaywall Developer Richard Orr and BHL Superuser Professor Roderic Page).


How to turn the “unknown knowns” on BHL into “known knowns” (Nicole Kearney speaking about PIDs at Biodiversity Next in Leiden, October 2019). Credit: Grace Costantino.


2020 has been a very different year. For the past six months all BHL Australia staff (and volunteers) have been working from home. With our core work of digitising historic literature on hold, I had to find alternative projects for the team to work on. This was a wonderful opportunity to increase our focus on improving the discoverability of our existing online content via PIDs. Since March, we’ve gathered article-level metadata for over 30,000 historic journal articles and added this data to BHL and Wikidata. We’re now in the process of adding Wikidata IDs to BHL’s author profiles and retrospectively assigning DOIs to content published as early as the 1700s (e.g.


Shaw, G. & Nodder, F.P., 1799. The Duck-Billed Platypus, Platypus anatinus.. The Naturalist’s Miscellany, vol. 10 (CXVIII),


Working with historic content is hard. The data we need to assign DOIs to historic journal articles is invariably messy or missing. But it’s worth it. The historic literature is the foundation upon which our understanding of everything is based; its authors are the foremothers and forefathers of current knowledge. The global BHL community have together made over 58 million pages of the world’s biodiversity literature freely accessible online. We’re now working to make every piece of knowledge within it discoverable and part of the global linked network of scholarly research.

Nicole Kearney