Data Citation: Connecting articles and datasets.

Author: Helena Cousijn (DataCite) and Rachael Lammey (Crossref)

The FREYA project is all about connecting different identifiers to improve discovery, navigation, retrieval, and access of research resources. One important connection people think and talk about a lot is the connection between articles and datasets. On the side of articles, this connection is usually established through data citation: authors citing datasets that underlie the research in their article. On the side of data repositories, the connection is made when researchers add information about articles to dataset metadata or when repositories do curation work to establish these links.

The concept of data citation is quite well-established and publishers are increasingly implementing policies asking authors to cite the data they used. This is an important first step, but having the infrastructure to process and share these data citations is just as important. There is even a data citation roadmap for scientific publishers to help with this. Once publishers have implemented data citation, they can make the citations available by adding these to the metadata they deposit with Crossref, with data repositories doing the same when they deposit metadata with DataCite.

Both Crossref and DataCite make these links between articles and datasets available through Event Data, a service that was launched by both organizations last year. Through two open APIs all interested parties can extract connections between articles and datasets. This allows publishers and repositories to get information not available in their own metadata, and allows other organizations access to all available information about data citations.

The article-data links currently available in Event Data will feed into the PID Graph, a relationships graph across a network of PIDs that’s being developed within project FREYA. This PID graph will make it possible to not only see which article is connected with which dataset, but also who the author is and which institute the research outputs are affiliated with, helping not only researchers but also institutes to get recognition and credit for this work

Do you want to know more about implementing and sharing data citations or about using the available systems to extract information about links between articles and datasets? Then you should join the free joint Crossref & DataCite webinar on January 31st, 4pm UTC. Register at: and join the conversation!