Benefits of citing data
Proper citation of data sources has both immediate and long term benefits to users and producers of data. “Data citation is the practice of referencing data products used in research. A data citation includes key descriptive information about the data, such as the title, source, and responsible parties.” (USGS)
Benefits for data producers
- provides proper attribution and credit
- creates a bibliographic “trail”, connecting publications and supporting data, and establishing a timeline of publication and usage
- demonstrates the impact of their work and establishes research data as an important contribution to the scholarly record
Benefits for data users
- citation makes it easier to find datasets
- supports persistence of datasets
- encourages the reuse of data for new research questions
Benefits for everyone
- increases transparency and reproducibility
Components of a data citation
Citing data is very similar to citing publications; there are many “correct” formats to use, but we suggest including the following important information:
- creator(s) or contributor(s)
- date of publication
- title of dataset
- publisher
- identifier (e.g. Handle, ARK, DOI) or URL of source
- version, when appropriate
- date accessed, when appropriate
The order of the information is not as important as having sufficient information to find the data set(s) used. Consider the style guidelines of the research domain or lab group, data source, or preferred publisher (see related information).
A suggested citation format may be specified by some publishers, with specific additional information (e.g. resource type, retrieval data, funder/sponsor). They may also request citation of related publication(s) along with the data. Be sure to review citation style guides carefully. When citation formats are not specified, you can follow your discipline’s scholarly citation style. The next section provides examples of common repository styles, as well as APA/MLA/Chicago styles.
Examples of data citation styles
Style | Example(s) | More information |
---|---|---|
APA (6th edition) | Smith, T.W., Marsden, P.V., & Hout, M. (2011). General social survey, 1972-2010 cumulative file (ICPSR31521-v1) [data file and codebook]. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor]. doi: 10.3886/ICPSR31521.v1 | IASSIST guidelines |
Chicago | Smith, Tom W., Peter V. Marsden, and Michael Hout. 2011. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center. Distributed by Ann Arbor, MI: Inter-university Consortium for Political and Social Research. doi:10.3886/ICPSR31521.v1 | IASSIST guidelines |
DataCite | Barclay, Janet Rice (2013) Stream Discharge from Harford, NY. Cornell University Library eCommons Repository. http://hdl.handle.net/1813/34425 | DataCite guidelines |
DRYAD | Yannic G, Pellissier L, Dubey S, Vega R, Basset P, Mazzotti S, Pecchioli E, Vernesi C, Hauffe HC, Searle JB, Hausser J (2012) Data from: Multiple refugia and barriers explain the phylogeography of the Valais shrew, Sorex antinorii (Mammalia: Soricomorpha). Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.2jj36325 | DRYAD guidelines |
ESIP | Cline, D., R. Armstrong, R. Davis, K. Elder, and G. Liston. 2003. CLPX-Ground: ISA snow depth transects and related measurements ver. 2.0. Edited by M. A. Parsons and M. J. Brodzik. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5060/D4MW2F23. Accessed 2008-05-14. | ESIP guidelines |
ICPSR | Jacob, Philip, and Henry Teune. International Studies of Values in Politics, 1966. ICPSR07006-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 1978. doi:10.3886/ICPSR07006.v1 | ICPSR guidelines |
Figshare | Rodriguez, Tommy (2013): 17,170 Base Pair Alignment of Thirteen Time-Extended Lineages [data: (complete) mtDNA; format: ClustalW]. figshare. https://dx.doi.org/10.6084/m9.figshare.815894 Retrieved: 16 26, Jan 04, 2016 (GMT) | Figshare guidelines |
MLA (7th edition) | Smith, Tom W., Peter V. Marsden, and Michael Hout. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011. Web. 23 Jan 2012. doi:10.3886/ICPSR31521.v1 | IASSIST guidelines |
The Digital Curation Centre (DCC) provides additional guidance on how to cite datasets and link to publications.
Related information
- The Austin Principles (Linguistics Data Citation)
- Dataset and Software References and Citation Examples (American Meteorological Society (AMS))
- Data Citation (Australian National Data Service (ANDS))
- Data Citation (United States Geological Survey (USGS))
- Data Citation Guidelines for Earth Science Data Version 2 (Earth Science Information Partners (ESIP), 2019)
- Data Citation Standards and Practices (CODATA-ICSTI)
- Data Citation Synthesis Group: Joint Declaration of Data Citation Principles (FORCE11)
- Get Recognition: Data Citation (The DataVerse Network)
- Data Citation (DataONE)