Data policy
The output of research is not only journal articles but also data sets, model code, samples, etc. Only the entire network of interconnected information can guarantee integrity, transparency, reuse, and reproducibility of scientific findings. Moreover, all of these resources provide great additional value in their own right. Hence, it is particularly important that data and other information underpinning the research findings are "findable, accessible, interoperable, and reusable" (FAIR) not only for humans but also for machines.
Therefore, Copernicus Publications requests depositing data that correspond to journal articles in reliable (public) data repositories, assigning digital object identifiers, and properly citing data sets as individual contributions. Please find your appropriate data repository in the registry for research data repositories:
re3data.org
. A data citation in a publication resembles a bibliographic citation and needs to be included in the publication's reference list. To foster the accessibility as well as the proper citation of data, Copernicus Publications requires all authors to provide a statement on the availability of underlying data as the last paragraph of each article (see section
data availability
). In addition, data sets, model code, video supplements, video abstracts, International Geo Sample Numbers, and other digital assets should be linked to the article through DOIs in the assets tab. With
Earth System Science Data (ESSD)
Copernicus Publications provides a journal dedicated to the publication of data papers, including peer review of data sets. Authors should consider submitting a data paper to ESSD in addition to their research paper in another journal published by Copernicus Publications.
Best practice
following the
Joint Declaration of Data Citation Principles
initiated by FORCE 11:
Preamble
Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record. In other words, data should be considered legitimate, citable products of research. Data citation, like the citation of other evidence and sources, is good research practice and is part of the scholarly ecosystem supporting data reuse.
In support of this assertion, and to encourage good practice, we offer a set of guiding principles for data within scholarly literature, another dataset, or any other research object.
Principles
The Data Citation Principles cover purpose, function and attributes of citations. These principles recognize the dual necessity of creating citation practices that are both human understandable and machine-actionable.
These citation principles are not comprehensive recommendations for data stewardship. And, as practices vary across communities and technologies will evolve over time, we do not include recommendations for specific implementations, but encourage communities to develop practices and tools that embody these principles.
The principles are grouped so as to facilitate understanding, rather than according to any perceived criteria of importance.
1. Importance
Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.
2. Credit and attribution
Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data.
3. Evidence
In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.
4. Unique identification
A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.
5. Access
Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.
6. Persistence
Unique identifiers, and metadata describing the data, and its disposition, should persist ? even beyond the lifespan of the data they describe.
7. Specificity and verifiability
Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited.
8. Interoperability and flexibility
Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities.
Statement on the availability of underlying data
Authors are required to provide a statement on how their underlying research data can be accessed. This must be placed as the section "Data availability" at the end of the manuscript. Please see the
manuscript preparation guidelines for authors
for the correct sequence. The best way to provide access to data is by depositing them (as well as related metadata) in FAIR-aligned reliable public data repositories, assigning digital object identifiers, and properly citing data sets as individual contributions. If different data sets are deposited in different repositories, this needs to be indicated in the data availability section. If data from a third party were used, this needs to be explained (including a reference to these data).
Data Cite
recommends the following elements for a data citation:
creators: title, publisher/repository, identifier, publication year (e.g. Loew, A., Bennartz, R., Fell, F., Lattanzio, A., Doutriaux-Boucher, M., and Schulz, J.: Surface Albedo Validation Sites, EUMETSAT [data set],
http://dx.doi.org/10.15770/EUM_SEC_CLM_1001
, 2015).
If the data are not publicly accessible at the time of final publication, the data statement should describe where and when they will appear, and provide information on how readers can obtain the data until then. Nevertheless, authors should make such embargoed data available to reviewers during the review process in order to foster reproducibility. The Copernicus review system allows to define such assets as 'access limited to reviewers' and reviewers must then sign that they will use such data only for the purpose of reviewing without making copies, sharing, or reusing.
In rare cases where the data cannot be deposited publicly (e.g., because of commercial constraints), a detailed explanation of why this is the case is required. The data needed to replicate figures in a paper should in any case be publicly available, either in a public database (strongly recommended), or in a supplement to the paper.
Other underlying material
Data do not comprise the only information which is important in the context of reproducibility. Therefore, Copernicus Publications encourages authors to also deposit software, algorithms, model code, video supplements, video abstracts, International Geo Sample Numbers, and other underlying material on suitable FAIR-aligned repositories/archives whenever possible. These materials should be referenced in the article and cited via a persistent identifier such as a DOI.
With regard to software citation, please refer to the
FORCE11 Software Citation Principles
.