Earth Science and Biodiversity Journals can Improve Support for Data Sharing

This study reviews research data policies and author instructions of 31 journals from the Earth sciences and from Biodiversity that are published by German learned societies or research institutions. 12 journals don’t address data sharing at all. The statements on data sharing of the journal’s data policies/author guidelines were matched to 14 defined features of journal research data policies. A brief discussion on quality of data policies is presented to raise awareness of German learned societies/research institutions and to guide them towards improved data policies of their journals.


Introduction
Statements on data sharing in journals' data policies and/or author guidelines recommend or mandate to make data related to an article available. Alongside to data sharing technical infrastructure and nontechnical aspects like scientists' positive personal attitudes towards data sharing (Stieglitz et al. 2020), these statements are important drivers to increase data publications for the purpose of scientific transparency and data reuse. Learned societies and research institutions frequently publish scientific journals, and being the "owners" of the journals, they have control (or, being co-publishers, at least some influence) on the existence and the contents of statements on data sharing of the journals.
Traditionally, journal articles pass on information on the underlying data in chapters on data collection procedures and analysis techniques as well as description of results. Further, results are often reported in diagrams and data tables embedded in the text document. Most journals allow for data tables too large to be included in the text document and other research products, such as movies, audiofiles, or detailed information related to the research described in the article, to be published as "Supplemental electronic material" along with the journal article on the publishers' website.
However, modern data publication standards have evolved much beyond the practices described above: they call for data (including software, models etc) to be routinely shared in ways that allow easy discovery, recombination, and reuse, and information about samples, methods, and tools should to be standardized, available, and linked across publications. Current best practice for data publication has been described for several research domains, including the Earth, space, and environmental sciences (Enabling FAIR Data Community et al. 2018;Enabling FAIR Data Community, 2018) and biodiversity research (Penev et al. 2017). A common request in these recommendations is that all core research outputs should be directed to FAIRaligned repositories. This stipulates that supplements on journal publishers' websites will no longer be used as the primary archive for data, a demand that is being implemented slowly but increasingly (Kwon, 2019;Santos et al. 2005).
To adopt best practices in data sharing and guidance for authors, journals should have comprehensive, clear and up-to-date information in place that describe the general approach to data sharing as well as the details of what is expected from authors in terms of provision of research data. This should be stated in research data policies that are related to (or part of) the author instructions.
This study reviews the data policies of 31 journals from the Earth sciences and from biodiversity that are published by German learned societies or research institutions. The focus is on journals that are publishing mainly original research articles. Since data publication is also relevant to Phd and habilitation theses, the journal "SDGG Schriftenreihe der Deutschen Gesellschaft für Geowissenschaften" (English: "SDGG Publication Series of the German Society for Geosciences), which publishes conference proceedings, monographs as well as dissertations and habilitation theses, is included.
This survey identified whether journals had an explicit data policy in place and/or addressed the topic of data sharing in the author instructions. The statements on data sharing of the journals' data policies/author guidelines (status May 2020) were then matched to the 14 features of journal research data policies as defined by Hrynaszkiewicz et al. (2020) (matching results: Hübner, 2020. The policy framework of Hrynaszkiewicz et al. (2020) is a recent and comprehensive effort to support journal data policy development and implementation across all scientific disciplines. Earlier attempts include the CODATA best practice guidelines for research data policy (Hodson & Molloy 2015), the TOP guidelines (Nosek et al. 2015) and the JISC model policy (Sturges et al. 2015). To identify the 14 features, Hrynaszkiewicz et al. (2020) reviewed, combined and harmonized requirements from existing scholarly publishers' research data policies (Springer Nature, Elsevier, Wiley, PLOS). The CODATA best practice guidelines for research data policy (Hodson & Molloy 2015) and the TOP guidelines (Nosek et al. 2015) were also included in the review of existing policy frameworks. The 14 features were then combined into a set of six policy types (tiers), with more features and requirements moving from policy one through to six. The six tiers allow for more nuanced, step-wise and robust implementation of policies by different journals. It is important to note that this policy framework is firmly rooted in the international research data community as it incorporates community feedback from Research Data Alliance (RDA) plenary meetings and community conference calls/web meetings as well as public consultations. The framework is an official output of the RDA Policy standardisation and implementation Interest Group. It intentionally avoids using the FAIR (Wilkinson et al. 2016) acronym for the reason that researchers are little familiar with it. However, the 14 features strongly sustain the idea of making data FAIR.

Information on data sharing
Information on the journals of this study as well as the availability of information on data sharing is presented in Table A [located in Appendix A].
12 out of 31 surveyed journals don't address data sharing at all: they don't have a data policy and neither address data sharing in the author instructions nor on the society/institution (journal owners) journal website.
19 journals address data sharing, although in variable ways. Some journals provide very little information on data publication, the only offer the possibility to add "supplemental material" to the journal article, to be published along with the article on the publisher's website. Other journals offer more specific information on the conditions of supplemental material publication. Of the 10 journals that have an explicit data policy, the policy is an individual document/website or the policy is integrated into the author instructions. Only two societies, Deutsche Geophysikalische Gesellschaft DGG and Deutsche Mineralogische Gesellschaft DMG, address data sharing on their respective websites (

Content of information on data sharing
The statements on data sharing of the journal's data policies/author guidelines were matched to the 14 features of journal research data policies as defined by Hrynaszkiewicz et al. (2020) (matching results: Hübner 2020). 19 out of 31 journals address data sharing either in their author guidelines or in a data policy. The only one feature that is addressed by all 19 journals is feature #4 "Supplementary materials" ( Table 1). For five journals, this is the only feature addressed ( Table 2). Please note that merely just mentioning the possibility of submission of supplemental material, without any further information, already qualified a journal to be included in the group of journals that address data sharing. Two journals address two features: feature #4 plus feature #10 "data formats and standards", and two journals combine three features: feature #4 plus features # 5 "Data repositories" and feature #9 "Data availability statements". Other journals combine more/other features in various ways. Three features are not addressed in any of the journals' texts about data sharing: #3 "Embargoes", #13 "Peer review of data", and #14 "Data Management Plans".
Hübner: Earth Science and Biodiversity Journals can Improve Support for Data Sharing Art. 37, page 3 of 10 Addressing data sharing: formats 10 journals have an explicit "data policy". This policy may be a separate text in form of an individual document or individual website, but in most cases, it is a sub-chapter of the author guidelines. In the latter case Table 1: Features of data policies as defined by Hrynaszkiewicz et al. (2020) and the number of journals that address this feature in data policies/author instructions.

Supplementary materials (4) 19
Data repositories (5)  12 Data citation (6)  10 Data availability statements (9) 9 Data formats and standards (10)  8 Definition of research data (1)  7 Data licensing (7)  5 Researcher/author support (8)  3 Definition of exceptions (2)  2 Mandatory data sharing (all papers) (12) 2 Mandatory data sharing (specific papers) (11)  1 Embargoes (3)  0 Peer review of data (13)  0 Data Management Plans (14) 0 information on data sharing is present in different parts of the author instructions, not exclusively in the chapter "data policy", making it difficult for authors to get the full picture on data sharing policy of the respective journal.

Discussion
About half (9 out of 19) of the journals from Earth sciences in this study don't address data sharing at all. This share is higher than the results of Malicki (2019), where only one third (5 out of 15) journals from the sub-group of "Earth and Planetary Sciences" journals don't address data sharing, including accepting data(sets) as supplementary materials in journals' instructions to authors. For the whole dataset of Malicki (2019) (=835 scientific journals from all disciplines), 60% of journals don't mention data sharing in journals' instructions to authors. The differences for Earth science journals in this and Malickis (2019) study indicates that especially Earth science journals owned by German societies/institutions are lagging behind in addressing data sharing.

Quality of journals' information on data sharing
The number of addressed features in information on data sharing in data policies/author instructions indicates quality in a way that it reflects on completeness or comprehensiveness of the information, with journals in the upper ranges of Table 2 as positive examples. The specific effects of individual features in statements on data publication is unkown for many of the features. Clearly, mandating data publication is a strong stance, but it could have the adverse effect that authors will refrain from publishing in that journal because they don't want, for whatever reasons, to publish the underlaying data. The role of data availability statements in citation advantages have been investigated, showing that articles that include data availability statements that link to data in a repository have an up to about 25% higher citation impact on average (Colavizza et al. 2020). However, more in-depth studies of this kind are necessary to evaluate the (positive) effects of individual subjects in journal statements on data sharing. Additional to completeness, clarity and up-to-dateness are further and most important parameters that are essential to overall quality of author information on data sharing. Whatever the policy of a journal is, it should phrase expectations to authors with clarity, in an understandable, precise and concrete manner. However, in some of the surveyed journals' texts, ambiguous and inconsistent statements were encountered, making it hard for authors to identify the expectations of the journal on data sharing. For example, the statement "Where a widely established research community expectation for data archiving in public repositories exists, submission to a community-endorsed, public repository is mandatory." unfortunately leaves it to the authors to judge if they may be part of a research community where these expectations are "widely established" and thus, if this mandate applies to them or not. Vague and/or contradictory journal data policy language has been documented before in journal data policies, see Christian et al. (2020) and Sturges et al. (2015).
Because publication of supplementary information is most widely offered by the journals investigated in this study, clarity is needed about the demarcation of supplementary information (published along with the article on publishers' websites) and data that is to be published elsewhere (preferably in a FAIR-aligned repository). Authors need information about what is still deemed as supplementary information and what (and why) data should be directed to repositories. This also underpins the necessity of defining what is understood by research data. The advantages of repository deposition above supplementals on publishers' websites as well as the current cultural change towards repository publication is illustrated by Kwon (2019).
Information on data sharing should be up-to-date and reflect modern best practice and current guidelines, especially with the FAIR principles and its evolving specifications (eg. see Enabling FAIR Data Community et al. 2018;Enabling FAIR Data Community, 2018;Hrynaszkiewicz et al. 2020;Davidson et al. 2019). Like with clarity, a wide spectrum of up-to-dateness was encountered in data policies/author instructions in this survey. For example, recommendations for data table file formats as standard MS office or PDF format could be updated in some texts to meet today's requirements towards non-proprietary formats and machine readability of data files. Hrynaszkiewicz et al. (2020) defined six policy types in their framework that include important differences to the level of implementation, allowing for a more nuanced, step-wise implementation of policies by different journals. Geoscience and biodiversity community-endorsed data sharing standards (Enabling FAIR Data  Penev et al. 2017) do correspond best with tier four or tier five policy types. In comparison to a less strict tier-4-policy, a tier-5-policy includes checking and enforcement of important features: i) exceptions to the policy, ii) data citation as well as iii) mandatory data sharing.

Policy typology
Enforcing mandatory data sharing in journal data policies in general is not yet standard. The reasons may be that it is time consuming to implement and that journal owners fear that authors may choose another journal (with less strict data policy) to avoid the effort of data sharing or for other reasons. However, mandatory data sharing policies that are enforced during the peer-review and publishing process and supported with suitable data repositories are the most effective policies (Vines et al. 2013). They potentially have the most benefits in terms of increasing citations and visibility of papers (Colavizza et al. 2020). None of the investigated Geosciences and biodiversity journals mandates data sharing.
Also included in tier-5-policy is that the data must be accessible to readers at the publication date, and have been accessible to editors and peer reviewers before publication. The importance of not allowing for access embargos is emphasized here because only then reviewers are able to do their reviews properly. It is noteworthy that none of the investigated journals addresses embargos and peer review of data, independent of whether it is published by a major publisher or self-published.

Policy compliance and registration
The FAIR Data Policy Landscape Analysis (Davidson et al. 2019) identified elements that support or hinder FAIR data practice in various national, funder, publisher, and institutional data policies. The analysis includes, to a small extent, also journal policies. Additional to the features identified by Hrynaszkiewicz et al. (2020) and the resources of the Enabling FAIR data project, the analysis questioned data policies about statements on the monitoring of policy compliance. Additionally, the importance that policies can be interpreted unambiguously by not only humans but also machines is emphasized. To be machine-readable and actionable, a structured data markup schema may be used. Furthermore, policies should be versioned, indexed and semantically annotated in a policy registry, such as the FAIRsharing registry 1 , a curated service of interlinked standards, repositories and data policies. FAIRsharing registry supports machine-actionability by asking contributors to make clear the status of the resource being described in terms of effective machine actionability.
The inadequacy of the majority of the investigated journals neither to address nor to enforce important journal data policy features indicates the need to involve the owners of the journals in discussions on how authors and the research community may be served best in terms of open and FAIR data. However, examples of high-quality-information on data sharing with respect to completeness, clarity, and up-to-dateness in this survey are the data policies/author instructions of journals published by Copernicus Publications and Pensoft. Both refer to recently published community guidelines, and Copernicus Publications even is a signatory of the Coalition on Publishing Data in the Earth and Space Sciences (COPDESS) commitment statement and the Enabling FAIR Data Commitment Statement in the Earth, Space, and Environmental Sciences (Enabling FAIR Data Community 2018).

Recommendations
Funding bodies, infrastructure providers and publishers are key stakeholders in providing a consistent and easy-to-use environment for FAIR data sharing. Learned societies as well as research institutions that own and publish journals (either self-publishing or with support of a professional publisher) are an important part of that publisher stakeholder group. For the learned societies as well as research institutions it is recommended to review their attitude towards data sharing. For the journals that have no data policy in place, the owners of the journal should strongly consider to adopt a data policy. All other owners should revisit existing data policies in terms of completeness, clarity and up-to-dateness, considering the features described and issues raised in for example Hrynaszkiewicz et al. (2020), Author Guidelines for scientific data and the Commitment statement (Enabling FAIR Data Community et al. 2018;Enabling FAIR Data Community, 2018) as well as in the FAIR Data Policy Landscape Analysis (Davidson et al. 2019). Improved information on data sharing will increase the attractiveness of the journals to authors which are increasingly willing to, as well as mandated to, publish research data that is related to journal articles. Hrynaszkiewicz et al. (2020) defined six policy types in their framework that include important differences to the level of implementation and are very useful for drafting new or revisit existing journal data policies. Geosciences and biodiversity disciplines have in general a positive attitude towards Openness and FAIRness and therefore the recommendation to aim for a tier-five-policy is appropriate and justified. Tier five implies that all of the 14 policy features are not only addressed in the data policy, but most of them will be checked and enforced in the publishing or peer review process.
It is noted that journals, in addition to have an impactful data policy in place, should educate editors to be qualified in supporting author questions on all issues related to the data policy, and, in case of enforced features, to be ready to react to potential author non-compliances to the policy. Also, Geoscience learned societies should consider signing the "Commitment statement in the Earth, space, and environmental sciences" 2 to show public support for this community-driven initiative and complement their data policy efforts with other actions to foster FAIR and open data sharing. Finally, individual scientists that are not yet members of a learned society may consider to join: as many society journals are led by bodies such as Publication Committees, members may plug in and support the evolution of the publishing practices for their community.
FID GEO 3 , the Specialized Information Service for Geosciences, is ready to guide and support German geoscience learned societies and research institutions in developing data policies for their journals.

Progress
This survey was conducted in May 2020 and was sent thereafter to the journal owners for consideration. Several journal owners responded, emphasizing the importance of the topic, announcing future changes to current policies/author guidelines (planned or nearly finalized) and/or asking for support for drafting new or amending existing policies/author guidelines. Regarding finalised changes, as to the knowledge of the author of this study at the end of August 2020, the editorial office and the editing committee of the journal "Hydrologie und Wasserbewirtschaftung, HyWa" have formulated and published a data policy, and the editors of the journal "Archiv für Molluskenkunde" finalized an update of the Instructions for Authors. Appendix 2 https://copdess.org/enabling-fair-data-project/commitment-statement-in-the-earth-space-and-environmental-sciences/. 3 https://www.fidgeo.de/en/research-data/.

Funding Information
This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -BE 4498/5-2.