Kind:
Sample
Type:
ENTRY
Registrator:
fabian.plass
Registration Date:
2023-07-21 10:20:36.064569
Modifier:
fabian.plass
Modification Date:
2023-07-26 12:07:54.854062
ENTRY741
ENTRY743
Name:
Current state of research data management
Document:
2.1. FAIR as part of research data management
One of the cornerstones of this overall research data management (RDM) is the FAIR principles (Findable, Accessible, Interoperable, Reusable). The emphasis placed on growing awareness of FAIRness is, however, more than just an essential duty that public funding agencies impose on research, but, moreover, it is the key to conduct knowledge discovery, innovation, and transfer, as well as the subsequent integration and reuse by the scientific community (Wilkinson et al, 2016). Events such as the global Corona pandemic demonstrate the need for, and the overall benefits of, making data available online (Besançon et al, 2021; Tse et al, 2020). This leads not only to efficient research and increased innovation, but also to fair and transparent use of public funds and tax money, as well as increased visibility and scientific reputation and reliability, to name just a few benefits of Open Science (Janssen et al, 2012).
The FAIR Data Principles propose that all scholarly output should embody the characteristics of being Findable, Accessible, Interoperable, and Reusable. While these principles provide guidance on the expected behaviors of data resources, their practical implementation has been subject to varying interpretations. As the support for these principles has grown, so has the diversity of interpretations surrounding their application (Mons et al, 2017). FAIR principles recognize the need for data accessibility under defined conditions but do not necessitate complete openness. While transparency and clarity are required for accessing and reusing data, restrictions can be placed based on privacy, security, and competitive reasons. FAIR promotes a balanced approach that allows diverse participation and partnerships while ensuring the availability of data within specified guidelines (Mons et al, 2017).
2.2. Data repositories and data publications
One of the tools used by researchers to publish not only publications but also scientific data and general information are data repositories. Well-known examples are the commercial data repository service figshare (http://figshare.com), open access archives like arXiv.org or platforms like Dataverse (Crosas, 2011), EUData (Lecarpentier et al, 2013), Zenodo (http:// zenodo.org/), which is maintained by CERN and funded by the EU Commission. In fact, most of the known repositories already consider the high-level FAIR principle. So, in the case of Zenodo, the uploaded data is provided with a digital object identifier (DOI) can optionally be published as open-accessible and viewable. This, in turn, leads to simplified findability, accessibility, and usability for the scientific community, as well as to the quotability of individual datasets, whose content no longer has to lead directly to a complete publication. This means that even data, methods, or code that initially received little attention can now be found by the general public and are not lost. Moreover, the principles of good research practice still apply.
However, justified doubts exist regarding the Open Science policy. Especially, there are questions and concerns about the security of data against external interference and possible compliance and regulatory requirements, especially in healthcare, such as the protection of relevant patient data. These issues circling around the subject of data sovereignty need to be clarified during the planning and before introducing digital technologies such as data management or cloud-based systems (Clayton et al, 2019; Hummel et al, 2021).
Good research data management does not start with the publication and archiving process of the work or the (meta-)data, but with their initial collection. This is because not only content-related data/information is of importance in the sense of RDM, but also its metadata. Metadata describes data or in general (additional) information about the described data(set). Metadata-specific information can be, e.g., the author, the creation time/date or the type of the dataset, as well as the DOI of the dataset. In fact, many distinct types of metadata exist, including descriptive, structural, as well as administrative or process data related information.
2.3. Electronic Laboratory Notebooks and Laboratory Information Management Systems
However, the scientific question, the choice of an experimental procedure, the materials and methods, the data analysis, and the interpretation of the results are traditionally recorded in detail in notebooks on paper and not electronically (Barillari et al, 2016). Not only is this incomprehensible, since most data are generated electronically or stored as code on a network anyway, but this concept vehemently contradicts the overarching principles of FAIR and Open Science in general. Neither are the data easy to find, nor are they accessible or usable by scientists outside the local system in which they are stored. Inevitably, the use of paper-based notebooks should be avoided, and a shift made to electronic systems such as a laboratory information management system (LIMS) and/or an electronic laboratory notebook (ELN).
An ELN and LIMS is a software solution designed to facilitate the documentation and management of laboratory processes and data. An ELN is used for capturing and organizing experimental data, while a LIMS supports the management of laboratory resources, sample tracking, quality assurance, and other laboratory functions. Together, ELN and LIMS provide a comprehensive platform for efficient and secure management of laboratory information, promoting compliance with best practices and regulatory requirements (Barillari et al, 2016; Bespalov et al, 2020; Machina and Wild, 2013). ELN systems, for example, can play a major role in a successful RDM. In this way, a continuous workflow under FAIR conditions can be guaranteed from the very beginning, starting with the collection of data, the use of the data by oneself or the research group and other researchers, the publication of the data, and finally a superior archiving, for example of data repositories like Zenodo or RADAR (Kraft et al, 2016). Further advantages of an ELN system are obvious: starting with (i) the easy and metadata-based collection of information, as well as their shareability, (ii) an ensured long-lasting data storage on a secured server, (iii) simplified accessibility via a global or local network, as well as (iv) the archiving possibilities on open data repositories and (v) self-implementable applications with the system (Barillari et al, 2016). Currently available project management tools connecting classical, collaborative project and data management like OSF (https://osf.io/), as well as classical ELN systems range from commercial applications like CERF (https://cerf-notebook.com/), Benchling (https://www.benchling.com/) or labfolder (https://www.labfolder.com/) or open-source-based ones like chemotion (https://www.chemotion.net/chemotionsaurus/), eLabFTW (https://www.elabftw.net/) or openBIS (https://openbis.ch/).