The Research Data Management Organiser (RDMO) &ndash; a Strong Community Behind an Established Software for DMPs and Much More

Ivonne Anders; Harry Enke; Daniela Adele Hausen; Christin Henzen; Gerald Jagusch; Giacomo Lanza; Olaf Michaelis; Karsten Peters-von Gehlen; Torsten Rathmann; Jürgen Rohrwild; Sabine Schönau; Kerstin Vanessa Wedlich-Zachodin; Jürgen Windeck

The Background and History of RDMO

The Research Data Management Organiser (RDMO) is a web-based software that enables research-performing institutions as well as researchers themselves to plan and carry out their management of research data. RDMO can assemble all relevant planning information and data management tasks across the whole life cycle of the research data. RDMO is ready for application in smaller or larger projects.

One of the results of ‘WissGrid’, a collaborative project in the German D-Grid context, was a small collection of guidelines on how to deal with data, including a set of questions to help organise data publication and data management (). The publication collected and reflected the discussion driven by the California Digital Library (CDL) and Digital Curation Centre (DCC) on data management and data management plans in the context of Germany and its landscape of research institutions and organisations.

Some of the takeaways from this work were that only writing up DMPs to meet the requirements of a funding agency would not suffice to guide research projects during their subsequent processes of producing and analysing their research data. DMPs and connected information should better remain in the realm of the workgroup/project or institution instead of a central website.

With this motivation, the DFG-founded RDMO project was set up in 2015 (), aiming to develop a modern and easy-to-install and use web application with a questionnaire based on the aforementioned WissGrid guidelines (), a storage engine and configurable output options. The web app has consecutively been early-exposed to interested adopters. By giving extensive support, the RDMO project not only improved its web application but also was an attractor for the formation of new local groups to organise their data management work across institutional borders and local and community barriers.

The RDMO project continued in 2018 to interact intensely with the growing group of institutions and groups that used the RDMO web app in many different ways: not only to produce DMPs but also as a tool to organise consulting and coaching in data management, to enforce standardisation of data management within an institution, to feed available information (e.g., from lab instruments) into a project’s RDMO instance or to adapt the collected information into several formats required by funding agencies or research institutions. A DMP with these additional functionalities can also be used to initiate processes and tasks in the whole data lifecycle and is called ‘machine-actionable DMP’ (maDMP). In RDMO, we implemented the recommendations of the RDA WG DMP Common Standards ().

RDMO from a Software Development’s Perspective

RDMO is an open source tool whose code can be freely extended and modified. It is implemented as a web application. It consists of a backend part running on a server that is mainly written in Python utilising the Django framework (https://djangoproject.com/) and a frontend part based on common web technologies providing the user interface running in a browser to be able to provide a collaborative platform. Python and the Django framework were chosen because Python is a high-level programming language that is relatively easy to learn. Its emphasis is on code readability and usability, which has made it a well-established programming language in the science community. This provides the advantage of having a certain degree of knowledge in the area where RDMO is installed, maintained, used, and its development is driven forward. From the start, RDMO’s code has been freely available with an Apache 2.0 licence on GitHub, which also serves as a focal point for community feedback (bug reports, feature requests) and for defining and tracking RDMO’s future development (https://github.com/rdmorganiser/rdmo).

The software’s first release dates back to 2016. Subsequently, the RDMO community has seen over 60 new versions. Regular releases provide continuity and have made RDMO grow quite mature over time. The exact number of software downloads is unknown, but the number of productive and test instances has steadily been increasing during the last few years and has now reached 56 (source: https://rdmorganiser.github.io/Community/, status: 11/09/2023).

RDMO was designed to make technical hurdles for administrators as low as possible. It can be installed fairly quickly and does not need much storage space or processing power because it primarily deals with textual data saved in rather small databases. RDMO only requires Python to run, a web server like Apache or Nginx to serve static files and a database like PostgreSQL or MySQL. There are Docker images provided as well to ease the RDMO run for those who are familiar with this technology.

Information is stored locally within an RDMO instance and is structured according to RDMO’s data model, presented in Figure 1. A person compiling a DMP for a project is requested to address a series of questions. The answers are stored as values of internal variables called attributes and can then be further used to generate documents (views) or to activate actions (tasks).

Figure 1

The RDMO data model. RDMO employs a complex data model organised along different Django apps and modules (representing database tables), which is well documented (https://rdmo.readthedocs.io/en/latest/management/data-model.html).

The exchange of information among instances is made possible by using a common attribute list (the RDMO domain), which ensures compatibility between question catalogs and still allows use-case-tailored question catalogs, option sets and views. All this content can be exchanged over the GitHub repository for content.

The RDMO domain currently includes 291 hierarchically ordered attributes, which cover all RDM aspects identified so far and thus plays the role of a ‘controlled vocabulary’ for DMPs. The fundamental RDMO catalog contains 125 questions covering all aspects of research data management. Besides that, several other catalogs (https://www.forschungsdaten.org/index.php/RDMO) have been tailored to specific disciplines (engineering, chemistry, etc.), institutions (UARuhr, HeFDI) or funding programmes (SNF, Volkswagen Foundation), taking care to reuse as many questions and attributes from the main catalog and domain as possible to ensure interoperability between existing projects, ensuring that the very same attributes are referred to the questions in different catalogs (thus allowing users to switch catalogs when necessary). For example, there were successfully accompanied attribute supplements for DFG questionnaires from FoDaKo, a cooperation of the Universities of Wuppertal, Düsseldorf, and Siegen concerning research data management (https://fodako.nrw/datenmanagementplan, see Figure 2), and the questionnaire of the University of Erlangen-Nuremberg for the Volkswagen Foundation, Germany’s largest private research sponsor. An implementation of the Horizon Europe Data Management Plan Template (for the homonym European funding framework programme) has also been added recently, comprising a questionnaire, new attributes and options, and a view (see Figure 3). Soon, the sub-working group will deal with other funding programmes from Germany and abroad, such as the Austrian funding organisation, Fonds zur Förderung der wissenschaftlichen Forschung (FWF)’ (https://www.fwf.ac.at/).

Figure 2

Overview of the FoDaKo questionnaires for projects funded by DFG. All questionnaires fulfil the DFG checklist and have different subject-specific coverage, from the ‘minimum’/‘intersection’ catalog with 85 questions to the ‘maximum’/‘union’ catalog (an extension of the core RDMO catalog) with 139 questions. The subject-specific questionnaires include further recommendations from the DFG Review Board on that subject. ‘All questions’ is an extension of the catalog RDMO. Below the title, the number of questions is given.

Figure 3

Preview of the ready Horizon Europe Data Management Plan in the RDMO interface. Compared to the funders’ DMP templates, the questions in the RDMO catalogs are more precise and ‘fine-grained’. Filling out a DMP is further eased with the provision of help texts and controlled answer choices (options). Finally, export templates, i.e., views, are available for converting the data management plan into a deliverable, which inserts references to thematically overlapping questions and converts the data management plan into the deliverable form for the funder.

The RDMO Consortium

The RDMO consortium was founded in 2020 by signing a Memorandum of Understanding (MoU) (https://rdmorganiser.github.io/docs/Memorandum-of-Understanding-RDMO.pdf) between several supporting German institutions and individuals. The organisational structure with various groups has been approved by an RDMO user meeting. This structure supports future development and is detailed in the MoU. There are three permanent groups besides the general meeting of all members of the consortium, i.e., the signatories of the MoU. Members and other interested parties can participate in the general meeting. The general meeting meets at least once a year, as required. All institutions that are interested in the preservation and further development of RDMO are invited to sign the MoU.

Some of the members are active in various RDM working groups, such as RDA and DINI/nestor (), and thus ensure a user-oriented focus on the RDMO content through their external cooperation.

The RDMO Steering Group (StG)

The RDMO consortium is led by a steering group (StG). The representatives of the StG are elected by the members at the general meeting every three years or as needed. The StG accompanies direction of the further development and coordinates the processes for the further development of the software and its content. It is composed of at least five persons.

The RDMO Development Group (DG)

The technical coordination and further development of RDMO are organised by a development group. In addition to a core of long-term committed developers who continuously drive the development forward, the low-threshold participation of a larger number of developers is required and already in place. These, for example, can contribute to development on a project-specific basis.

The RDMO Content Group (CG)

The work of the CG members focuses on maintaining existing and newly generated content, such as attributes or questions for catalog templates. They provide moderation and support for individual processes, as well as domain adjustments. The CG collects user feedback from RDM coordinators and researchers from research institutions in Germany and checks the general usability of RDMO against the background of user feedback.

The work of the CG is currently organised into four sub-working groups and can spawn ad-hoc sub-working groups for special purposes.

Sub-Working Group Guidance Concepts and Texts

The ‘Practical Guide to the International Alignment of Data Management’ published by Science Europe () provides specific guidance for different stakeholders, such as researchers and reviewers of DMPs, on how to manage research data, describe data management and review a DMP. The guide therefore comprises an overview of core aspects that should be included in a DMP. However, in such guidance documents, discipline-specific recommendations are often lacking. The sub-working group first collected discipline-specific best practices in data management. Based on this collection and findings, the most relevant DMP sections requiring recommendations were identified. For the structuring of a corresponding DMP guidance, the software design pattern concept was used in software engineering for the systematic description of problem-solution pairs (). The pattern concept provides a template to store information, e.g., problems, solutions, concrete examples and related patterns. A specific DMP guidance template was developed by extending the initial pattern template. The use of such a pattern structure for DMP guidance ensures that recommendations/guidance can be easily compared and linked. Moreover, the pattern structure can help raise awareness of the potential consequences of not implementing proper data management through concrete solutions. As a proof-of-concept and first collection of guidance patterns, examples were selected from the own RDM support experiences for research projects with different disciplinary foci and iteratively improved the template (). The DMP guidance pattern structure can be applied to other DMP guidance texts and extended accordingly.

In the future, the working group will further elaborate on how to streamline our DMP pattern concept with RDM community activities, like the Stamp project (Standardised data management plan for education sciences; ) or the activities of the RDA working group ‘Discipline-specific Guidance for Data Management Plans’ (https://rd-alliance.org/groups/discipline-specific-guidance-data-management-plans-wg). Moreover, they are going to implement the envisioned community-driven guidance pattern collection process, e.g., by guiding RDM support teams and researchers to collect further patterns and provide guidance on how to use the pattern collection. On a practical level, they aim to provide a basic set of patterns for the RDMO community to be used in upcoming and existing DMP templates. However, the group envisioned the applicability and usage of the patterns across disciplines and tools, not limited to their usage in RDMO.

Sub-Working Group Editorial Processes

The sub-working group called Editorial Processes is responsible for the development, curation and harmonisation of the content that is necessary for the local usage of an RDMO instance: attributes, catalogs, conditions, option sets and views.

External authors have the option to make their questionnaires available to the general public in the ‘shared’ area of the RDMO repository for content (https://github.com/rdmorganiser/rdmo-catalog). Editorial Processes also accompanies the content development by external authors, cares for its harmonisation and adds the newly created attributes and questions whenever they can be of general relevance. Besides that, this sub-working group has coordinated the localisation of the RDMO software and of the RDMO content into French, Spanish and Italian, yielding a total of five languages.

Sub-Working Group Website

The transition of RDMO towards a community-based project required the website (https://rdmorganiser.github.io/) to reflect the change from a project to a community as well. This sub-working group is engaged in the improvement of the online representation of RDMO, tailoring the information for the different audiences, including end users (researchers), RDM managers/coordinators and system administrators presenting various aids. The focus is on providing informational material that is relevant, depending on the needs of the audience.

The website intends to be the first point of contact for RDMO users or interested parties and to bring together all the available information about RDMO.

Sub-Working Group DFG Checklists

This sub-working group is working on the implementation of the Deutsche Forschungsgemeinschaft (DFG) guidelines for research data management in RDMO. These guidelines must be considered during the redaction of project proposals and are available as a checklist (http://www.dfg.de/research_data/checklist). Since spring 2022, many German universities have developed guidelines, commentated versions of the DFG checklist or specific RDMO questionnaires to support their local researchers. The sub-working group was established in October 2022 to harmonise and map local solutions, creating one community questionnaire and export template.

Conclusion and Outlook

The overall goals of the work of the RDMO consortium are to simplify RDM and DMP planning further for users, improve their experience and build a sustainable open source community. With the user perspective in mind, the focus is, therefore, particularly on motivating researchers to use RDMO for their purposes. One of the ways by which the consortium intends to achieve this is by expanding different RDMO catalogs for various purposes (e.g., additional benefits such as project management functions and exchange between the different researchers in the project) by using DMPs. Researchers can be motivated in this respect, not only by familiarising them with RDMO but also by involving them in developing questionnaires that can be tailored to their discipline and/or to the needs of their community.

The development of several RDM initiatives, including the German National Research Data Infrastructure (NFDI, https://www.nfdi.de/consortia/), gives great momentum to the discussion around DMPs and facilitates the harmonisation and establishment of common infrastructures. In the coming years, it is expected that the importance of research data and corresponding data management will continue to increase enormously. This will also give rise to further environments and tools that facilitate RDM. Due to its strong community, RDMO has the possibility to offer a significant contribution to innovative and demand-oriented research data management.

Data Science Journal

Practice Papers

The Research Data Management Organiser (RDMO) – a Strong Community Behind an Established Software for DMPs and Much More

Abstract