Data from a variety of research programmes are increasingly used by policy makers, researchers, and private sectors to make data-driven decisions related to climate change and variability. Climate services are emerging as the link to narrow the gap between climate science and downstream users. The Global Framework for Climate Services (GFCS) of the World Meteorological Organization (WMO) offers an umbrella for the development of climate services and has identified the quality assessment, along with its use in user guidance, as a key aspect of the service provision. This offers an extra stimulus for discussing what type of quality information to focus on and how to present it to downstream users. Quality has become an important keyword for those working on data in both the private and public sectors and significant resources are now devoted to quality management of processes and products. Quality management guarantees reliability and usability of the product served, it is a key element to build trust between consumers and suppliers. Untrustworthy data could lead to a negative economic impact at best and a safety hazard at worst. In a progressive commitment to establish this relation of trust, as well as providing sufficient guidance for users, the Copernicus Climate Change Service (C3S) has made significant investments in the development of an Evaluation and Quality Control (EQC) function. This function offers a homogeneous user-driven service for the quality of the C3S Climate Data Store (CDS). Here we focus on the EQC component targeting the assessment of the CDS datasets, which include satellite and in-situ observations, reanalysis, climate projections, and seasonal forecasts. The EQC function is characterised by a two-tier review system designed to guarantee the quality of the dataset information. While the need of assessing the quality of climate data is well recognised, the methodologies, the metrics, the evaluation framework, and how to present all this information to the users have never been developed before in an operational service, encompassing all the main climate dataset categories. Building the underlying technical solutions poses unprecedented challenges and makes the C3S EQC approach unique. This paper describes the development and the implementation of the operational EQC function providing an overarching quality management service for the whole CDS data.
Climate change and variability pose an unprecedented challenge to the overall society, which requires mitigation and adaptation responses to reduce the threats and maximise the opportunities presented to organisations of all kinds. The impacts of climate variability and change can take various forms such as physical, social, financial, or political, and as such climate change adaptation has a very broad scope. Both business and public administrations are vulnerable to potentially disruptive risks and are key actors in the creation of a climate-resilient future (ISO 14090:2019; ISO 14091:2021).
Both monitoring and modelling of the Earth system can provide the information and guidance necessary for the policy and decision makers to deal with climate-related challenges. This has led to the establishment of various initiatives designed to better understand the Earth system through an improvement in both observational capabilities and modeling tools. As a result, an increasing amount of environmental data about past, present, and future climate is becoming available. Unfortunately, these data often come with inconsistent or missing metadata, inhomogeneous documentation, and sometimes sparse evidence concerning their uncertainty and validation. A variety of data streams is generated independently and from multiple sources, adhering to different definitions and assumptions, often not standardised across communities, and, at times, with overlapping but disconnected objectives. As a consequence, the users can feel disoriented when it comes to identifying the most appropriate dataset for an intended application (Nightingale et al. 2019).
Given the ever more prominent role that climate products are assuming in decision making, it is unavoidable that the quality of these data will come under increasing scrutiny in the future. Climate services are emerging as the link to narrow down the gap between upstream climate science and downstream users. Climate services form the backbone of the process that translates climate knowledge and data into bespoke products for decision making in diverse sectors of society, ranging from public administrations to private business (Hewitt et al. 2020; Medri et al. 2012). The Global Framework for Climate Services (GFCS) of the World Meteorological Organization (WMO) stresses the increasing need for robust climate information based on observations and simulations covering future periods, ranging from several months up to centuries for economic, industrial, and political planning. Moreover, climate services play a crucial role in disseminating relevant standards (GFCS WMO) fostering adoption of common data models and formats and with sufficient metadata uniformly stored. The GFCS offers an umbrella for the development of climate services and has identified the quality assessment, along with its use in user guidance, as a key aspect of the service. The services, and the quality assessments in particular, need to be provided to users in a seamless manner and need to respond to user requirements.1 The ultimate goal is building trust between data providers and users, as well as maximising usage uptake (Callahan et al. 2017; Rfll 2020). Thus, the questions of what type of quality information to provide and how to present it to users are receiving sustained attention.
A relatively young operational climate service is the Copernicus Climate Change Service (C3S, Thépaut et al. 2018), one of the six operational thematic services established by the European Commission within the Copernicus Earth Observation Programme (EC 2020). The C3S, implemented by the European Centre for Medium-Range Weather Forecasts (ECMWF), aims to be an authoritative source of climate information for a wide variety of downstream users, ranging from policy makers to industrial sectors. The backbone of C3S is a cloud-based Climate Data Store (CDS), designed to be a single point of access to a catalogue of climate datasets of different categories, including in-situ and satellite observations, seasonal forecasts, reanalysis, and climate projections. In a progressive commitment to establish relations of trust between data providers and downstream users, as well as providing sufficient guidance for users to address their specific needs, C3S has made significant investments in the development of an Evaluation and Quality Control (EQC) function. By being transparent and characterising data quality attributes in a traceable and reproducible way, C3S is setting the basis for the inclusion of reliable climate data into policies and actions.
The establishment of an EQC function has three main advantages: i) it guides the users into the documentation to understand properly the dataset, it simplifies the comparison across datasets and builds trust in the products available, ii) it helps the data providers to understand which information they need to deliver to be compliant with standards, increases data uptake from the users, and clarifies how to provide standardised dataset quality information, and iii) it triggers actions for the service improvement leading to the provision of the most relevant dataset information and ensuring that published datasets are mature enough to support the authoritative character of C3S. The relation of trust between data providers and downstream users generated by these advantages is a necessary condition to foster a flourishing market for climate services (Zeng et al. 2019).
While the need for a quality assessment of the data available on the CDS is well recognised, the methodologies, the metrics, the assessment framework, and the way to present all this information have never been developed before in an operational service that disseminates all the main climate dataset categories. The proof-of-concept framework described in Nightingale et al. (2019) focused on observations only. Instead, here we implement the operational delivery of homogeneous EQC information across all the CDS climate dataset categories mentioned before. This objective, along with the task of building the underlying technical solutions, posed an unprecedented design challenge and makes the C3S EQC unique. The framework for the operational EQC function of the CDS builds on past and present international research initiatives. A variety of concepts were developed in previous projects (mainly the EU FP7 funded QA4ECV2 and the C3S_51 Lots 2-43), most of these concepts were integrated while moving the EQC function towards its operational phase.
This paper describes the challenges, the development and the implementation of the operational EQC function providing an overarching quality assurance service for the whole CDS. The results of the dataset assessments are publicly available in the CDS Catalogue.4 The manuscript focuses on the framework implemented for the CDS datasets only, leaving out other aspects of the EQC function, such as the assessment of the software made available by the CDS to explore the datasets (i.e. Toolbox), the assessment of the CDS infrastructure and the assessment of the user’s satisfaction and requirements.
Commitment to quality is fundamental to build trust among stakeholders and the EQC function is the key element of C3S devoted to reach this goal. The main purpose of the EQC framework is to provide the C3S with a consistent, structured, and pragmatic approach to enhance the dataset reliability and usability. The design of this framework adopts an iterative approach integrating continual learning and improvement.
The EQC function regularly informs and recommends C3S about drawbacks, shortcomings and limitations related to the CDS datasets. These analyses are completed by a continuous user-engagement process to identify the user expectations that need to be addressed. The EQC team provides technical and scientific quality information of the CDS datasets via a set of homogeneous Quality Assurance Reports (QARs), helping to set the minimum requirements and baseline criteria for including new datasets in the CDS Catalogue. The QARs are filled templates called Quality Assurance Templates (QATs). Consistency across the QATs is obtained through the adoption of a vocabulary of homogeneous concepts and common practices.
The general strategy for assessing the CDS datasets consists of five steps:
The production of the QARs calls for setting the procedures to initiate, develop and update the QARs (e.g., workflow), developing the software tools to support the assessments (e.g., data checker), and engaging with a wide range of stakeholders to choose the most adequate options for the QARs. These steps lead to the creation of QARs, which provide users with comprehensive information about the technical and scientific quality of the datasets. The different sections of the QARs are made accessible to the users in the CDS web portal through a synthesis table. The synthesis table is devised as a tool to organise and homogenise the EQC information, which is made of atomic elements corresponding to the different entries of this table. These entries contain links leading the user to the respective subsection of the QAR, where the user can find the EQC information of interest.
The overall EQC framework is guided by homogeneity and scalability approaches. The former leads to consistency of the EQC information across the CDS dataset categories, the latter leads to the integration of automatic tools to produce timely and sustainable data assessments in an operational environment. In particular, the EQC framework is driven by:
Finally, the guidelines provided in Peng et al. (2021) have been followed when developing the EQC framework, as shown in Table 1.
Table 1
Mapping of the Peng et al. (2021) guidelines to the EQC framework characteristics described here.
FAIR-DQI GUIDELINES (PENG ET AL. 2021) | EQC FRAMEWORK DESCRIBED IN THIS PAPER |
---|---|
Guideline 1: dataset | The dataset is described with a comprehensive online page providing various information that includes DOI, rich metadata, and licence. |
Guideline 2: assessment model | The assessment method is available online together with the quality information. This paper itself details further the assessment model used. The assessment model is versioned and publicly retrievable. |
Guideline 3: quality metadata | The assessments are captured into a structured schema/template (QAT). The quality information is standardised in a machine-readable (in our case using the CMS) and reusable form. |
Guideline 4: assessment report | The quality information is structured in a template and is accessible online, versioned, and human-readable. |
Guideline 5: reporting | The assessments are disseminated in an organized way via a web interface including the quality aspects assessed, the evaluation method, and how to understand and use the quality information. |
The QAT is the tool used to gather information on the most relevant aspects of the CDS datasets, informing the user in a quicker way rather than accessing and reading several documents (e.g., user guides, peer-reviewed papers, dataset descriptions). The QAT includes all the relevant quality information, in a concise and standardised form, with references and links leading to further details.
The general strategy is to provide seamless QATs, which are as homogenous as possible across all dataset categories. The QATs for each dataset category (i.e., satellite and in-situ observations, reanalysis, seasonal forecasts, climate projections) are available as supplementary material. The QATs are regularly reviewed to gradually converge towards harmonisation. Much improvement has been achieved by adopting a common terminology (see section 6) and common minimum requirement fields. The homogenisation of the QATs of different dataset categories ideally tends towards adopting one single QAT for all datasets. However, this goal is not feasible due to the diverse nature of the CDS dataset categories (concepts like ‘processing level’ or ‘quality flag’ are relevant for observational datasets, but not for other categories; along the same lines, the concept of ‘ensemble size of the hindcast’ is mostly relevant for seasonal forecasts). This homogenisation effort was pragmatically addressed by mapping the different QATs, one for each category, onto a general table agnostic of the dataset type.
In practice, all the QAT fields were grouped under main sections and subsections with common names for all dataset categories. An excerpt of the resulting QAT is reported in Figure 1. Having common names for sections and subsections to all the QATs allows to organise and homogenise the EQC information in a general table named synthesis table (Figure 2).
QAT excerpt: the cells of the synthesis table (Figure 2) correspond to the subsections of the QAT (yellow text), while the column titles of the table correspond to the QAT sections (white text). The fields with an asterisk indicate the minimum requirements (see section 3.1). The QAT questions are in the grey area (left column). The middle column defines the data type, the rightmost column reports guidance about the type of content expected. Text in cyan appears as a tooltip when hovering the mouse over the web-form of the synthesis table.
Synthesis table, conceived as a tool to organise and homogenise the EQC information, as well as to guide users through the documentation. The table fields, each identifying an aspect of the dataset, are grouped into columns. Note the correspondence between the field ‘dataset overview’ of the column ‘Introduction’ and Figure 1.
The synthesis table entries contain links leading the user to the respective subsection of the QAR (i.e., filled QAT), where the user can find the EQC information of interest. Therefore, the structure of the synthesis table is agnostic of the dataset category, while the QAT fields, within each subsection (the information that is displayed when clicking on a cell in the synthesis table), depend on the dataset category.
The synthesis table offers an effective approach to guide the users into the documentation and homogenise the access to it. It addresses a typical user requirement: ‘most of the time, the problems with the documentation are not due to the lack of it, but to the difficulty in finding it’ (extracted from the C3S User Requirement Analysis Document 09/2019) or ‘all documentation should be easy-to-access’ (Nightingale et al. 2019). For instance, a non-expert user might not know the meaning of ATBD (Algorithm Theoretical Baseline Document) and the synthesis table overcomes this complication by guiding the user through questions (i.e., QAT fields), answered with high-level information further detailed in the complete document referred (ATBD in this case). Moreover, the synthesis table offers an extra level of assurance through independent assessments and guarantees the user that all the information made available through this table is traceable and quality controlled, because the information given by the provider is double checked by the EQC team and is versioned in the CMS. An extra advantage of the synthesis table is the possibility to track which EQC material the user is interested in by recording the user’s actions in the table. These actions can be analysed in a later stage to steer the future decisions of the EQC function and C3S in general.
The information accessible through the synthesis table may be grouped into two categories:
The table is characterized by fields grouped into columns (Figure 2). The column with the header ‘introduction’ gives a quick overview of the data characteristics (e.g., name, provider, time resolution), as inspired by the WIGOS guide on metadata standards (WMO/WIGOS 2017). The column ‘user documentation’ provides the essential documentation for the effective use and understanding of the dataset (e.g., user guide). The column ‘access’ describes whether the dataset variable can be served by the CDS Toolbox and which are the archiving practices followed for this dataset. Finally, the column ‘independent assessment’, being more articulated, is explained in detail in Appendix I.
Some of the QAT entries are considered mandatory and some optional in the EQC framework. The content of the mandatory entries is considered so fundamental that, when missing, the dataset is not usable/understandable and thus unservable. These mandatory fields define the minimum requirements (MRs) for a dataset variable to be published (or withdrawn) by the CDS. The identification of these fields probably represents the first systematic effort towards the inclusion and development of an operational check of MRs, encompassing a wide range of dataset categories. Indeed, the identification of a suitable set of MRs was indicated among the ‘Science Gaps’ in assessing climate data quality by Nightingale et al. (2019).
The list of MRs is specifically thought to facilitate a timely publication of a dataset in the CDS, ensuring, at the same time, a sufficient (but not necessarily optimal) quality of the dataset. The MRs cover several aspects ranging from the dataset documentation to the compliance of metadata with community standards. The fields were identified as a result of the interaction between the EQC team, data providers, C3S, and users. The analysis of the MRs leads to recommendations to the C3S governance board on whether a dataset shall be made public on (or withdrawn from) the CDS.
To guarantee the maintainability of the MRs, they are an integral part of the QATs and are updated with the same frequency. See the supplementary material for a complete list of the MRs, indicated by an asterisk next to the QAT entry. Typical examples of MRs are ‘data format’, ‘physical quantity name’, ‘user guide documentation’ or ‘validation activity description’. Beyond the mandatory text necessary to fill in the QAT entries, a number of documents are also requested to be linked in the QATs as minimum requirements. These are:
Before publication in the CDS, it is essential that the documents listed above are made available alongside the datasets they refer to.
The current version of the MRs fits the existing technology infrastructure as well as available human resources. Ideally, the list shall be extended to include basic technical checks about the data and metadata, such as time and space consistency and completeness, physical plausibility. However, it would require a technical infrastructure that was not available on the CDS at the time. In particular, it requires setting up automatic tools (a data checker software available for all the dataset categories) and tackling technical challenges (downloading and queuing time, memory disk space, enforcement of common metadata standards). Solving these technical limitations will help to extend the MRs list homogeneously across all the dataset categories.
Building the framework of the EQC requires designing protocols, software tools, QATs, and workflows for the QAR production as well as following common vocabularies and practices (e.g., TRUST Principles for Data Repositories, Lin et al. 2020). Among these, we focus now on the technical solutions underlying the EQC framework. Substantial technical developments have been undertaken during the onset of the operational phase of the EQC and more will be needed while it gets more mature over time:
At the heart of the EQC assessments is the Content Management System (CMS), an application used to manage content stored in a database and displayed in a presentation layer based on a set of templates, i.e. the QATs. Its objective is to ease the collaborative definition of the QAT structure and to facilitate and manage the creation of the QAT content. Creation of the QAT content is partially automated, as detailed in section 5.2.
The CMS facilitates the QAR production following a workflow that involves several roles, described in Table 2, that access the CMS sequentially.
Table 2
Roles involved in the QAR production workflow.
ROLE | RESPONSIBILITY |
---|---|
EQC main contact | One EQC team member who acts as the main contact of a specific dataset category. As QAR production is a multi-actor process, it is important that there is a central person, the EQC main contact, to coordinate the QAR production. This member contacts the data providers to agree on when they are available to fill in the QAT. Once the link with the providers is established, the EQC main contact defines the QAR name, fills in the QAT entries that identify the QAR uniquely and selects the team involved in the QAR production. Eventually, the EQC team member lets the actors involved know where there is a potential issue before it impacts the production. |
Data provider | Typically, a member of the team that provided the CDS with the dataset under evaluation. The providers fill in the information requested in the QAT, because they are considered the best source to fully describe their datasets and so are the preferential choice for this task. |
Evaluator | An EQC member who vets the QAR content and fills in the independent assessment fields. This role interacts with the provider for guidance about the amount and type of content expected and for any clarification needed. |
Reviewer | An EQC member who scrutinises the whole QAR content for completeness and understandability. The reviewer is fundamental, because of her/his work in checking and verifying the correctness and consistency of all information introduced, while interacting with the evaluator to address any issue encountered. |
Approver | Role covered by one C3S governance board member, who makes decisions about the publication of the dataset, (also) based on the QAR, and conducts a final check of the QAR before making it public together with the dataset. If the QAR requires further review, it will be sent back to the EQC team, commenting about what is still needed. Otherwise, it will be published in the CDS. |
Figure 3 sketches the interaction between the roles involved in the generation of the QARs. The implementation of this workflow into the QAR production needs more consideration; it needs to distinguish between fast and in-depth assessment cycles as well as between common and non-common QAT fields. Details are provided in the next section.
Sketch showing the basic roles, their interactions, and their responsibilities during the QAR production within the CMS. Note the iteration loop between roles to allow refinement of the content. The sketch gives a grasp of the more complex workflow shown in the next figures.
To complete the list of roles involved in the CMS, two additional roles are considered but they are not directly part of the QAR production workflow:
The trade-off between timely and detailed assessment is tackled by splitting the QAR production workflow into two phases:
An additional element that makes the QAR production sustainable is the identification of QAT fields associated with common content across several QARs. More details are reported in Appendix II. In the following section, it is shown how these two elements, fast/in-depth assessment and common/non-common fields, come into play during the QAR production.
In a nutshell, the process for the QAR production for both already published and submitted for publication datasets may be summarised as follows:
The trigger for a new assessment is associated with two main causes:
Throughout the QAR production process, the user engagement team of the EQC iterates with the users to harvest their feedback about the different steps or improvements taken in a co-production process, making sure to advance in the direction of fulfilling the user’s needs. These user’s requests will result in reports to be discussed at regular EQC meetings, where they shall be further investigated and eventually trigger framework and QAR updates. User requirements also help to refine the QATs and to prioritise the performance metrics to be employed during the independent assessment. User engagement outcomes are thus the basis to conduct a gap analysis of the information made available to users and to steer the EQC design evolution in terms of framework and dissemination activities. This virtuous feedback loop is crucial for a user-oriented service as C3S.
The EQC function has been implemented after many datasets were already published in the CDS. As a consequence, a workflow needed to be envisaged to produce the necessary QARs. In this case, the trigger of the QAR initiation is a QAR release calendar, defined by the EQC team together with the C3S governance board. Once the QARs are triggered, the workflow is managed in the CMS, as shown in Figure 4.
Sketch showing the interaction across roles during the QAR production within the CMS. Compared to Figure 3, here the distinctions between fast/in-depth assessment cycles and common/non-common fields are explicit and it is clarified at which stage the QAR is published.
Given the same roles identified in section 4, the QAT is filled in the private domain during the fast assessment cycle and then published. Once public, the QAR is completed with the independent assessment during the in-depth cycle and finally updated in the public domain. Each assessment cycle distinguishes between common and non-common fields:
Once published, one QAR might need to be updated. More details about the procedure in this case are given in Appendix II.
In the future, the evolution of EQC will need to consider a workflow for datasets ready to be published. So far, this workflow has not been implemented. Several options are considered based on the lessons learned during the ramp-up phase of EQC. Here we give some recommendations.
A new dataset is a dataset the provider considers ready to be served through the CDS. At this stage, the provider and C3S officers iterate to ingest data information, like documentation or the location where data are stored. Much of this information could be collected in the CMS (or a tool connected with the EQC CMS), which facilitates the completion of part of the QARs shortly after. Instead of EQC asking for similar information again, we can leverage the content already stored in the CMS to streamline the flow of information exchanged among the various authors involved, by introducing a workflow starting with the fast assessment explained in the previous section. Once the fast assessment cycle is complete, the dataset could be either rejected because, for instance, the minimum documentation required is not complete, or accepted for publication. When the dataset and the initial QARs are public, the in-depth assessment cycle starts. The logical flow, illustrated in Figure 5, may be summarised as follows:
Scheme of the preliminary workflow about the way EQC may be engaged in the fast assessment of datasets not yet published in the CDS. Once a dataset is published (identified with ‘end’ in the figure), the in-depth assessment cycle starts as usual. The part associated with the EQC is in light blue, while the part associated with the EQC user engagement team is in green.
The independent assessment is part of the in-depth cycle that always starts after the dataset is published with its QAR. However, for new datasets it would be convenient that the data provider performs the technical assessment, that is, data checks, and reports the evidence logs to the EQC team. The EQC team then controls that evidence is available for the entire dataset and performs random checks autonomously on a subset of the entire dataset. The reason for this logical flow is that there are technical limitations that make the data checks timely only when done by the provider. Indeed, the downloading and queuing time and the disk storage requirements are technical limitations that would demand more resources for EQC, while these resources are likely already allocated on the provider’s side. The strategy described would also reduce duplication of efforts and optimise resources, while guaranteeing independent checks.
Besides the QAR production, the EQC function for the CDS is completed by additional protocols that make it a solid building block of C3S. Here follows a brief list of the protocols and practices considered.
Communication channels to inform C3S with recommendations to avoid gaps, address drawbacks and shortcomings and identify limitations have been established. These issues are reported via the EQC communication channels in the form of tickets sent to the rest of C3S. The tickets are sorted by resolution timing and priority as follows:
The different issues are analysed by C3S and may trigger internal processes to deal with them. In this respect, the EQC team supports the evolution of C3S through gap analysis of the current capabilities of the CDS and formulates recommendations.
One key common practice to ensure consistency across dataset categories is to define a common vocabulary. The definition of shared vocabularies and common practices provides a foundation for interoperability, reduces interpretation ambiguities, and boosts an efficient communication exchange. The efforts to harmonise existing terminologies in a structured vocabulary aim to facilitate the C3S products usage by the downstream and upstream users, and it is also beneficial for the coordination with the rest of the C3S activities ensuring consistency when referring to specific CDS elements. It shall be noted that the lack of an overarching consistent EQC vocabulary was identified as one of the priority gaps in climate data quality (Nightingale et al. 2019).
Agreeing on a common terminology is by no means a simple task, as it is time-consuming and comes with a variety of challenges, especially in the case of C3S where datasets come from different communities adopting different conventions. For instance, numerous terms are interpreted differently across data communities. What is defined as ‘product’ in the satellite observations community13 is way different to what the seasonal forecasts community14 refers to or to what the ECMWF MARS (Meteorological Archival and Retrieval System) archive considers it to mean.15 Some terms are very general (e.g., ‘observation’) and lead to long discussions to reach an agreement. For these cases, a practical solution has been to include mostly CDS-related terms, leaving out general terms as much as possible. The definitions are continuously monitored and improved.
According to the FAIR principles (Wilkinson et al. 2016), it is critical to use controlled vocabularies to describe and structure (meta)data in order to ensure findability and interoperability. A common vocabulary also refers to a set of common standards for data and metadata (formats and conventions) to be enforced by C3S. Indeed, gathering the metadata in a single system, as the CDS, with a common format, needs standardisation, as it is necessary to encourage data providers to convert their metadata inventories into formatted inventories that can be transferred to the C3S service. Including metadata in a consolidated and centralised system requires and/or encourages providers to agree to share the information with the community at large (Aguilar et al. 2003; Brunet et al. 2020). Having a common metadata standard for the many communities gathered by C3S is not realistic, because these communities have their own standards. Thus, a first practical approach is that each dataset category follows a community-recognised standard, as identified by the EQC team. Examples are CMIP, CORDEX, ESA-CCI, and obs4MIPs metadata conventions.
Finally, consistency of the EQC framework is also achieved by commitment to transparency following the TRUST Principles for Data Repositories (Lin et al. 2020). In particular, the methodology, the software and the assessments are made available to the users through the QARs public in the CDS. This helps to increase transparency and verifiability of the assessment, as well as resilience of the processes considered, which is open to further improvement.
A few more practices have been identified to ensure consistency across the QATs, among these are the following:
The evolution of the EQC function would benefit from optimising the protocols, the templates and the workflow implemented so far, while tackling the gaps identified. The main issues encountered and more general considerations follow:
The current framework developed for the ramp-up operational phase of the Evaluation and Quality control (EQC) for the C3S CDS was presented. The framework considers the tasks, protocols and tools required to ensure that data are reliable and usable. It is inspired by the WMO GFCS guidelines, the ISOs 14090/1 and previous EU FP7/C3S projects. The framework is driven by a holistic approach aiming at homogenising the type of information made available across different climate datasets in a way that is both human and machine readable. It is characterised by a two-tier review system to assure the quality of dataset information released to the public. On the one hand, this approach enables fair and consistent comparison across datasets and facilitates guidance on the best use of data for the intended user’s application; on the other hand, this approach makes the assessments sustainable and maintainable in an operational environment. In doing so, the framework explored optimal mechanisms (e.g., fast vs in-depth assessment and common vs non-common dataset information) for setting up sustaining delivery of EQC information, meeting the guidelines of the European Roadmap for Climate Services (EC 2015) and Peng et al. (2021).
The establishment of a quality management framework demonstrated benefits to the many actors involved:
As mentioned in the introduction, it is the first time that the methodologies, the metrics, the evaluation framework and the way to present all this information are being developed in an operational service that disseminates the majority of the climate dataset categories (including in-situ and satellite observations, seasonal forecasts, reanalysis, and climate projections). Building the underlying technical solutions makes the C3S EQC unique and requires pragmatic decisions for its implementation. The first part of the EQC framework design focused on ensuring the robustness of the baselines and processes to collect the information required for the QATs, keeping their coherence and their comparability across all the datasets available in the CDS. Having a set of QATs covering consistently all the dataset categories is a unique endeavour. These activities needed continuous improvement to fine-tune the EQC framework, benefitting from the operational assessments (QARs) and user engagement process. During the second part of the EQC framework design, activities focused on defining the level of content required in the QARs and its homogeneity across. Optimisation of the QAR update by means of automation tools and workflow streamlining has also played an increasing role.
The EQC framework was developed and implemented for all datasets published in the CDS at the beginning of 2020. QARs have been generated at the granularity of the variable for each dataset and made available to the users via the CDS web platform. The implementation of the EQC function addressed most of the recommendations that arose during the pre-operational phase (see Nightingale et al. 2019): (i) designing of dataset category specific QATs, (ii) enhancement of the CMS functionalities, (iii) identification of minimum requirements to publish a CDS dataset, (iv) establishment of an overarching consistent EQC vocabulary, (v) creation of guidance documents for evaluators and reviewers to guarantee consistency in the QAR production, (vi) regular benchmarking activities brought into the operational process, (vii) ability to track changes in the QAR content.
Several constructive pieces of feedback from data providers, downstream users and C3S officers made the dissemination of the EQC information more robust over time. Orchestrating the different elements involved requires considerable coordination efforts and a continuous improvement approach to integrate the inputs regularly emerging from stakeholders and technical constraints. A number of lessons learned and science knowledge gaps were identified during the development of the EQC function and are detailed in the paper. These warrant further investment to comprehensively address the quality dimension of climate datasets in an operational environment.
The additional file for this article can be found as follows:
QATsConsistent set of QAT designs for a variety of dataset categories supported by the CDS: in-situ observations, satellite observations, seasonal forecasts, global and regional climate projections, global and regional reanalyses. DOI: https://doi.org/10.5334/dsj-2022-010.s1
7http://datastore.copernicus-climate.eu/documents/in-situ/GRUAN_product_traceability_chain_GAIA_CLIM_Humidity.pdf.
12The results of the metadata checks, i.e., data format, Toolbox compliance and standard convention compliance cannot be structured in precompiled tables, but the checks are produced by automatic software validated in advance.
The independent assessment is a fundamental piece of the quality assessments. Indeed, evidence that the dataset has been independently validated is key criteria for most data users (Nightingale et al. 2019). Applying the same approach and tools, nonpartisan of the supplier source, guarantees a uniform and impartial auditing. The independent assessment is part of the QAT and is designed to accommodate the information on the following topics:
An additional element that makes the QAR production sustainable is the identification of QAT fields associated with content common across several QARs. It shall be taken into account that the QAR granularity is at a variable level. Thus, given for instance the same model, version, model run and Catalogue entry, all the variables share some information that is exactly the same. The concept is clarified by the following example: for a given Catalogue entry = ‘CMIP5 monthly data on single levels’, model = ‘inmcm4’, experiment = ‘amip’, some answers to the QAT for climate projections have the same content, for instance ‘description of the model’ or ‘horizontal resolution’. In this respect, instead of repeating the same answers, these are written once in the common part of the QAR, while the specific answers, unique to the specific variable, enter the non-common part of the QAR. Continuing with the example above, let us suppose that we want to produce one QAR about
When creating the QAR, the EQC main contact fills these four fields that make the QAR unique and selects the common fields associated with model ‘inmcm4’, like data format or model components description. These fields are filled once in the CMS and then propagated automatically to all variables and experiments for the same model. One extra consideration that reduces the manual intervention is recognising that the non-common fields are typically the variable name, units, and description, and these are the same independent of the specific model given the same Catalogue entry. Thus, the associated information can be extracted from precompiled tables to fit each QAR. For instance, the definition of ‘2m temperature’ is the same for model ‘inmcm4’ or for model ‘access1-3’, but the common fields (e.g., model components description) are very different. As a result, the CMS makes it possible to fill in each QAR by extracting the non-common information from precompiled tables. Benefitting from this functionality, any change to these tables can be quickly propagated to hundreds of QARs. These tables need to be validated by the data provider. Note that all this also makes the approval process of thousands of QARs sustainable in the fast assessment cycle. Indeed, while the common part of a QAR needs the usual CMS workflow, the remaining non-common part fast assessment would require only automatic extraction of content from tables agreed with the data provider. The merging of agreed common fields and agreed tables for non-common fields avoids the need for the approval stage of each single QAR in the fast assessment. The EQC team spent time identifying which QAT fields are common and which are non-common, results are shown in the supplementary material.
Once published, one QAR might need to be updated till a new version of the same dataset is published in the CDS. Moreover, an update can also be caused by:
It shall be noted that in case a new version of a dataset is available, a completely new QAR has to be produced. In this respect, a new dataset version is not considered a trigger for QAR updates, but the workflow follows the usual QAR production.
Throughout the paper, the authors use the concepts of ‘scientific’ and ‘technical’ assessments. It might be worth avoiding any misunderstanding clarifying the notion in this context. The scientific assessment consists of data content and cross-data content checks, as opposed to the technical assessment that regards the data and metadata files checks (Stockhause et al. 2012). As an example, when the evaluator plots the dataset variable and checks for reproducibility of El Niño events against skill metrics, here the scientific soundness of the data content is considered (e.g., Haiden et al. 2019). When the evaluator looks for the metadata standard compliance (e.g., ACDD convention), here the evaluator is checking that the attributes describing the files are according to a set of community-recognised metadata characteristics (e.g., specific date-time format). There is no check of the data content, no scientific evaluation in this case, but a metadata conformity check; it is a purely technical assessment (Evans et al. 2017; Stockhause et al. 2012). The physical consistency checks are borderline in this distinction. On the one hand, these are technical checks when analyses regard the match between information in the valid ranges reported in the metadata and the max/min values in the file content. On the other hand, the check becomes more complex if a plot associated with analyses is necessary to identify the suspicious outliers. This operation becomes necessary when the metadata does not contain the valid ranges, which is a good practice to include but at times it is very challenging for exotic variables. In this paper, the physical plausibility checks are considered part of the technical assessment, albeit automatic statistical analyses about the data content is necessary.
As far as the scientific assessment is concerned, this refers to scientific analyses of the physical content described by the dataset to check for its scientific soundness. Given the nature of this assessment, it is typically carried out by domain experts. Analyses may include uncertainty estimation, validation against reference datasets, and reproducibility of temporal/spatial patterns. Once clear what is considered as technical and what is considered as scientific, part of the assessments described in this paper are about documentation (availability and completeness) and file accessibility, archiving and compatibility within the service (i.e., with the Toolbox). All these analyses are neither purely technical nor scientific because they are not about the files per se, but about the associated material needed to access and understand the dataset. These analyses enter the group of stewardship assessment. It is true that stewardship regards all the aspects about distribution of the dataset and so potentially also technical and scientific assessments enter in this category. However, with stewardship assessments here we consider any assessment that guarantees accessibility and understandability of the dataset distributed, and so anything of relevance in dataset quality that is not associated with the data and metadata file content, such as documents accompanying the dataset describing how to use it. Typical examples regard the description of the algorithms or models used to produce and process the data, provision of the DOI and licence of use, grid description, verified network address to access the data, and information about the archiving procedures. The goal is to ensure that the dataset is well documented, the processing chain is visible, and the data is readily obtainable and usable.
At times, the assessments described above are accompanied by maturity assessment models. These are formal approaches to support compliance verification, usually defined in discrete stages to evaluate practices applied in organisations, services, or products. Maturity is meant as a desired or anticipated evolution from a more ad hoc approach to a more managed process (Peng 2018). Datasets associated with high maturity are produced following best practices of the community and in a more managed fashion, increasing user trust in the data record provided. It should be noted that a low maturity rating does not necessarily imply a low scientific value for a dataset. It can happen especially for datasets managed by a single investigator that may be flagged to have low maturity due to poor quality in metadata, documentation, and accessibility.
ACDD – Attribute Convention for Data Discovery
ATBD – Algorithm Theoretical Baseline Document
C3S – Copernicus Climate Change Service
CAMS – Copernicus Atmosphere Monitoring Service
CDS – Climate Data Store
CF – Climate and Forecast
CMEMS – Copernicus Marine Environment Monitoring Service
CMIP – Coupled Model Intercomparison Project
CMS – Content Management System
CORDEX – COordinated Regional climate Downscaling EXperiment
CORE-CLIMAX – COordinating Earth observation data validation for RE-analysis for CLIMAte ServiceS
DOI – Digital Object Identifier
ECMWF – European Centre for Medium-Range Weather Forecasts
ECV – Essential Climate Variables
EQC – Evaluation and Quality Control
ERA5/6 – Fifth/Sixth generation of ECMWF atmospheric Re-Analyses
ESA-CCI – European Space Agency Climate Change Initiative
ESMValTool – Earth System Model Evaluation Tool
FAIR – Findable, Accessible, Interoperable, Reusable
FP7 – Seventh Framework Programme
GCOS – Global Climate Observing System
GRUAN – GCOS Reference Upper-Air Network
ISO – International Organization for Standardization
MARS – Meteorological Archival and Retrieval System
MR – Minimum Requirement
Obs4MIPs – Observations for Model Intercomparisons Project
PQAD – Product Quality Assurance Document
PQAR – Product Quality Assessment Report
PUG – Product User Guide
QA4ECV – Quality Assurance for Essential Climate Variables
QAR – Quality Assurance Report
QAT – Quality Assurance Template
SEAS5 – Fifth generation of the ECMWF seasonal forecasting system
SMM – System Maturity Matrix
TRUST – Transparency, Responsibility, User focus, Sustainability and Technology
UERRA – Uncertainties in Ensembles of Regional ReAnalysis
WIGOS – WMO Integrated Global Observing System
WMO – World Meteorological Organization
This study is based on work carried out in the C3S_512 contract funded by Copernicus Programme and operated by ECMWF on behalf of the European Commission (Service Contract number: ECMWF/COPERNICUS720187C3S_512_BSC). We would like to acknowledge the work of colleagues from several European institutions, the data providers and C3S, who contributed to the development of the EQC framework as well as to the QAR production. We would also like to acknowledge the focus group users, who took time to review and provide valuable feedback on the QARs, QATs, minimum requirements and the CDS quality assessment tab. The authors are grateful to the anonymous reviewers for their constructive comments that have helped for the improvement of this paper.
The authors have no competing interests to declare.
CL is the main contributor to conceptualization, project planning, and writing of the original draft. FD contributed significantly to conceptualization, project planning, reviewing, and editing. GL contributed significantly to conceptualization, writing of the original draft, reviewing, and editing. CB contributed significantly to conceptualization, reviewing, and editing. AO and CS contributed significantly to conceptualization, project planning, reviewing, and editing. MC, DS, PB, SP, VR, DP, FS, AL, AP, DC, OM, PC, NP, FM, MR, AR, and MG contributed to conceptualization, reviewing, and editing.
Aguilar, E, Auer, I, Brunet, M, Peterson, TC and Wieringa, J. 2003. Guidelines on climate metadata and homogenization, WCDMP-No. 53, WMO-TD No. 1186. Geneva: World Meteorological Organization.
Brunet, M, Brugnara, Y, Noone, S, Stephens, A, Valente, M A, Ventura, C, Jones, P, Gilabert, A, Brönnimann, S, Luterbacher, J, Allan, R, Brohan, P and Compo, GP. 2020. Best Practice Guidelines for Climate Data and Metadata Formatting, Quality Control and Submission. Reading, UK: Copernicus Climate Change Service.
Buontempo, C, Hanlon, HM, Bruno Soares, M, Christel, I, Soubeyroux, J-M, Viel, C, Calmanti, S, Bosi, L, Falloon, P, Palin, EJ, Vanvyve, E, Torralba, V, Gonzalez-Reviriego, N, Doblas-Reyes, F, Pope, ECD, Newton, P and Liggins, F. 2018. What have we learnt from EUPORIAS climate service prototypes? Climate Services, 9: 21–32. DOI: https://doi.org/10.1016/j.cliser.2017.06.003
Callahan, T, Barnard, J, Helmkamp, L, Maertens, J and Kahn, M. 2017. Reporting data quality assessment results: Identifying individual and organizational barriers and solutions. eGEMs, 5(1). DOI: https://doi.org/10.5334/egems.214
European Commission (EC). 2020. Copernicus and earth observation in support of eu policies. Part I, Copernicus uptake in the european commission. DOI: https://doi.org/10.2760/024084
European Commission (EC), Directorate-General for Research and Innovation. 2015. European Union: A European Research and Innovation Roadmap for Climate Services. DOI: https://doi.org/10.2777/702151
European Organization for the Exploitation of Meteorological Satellites (EUMETSAT). 2014. CORE-CLIMAX System Maturity Matrix Instruction Manual (Doc. No. CC/EUM/MAN/13/002). Available at https://masif.eumetsat.int/website/wcm/idc/idcplg?IdcService=GET_FILE&dDocName=PDF_CORE_CLIMAX_MANUAL&RevisionSelectionMethod=LatestReleased&Rendition=Web.
Evans, B, Druken, K, Wang, J, Yang, R, Richards, C and Wyborn, L. 2017. A data quality strategy to enable fair, programmatic access across large, diverse data collections for high performance data analysis. Informatics, 4(4): 45. DOI: https://doi.org/10.3390/informatics4040045
Hewitt, CD, Allis, E, Mason, SJ, Muth, M, Pulwarty, R, Shumake-Guillemot, J, Bucher, A, Brunet, M, Fischer, AM, Hama, AM, Kolli, RK, Lucio, F, Ndiaye, O and Tapia, B. 2020. Making society climate resilient: International progress under the global framework for climate services. Bulletin of the American Meteorological Society, 101(2): E237–E252. DOI: https://doi.org/10.1175/BAMS-D-18-0211.1
ISO 14090:2019. Adaptation to climate change — Principles, requirements and guidelines. Geneva, Switzerland. https://www.iso.org/standard/68507.html.
ISO 14091:2021. Adaptation to climate change — Guidelines on vulnerability, impacts and risk assessment. Geneva, Switzerland. https://www.iso.org/standard/68508.html.
ISO 19157:2013. Geographic information — Data quality. Geneva, Switzerland. https://www.iso.org/standard/32575.html.
Lawrence, B, Jones, C, Matthews, B, Pepler, S and Callaghan, S. 2011. Citation and peer review of data: Moving towards formal data publication. International Journal of Digital Curation, 6(2): 4–37. DOI: https://doi.org/10.2218/ijdc.v6i2.205
Leadbetter, A, Carr, R, Flynn, S, Meaney, W, Moran, S, Bogan, Y, Brophy, L, Lyons, K, Stokes, D and Thomas, R. 2020. Implementation of a data management quality management framework at the marine institute, Ireland. Earth Science Informatics, 13(2): 509–521. DOI: https://doi.org/10.1007/s12145-019-00432-w
Lin, D, Crabtree, J, Dillo, I, Downs, RR, Edmunds, R, Giaretta, D, De Giusti, M, L’Hours, H, Hugo, W, Jenkyns, R, Khodiyar, V, Martone, ME, Mokrane, M, Navale, V, Petters, J, Sierman, B, Sokolova, DV, Stockhause, M and Westbrook, J. 2020. The TRUST Principles for digital repositories. Scientific Data, 7(1): 144. DOI: https://doi.org/10.1038/s41597-020-0486-7
Medri, S, Banos de Guisasola, E and Gualdi, S. 2012. Overview of the main international climate services. Social Science Research Network. SSRN Scholarly Paper ID 2194841. DOI: https://doi.org/10.2139/ssrn.2194841
Nightingale, J, Boersma, KF, Muller, J-P, Compernolle, S, Lambert, J-C, Blessing, S, Giering, R, Gobron, N, De Smedt, I, Coheur, P, George, M, Schulz, J and Wood, A. 2018. Quality assurance framework development based on six new ecv data products to enhance user confidence for climate applications. Remote Sensing, 10(8): 1254. DOI: https://doi.org/10.3390/rs10081254
Nightingale, J, Mittaz, JPD, Douglas, S, Dee, D, Ryder, J, Taylor, M, Old, C, Dieval, C, Fouron, C, Duveau, G and Merchant, C. 2019. Ten priority science gaps in assessing climate data record quality. Remote Sensing, 11(8): 986. DOI: https://doi.org/10.3390/rs11080986
Peng, G. 2018. The state of assessing data stewardship maturity – an overview. Data Science Journal, 17: 7. DOI: https://doi.org/10.5334/dsj-2018-007
Peng, G, Lacagnina, C, Downs, RR, Ramapriyan, H, Ivánová, I, Ganske, A, le Roux, J, et al. 16 Apr. 2021. International Community Guidelines for Sharing and Reusing Quality Information of Individual Earth Science Datasets, OSF Preprints. DOI: https://doi.org/10.31219/osf.io/xsu4p
Rfll, German Council for Scientific Information Infrastructures. 2020. The Data Quality Challenge. Recommendations for Sustainable Research in the Digital Turn. Göttingen.
Stockhause, M, Höck, H, Toussaint, F and Lautenschlager, M. 2012. Quality assessment concept of the World Data Center for Climate and its application to CMIP5 data. Geoscientific Model Development, 5(4): 1023–1032. DOI: https://doi.org/10.5194/gmd-5-1023-2012
Thépaut, J, Dee, D, Engelen, R and Pinty, B. 2018. The Copernicus Programme and its Climate Change Service. IGARSS 2018 IEEE International Geoscience and Remote Sensing Symposium, 1591–1593. DOI: https://doi.org/10.1109/IGARSS.2018.8518067
Wilkinson, MD, Dumontier, M, Aalbersberg, IJ, Appleton, G, Axton, M, Baak, A, Blomberg, N, Boiten, J-W, da Silva Santos, LB, Bourne, PE, Bouwman, J, Brookes, AJ, Clark, T, Crosas, M, Dillo, I, Dumon, O, Edmunds, S, Evelo, CT, Finkers, R, Mons, B, et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1): 160018. DOI: https://doi.org/10.1038/sdata.2016.18
WMO/WIGOS. 2017. WIGOS Metadata Standard. Geneva: World Meteorological Organization, WMO no. 1192.
Zeng, Y, Su, Z, Barmpadimos, I, Perrels, A, Poli, P, Boersma, F, Frey, A, Ma, X, Bruin, K de, Goosen, H, John, VO, Roebeling, R, Schulz, J and Timmermans, WJ. 2019. Towards a traceable climate service: Assessment of quality and usability of essential climate variables. Remote Sensing, 11(10): 1–28. DOI: https://doi.org/10.3390/rs11101186