1 Introduction
The Arctic region surrounds the Earth’s North Pole, although its geographic limit varies depending on the criteria followed (Figure 1). The Arctic is undergoing unprecedented changes as a result of global warming, such as the sea ice extent decline since late 1970s consistent in all months of the year, with the retreat speeding up since the early 1990s (). Also the observed increase in mass loss from the Greenland ice sheet during the last decade (; ; ; ), is of great concern for its future contribution of sea level, since if melted completely, sea level would rise a global average of 7.3 m (). The causes of such changes and their impacts on the environment and society are not yet well understood, hence limiting our ability to predict the future climate challenges. In particular, it is essential to improve the performance of global climate models, including treatments of many processes and their interactions within the atmosphere, ocean, sea ice, ice sheet and biosphere systems. Processes-based studies combining analysis of available observations and models of varying complexity and scales are needed to make climate models more realistic, which is an important task for future predictions of climate scenarios.
The IPSL (http://www.ipsl.fr/) is focussed on research topics concerning the global environment and particularly on the Arctic, which has been recently highlighted as a research priority within the institute and at national level. This priority motivated the creation in 2010 of the new French Arctic initiative Chantier arctique (http://www.chantier-arctique.fr/en/), which aimed at mobilising the existent multidisciplinary scientific community focussed on Arctic research to help identify the key scientific issues. The IPSL focuses on the environmental research part, which has identified the need for increasing observations in the Arctic, from different scientific communities, as well as to identify existing datasets. The latter motivated the development of the LABEX L-IPSL Arctic metadata portal, presented here as a tool to identify Arctic data within the IPSL and linked institutes. LABEX (Laboratoires d’Excellence) L-IPSL (http://labex.ipsl.fr/) is a scientific program that focusses on the study of climate change as part of the IPSL. The goal of the project was to provide an assessment of the potential consequences of climate change at different time and spatial scales, which are important for political or economical issues. The project is strongly focussed on regional scales, for instance the Arctic. The creation of the LABEX L-IPSL Arctic metadata portal was part of one of the work packages within the project.
A lot of effort can be put into unifying data format, however, if this is not accompanied by comprehensive metadata, the visibility and accessibility of this information is compromised. This could mean that valuable data remain untapped in local computers or servers. Starting a metadata compilation is a long process that can not be automated, but once it is in place, acts as driving force for a wide scientific community, facilitating data sharing and collaborations. The main objective of this paper is to showcase the different observations that exist as part of the IPSL, compiled in the metadata portal, without elaborating into the technical aspects of the metadata interface itself. This metadata portal is an important starting point for Arctic research at the IPSL level, but also for the development of national collaborations and links with other international efforts and shared resources.
The portal represents all the Arctic datasets created by the IPSL researchers with a total of 32 datasets identified (the complete dataset list can be found in the supplementary material). Some of the existing datasets at the IPSL are already archived through various data centres, such as The ICARE (Cloud Aerosol Water Radiation Interactions) Thematic Centre (http://www.icare.univ-lille1.fr/) and Ether (http://www.pole-ether.fr/etherTypo/index.php?id=1450&L=1), which is and Atmospheric Chemistry Data Centre. However these data centres are not specifically focussed on the Arctic. The purpose of the portal was to provide metadata information that could be easily accessed by the user for their own research and to create a tool that gathered all relevant metadata information for each dataset. The metadata template used for this purpose was created in a comprehensive way aimed at a wide scientific community, due to the multidisciplinary research focus of the IPSL, which facilitates the compilation of different types of observations, from satellite to buoys measurements. The portal contains standardised information about each of the datasets as part of the metadata, together with links to relevant publications, principal investigators (PIs) and to the data distribution sources. Also plots showing potential usage of the data are shown. This article compiles all the datasets that form the LABEX L-IPSL Arctic metadata portal, including the description of the metadata format development as well as the schematic content of the dataset template.
2 Methodology
The objective of this paper is to highlight why the portal is a useful tool for the IPSL and also for the scientific community focussed on Arctic research. The idea was not to develop an innovative technique to design a metadata portal, therefore the reader should not expect a technical paper showing the code of the portal interface development. Nevertheless, the motivation for the chosen metadata format is explained in Subsection 2.1.
The development of the portal started in September 2013 and was initially planned as a one year task within the LABEX L-IPSL program. It was finally launched in December 2014. The portal is now publicly visible (http://climserv.ipsl.polytechnique.fr/arcticportal/). The portal interface was based on the LABEX L-IPSL Climatology data portal, which is still in preparation, hence the delay with some technical aspects. Despite these delays, all existing datasets collected prior to December 2014 are included in the LABEX L-IPSL Arctic data portal.
The process of dataset integration was challenging; first the identification of all the potential groups and researchers involved with Arctic observations was carried out and a list of contact details created. Due to the different formats and levels of data processing as well as the common lack of an unified way of storing metadata, a standardised data template (Table 1) was created and distributed within the contacts list; the development of the metadata template is explained below (Subsection 2.1).
Dataset | Projects | Contacts | Parameters |
---|---|---|---|
Dataset Title | Short Name | Role | Category |
Purpose | Long Name | Full Name | Parameter |
Abstract | URL | Units | |
Acquisition Methodology | Description | Phone | |
DOI | URL Type | Fax | |
Access Constraints | Organism short Name | ||
Use Constraints | Organism long Name | ||
Keywords | URL | ||
Status | Description | ||
Temporal Coverage | URL Type | ||
Start Date | Address | ||
Stop Date | City | ||
Datacenter | References | Multimedia sample | Distribution |
Short name | Title | Multimedia file | Distribution Format |
Long name | Authors | Caption | Media |
Address | Publication Date | Description | Size |
City | Series | Fees | |
Postal Code | Edition | ||
Country | Volume | ||
URL | Issue | ||
Description | Report number | ||
URL Type | Publication Place | ||
Publisher | |||
Pages | |||
ISBN | |||
DOI | |||
Other details | |||
URL | |||
Description | |||
URL Type | |||
Data resolution | Instrument | Spatial Coverage | Paleo Coverage |
Latitude Resolution | Site name | Southern latitude | Start date |
Longitude Resolution | Location | Northern latitude | Stop date |
Horizontal Resolution Range | Detailed location | Western longitude | Eon |
Vertical Resolution | Site longitude | Eastern longitude | Era |
Vertical Resolution Range | Site latitude | Minimum altitude | Period |
Temporal Resolution | Site altitude | Maximum altitude | Epoch |
Temporal Resolution Range | Site depth | Minimum depth | Stage |
Platform short name | Maximum depth | ||
Platform long name | |||
Instrument longitude | |||
Instrument latitude | |||
Network coordinates | |||
Instrument altitude | |||
Instrument depth | |||
Instrument description |
It is worth pointing out that not all the fields from the template were filled out (Table 1), on the one hand because some of them are not relevant to specific datasets and on the other hand because some metadata information was not available by interviewing the contacts nor by online searching. Once the portal interface is finalised, future observations and datasets updates will be done directly by the PIs or data coordinators. This is one of the final aims of the portal, i.e. to be a public tool for people linked to the IPSL.
2.1 Development of the metadata scheme
When embarking on the development of a metadata portal, one starts by looking for a metadata standard format. However, one quickly realises the fact that there is not one standard format, but many different ones. Some are built to meet generic needs, such as the International Organization for Standardization (ISO) standards; others on the other hand are thought for a specific community (see for example the metadata standards for Marine metadata https://marinemetadata.org/conventions/vocabularies). In the case or earth science, many metadata standards exist to describe the observations; the ISO 19115 standard defines a general schema to provide information on the identification, extent, quality, spatial and temporal aspects, content, spatial reference, data representation, distribution and other properties of digital geographic data and services. To document a dataset including the description of the platform or the acquisition sensor, it is necessary to include other ISO schemes, such as ISO 19115-2.
For the LABEX L-IPSL Arctic portal, it was decided to rely on an already widely used standard format, the Global Change Master Directory (GCMD) Directory Interchange Format (DIF), which provides metadata lists including both the elements necessary for the description of the dataset and those useful to describe the acquisition sensors and platforms (Table 1). The decision to choose DIF over ISO 19115-1 (and -2) was taken firstly because DIF had been already used for IPSL meta-catalogue projects, hence facilitating and speeding up the metadata building process. Due to time limitations (the whole portal had to be developed in one year) and because DIF presents a simpler package than ISO, the former appeared to be more suitable for the portal needs. Another important point is that the different datasets gathered for the portal present different granularity and there is a clear heterogeneity within observations, which made the process of defining datasets more complex. The use of DIF facilitated this last point due to its flexibility in datasets definition.
The metadata scheme used to document the different datasets is built upon the DIF (DIF Writer’s Guide, 2014 Global Change Master Directory, NASA, http://gcmd.nasa.gov/add/difguide/). Some metadata were added to the list provided by the DIF, such as a field to document the Digital Object Identifier (DOI) of the dataset or the ability to describe a network of sensors. The DIF is a metadata format used to create directory entries that describe scientific data sets. A DIF holds a collection of fields, with specific information about the data. The DIF format defines three groups of metadata: required, highly recommended and recommended metadata, providing relatively large freedom to document a dataset. The mandatory metadata refers to the minimum information required to identify and access a dataset. This includes the title and summary of the dataset and a link to the data centre hosting the dataset. The DIF is compliant with the ISO 19115 metadata standard, i.e. the information included in a DIF file covers the required one by the ISO 19115 standard. The GCMD also provides predefined names lists for several of the DIF metadata fields. These lists allow us to limit the choices for these fields, avoiding having different names or acronyms for the same object (for example, the list of categories of geophysical parameters or the list of instrumental platform types). For the geophysical parameters, the standard names defined by the Climate and Forecast standard (http://cfconventions.org/) were also used to complement the predefined names lists proposed by the GCMD. This standard is widely used by the community of climate studies and was recently incorporated as an Open Geospatial Consortium (OGC) standard in connection with the Network Common Data Form (NetCDF) file format.
Following the recommendations of the Infrastructure for Spatial Information in the European Community directive (INSPIRE, http://inspire.ec.europa.eu/) regarding the metadata access (Table 1), the portal aims to facilitate the access to the documentation of the datasets. The INSPIRE directive, an European Union initiative, enables the sharing of environmental spatial information among public sector organizations and better facilitate public access to spatial information across Europe. One of the improvements of the portal interface since its creation is the auto-completion tool with the DIF predefined name list. This tool will allow the future user to add a new dataset or update and existent one, without the need of the DIF or further background information or knowledge; this will allow a wider community to have access to the metadata template.
3 Dataset Description
Due to the large number of datasets included in the portal it was decided to gather them in different categories, summarised in Table 2. The datasets are first divided in three main categories (atmosphere, ocean and land) based on where the observations are carried out; although some datasets include observations for more than one type. The second categorisation is made according to the type of measurement: in situ observations or remotely sensed observations from satellites and aircraft campaigns.
Type of measurement | Description/name | Main parameters | Spatial coverage | |
---|---|---|---|---|
Atmosphere | In situ | CLIMSLIP-NyA campaigns: chemical composition over the snow pack | BC | Svalbard |
NDACC-SAOZ Balloons and ground based long-term observations of atmospheric composition change | O3, NO2, O4, H2O, Colour Index | Arctic stations | ||
ICOS-ATC station | CO2, CO, CH4, meteorological parameters, water vapour isotopic content | West-Greenland | ||
Water vapour isotopes dataset | δD, d-excess, meteorological parameters | Global network | ||
RMR lidar observations of the atmosphere from Andoya station | T, aerosol properties | Andoya (Norway) | ||
IAOOS buoys network (real time observations and radiative budget from OPTIMISM | Aerosol properties, cloud properties, T, meteorological parameters, radiative budget | Arctic ocean | ||
Satellite | Global climatological data from microwave radiometers: AMSU-A/B and MHS instruments) | Brightness T, precipitation, convection, humidity, surface T | Global | |
Atmospheric vertical profiles: CALIPSO | Lidar attenuated backscatter, reflectance, BT, aerosol and cloud properties, | Global | ||
Atmospheric composition: IASI on board Meteop A/B | CO, Skin and atmospheric T, RH, cloud properties | Global | ||
Greenhouse gases observations: GOSAT | CO2, CH4, Cloud cover, H2O | Global | ||
atmospheric chemistry dataset: GOMOS | O2, OClO, NO3, NO2, O3 | Global | ||
Ice cloud properties measurements: DARDAR | Global | |||
Aircraft campaigns | Aerosol, cloud and radiative properties: ASTAR, RACEPAC, SoRPIC campaigns (as part of CLIMSLIP+IPEV) and RALI campaigns | Scattering phase function, extinction coefficient, Asymmetry parameter, LWC, TWC, aerosol and cloud properties, lidar backscatter | Svalbard, Canada | |
Atmospheric chemical composition YAK-AEROSIB: Siberia | Equivalent BC, O3, CO2, aerosol concentration, water vapour concentration, wind speed, WBPT | Siberia | ||
Aerosol measurements over the Arctic during POLARCAT project | CO, total particle concentration, aerosol concentration, O3 | Sweden, Greenland | ||
Emissions inventories | MACCity: Global Anthropogenic emissions inventory | NOx, OC, ethane, CO2, acetone | Global | |
Ocean | Buoys | OPTIMISM: Sea ice, ocean and meteorological observations from an Arctic network | Currents, SST, SSS, sea ice thickness and T | Arctic ocean |
IAOOS: Sea ice, snow and oceanographic real time measurements in the Arctic | Sea ice concentration and T, SST, ice drift | Arctic ocean | ||
Ships | TARA 2013 Expedition; CORIOLIS project; sea surface salinity drifters (OVIDE mission); R/V Polarstern | SST, SSS, pCO2, fCO2, CTD profiles, CO, non-methane hydrocarbon concentrations, chlorophyll | Arctic ocean and sub-polar North Atlantic | |
Land | In situ | Permafrost observations: study of the influence of fluvial thermal erosion during the ice breakup of the Lena River | Erosion rate, T of the frozen layer, geographical distribution of the thaw slump | Siberia |
Satellites | Plant functional Type (PFT) maps based on GlobCover 2005, surface temperature maps over the Arctic (SSM/I and SSMIS) | PFT classification, land surface T | Arctic | |
Ice core observations | NEEM North-west Greenland, Isotopic composition from the last interglacial period till present at the summit of Greenland GRIP; NGRIP, North Greenland; | δ18O, ice and gas age, air content, δ—Kr/Ar, δ—Xe/Ar, CH4, ice sheet elevation change, δ15 N, d-excess, δD | Greenland |
As part of the metadata portal we have also added links to additional observations that are carried out in collaboration with other institutes; this is the case of the project Climate impacts of short-lived pollutants and methane in the Arctic-Agence Nationale de Recherche (CLIMSLIP-ANR) project, aimed at the examination of the roles of these short-lived pollutants in the Arctic and their impacts on the regional climate. The project includes data collection and analysis as well as regional and global modelling. The datasets linked with the CLIMSLIP-ANR project are CLIMLISP-NyA, ASTAR, RACEPAC and SoRPIC and YAK-AEROSIB and POLARCAT fields campaigns (Table 2). One important feature of the portal is the search tool that enables the user to locate datasets using the categories from Table 2 as well as using keywords (http://climserv.ipsl.polytechnique.fr/arcticdatadb/Datasets/search). The search tool available at the moment is just a preliminary sample where one can search by specific category, for example by variable. The final idea is an open search with key words that will not prevent the multidisciplinary public from accessing any kind of metadata, even if it is out of their area of expertise.
4 Dataset Availability
The metadata associated with this paper is dedicated to the public domain and is available through the IPSL Mesocentre, which is a service of data and computation of the LABEX L-IPSL, http://climserv.ipsl.polytechnique.fr/arcticportal/. As mentioned above, some of the data are already available for scientific use, Table 3 lists the different data-centres that can be accessed.
Data centre | Dataset | URL link | Data access |
---|---|---|---|
ICARE | CALIPSO (2); DARDAR (5); AMSU-A, B and MHS (6); IAOOS (11) | http://www.icare.univ-lille1.fr/drupal/ | Data access through registration: http://www.icare.univ-lille1.fr/drupal/register. Some can require prior authorizationfrom the PI |
ETHER | GOMOS (7); GOSAT (8); MACCity global anthropogenic emissions inventory (16); NDACC-France SAOZ Balloons (17); NDACC-France ground based (18); IASI (12) | http://www.pole-ether.fr/ | Data access free within the scientific framework. Login request is required: http://www.pole-ether.fr/etherTypo/index.php?id=1553&L=1 |
NOAA Paleoclimatology | GRIP (9); NGRIP (10); NEEM(19) | http://www.ncdc.noaa.gov/paleo/icecore/greenland/greenland.html | Access to online data is free of charge. Some orders could be subject to a certification, consultation fee or handling charge. |
Water Isotopes database | Atmospheric water vapour isotopes (13) | http://waterisotopes.lsce.ipsl.fr/ | Data directly available through website (plots or tabulated). Database completion: work in progress. |
ICOS-ATC | Ivittuut Observatory (Greenland) (14) | https://icos-atc-demo.lsce.ipsl.fr/ | Data available through data centre asplots. Data points available through email. |
Pangaea | Land surface temperature maps (15); PFT maps over Siberia (22) | http://www.pangaea.de/ | Data access directly through website as tab-delimited text or HTML format |
CORIOLIS | CORIOLIS (4) | http://www.coriolis.eu.org/ | Data downloadable in different ways through website. A user desk allow customers to communicate with the CORIOLIS team. |
The datasets linked to the CLIMSLIP-ANR (1, 3, 23, 24, 26, 30 and 32) project mentioned before are not yet associated to a data-centre, the same happens for OPTIMISM (20), permafrost investigations in central Yakutia/Siberia (21), R/V POLARSTERN (25), RALI (27), RMR lidar observations in Andoya/Norway (28), OVIDE mission (29) and TARA Expedition 2013 (31).
The fact that many of the datasets are not yet stored in public data-centres, highlights the importance of the creation of the LABEX L-IPSL Arctic data portal, which allows the public visibility of these observations. The portal has also helped gathering, for the first time, the metadata information in a standardised format, crucial for example for climate models evaluation.
5 Applications of the Arctic Metadata Portal
The different observations gathered in the metadata portal will improve current knowledge about processes in the Arctic, as well as improve regional and global climate models based on evaluation using observations. In this section, four examples of scientific applications of the portal are presented.
5.1 Land cover mapping from satellite observations
It is known that high-latitude ecosystems play an important role in the global carbon cycle and also in the climate system. Moreover these ecosystems have experienced rapid environmental change, showing the need to increase accurate land cover observations to monitor these changes and also to use the observations to improve current Earth system models initialisations (). These models require specific land cover classification systems based on Plant Functional Types (PFTs).
The dataset presented here comprises PFTs maps for the Organising Carbon and Hydrology In Dynamic Ecosystems (ORCHIDEE) model – the land surface model of the IPSL earth system model (http://labex.ipsl.fr/orchidee/) – at one kilometre resolution that have been produced across Siberia (see Figure 2). A complete description for the ORCHIDEE model was first described by Krinner et al. (). These PFTs maps are based on the land cover product GlobCover Land Cover maps 2005 (European Space Agency initiative) with an updated cross-walking approach to link land cover classes to the 16 PFT classes in ORCHIDEE. Ottlé et al. (), who is the PI of this dataset (dataset number 22 in the supplementary material) compares over Siberia, multiple land cover data sets against one another and with auxiliary data to identify key uncertainties that contribute to variability in PFT classifications that would introduce errors in Earth system modelling. This dataset highlights the importance of accurate observations to improve current climate models.
5.2 Arctic clouds: models versus observations
Clouds are also an important factor in terms of climate model uncertainties when estimating climate sensitivity, since they are the primary modulators of the Earth’s radiation budget (). Focussing on the Arctic, Figure 3 shows Arctic annual mean low-level cloud cover observed by the General Circulation Model (GCM)-Oriented Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation CALIPSO Cloud Product (GOCCP), designed to evaluate the cloudiness simulated by GCMs () and compared the Coupled Model Intercomparison Project Phase 5 (CMIP5) climate models (; ). The annual mean low-level cloud cover (z <3.36 km) observed by CALIPSO-GOCCP in the Arctic shows that the atmosphere contains small but significant amount of low clouds (30% to 45%), with the exception of Greenland and high regions. Above the ocean, the moister atmosphere produces a larger low-level cloud cover (typically from 60 to 80%). Their significantly asymmetric distribution is linked to the sea surface temperature with larger cloud coverage above the warmest ocean (Barents and Greenland Seas) and smaller above the cold Beaufort Sea. All CMIP5 models, except the Max Planck Institute-Earth System Model (MPI-ESM), reproduce this asymmetry, but they do not reproduce the correct fraction of low cloud cover, showing a large inter-model spread, highlighting the inability of representing clouds by models.
Dataset (2) (supplementary material) from the LABEX L-IPSL Arctic metadata portal includes global satellite observations of cloud and aerosols vertical profiles by CALIPSO (http://smsc.cnes.fr/CALIPSO/), which is a Franco-American mission launched by the NASA to provide vertical profiles of the atmosphere, useful for learning more about the vertical distribution of the properties of aerosols and thin clouds. There are three instruments on-board CALIPSO: the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), the Imaging Infrared Radiometer (IRR) and a Wide Field Camera (WFC). Of particular interest for the portal, CALIPSO output robustly document the frequent presence of low level clouds over the Arctic, of great importance for Arctic research. This dataset is also stored on ICARE data centre (Table 3).
5.3 Arctic aerosols: satellite observations and model output comparison
As mentioned in the previous subsection, aerosol vertical profiles from CALIPSO are also included in the portal (Dataset (2) from the supplementary material). Ancellet et al. () showed that the CALIOP lidar Level 1 uncorrected product is a useful tool for mapping aerosol vertical and horizontal distribution. Understanding the sources of aerosols in the Arctic is important because, despite the fact that there are few pollution sources, there is long-range transport of anthropogenic and biomass burning emissions from lower latitudes, mostly from Europe and Asia (; ; ). Figure 4 shows the distribution of the CALIPSO aerosol backscatter ratio, defined as the backscattering by particle versus total scattering, during spring 2008 for two altitude ranges.
As well as improving observations of aerosol distribution and its sources, CALIOP observations can help to validate and improve current climate models, since global climate models tend to underestimate aerosols concentrations in the Arctic ().
5.4 Arctic sea ice monitoring from in situ observations
The final example of scientific applications of the portal illustrates data from the Observing processes impacting the sea ice mass balance from in situ measurements (OPTIMISM) project (Dataset (20) from the supplementary material). This is an on going effort launched in 2009, which consists on a network of automated buoys providing real-time measurement of sea-ice thickness and fluxes at the interfaces in the Arctic ocean. There are still no publications, animations of the buoys’ trajectories and preliminary observations are displayed on the project website. These include thermal profiles in the air/ice/ocean interface and ice thickness measurements (http://optimism.locean-ipsl.upmc.fr/).
6 Concluding Remarks
The LABEX L-IPSL Arctic metadata portal presented here improves the visibility of the different observations carried out within the IPSL and links with other institutes as well as new activities related to the French Chantier Arcticque. It will facilitate the use of the observations for the evaluation of theoretical models, especially the global IPSL climate model and regional models focussed on the Arctic. In the future the datasets will be updated directly by the researchers involved. It will also be possible to include new datasets with a specific interactive tool, which will include an auto-completion tool to facilitate the task. The search catalogue tool (http://climserv.ipsl.polytechnique.fr/arcticdatadb/Datasets/search), which is currently under development, will allow the public to search by the categories shown in Table 2 as well as by using keywords listed in each dataset. There is scope to also include climate models outputs and climatological data to the existing datasets, as well as the potential for expanding to other national and international projects. Currently there is not a global initiative (e.g. GCMD) to harvest the metadata; however as part of the French Chantier Arcticque there is scope to achieve this in the future.
7 Competing Interest
The authors declare that they have no competing interests.