Progress in Activities of WDS-China Data Centers

The World Data System (WDS) plays an important role in promoting global scientific data man - agement, exchange, and sharing. There are 8 WDS data centers in mainland China including study areas of astronomy, space science, global change, renewable resources and environmental, cold and dry regions, microbiology, geophysics, and the ocean. This paper summarizes the current status of the WDS China data centers, along with their major progress in recent years. This progress includes a clearinghouse for metadata exchange, research data archival, historical data saving, international data exchange, data publishing models, CoreTrustSeal certification, open repositories for the scientific community, science popularization services, international training workshops, and awards. We conclude with a discussion of the opportunities and challenges faced by WDS China data centers.


Introduction
In this information age, scientific data is integral not only to scientific and technological infrastructure but also to national strategic resources. Scientific Data Centers are one of the important carriers of scientific data management (Wang et al. 2017). The World Data System (WDS) is an Interdisciplinary Body of the International Science Council (ISC; formerly ICSU). The WDS builds on the 50+ year legacy of the World Data Centres and Federation of Astronomical and Geophysical data analysis Services established by ICSU to manage data generated by the International Geophysical Year (1957Year ( -1958. In 1988, the following 9 WDCs were established in China by the former ICSU Panel on World Data Centers, including WDC for Astronomy, Beijing, WDC for Space Science, Beijing, WDC for Geophysics, Beijing, WDC for Meteorology, Beijing, WDC for Oceanography, Tiajin, WDC for Glaciology and Geocryology, Lanzhou, WDC for Seismology, Beijing, WDC for Geology, Beijing, and WDC for Renewable Resources and Environment. These centers were reviewed successfully by WDC Panel team in 2005. WDS China has 8 data centers in the mainland at present. This paper summarizes the activities and progress of these centers in recent years.

Status and Activities
WDS China Data Centers are organized around the framework of the earth system and solar-terrestrial space as shown in Figure 1. Outer space, is WDS China Astronomical Data Center (CAsDC) in Beijing, while solarterrestrial space is covered by the WDS China Space Science Data Center (CSSDC) in Beijing. For the earth system, there are many data centers across the sub-disciplines of atmosphere, biosphere, lithosphere, hydrosphere (cryosphere), and anthroposphere. The CAsDC is supported by the National Astronomical Observatories, CAS (http://explore.china-vo.org/). Data received from Guoshoujing's Telescope (Large Sky Area Multi-Object Fiber Spectroscopic Telescope) is archived and released in the CAsDC. As a distribution platform, the CAsDC provides global services and connects all the astronomical observatories in China with the Alibaba global cloud facility. It has been a member of the International Virtual Observatory Alliance since 2002.
The CSSDC is supported by the National Space Science Center, CAS (http://www.cssdc.ac.cn/). It integrates and optimizes the data of space science, focusing on the integrity, systematization, and standardization of data management in space science, ensuring the permanent security and long-term availability of space science data and improving the level and efficiency of data application while also exchanging and sharing international space science data.
The GCdataPR is supported by the Institute of Geographic Sciences and Natural Resources Research, CAS and the Geographical Society of China (http://www.geodoi.ac.cn). It establishes a set of mechanisms and management methods for data publishing, preservation and sharing in the field of global change scientific research. GCdataPR advocates an innovative data sharing approach that integrates metadata, data products, and data papers.
The WDC-RRE is supported by the Institute of Geographic Sciences and Natural Resources Research, CAS (http://eng.wdc.cn/). This repository's principal databases include basic geographic, natural resources, population and social economy, disaster risk reduction, land use and land cover, Loess Plateau agriculture and environment, temperature products, and special region or thematic databases. It is also one of the subcenters of the National Earth System Science Data Sharing Infrastructure in China.
The WDS for Cold Dry Area Science Data Center is supported by the Cold and Arid Regions Environmental and Engineering Research Institute, CAS (http://card.westgis.ac.cn). It mainly includes scientific data on the cryosphere (glacier, snow, frozen soil), deserts and desertification, continental river basins in arid regions, and the critical scientific data set of land surface processes in cold and arid regions. It advocates the publishing of scientific data with unique digital object identification.
The WDCM is supported by the Institute of Microbiolmicrobes and was set up as a data center of the World Federation for Culture Collections, CAS (http://www.wdcm.org). It is a vehicle for networking microbial resource centers of various types of WFCC). The WDCM is constructing a data management system and a global catalog to help organizing, discovering and exploring the data resources of its member collections (Wu et al. 2013).
The WDS for Geophysical Scientific Data Center (Beijing) is supported by the Institute of Geology and Geophysics, CAS (http://www.geophys.ac.cn). The basic tasks of this data center are collection, handling and storage of scientific data and providing access to scientific research. It obtains global space environment and solid earth observation data through its observations in China and participation in international joint observation and data exchange projects.
The WDS for Ocean Data Center (Tianjin) is supported by the China Oceanic Information Network (http://www.cmoc-china.cn). It is responsible for the management of national marine data and information resources, providing guidance on and scientific management of national marine data and information, along with information and technical support for the marine economy, marine sustainable development, marine environmental protection, public services, and carrying out relevant research.

Clearinghouse for metadata exchange
The framework for WDS China Common Clearinghouse was preliminarily put forward and a prototype system was built using Pycsw, which is a Python realization approach of OGC Catalogue Services for the Web (CSW) standard. The pycsw technical framework was used to establish metadata management systems and allowed its metadata standards to be compatible with other international and national standards. The metadata capture module was built based on data harvesting to making all the metadata information could be accessed among the data centers in China. The initial progress can be seen in the exchange system (the website of WDS China, http://www.wds-china.org/). The job has been involved in the WDS Harvestable Metadata Working Group in 2019.

Research data archiving
The CSSDC gradually archived and released data of major projects, such as the Space-based multi-band astronomical Variable Objects Monitor(SVOM) Strategic Priority Program on Space Science in CAS, the International Space Weather Meridian Circle Program and others. The WDS for Cold Dry Area Science Data Center has made rapid advances in archiving major science and technology projects in China. Datasets of Heihe Watershed Allied Telemetry Experimental Research were archived and released based on a series of careful quality control procedures throughout sensor calibration, data collection, data processing, and datasets generation.

International cooperation data exchange
The WDCM is establishing a Global Catalog of Microorganisms (GCM), which is expected to be a robust, reliable and user-friendly system to help culture collections manage, disseminate and share the information related to their holdings. The GCM includes information on strain, taxonomy, isolation, application, paper, patents, sequence and protein. Up to now, there are 48 countries, 118 institutes and 447695 strains in the GCM. The WDS for Ocean Data Center has established a formal relationship of marine data exchange with over 130 marine institutions in more than 60 countries and is maintaining a close relationship of data exchange with over 30 major national oceanographic data centers. The data center is involved in global collaboration projects that include: the Global Ocean Observing System (GOOS), the Joint WMO/IOC Technical Commission for Oceanography and Marine Meteorology (JCOMM), the Global Sea Level Observing System (GLOSS), the Global Temperature and Salinity Profile Plan (GTSPP), the Array for Real-time Geostrophic Oceanography (ARGO), the North Pacific Marine Science Organization (PICES), and more. It also records 485 tide prediction sites around the world. CSSDC cooperated with the European Space Agency and French Space Agency in missions of Solar wind Magnetosphere Ionosphere Link Explorer (SMILE)and SVOM. During the past two years, the CSSDC was in charge of the construction of the joint observational network of China and Brazil supported by CAS.

Historical data saving
Historical ionospheric data collected by the WDS for the Geophysical Scientific Data Center (Beijing) comes from decades of manually created records, including around 50 years of photographic film records, and about 20 years of digital records. The ionospheric characteristic parameter database covers the continuous observation data of more than one solar activity cycle (11 years) in and around China, especially the continu-ous observation data of 70 years in Wuhan. This is the ionospheric characteristic parameter observation data with the longest observation in China.

Data publishing model
GCdataPR is becoming known as the new pattern of demonstration data centers of the world (Ma et al. 2019). It advocates an innovative data sharing approach that integrates metadata, data products, and data papers. Its published data set statistics is shown in Table 1 since June 2014. It has become the open repository of 59 academic journals published by Chinese and American institutes.

Open repository
Since August 1, 2019, the American Geophysical Society (AGU) has requested that all academic journals under AGU should publish the original data as the paper being published. AGU announced 203 data repositories recognized by the union around the world (the website of Repository Finder), including many WDS China data centers, such as GCdataPR, WDC-RRE, Virtual Space Science Observatory, the WDC for Geophysics, and so on.

CoreTrustSeal certification
CoreTrustSeal (https://www.coretrustseal.org/) is a certification system newly released by WDS and Data Seal of Approval (DSA), based on three principal dimensionalities criteria (organization infrastructure, data management and technical capability). The CAsDC and WDC-RRE in China achieved the certification at the end of 2018 and the beginning of 2019 (Wang et al. 2019). They are the first two centers getting the CoreTrustSeal certification in Asia, and presented their experiences in achieving this certification at the WDS Asia-Oceanian Conference in Beijing, 2019.

International Training
On August 10, 2015, the WDC-RRE hosted the international training workshop on resource and environmental data sharing in Northeast Asia and Central Asian regions in Beijing. Since then, WDC-RRE hosted training workshops for developing countries annually. More than 100 young data scientists were trained in these workshops. WDCM also has provided training courses for microbial data analysis every year. Meanwhile, WDS China supported the symposiums and conferences for Asia-Oceania regions, e.g., WDS-AO conference 2019.

Science popularization service
CAsDC has been a member of the International Virtual Observatory Alliance (VO) since 2002. The World Wide Telescope (WWT) is a visualization environment aggregating scientific data from major telescopes, observatories, and institutions in the world. Since 2008, CAsDC/China-VO has been trying to promote its application in education and science popularization in China. Hundreds of students participated and created nearly 300 great tours discussing the universe and astronomy in the WWT tour Contest. In February 2018, the CAsDC/China-VO team released the first Chinese Version of WWT and the online resource sharing platform.

Awards
GCdataPR was awarded the 2018 WSIS Prize (champion of electronic science group) by the United Nations in March 2018 and the honorary of "leading scientific and technological achievement award --shortlisted excellent project" by the China international big data expo in May 2018, and "innovation project" award in the eighth China digital publishing expo in July 2018. Linhuan Wu, a principal data scientist working at the