THE UNESCO-IOC-IODE “ GLOBAL OCEANOGRAPHIC DATA ARCHEOLOGY AND RESCUE ” ( GODAR ) PROJECT AND “ WORLD OCEAN DATABASE ” PROJECTS

We document the history and progress of two international ocean data management projects. The “Global Oceanographic Data Archaeology and Rescue” project was initiated in 1993 under the auspices of the UNESCO Intergovernmental Oceanographic Commission (IOC). The project has the goal of locating (archaeology) and digitizing or copying to modern electronic media (rescuing) historical (pre-1992) oceanographic data that exist in manuscript or electronic media form that are at risk of loss due to media decay. The IOC “World Ocean Database” project initiated in 2001 focuses on encouraging international data exchange for the post-1991 period and the development of regional atlases.


INTRODUCTION
To determine the role of the world ocean as part of earth's climate system and to develop climate system forecast and assessment capability for periods from months to decades, the international scientific community needs the most complete databases of historical oceanographic data possible.Many historical oceanographic vertical profile and plankton data sets are not available to the international community because they exist only in manuscript form.They are at risk of loss to media decay and damage.There are also data in electronic form that are not generally available and are also at risk of loss due to media degradation.This article describes a project initiated in 1993 to locate such data sets and to incorporate them into a global, comprehensive, integrated, scientifically quality-controlled database with all data in one uniform format.The "Global Oceanographic Data Archaeology and Rescue" (GODAR) project was initiated in 1993 under the auspices of the UNESCO Intergovernmental Oceanographic Commission (IOC).The project has the goal of locating (archaeology) and digitizing or copying to modern electronic media (rescuing) historical (pre-1992) oceanographic data that exist in manuscript or electronic media form that are at risk of loss due to media decay or neglect.This project has resulted in a tripling of the availability of some ocean profile data types for the pre-1992 period.The IOC World Ocean Database project (initiated in the 2001) focuses on enhancing the incorporation of modern data into a global ocean profile-plankton database and the development of regional climatologies.The goal of these two projects supports the development of the most comprehensive global ocean profile-plankton databases possible.Such a database is crucial for enabling scientific progress in determining the role of the ocean as part of earth's climate system.These databases, and scientific products based on these databases, represent the infrastructure on which much ocean and climate research and assessments are now based.Specifically: a) Objective analyses of the data in these databases provide gridded climatologies that are used as initial and boundary conditions for ocean climate simulations and to verify simulations of the climate system; b) The data are used to prepare diagnostic studies, particularly for identification of interannual-to-decadal ocean variability (e.g., Levitus et al., 2000); c) More recently these data are used as the input for ocean data assimilation efforts (Carton & Giese, 2008); d) The international scientific community advises national and international bodies on such issues as climate change, e.g. the Intergovernmental Program on Climate Change (IPCC).Hence, the international oceanographic and climate communities should have access to the most complete electronic oceanographic data bases possible.Regardless of one's views about the origins of observed changes of the earth's climate system (anthropogenic, internal, or natural), the scientific community needs the best scientific databases possible to perform scientific research on this topic (Solomon et al., 2007); e) Substantial resources have been, and continue to be, allocated for national and international ocean and climate programs such as Tropical Ocean and Global Atmosphere (TOGA), World Ocean Circulation Experiment (WOCE), Global Ocean Ecosystems Dynamics (GLOBEC), Joint Global Ocean Flux Study (JGOFS), Climate and Global Change, Climate Variability and Prediction (CLIVAR) and for the establishment of a Global Ocean Observing System (GOOS).Planners of such programs should have access to all historical oceanographic data in order to optimize measurement strategies for these programs.Scientists analyzing data from such programs need historical data in order to study interannual-to-decadal variability.Operational forecast centers need historical data in order to perform quality control of synoptic data; f) To understand fisheries variability, and to manage fisheries and other marine resources.
This article is an update of earlier project reports, Levitus et al. (1994) and Levitus et al. (2005).

A BRIEF HISTORY OF OCEAN DATA MANAGEMENT
The modern science of physical oceanography might be considered to have started around 1902 with the formation of the International Council for Exploration of the Sea (ICES).ICES started as a consortium of Western European countries with an interest in understanding the fluctuation of fisheries of the Northeast Atlantic.The titration technique used to determine salinity was developed around this time as well as the equation of state for sea-water.ICES coordinated and planned work of oceanographic data collection.Less recognized in my view is that ICES formally encouraged the international exchange of oceanographic data.This was done through establishment of a formal data publication series (Smed, 1968;Went, 1972).As time progressed, oceanographic institutions established data archives.For example, in the United States the Woods Hole Oceanographic Institute and the Scripps Institute of Oceanography became archive centers for Mechanical Bathythermograph Temperature (MBT) profile data beginning around World War II when the MBT instrument was first developed.Data were stored in the form of handwritten cards containing both data and metadata.ICES began storing data on punched cards in 1957 using nearly the same format as the U.S. Navy Hydrographic Office (Smed, 1968).
National support for the development of oceanographic sciences grew in many countries after 1900, which led to the development of catalogues of oceanographic data.In the United States, the National Academy of Sciences supported development of a catalogue, international in scope, of historical oceanographic data and of oceanographic institutions (Vaughn et al., 1937) in order to encourage the development of the science of oceanography.
Establishment of the World Data Center (WDC) system (Allen, 1988;Rishbeth, 1991;Ruttenberg & Rishbeth, 1994) occurred during the International Geophysical Year (IGY) in 1957-58 in order that scientific data gathered as part of the IGY be safely archived and accessible internationally without restriction.The World Data Center (WDC) System (Secretariat of the ICSU Panel on World Data Centres, 1996) operated under the auspices of the International Council of Scientific Unions, a non-governmental organization of scientific unions.ICSU is now known as the International Council for Science.The World Data Center system has now been incorporated into the ICSU World Data Service (http://www.icsu-wds.org/).
During the 1960s countries began establishing formal national oceanographic data centers to archive ocean data and provide services.Some of these centers were organized internationally under the auspices of the UNESCO/IOC (Roll, 1979), which was founded in 1961 to encourage the international exchange of oceanographic data and to support capacity transfer from developed to less-developed nations.Glover et al. (2010) present a history of IOC contributions to ocean data exchange.As indicated by their names, the IOC is an intergovernmental organization, and ICSU/CODATA/WDCs is nongovernmental.
In 1993 the World Data Center for Oceanography-Silver Spring (then known as WDC-A for Oceanography and hereafter referred to as WDC-Silver Spring or just WDC) began, under the auspices of the IOC, the Global Oceanographic Data Archaeology and Rescue (GODAR) project.In 2001 the IOC initiated a World Ocean Database (WOD) project which is also led by WDC-Silver Spring.In the remainder of this paper we document the history and progress of these and related projects.

HISTORY OF THE GODAR PROJECT
Based on the knowledge of oceanographic scientists and data managers from the international community, it was clear in the late 1980s that substantial amounts of historical oceanographic data existed only in manuscript or analog form or on obsolete electronic media.In addition, there were data in electronic form that were not available to scientists other than the principal investigator who made the original measurements.Such data were, and still are, at risk of being lost due to: a) Media degradation such as fading ink or magnetic fields; b) Obsolescence of devices to read such data from old media; c) Environmental catastrophes such as fires and floods; d) The retirement of individuals who know how to access these data or know the metadata associated with these data that make the data useable to other scientists; e) Simple neglect.
The idea of digitizing historical oceanographic data from manuscripts did not originate with the GODAR project.Several such efforts began with the advent of electronic computers.For example, the U.S. National Oceanographic Data Center (NODC) digitized manuscript data after its establishment in 1961.What the GODAR project represents is the establishment of a formal, internationally organized effort to support such activities and to make available all data in a single, integrated database.The existence of a formal international project, such as GODAR, sponsored by a recognized intergovernmental body such as the IOC, not only provides coordination which helps avoid duplication of effort but also can help in attracting national funding for participating countries.
In September, 1988, a workshop sponsored by the NOAA NODC and the NOAA Environmental Research Laboratories was held at NODC in Washington, D.C. Scientists and data managers from several countries held discussions on ocean data archiving and distribution.Following a suggestion by Sydney Levitus (Anonymous, 1988), one of the meeting recommendations was the establishment of a "Historical Data Validation Project" to "recover as much historical data as possible".A second meeting was held at NODC in September, 1990.This meeting was supported by the U.S. Climate and Global Change (CGC) Program.Specifically, a report to the NOAA Panel on Climate and Global Change by the CGC Working Group on Data Management (chaired by Francis Bretherton, University of Wisconsin) had emphasized the importance of "data archaeology" for Global Change Research.In that report "Data Archaeology" was defined as "the reconstruction of past climate and other aspects of global change from existing data".It involves a mix of seeking out, restoring, evaluating, correcting, and interpreting past data sets.
The word "Rescue" in this context refers to the effort to save data at risk of being lost to the science community by digitizing data along with accompanying metadata, incorporating these data into internationally available databases, and distributing these databases without restriction.The Panel Report recognized that researchers trying to study long-term ocean variability would have to wait decades for enough data to accumulate from new ocean observing systems to study decadal variability.At the 1990 meeting held at NODC, scientists and data managers from several countries and international centers including the Soviet Union, Republic of Korea, Japan, Chile, Australia, United States, and the International Council for the Exploration of the Seas (ICES) met to discuss the state of historical oceanographic data and in particular to discuss the loss of data due to media degradation.The meeting made a rough estimate that "approximately 50% of all the temperature profile data ever taken is not in the world's data centers".The results of the meeting led to the establishment of various national and international projects known generically as "Oceanographic Data Archaeology and Rescue" projects.
An international meeting known as the "Workshop on Ocean Climate Data" was hosted by NASA and NOAA at Greenbelt, Maryland, U.S.A (Churgin, 1992).The meeting was sponsored by the Commission of the European Communities (CEC), International Council of Scientific Unions (ICSU), World Meteorological Organization (WMO), International Council for Exploration of the Sea (ICES), and the Intergovernmental Oceanographic Commission (IOC).As a result of the demonstrated progress of various national data archaeology and rescue projects, the workshop recommended the expansion of these projects to band together under the umbrella of an existing international organization.
As a result of the "Workshop on Ocean Climate Data", a proposal for a "Global Oceanographic Data Archaeology and Rescue" (GODAR) project was submitted by Sydney Levitus to the Fourteenth Session of the IOC International Data and Information Exchange Committee (IODE) meeting held in Paris, France, during December, 1992.The IODE recommended to the IOC that this project be adopted as an IOC project.During the March, 1993 IOC Assembly meeting, the IOC adopted the proposal for a GODAR project.Sydney Levitus was invited to be Project Director, an invitation which he accepted.

WHY ARE OCEAN DATA MANAGEMENT PROJECTS NEEDED?
As this article documents, many oceanographic data were gathered before the advent of electronic computers.
Earlier projects to digitize these data were frequently incomplete.Also oceanographic data have often been gathered for one specific purpose and then ignored.Recognition of the importance of the role of the ocean as part of earth's climate system has resulted in a demand for historical ocean profile and plankton databases.
Even modern oceanographic measurements from a particular cruise or set of cruises may not be managed in what we consider to be an optimal fashion.What we believe is needed to facilitate oceanographic and climate research is a global, comprehensive, integrated, scientifically quality-controlled ocean profile-plankton database with all data in one, uniform format.Scientific data measured at great cost to societies need to be available in perpetuity for a variety of scientific studies including global change, global warming, and fisheries research studies among others.

GOALS OF THE GODAR PROJECT
The GODAR project emphasizes: a) Digitization of data now known to exist only in manuscript and/or analog form.This effort has highest priority of all activities; b) Rescue of electronic data that are at risk of being lost due to media decay or neglect; c) Ensuring that all oceanographic data available for international exchange are archived at two or more international data centers in electronic form; d) Preparation of catalogues (inventories) of: i) data now available only in manuscript form, ii) data now available only in analog form, iii) digital data not presently available to the international scientific community; e) Performing quality control on all data and making all data accessible via modern electronic media which now includes the Internet, CD-ROMs, and DVDs.

IMPLEMENTATION
From the inception of national and international ocean data archaeology and rescue projects at various centers, efforts were coordinated to avoid duplication of effort and to maximize the use of scarce resources.Joint activities include the exchange of data, data distribution plots, catalogue information about data holdings, and the exchange of scientists and data managers between centers.An emphasis on "rescue" and exchange of data occurring simultaneously was for two reasons: a) Some data are at risk of being lost forever if not saved immediately; b) In order to demonstrate credibility, the project needed to demonstrate how quickly the project could act to make previously unavailable data accessible in electronic format.
Perhaps the most valuable technique to quickly describe data holdings is to produce data distribution plots and tables of the number of profiles on a year-by-year basis for each major measurement type.Levitus and Gelfeld (1992) did this for each of the major NODC digital archives.This work showed the distributions of NODC holdings for all countries combined.The GODAR project prepared similar summaries on a country-by-country basis and distributed these electronically to data centers, scientists, and institutions in many countries.These summaries generated much interest which resulted in the exchange of more information and data.
Physical, chemical, and plankton oceanographic data, as well as ancillary surface marine meteorological observations, are the specific types of data that the GODAR project focuses on.Initially, most data digitized, or otherwise rescued, have been physical parameters.The rescue of sea level data from tide gauges has also become part of the GODAR project (Caldwell, 2003).
A series of six regional meetings were held during the first several years of the GODAR project to survey the oceanographic data held internationally in both manuscript and electronic form.The first regional GODAR workshop was held in Obninsk, Russia, during May 1993.This meeting focused on datasets and activities in eastern and northern Europe.This region was chosen in particular because of the possibility of the loss of substantial data sets due to economic conditions in Eastern Europe.The report of this first regional GODAR workshop (IOC, 1993) gives some indication of the amounts of data that exist in manuscript form.For example, the Russian delegation reported the existence of data for approximately 450,000 Mechanical Bathythermograph (MBT) profiles and 800,000 Oceanographic Station (OSD) casts in manuscript form.Reports have been produced for each subsequent regional workshop (IOC, 1994a;IOC, 1994b;IOC, 1995;IOC, 1997;IOC, 1999a) describing results of the workshop and in particular describing the amount of data held in manuscript and electronic form in each participating member state.The first phase of the GODAR Project culminated in an International Review Conference held in Silver Spring, Maryland during July 10-13, 1999.Seventy-five scientists and data center managers from twenty-five countries attended.The first phase of GODAR was deemed a success (IOC, 2003) and the workshop recommended the expansion of the project to formally include sea level data and geophysical data.
In 1999 a workshop (IOC, 1999b) sponsored by the IOC International Oceanographic Data and Information Exchange Committee (IODE) recommended formation of a special project to act as a focus for GODAR for countries bordering the western pacific under the title "GODAR/WESTPAC" with the project office being located at the Japan Oceanographic Data Center (JODC).Table 1 shows a list of all GODAR Workshops and the reports generated.In 1994, the European Union supported two pilot projects for developing concerted data management activities in the Mediterranean and Black Seas.These projects focused on the rescue and analysis of temperature and salinity profiles.These projects successively released the Mediterranean Ocean Data Base MODB (The MODB Group, 1996;Brasseur et al., 1996), andMEDATLAS (MEDATLAS Consortium, 1997;Fichaut et al., 1999).The latter database included the MODB data set with further quality checks of the observations.A new concerted action, MEDAR/MEDATLAS II, started in 1998 for a 3-year period funded by the European Union's MAST program and was endorsed by the IOC as a GODAR project.This project aimed to develop regional sustainable data management capacity for the Mediterranean and Black Sea scientific and operational programs through data and information exchange, job training, and workshops.The database and products resulting from the cooperation of 25 participants have been published on CD-ROMs (The MEDAR Group, 2002a, b).The long-term archiving of these databases is assured by several national data management systems.The released set of observations doubled the volume of available data compared to the previous project.It includes profiles for 12 bio-chemical variables in addition to temperature and salinity profiles, all fully checked for quality both automatically and visually, according to a common protocol based on international standards (MEDAR Group, 2001).It is notable that several countries that previously did not participate in the GODAR project were involved and that data not previously available were released for public use.Gridded fields and maps were produced using objective analysis methodology (Rixen et al., 2001).These results have been presented in many conferences and workshops.
For a dataset to be processed as part of the GODAR project, it must be accompanied by latitude, longitude, and the date of observation for each profile.Initially, mechanical bathythermograph data were sought out, but due to so much more accurate (and frequently deeper measuring) Ocean Station Data (OSD) and High Resolution Conductivity-Temperature-Depth Data (HCTD) being found, the GODAR project changed to focus on Ocean Station Data and High-resolution Conductivity-Temperature-Depth Data.One priority for inclusion of data in the GODAR project was acquisition of data from relatively data void regions As much metadata as possible were acquired along with the data so that the data would be as useful as possible for present and future generations of scientists and decision makers.
Data acquired as part of the projects described in this article were subjected to quality control as described by Boyer and Levitus (1994) and Conkright et al. (1994).Some specific details described in these reports have been updated since they were first published.For example the results of statistical checks can change as additional data are acquired, in particular for data sparse regions in space and/or time.It should be recognized that quality control procedures can be expected to improve with increasing amounts of data gathered with the passage of time and improved instrumentation and data coverage.For example, the establishment of the Argo profiling float project has greatly improved data coverage in many parts of the world ocean (Roemmich et al., 2009).The global historical database of temperature data has increased by approximately 3.4 million profiles as a result of GODAR, and the GODAR workshops have identified data from many more profiles that are in manuscript form or are not part of any publically available database.Thus it is clear that the international scientific community now has access to a much more comprehensive ocean profile database than previously thought possible.There will be additional historical data added in the future.Figure 5 is an example of one of the data sets acquired as part of the GODAR project.It shows a submission by Japan of data from approximately 270,000 ocean station data profiles.These profiles represent data taken by the Japanese Fisheries Agency and Japanese Prefectural Fisheries Experimental Stations.

THE WORLD OCEAN DATABASE (WOD) PROJECT
In 2001 the IOC initiated a World Ocean Database project with Sydney Levitus appointed as project leader.The purposes of this project are to encourage the exchange of modern ocean profile-plankton data, develop regional and global databases and atlases, and to develop quality-control procedures for ocean profile-plankton data.
Modern data (post-1990) have been acquired as a result of this project, and data from new instrument types have been added to the data available as part of the World Ocean Database series (Boyer et al., 2009).Plankton data as well as data from moored buoys, drifting buoys, profiling floats, undulating ocean recorders (e.g., towed CTDs), and instrumented marine mammals (Boehlert et al., 2001) are included in the World Ocean Database 2009.This includes many upper ocean temperature profiles that will help determine interannual-to-decadal variability of ocean thermal structure (Levitus et al., 2000;2009).For example Lysne and Deser (2002) compare the variability of the upper ocean thermal structure from an ocean general circulation model simulation of the Pacific Ocean with analyses of historical upper ocean thermal data.Figure 6 shows a time history of the number of ocean temperature profiles available from NODC/WDC as a function of time and includes the results of all projects described in this article.As the size of the integrated global and regional databases increases it becomes feasible to begin characterizing the frequency distributions of ocean variables, which is important for development of statistical quality control procedures.The state of quality control of historical oceanographic and synoptic ocean data needs to be improved.There has been only a small amount of published work on this subject with the exception of works by Oguma and Nagata (2002), Levitus andSychev (2002), andOguma et al. (2003).

BIOLOGICAL DATA
NODC/WDC has received many requests for ocean biological data during the past decade.These requests are to support specific missions such as missions to estimate sea surface chlorophyll from space-based platforms (e.g., SeaWIFS), determination of ocean biogeochemical cycles, and studies of marine biodiversity.Prior to 1998, NODC integrated databases did not include chlorophyll data nor plankton data.Data for these variables are now included in the World Ocean Database series.Approximately 132,000 OSD casts now contain chlorophyll profiles, and there are 44,000 surface observations of chlorophyll from the French Ship-of-Opportunity Program, SURTROPAC (Dandonneau, 1992) (More of these data are available and need to be added to the WOD).The geographical distributions of these data are shown in Figure 8.The sea surface chlorophyll data have been used as internal boundary conditions for the analysis of surface chlorophyll estimates from the NASA Coastal Zone Color Scanner and SeaWIFS projects (Gregg & Conkright, 2002).Their results indicate that there are statistically significant gyre and basin scale changes in the distribution of plankton in the world ocean.WOD09 contains over 106,000 plankton biomass observations, and over 700,000 taxonomic observations.As work continues to expand the plankton database, attention is focused on metadata requirements, quality assurance, and methods for incorporating data from different sampling techniques.In co-operation with numerous scientists and international groups (NOAA, 1997), NODC/WDC has identified key metadata requirements necessary for usefulness of plankton data and is developing quality control and analytical techniques for these data.

PROBLEMS IN BUILDING A GLOBAL OCEAN PROFILE-PLANKTON DATABASE
The World Ocean Database (WOD) is a global, comprehensive, integrated, scientifically quality-controlled ocean profile-plankton database with all data in one uniform format.WOD is produced by NODC/WDC-Silver Spring.
The WOD has replaced earlier NODC profile databases.The latest version of WOD is known as World Ocean Database 2009 (WOD09) and is comprised of data from 94 countries, representing 760 institutes, with data gathered by 7,418 platforms (ships, buoys...) made during 186,000 cruises.For this reason we characterize WOD as a "heterogeneous" database.Although exact definitions are not possible, we characterize a "homogenous" database as being composed of data from relatively few sensors, platforms, etc., for example, data from several satellites measuring the same variables.
Many problems are encountered in developing a database such as the WOD.These include: a) Incorrect data or metadata submitted; b) Lack of critical metadata; c) Improperly formatted data submitted; d) Data submitted in many different formats.
It would not be an exaggeration to state that nearly every data set we receive has some type of problem associated with it.This requires manual intervention during processing, and so the development of integrated profile-plankton databases is relatively labor intensive.The ocean profile-plankton databases we deal with are relatively small in size (<20 Gb) but require substantial effort to develop.This effort is justified by the effort it cost to make the original measurements and the scientific utility of these databases and products based on these data as we document in the following two sections.
We are not implying that a "heterogeneous" database is any less difficult to construct or maintain than a "homogeneous" database.We simply mean that the problems in developing such databases may be different.

SEA LEVEL DATA
As noted above, since 1999 sea level data have been included in the GODAR project as part of the Global Sea Level Observing System (GLOSS).Substantial progress has occurred with respect to the location of, and digitization of, historical sea level data.Results of these efforts (and earlier efforts by Dr. Mark Luther) have been described by Caldwell (2003) which we summarize briefly.The goal of this GODAR sub-project has been to locate and digitize historical hourly records of sea level data measured by tide gauges.The hourly data rescued to date includes 372 years of data from 34 locations in 15 countries.The Japan Oceanographic Data Center has contributed data from tide gauges in the western Pacific.However, the majority of data are from countries in South and Central America.Many of these countries have contributed data, and many data are from the archives of the former U.S. Coast and Geodetic Survey, which is now part of the NOAA National Ocean Service.

SURFACE-ONLY MARINE OBSERVATIONS
Although not a formal part of the GODAR project, some data centers have digitized surface marine observations from merchant ships and contributed them as part of the GODAR project.These data have been sent to the groups responsible for constructing the International Comprehensive-Ocean-Atmosphere-Data-Set I (COADS) (Woodruff et al., 2011) where they have been incorporated into this database.For example, the People's Republic of China has digitized approximately 420,000 sets of surface marine observations from Chinese ships and contributed these to the GODAR project.

DECLASSIFICATION OF NAVAL DATA
Ocean observations are made by numerous navies of the world.Many of these countries routinely submit all or part of their data to the World Data Center system.The IOC has issued Circular Letters requesting the declassification of naval oceanographic data.Argentina, Turkey, United Kingdom, United States, and Russia have declassified and made available additional oceanographic data in response to the IOC request.

ECONOMIC VALUE OF RECOVERED DATA
Oceanography is an observational science, and it is not possible to replace historical data that have been lost.From this point of view, historical measurements of the ocean are priceless.However, in order to provide input to a "costbenefit" analysis of the activities of oceanographic data centers and specialized data rescue projects, we can estimate the costs incurred if we wanted to resurvey the world ocean today, in the same manner as represented by the WOD09 Ocean Station Data (OSD) profile archive.
The computation we describe was first performed in 1982 by Mr. Rene Cuzon du Rest, of NODC but here we use up-to-date figures.We use an average operating cost of $20,000 per day for a medium-sized U.S. research ship (NSF personal communication) with a capability to make two "deep" casts per day or 10 "shallow" casts per day.We define a "deep" cast as extending to a depth of more than 1000 m and a "shallow" cast as extending to less than 1000 m.This is an arbitrary definition but we are only trying to provide a crude estimate of replacement costs for this database.Using this definition, WOD09 contains approximately 2,560,000 shallow casts so that the cost of the ship time to perform these measurements is approximately $3.6 billion.In addition WOD09 contains approximately 405,000 profiles deeper than 1000 m depth, so the cost in ship time to make these "deep" measurements is approximately $3.2 billion.Thus, the total replacement cost of the OSD archive is about $6.8 billion, a figure based only on ship-time operating costs, not salaries for scientists or any other costs.As previously noted, approximately 610,000 XBT profiles have been recovered as part of the GODAR project.Assuming a present day replacement cost of $40/profile, this represents an amount of $24 million dollars in data recovered.

SCIENTIFIC UTILITY OF OCEAN PROFILE DATA
We often receive questions about the importance and utility of historical oceanographic profile data.To respond to such questions, we present Figure 10 which shows the history of scientific citations for some NODC/WDC ocean profile databases and products based on these databases.Based on the number of citations, it is very clear that ocean profile databases and products based on these databases have had a substantial impact on scientific research during the past twenty years and can be expected to do so in the future.

SCIENTIFIC PROGRAM SUPPORT FOR DATA ARCHAEOLOGY AND RESCUE
Data archaeology and rescue efforts have received widespread scientific support not only for ocean data but for many other geophysical data (WCRP, 1995;WCRP, 1998).Brunet and Jones (2011) describe some of these projects.
A recent data archaeology and rescue project under the aegis of ICSU/CODATA has begun known as DARTG (Data at Risk Task Group).This project defines "data at risk" as "scientific data which are not in a format that permits full electronic access to the information which they contain".Such data may be inherently non-digital (handwritten or photographic), on near-obsolete digital media (such as magnetic tapes), or insufficiently described (lacking meta-data).Some born-digital data can also be considered "at risk" if they cannot be ingested into managed databases because they lack adequate formatting or metadata.Data which are regarded as unusable tend to be regarded as useless and then risk being destroyed.Most of the non-electronic data in question pre-date the digital era, and where they complement more modern ones by offering a much longer time-base they are essential, sometimes vital, for studies of long-term trends.
The goals and objectives of DARTG are to "create an Inventory of data that are at risk, and whose unique scientific information is in danger of being lost to posterity.(The Inventory will become the foundation for a Phase II project to design a series of missions to rescue that information.)DARTG will thus accentuate the need to be protective of the scientific content of fragile data, and will illustrate this broader objective by compiling literature describing new science which has emanated from analyses of rescued, historic data.By working through the steps to achieve our objective, DARTG will demonstrate an approach, process, and practices for building an extensible inventory of scientific data which risk being lost or destroyed and whose information content is therefore seriously endangered.

DATA AVAILABILITY AND ACCESS
As part of its commitment to the scientists, institutions, and countries that have made these oceanographic data available, the GODAR project through NODC/WDC has made all data available on CD-ROM and DVD media as well as on-line via the Internet from the NODC/WDC website (www.nodc.noaa.gov).Beginning with World Ocean Database 1998, all data have been made available on-line.The most recent version of the World Ocean Database is World Ocean Database 2009 (WOD09) (Boyer et al., 2009).The online version of the World Ocean Database is updated every three months with additional data and corrections made for data and metadata found to have been in error.We actively seek out guidance from scientists and data managers regarding possible problems with the data and metadata in the WOD.Conversely, we inform data originators of such problems when we encounter them.Each data profile in the WOD is identified by a unique identification number to make communication with colleagues easier.
The World Ocean Database products come with software conversion routines so that users of software packages, databases, and programming languages such as MATLAB, IDL, PC-Surfer, C, and FORTRAN can access the data.
In response to user requests, we have defined the WOD format to be as 'self-defining' as possible so as to eliminate, or at least minimize, the need for any structural changes to the format when new data types are added.All code tables, documentation, and software containing metadata are available on-line as well as on the CD-ROMs which are used to distribute the WOD series.When a new database is released (every 3-4 years) users can acquire the new database or simply acquire data for those ocean stations that have been added or modified since the previous release.
In addition, as corrections are made to the database after a release of WOD, users can acquire any modified data several days after the end of every month.There is a "Help Desk" and "Frequently Asked Questions" for the databases available online.

EDUCATION
The World Ocean Database 2009 has been used to produce objectively analyzed climatological fields of the variables contained in WOD09.This product is known as World Ocean Atlas 2009.The analyses are performed on a one-degree grid at thirty-three standard depth levels for the world ocean between the sea surface and 5500 m depth.All of these fields plus observed one-degree square statistics of the number of observations, mean, standard deviation, and standard error of the mean for each variable and compositing period have also been computed.All of these fields are available on-line as digital fields and in the form of color figures (approximately 44,000 color figures available online at www.nodc.noaa.gov).This product is known as World Ocean Atlas 2009 Figures.Numerous colleagues have informed us that these figures have proven valuable in teaching oceanography.
Ocean Data View (ODV) is a software package for the interactive exploration, analysis, and visualization of oceanographic and other geo-referenced profile or sequence data.ODV runs on Windows (7, Vista, XP, 9x, Me, NT, 2000), Mac OS X, Linux, and UNIX (Solaris, Irix, AIX) systems.ODV data and configuration files are platformindependent and can be exchanged between different systems.It can be acquired at: http://odv.awi.de/.Dr. Reiner Schlitzer and his colleagues have developed ODV and make it freely available.

CONCLUSIONS
We have described the results of two ocean data management projects led by the author that have led to the creation of the most comprehensive, global ocean profile-plankton data made available internationally without restriction.International cooperation has been outstanding and the outlook for continued international cooperation is excellent.
It is clear to us that countries recognize the importance of building global ocean databases.This is quantified by the amount of downloaded data from WOD and the high number of citations of WOD and products based on it in the scientific literature.The challenge to the GODAR and WOD projects, and others like them, is to maintain continuing support.Scientists and decision makers may take the existence of scientific databases for granted, but such databases require long-term dedicated efforts and financial support.Additional oceanographic data have been identified as part of the GODAR project that still needs to be digitized and or transferred from aging electronic media to fresher media and incorporated into regional and global databases.Efforts will continue to digitize these data and make them available in future databases.The WOD project continues incorporating as much modern oceanographic data as possible.We believe one of the key reasons for the success of the two projects we have described is that we have made all data received as part of these projects available to the international scientific community in a single format with quality control flags assigned to each data value.I believe this is should be a component of any "data archaeology and rescue" project.
In order to develop climate-system quality records, and by this we mean in particular building databases that are free of systematic biases, it is necessary for: 1) data to be made available without restriction and 2) such data to be used in climate system time series.A good example of this is the work by Gouretski and Koltermannn (2007).Seeing what they believed to be an unrealistic feature in the time series of global ocean heat content during the 1970s (Levitus et al., 2005), Gouretski and Koltermann used data from the World Ocean Database to indentify timevarying biases in bathythermograph data, which caused this unrealistic feature.This has led to further studies describing this problem (Levitus et al., 2009;Domingues et al., 2008;Wijffels et al., 2008;Ishii & Kimoto 2009) and improvement of estimates of ocean heat content.Willis et al. (2009) identified a systematic bias with data from some profiling float instruments deployed by the Woods Hole Oceanographic Institute because the data were available without restriction.The examples we have provided here document that building a high-quality global climate observing system and global climate databases requires that all data that are part of such a system and part of such databases be available without restriction.

Figure 1 Figure 1 .Figure 2 .Figure 3 .Figure 4 .
Figure1is a comparison of the number of Ocean Station Data (OSD) casts (Bottle Data) available from NODC/WDC in 1991 as a function of year for the 1900-1990 period compared to the OSD casts available as part of the World Ocean Database 2009.In 1991 NODC/WDC held data from 783,912 OSD casts for the pre-1991 period with the GODAR project adding an additional 1,050,509 casts.Most of the data acquired are from the post-World War II period when oceanographic expeditions greatly expanded in number.

Figure 5 .
Figure 5.Japanese Ocean Station Data associated with the Japanese Fisheries Agency and Japanese Prefectural Fisheries Experimental Stations acquired as part of the GODAR project.A red dot indicates a one-degree square containing 41 or more surface observations, orange indicates 21-40, yellow indicates 6-20, green 2-5, and blue indicates a one-degree square containing 1 observation.

Figure 6 .
Figure 6.Location of Ocean Station Data casts (5016 in total) digitized by the Bundesamt fuer Seeschifffahrt und Hydrographie (BSH), Hamburg, Germany.All data were measured prior to 1946.Data from some of the main contributing countries are indicated by color-coded dots.Other countries data are indicated by black dots.

Figure 7 .
Figure7.Growth of the NODC/WDC archive of temperature and salinity profiles as a function of time.The acceleration of the growth of the archive in the early 1990s is due to the GODAR and WOD projects described in this paper.Data from moored buoys such as the TAO/Triton, PIRATA, and RAMA arrays and from Argo profiling floats have also made substantial contributions.

Figure 8 .
Figure 8. Distribution of historical chlorophyll profile data recovered by the GODAR project and modern surfaceonly chlorophyll data made available by the French Ship-of Opportunity Program (SURTROPAC).A red dot indicates a one-degree square containing 41 or more surface chlorophyll observations, orange indicates 21-40, yellow indicates 6-20, green 2-5, and blue indicates a one-degree square containing 1 observation.
Figure 9 shows the state of the plankton recovery effort.More detailed information about the biological data can be found in the works by O'Brien et al. (2002a, b), Conkright et al. (2002b), and Baranova et al. (2009).

Figure 9 .
Figure 9. Distribution of plankton tows recovered by the GODAR project.A red dot indicates a one-degree square containing 41 or more plankton tows observations, orange indicates 21-40, yellow indicates 6-20, green 2-5, and blue indicates a one-degree square containing 1 tow.

Table 1 .
GODAR Workshops and the reports generated* Selection software (WODselect) (developed by Mr. TimBoyer and Ms. Olga Baranova)allows users to access data by specifying geographic area, observation dates, instrument type, measured variables, deepest measurement, country, ship/platform, project name, and institute.Data are made available in a Comma-Separated-Values (CSV) format.WODselect supports the goals of the IOC, ICSU WDC, and United States data exchange systems to promote open access to scientific data.Additionally, it supports the United Nations Framework Convention on Climate Change to "promote and cooperate in the full, open and prompt exchange of relevant scientific, technological, technical, socio-economic and legal information related to the climate system and climate change".