A MATURITY MODEL FOR DIGITAL DATA CENTERS

Digital data and service centers, such as those envisaged by the ICSU World Data System (WDS), are subject to a wide-ranging collection of requirements and constraints. Many of these requirements are traditionally difficult to assess and to measure objectively and consistently. As a solution to this problem, an approach based on a maturity model is proposed. This adds significant value not only in respect to objective assessment but also in assisting with evaluation of overlapping and competing criteria, planning of continuous improvement, and progress towards formal evaluation by accreditation authorities.


INTRODUCTION AND PROBLEM STATEMENT
Digital data and service centers, such as those envisaged by the ICSU World Data System (WDS), face a variety of key performance issues derived from a variety of sources, in respect to their operations, planning, and management. These may include organizational objectives, user requirements, constraints and requirements imposed by funding agencies, and, of course, the criteria set by the WDS in respect to different categories of membership. In addition, there may be local legal compliance required in respect to preservation and archiving while technical constraints could include standards for interoperability, cataloguing, processing, and the like.
There are several management problems associated with this wide variety of requirements imposed on a center, for example:  There is an overlap, though sometimes a subtle difference, in requirements derived from multiple sources.  Many of the requirements imposed on a center cannot be measured objectively, and different observers may come to different conclusions about the current performance of an organization or center.  Knowledge concerning successful approaches is not easily disseminated or transferred.

PROPOSED SOLUTION
A solution to these and several other smaller management challenges may be provided by applying the principles of a 'Maturity Model' (Humphrey, 1987), analogous to the approach first proposed by the Carnegie-Mellon Institute for the assessment and management of organizations involved in software creation and delivery. This provides a framework that addresses many of the management challenges that we have described thus far and serves as a repeatable and less subjective measuring instrument to assess the performance of digital data and service centers.

REQUIREMENTS PLACED ON DIGITAL DATA CENTERS
We will be using a hypothetical data center in the field of Earth and environmental sciences to develop our solution. We assume that the data center will be distributed physically (which is increasingly the norm and adds to the complexity of management) and that it needs to comply with typical interoperability requirements. Such a center might typically expect to 1. Derive strategic and management objectives from a business planning process, which, in turn, is Data Science Journal, Volume 12, 30 April 2013 WDS189 subject to financial and other resource constraints while presumably serving the need of one or more communities. These communities may not all be scientists and could include the wider public, decision makers, and private enterprise; 2. Link to a Community of Practice that imposes constraints and requirements, with the constraints including aspects of mandate and scope of operations and the requirements often aimed at ensuring interoperability and trouble-free access to the center's resources. The latter aspect may include data access policies. The center also needs to ensure that it meets the requirements of the Communities of Practice that it serves, defining appropriate products and services and service level agreements in the process; 3. Make provision for physical and software infrastructure to support its products and services, which may include functions of access, preservation, and processing requirements as well as measures whereby interruption of service and risk to assets are minimized. This requirement becomes quite complex in the case of a physically distributed system and may require the separation of archiving/ preservation arrangements from those aimed at operational data and services; 4. Apply due diligence and sound governance in respect to its operations, covering aspects such as independent oversight, risk management, adequate planning for long-term feasibility, and proper liaison with relevant stakeholders. There may be multiple jurisdictions that impose legal requirements and policy constraints on the center.
The large number of requirements and constraints deriving from the above can be arranged into an objective hierarchy (or network because some of the objectives have multiple links to others), and each of these objectives can theoretically have a goal and current level of performance as a minimum (Brehmer, 2005). This process is not new but is routinely performed in many private and public organizations as performance management.
The main difficulty lies with the measurement of the performance, which, for many of the typical requirements and constraints described above, is often performed arbitrarily and subjectively. The main purpose of this paper is to promote the use of maturity models to assist with an objective performance measurement.

MATURITY MODELS APPLIED TO DATA CENTER OBJECTIVE HIERARCHIES
The common definition of a maturity model is "a (framework) that describes how well the behaviors, practices and processes of an organization can reliably and sustainably produce required outcomes" (SEI, 2012). By creating such a framework, several side benefits can be obtained that will be discussed in detail later on, but the obvious structure of the framework is the descriptions associated with predefined levels of performance. These levels of performance are typically designated as follows:  Level 1 (Initial): Usually associated with ad-hoc approaches, undocumented processes, and little guarantee that a given outcome can be achieved. Knowledge and capacity are centered in individuals. The organization is often ignorant of best practice and of applicable or useful standards and specifications.  Level 2 (Repeatable): Processes are documented in sufficient detail to ensure continuity and allow reliable execution by a number of participants.  Level 3: (Defined): Not only are processes documented, but they are also standardized and aligned where applicable to national or international standards and specifications.  Level 4: (Managed and Auditable): Performance metrics are collected in respect to achievement of objectives and compliance with standards. Independent audits are performed from time to time to confirm such compliance.  Level 5: (Optimized): Deliberate process optimization is undertaken, and a regime of continuous improvement is possible.
These levels of performance are, of course, generic and need to be translated into corresponding descriptions for each of the objective hierarchy elements applicable to a data or service center. The example in Figure 1 deals with 'Meta-Data Interoperability'. Deriving similar descriptions for each performance level across all relevant objectives in the hierarchy leads to a comprehensive 'Maturity Matrix'.
There are several side benefits and additional uses of this approach in addition to identifying the level that most closely matches current performance (and in the act of doing so, making an objective and repeatable assessment):  Organizations often do not know where to start. By having access to a maturity matrix, it is possible to evaluate a feasible entry point.


The matrix can, and should, contain the benefit of prior experienceand each entry may be supported by best practice, standards, guidelines, and specifications.  It can assist multiple organizations with roughly the same objective hierarchy to align and pursue a shared vision (for example, in the ICSU WDS).  It assists with planning the next level of performance as a set of explicit, measureable objectives and to prioritize such actions that may be needed to achieve it.  It serves to define a level of performance across a collection of objectives and as such can be used to envision the requirements imposed by certification or audit authorities, for example, by defining the level of performance required to be certified as a 'trusted digital repository'.  It provides in a relatively objective way a means of comparing the performance of organizations, should the need arise to do so.

CONCLUSION
Hence, such an approach can be useful to establish all of the following: 1. Current level of performance; 2.
A set of internal objectives and self-assessment against these objectives; 3.
A set of future goals and milestones to support a process of continuous improvement; 4.
A quality assurance program; 5.
Accreditation and external audit mechanisms.
Current work will be extended in the near future to develop specific matrix entries for a wide variety of input requirements based on the scope discussed in the paper. The intention is to establish this as a community resource that can be edited by any number of collaborators with a view to its refinement, validation, and extension, thereby serving the ICSU WDS specifically and scientific data systems and services in general.