ANALYSIS OF THE PARADIGM EVOLUTION OF DIGITAL LIBRARIES IN CHINA

The authors analyze the developmental framework of digital libraries in China and point out their current demand characteristics, development requirements, and developmental period. They then conclude that it is necessary to start up a new paradigm evolution of a digital library, from a traditional digital library to a virtual digital library. On that basis, they describe in detail several problems and developmental approaches that developing a virtual digital library must deal with, drawing lessons from the prototype DILIGENT.

and research processes of users under the new service environment e-Science, moving towards being more interactive, dynamic, and personalized, eventually aiding its paradigm to evolve.

Features of the first generation of digital libraries in China
Digitalization of resources is the main feature of the first generation of digital libraries in China, of which a typical example is the China-US Million Book Digital library Project (QiuShiNews, 2002). The project, with its goal of collecting one million books to be accessed on the Internet by users from all over the world in four to five years, has developed a top-ranking technological digital library platform to effectively support the process, management, and service of the digital books with cooperation between China and the U.S. During this process, Chinese digital library projects and their system architecture have used a construction model of a digital library that sets up a digital information resource system based on the digitalization of certain documentation resources. These then are embedded into the system of a traditional library or the superstratum of an information resource system as an independent system. The service, retrieval, and delivery of certain digital information are the main missions of the first generation of digital libraries. Resource databases are used to store digital information resources. A system of information resource management and retrieval, and user interface have been developed.

Features of the second generation of digital libraries in China
With the development of computer technology, Chinese digital libraries have been changing daily. Because of the rapid development of digitalization of documentation resources, users expect digital libraries to provide more service, especially integrated services, to satisfy their information needs. In consequence, a new kind of digital library such as Chinese Science Digital Library has appeared, with integrated service as its main characteristic. CSDL (CSDL, 2003) was started in 2001 by the Chinese Academy of Sciences (CAS) to develop a platform of digital information service to serve its more than 90 research institutes across the country. Presently, CAS researchers can access full text scientific, technical, and medical (STM) journals, conference proceedings, theses and dissertations (ETDs), patents, reference books, and eBooks from the CSDL portal. Taking e-journals for example, CSDL now provides users with 4000 core western STM journals and about 10000 Chinese ones.
Meanwhile, CSDL organizes a wide range of networked services, such as union catalogs, cross searching, document delivery, digital reference, MyLibrary customization, and remote authentication (Figure1).
Although digital libraries in China have experienced two generations and now are developing user-centered models, there is a gap between China and the developed countries, in the infrastructure and function of a digital library.
The representatives of digital libraries such as CSDL and its system architecture have shown that the construction model of the second generation of digital libraries in China concentrates on the interoperability of a distributed information system in order to provide service for seamless information interchange and sharing among interoperable systems based on the previous resource construction, as well as creating a system of integrated information service. In this stage, the paradigm of a digital library comprises all kinds of distributed information resources, tools providing integrated service, and integrated service that highlight a digital library.

What is a virtual digital library?
Digital information resources were at the root of the survival and development of digital libraries before the emergence of e-Science. However, the e-Science environment has completely changed the situation.
Representatives of e-Science technologies such as the grid have allowed digital libraries to collect, in a short period of time, all kinds of resources available all over the world, including information resources, equipment resources, instrument resources, etc. Consequently, digital libraries aid users in establishing their personal digital library very swiftly by collecting resources automatically. The type of digital library with the features and functions mentioned above is called a virtual digital library. Although owning resources locally is not the main construction goal for a virtual digital library under the e-Science environment, it can dispatch resources if there is permission to do so.
The main service of a virtual digital library is to provide rules for resource collection, organization, and management, as well as a cooperation mechanism, in order respond to the users' needs as soon as possible and eventually help users set up their own virtual digital library according to their research projects, study requirements, and virtual organizations. This is the key point that a virtual digital library concentrates on after the construction of resource bases. In practice, a personal virtual digital library, which will have much stronger functionality and much more humanization in operation in the e-Science environment, is mainly composed of a name system, a system of indexing and searching, a generator of the digital library, user interface and rule bases, including rules for the creation, management, and maintenance of the personal virtual digital library, along with rules and mechanisms for resource collection, unifying services, and request response. Providing the services to generate a digital library is the symbol of the virtual digital library in this stage.
The ultimate goal of a virtual digital library is to help users, especially scientific workers, to set up their MyLibrary. Nevertheless, this is a different MyLibrary from that which appears currently. The difference is that the new MyLibrary can conveniently set up a personal virtual digital library on a user's desktop according to the standards, rules, and mechanisms defined by users themselves. In addition, there is a dynamic integration of resources in a virtual digital library, which combines with the e-Science environment and then forms a digital information environment based on the user's information activity and suitable to the user profile.

The functions of a virtual digital library
(1) The enhancement of interactivity The virtual digital library is not only the core component of the information service but also the object of the information service. That is to say, the virtual digital library will collect information resources provided by users or virtual organizations in the process of information service to enrich its resource repository. While being served by a virtual digital library, users can participate in the resource construction of a digital library. Although users could interact with a digital library only via virtual reference or email before e-Science appeared, the interactive method mentioned above can to some degree enhance the interactivity both between the digital library and its users and between the digital library and virtual organizations under the e-Science environment.
(2) The enhancement of personalization A virtual digital library provides integrated service through the collection of information resources here and there and then supplies the resources and service customization for users with the help of the interface that owns the function of customization. Subsequently, the virtual digital library will include the wrapped service that is customized by its users. After the application of grid technology, the virtual digital library will become stronger and stronger in personalization service. As an example, DILIGENT, short for a Digital Library Infrastructure on Grid-Enabled Technology, is to establish an infrastructure based on grid technology to make users set up their personal virtual digital library easily and effectively according to scientific research interests within some specific period.
(3) The enhancement of dynamic integration The construction of integrated service in a virtual digital library is an enormous information project, which has many aspects. Only cooperation can make a virtual digital library work. In the e-Science environment, by virtue of the core technology grid, it is possible to integrate resources distributed in different places via the Internet and then offer resource service competency that integrates high-performance computing resources, resource management, and service. Grid technology can locate precisely the data sets that can satisfy users under a distributed and heterogeneous environment and support further functionality. As a use of electronic resources, it is not necessary to pay attention to the source of the resource and the load balancing of the system. In addition, grid technology has become a vital approach to the linkage and unification of all kinds of remote and heterogeneous information resources and for its ability to reasonably and effectively organize remote resources and to set up grid virtual computers with strong service competency. Because of the above-mentioned advantages, when a virtual digital library is built, correct management, a knowledge base, and all sorts of services can be pooled together and resources can be dispatched by grid technology. As a result, users can access the resources and receive service through a uniform interface, which can provide service from one virtual digital library or from virtual digital library federations.
(4) The enhancement of knowledge production Compared with the Internet, there is the feature of knowledge production in grid technology. That is to say, owing to the Internet's inability to produce knowledge, knowledge can be diffused on the Internet and accessed by users after its creation by other approaches. In contrast, the grid can automatically produce knowledge according as the users' information needs. High-performance computers, with strong service competency, have played an important role during this process. For example, these computers can transform raw data collected from all kinds of sources into information and knowledge with the aid of specific programs, such as data mining. Moreover, grid technology can automatically find sources of data related to users' needs and then form new information or knowledge via comprehensive analysis and knowledge discovery. Therefore, with the development of grid technology, the knowledge production of the virtual digital library is greatly enhanced. When a request or query is submitted, the virtual digital library, with its foundation of grid technology, will process and analyze the request or query automatically so that the resulting information can satisfy users and will be carried to the node where the user has logged in. This will consummate the knowledge service of the virtual digital library.
(5) The abridgement of response time Before e-Science appeared, the service of digital libraries dissociated itself from the process of scientific research so that the digital library served users similar to winning by striking only after the enemy has struck. In other words, the digital library responded to the users' requirements after its reception of the information need from scientific workers. However, in the e-Science environment, because of the higher demand for response speed required by scientific workers and virtual organizations, the virtual digital library must provide service in a more precise and more expeditious way. Accordingly, the digital library has to change its methods of service to anticipate users' requirements. That is to say, a digital library in the e-Science environment serves users depending on the users' profiles, such as education level, major, as well as research field, with long-term tracking of users' information. If it happens that there is something new and worth whistle on the Internet, the virtual digital library can push it to users on its own, which makes it respond and offer the proper service in a short time when users' information need appears.
(6) The enhancement of sharing competency First, the scope of not only the main body of sharing but also the objects of sharing has been extended. The basic condition for information sharing includes diversity in information delivery and simultaneous receipt and use of information by multiple users without any influence upon redelivery and reuse of information, along with diversity in the methods of information duplication, processing, and use. Before e-Science appeared, it was necessary to represent information by digitalization to assure the three above-mentioned conditions were met.
As for the digital library, of which the core is digital information resources, information resource sharing is its inherent advantage. By dint of e-Science, resources, which can be shared among digital libraries and scientific research organizations, have broadened from information resources, before the appearance of e-Science, to equipment resources, instrument resource, computing resources, communication resources, and specialist resources, after its emergence.
Second, the virtual digital library will fully carry out knowledge sharing. Information is made up of the information carrier, signal, and content, while information sharing is dependent on information technology and information systems. For all sorts of information processing technologies, such as information indexing, information organization, information query, and information delivery, the effectiveness and achievement of information sharing can be improved. However, it is obvious that the change of information carriers and signal systems cannot affect the information content. Compared with information sharing, as a result of developments in information sharing, knowledge sharing, based on the affinity of users and their mutual communication, falls back on thought, worldview, culture, social system, and environment. Thus, the implementation of knowledge sharing is more difficult. Nevertheless, in the e-Science environment, concurrently with the boost of social divisions, people prefer to obtain knowledge directly, rather than share raw information and process it into knowledge by themselves. Desire for knowledge has contributed to the development of knowledge sharing. Meanwhile, the virtual digital library, which does well in information sharing, has seized the opportunity brought by e-Science and advances knowledge sharing through grid technology that can create immediate knowledge. By virtue of all categories of tools that can work in real-time and provide much clearer pictures, members either from the same virtual organization or from different virtual organizations can trust in each other more deeply than ever before because of the guarantee of agreements. In short, the convenience of the creation of knowledge and trust among users can offer an extraordinarily advantaged precondition for knowledge sharing.

Promotion of the need from scientific research
Science is increasingly developing in a digital fashion (Atkins, et al., 2003), with e-Science, or cyber-infrastructure that creates a networked infrastructure rich in data grids, computing grids, digital libraries, and collaboration. Here all kinds of scientific objects, including people, programs, facilities, data and documents, procedures and workflows, and even policies and strategies, can be and will be digitally represented, accessed, interconnected, and invoked through networked interaction. This calls for a new definition of information resources, information organization, and information service integration (Zhang, 2005).
Science is increasingly based on interactive virtual knowledge communities where networked information becomes interactive research tools, online collaboration acts as the organizing mechanism (Atkins, 2004), and virtual organizations dynamically support knowledge, discovery, and exchange. Effective services far beyond simple search and delivery of known items are in demand to help users mine knowledge and knowledge relations among the seas of various scientific objects and do so proactively along research workflows and Data Science Journal, Volume 6, Supplement, 6 October 2007 among research interactions. A service system diffusing into and interacting with users' knowledge processes is essential for success in such a situation (Zhang, 2005).

The development of digital libraries throughout the world
The techniques for solving typical digital library issues, such as interoperability, ontology integration, workflow optimization, content-based automatic selection, etc, will certainly provide useful experiences for the development of grid services. The high-performance computing competency that is offered by grid technology will promote the development of many new functions for digital library users, which are related to multimedia documents. Particularly, functions, such as content feature extraction, summarization, automatic content source description, etc. of video, images, and sound, which are based on complex and time consuming algorithms, will become feasible with a reasonable performance. Furthermore, the techniques for data replication and security handling developed in the grid area will contribute greatly to the definition of new digital library preservation techniques.
The DILIGENT project is creating an advanced test-bed that will allow virtual e-Science communities to share knowledge and collaborate in a secure, coordinated, dynamic, and cost-effective way (http://www.diligentproject.org/). This test-bed will be built by integrating grid and digital library technologies. The merging of these two different technologies will result in an unprecedented level of functionality and will lay the foundations for a next generation of e-Science knowledge infrastructure. The DILIGENT infrastructure, which is based on high bandwidth networks and empowered by OGSA-compliant grid services, will be able to serve many different research and industrial applications. The test-bed will be demonstrated and validated by two complementary real-life application scenarios: one from the cultural heritage domain and one from the environmental e-Science domain. On the basis of analysis of currently appearing digital library projects that are related to the technologies of e-Science such as the grid, we can conclude that there are at least three features in the digital library trends: (1) Considering the development of grid technology such as the grid developed by the EGEE project. The EGEE project, funded by the European Union with 30 million euros, assembled specialists from more than 27 countries and districts, with a goal of developing a service grid infrastructure in Europe based on the latest grid to provide highly effective service to scientific workers day and night. It is the largest project among all projects relating to grid technology up to the present time. The project has focused on three aspects: Setting up a consistent and substantial grid infrastructure and pooling other computing resources; Constantly improving and maintaining middleware to afford reliable service; and Attracting new users from the scientific research community and industry to make high-standard training and supports available. In many digital library projects now in operation, such as DILIGENT, the infrastructure is based on grid technology.
(2) Following OGSA protocol. OGSA is a protocol in the e-Science environment, which can include technology criteria such as web services, access to databases and J2EE into the grid, so that grid service can be accessed for "services" uniformly. In the infrastructure constructed by DILIGENT, middleware compatible with OGSA protocol are important parts of the resource layer. As a result, DILIGENT can insure that its service abides by OGSA protocol and that a third party can integrate service provided by DILIGENT so it can be utilized by users.
(3) Aiming at the trend of digital libraries in the e-Science environment. In the future, both personal study and scientific research will depend on digital libraries (Warner, 2005). Given the development trend of digital libraries, a researcher from ISTI-CNR (Cannataro & Talia, 2003), short for the Institute of Information Science and Technologies, National Research Council, has contended that there are three aspects featuring future digital libraries: Supporting the sharing of content and resources; Owning key functions of service and management that can insure that all service of a digital library can be supplied smoothly; and Being an important component of a general infrastructure for sharing. The above-mentioned analysis of a current digital library such as DILIGENT related to grid or other e-Science technologies shows that in the future either a prototype of a digital library or a practical project will focus on technologies brought by e-Science, such as grid technology, and apply them to heighten service competency.

The embryo of e-Science in China
The Ministry of Science and Technology, National Commission of Development and Reform, the Ministry of Education, and the Ministry of Finance have jointly established The Outline for Construction of the Platform of Science and Technology Foundation in China (2004-2010) (XinHuaNet, 2004, which has clearly brought forward the framework of the foundation of the national platform of science and technology. Furthermore, it also involves the guiding idea, principles, goal, mission, construction emphasis, and guaranty methods to advance the construction of the platform.
As set forth in the Outline, the construction goal of the platform comprises three aspects in order to lay a substantial platform of science and technology by 2010, with the platform ultimately possessing a reasonable layout, perfect functioning, a well-developed system and high-effective sharing. These three aspects are: Initially setting up the support system for the science and technology foundation, which is seasoned with the need of S&T innovation and development.
The establishment of management system with its core of sharing Specialists and research service organizations fitting the construction and development of the platform. In short, with the construction of a platform for the science and technology foundation, there is an embryo of e-Science in China. In the first stage, China will strengthen the foundation construction such as the grid, high-performance computing, data specimens, and digital libraries.

The first stage of construction of the grid infrastructure
There is an affinity between e-Science and grid technology. Generally speaking, the e-Science environment needs can only be satisfied by grid technology. As for the Chinese government, scientific and technological innovation is the groundwork for the persistent development of the country. Improving the scientific and technological environment, as well as promoting scientific and technological innovation, deserves priority attention. As a wholly new model of scientific research, e-Science has played a vital role in the informationization of scientific research in China.
Considering the current development of e-Science in China, we can reach a conclusion that there is no large-scale construction of e-Science. However, China has inserted a great deal of funding and human resources into the research and development of the grid technology that is the core technology in the e-Science environment and has been achieving a good performance since 1999. The National 863 plan, the National Science Foundation (NSF) in China, the Ministry of Education, as well as the Shanghai government, have deployed high-performance computing and grid technology to education and industry. At the same time, many key e-Science technologies that can be applied widely as well as application demonstrations of industry have been developed and researched. In China there are five large-scale projects concerning grid technology, including CNGrid, 863 Grid, NSFC Grid, ChinaGrid, and ShanghaiGrid. Because of the achievement of the present grid projects, the infrastructure of digital libraries, especially the basic technology useful to boost the service competency, will be completed.
The Outline has declared that natural resources, scientific instruments, data, networks, and S&T documentation are indispensable parts of the national platform for the foundation of science and technology. In effect, the construction plan, made by Chinese Academy of Sciences (CAS) attaching importance to the support of digital libraries in the e-Science environment and bringing them into the construction scope, has four construction goals relating to the digital library, accounting for 40% of the plan, among its ten construction goals. Thus, it can be seen that CAS regards digital libraries as significant as the high-performance computing grid, large-scale databases, and high-speed networks in the construction process of e-Science. Because of the cooperation among these modules of e-Science, digital libraries can play a vital part in the integration of resource platforms, the sharing of scientific and technological establishments, and the construction of an application support environment.

Making best of platform construction of scientific and technological foundation
To achieve the goal of construction of platform of science and technology foundation, the Outline has clarified three missions (XinHuaNet, 2004): (1) Creating and implementing the substance and information assurance system. This includes: Developing scientific, reasonable, and uniform technology standards and criteria; Researching and developing related technology; Integrating, combining, and optimizing the present large-scale scientific instruments, equipments establishment, scientific data, and scientific and technological documentation, as well as natural S&T resources; Improving the development of resource informationalization and networking via international resources; Building resource deployment patterns with proper integration and moderate distribution.
(2) Establishing systems with a core of sharing. Enacting and implementing the management of S&T resource law; Advancing the amendment and enactment of laws, legislation, regulations and standards that enhance the development of science and technology as the rights and obligations become clear; Establishing and implementing encouragement mechanisms, as well as assessment and monitoring mechanisms; Advancing the innovation of management methods; Creating a legal system environment for the fair use of public resources.
(3) Training specialists and organizations. Deepening reform of personnel systems in non-profit research institutions; Consummating the assessment system; Setting up the mechanism for attracting persons with ability; Training personnel ranks who are good at managing the scientific and technological foundation and providing technology support.
By making the best of the scientific and technological platform, what has hindered the development of digital library can be dealt with effectively. As a result, in the process of adjusting itself to confront the challenge brought by e-Science, the digital library, also called the virtual digital library in the e-Science environment, has to make use of achievements attained by the construction of the platform of scientific and technological foundation.

Self-reconstruction and improvement
At present, constructing a digital library, that operates, improves, and expands its functions after the emergence of e-Science, is a leading-edge problem. Nevertheless, the DILIGENT project, with its initiation in 2004 funded by European Union, has probed into the problem. Using the many services based on grid technology developed and designed by EGEE project as an example, the DILIGENT project expects to set up the infrastructure of a virtual digital library based on grid technology and compatible with OGSA protocol, which allows users to establish their personal virtual digital library as desired individually. After more than a year, both the design of system's functions and the research and development of key technologies have been successfully accomplished. The construction model of the DILIGENT project may lead to the correct path in constructing a digital library in the e-Science environment.

Understanding the status of digital libraries in the e-Science environment
The construction of the DCC ( Data Curation Center) and the eBank project has shown that the digital library has held an important place in the e-Science environment by dint of its particular service predominance. Equal to communication and cooperation among scientific workers and along with remote instruments, the digital library is the one of the most important factors in the e-Science environment, with its communication and cooperation between man and information. At the same time, digital libraries and long-distance instruments are joined together. In consequence, the digital library is not only for supplying data to distant instruments but also for storing the data received from remote instruments. Furthermore, as a vital characteristic of e-Science, the remote dispatch and instruments sharing has demonstrated the interactivity between man and instruments. Communication and cooperation among scientific workers, as well as remote instruments have comprised a close loop, insuring the free flow of information and knowledge in the e-Science environment.

Rebuilding the structure of digital libraries
With the across-the-board view of the current development of e-Science in Britain, America, and the European Union, we can conclude that, by and large, there is a uniform technological architecture in these countries and areas. Generally speaking, the architecture can be described as a five-layer structure. As shown in Figure 2, there is an application layer, a layer of application development environment and tools, a grid middleware layer, a grid infrastructure layer, and a resource layer. On the basis of the analysis of this five-layer structure, the infrastructure of e-Science, can provide a dynamic resource process and resource access to large-scale equipment, scientific data, computing resources, and scientific and technological documentation. Furthermore, it offers information management and services during the process of cooperative research. What the grid middleware and grid infrastructure have provided is the main goal of digital library in the e-Science environment. Thus, the virtual digital library must rebuild the structure of the digital library using a DILIGENT prototype for reference.

Improving current digital libraries
The resource layer of the virtual digital library is composed of all kinds of resources and services that depend on those resources. When the resource layer is built, the present digital library will be transplanted into the grid environment, which will become one part of the resource layer in the virtual digital library. On this basis, the resource monitoring service, information service, generator of agent and matching, along with support services of virtual organizations will be enhanced.

Reconstructing the digital library layer of current digital libraries
According to the prototype of the DILIGENT project, the digital library layer will focus on services that will generate virtual digital libraries, workflow management, content management, metadata management, indexing, and search management. The service generator of virtual digital libraries has allowed individuals and organizations to create personalized virtual digital libraries, giving users the ability to specify a list of standard services and resources and then use them to describe the features of a digital library constructed according to their needs. On this basis, the service will differentiate all types of service and information resources to conform to users' specific needs and offer them the most suitable content and service.

Construction of the application layer of virtual digital libraries
The application layer of virtual digital libraries involves a portal generator and visualization service. In the process of construction, the portal generator and visualization services should be offered accordingly as conditions change. Under the e-Science environment at present, there are three main kinds of visualization service, including scientific computing, data, and information visualization. Visualization can be classified into seven kinds depending on the demand level of scientific research: which involves one-dimensional, two-dimensional, three-dimensional, multi-dimensional, time sequence, layered information, and network information visualization.
Meanwhile, when the application layer of a virtual digital library is set up, personal portal generators have to be provided according to the individual needs of scientific workers and virtual organizations so that the primary resources and services based on these resources can be available to all. In addition, given the system architecture in the DILIGENT project, there are some other aspects of the process of the construction of the application layer to be considered for a harmonious construction. Annotation in the management of content and metadata; Content clustering; Storage monitoring; The personalization of the management of indexing and searching; Feature extraction

CONCLUSION
In the environment of e-Science and e-Communities, digital libraries in China have received an opportunity while facing significant challenges. However, e-Science is having a definite effect on the digital library. During this dynamic and changing process, not only the research community but also the developers of digital libraries can use the newest developments of e-Science and pay more attention to how to satisfy users.