Research Papers
An IDA-Based Parallel Storage Scheme in the Scientific Data Grid
Authors:
Weizhong Lu ,
Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
Graduate University of Chinese Academy of Sciences, Beijing, China
Yuanchun Zhou,
Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
Lei Liu,
Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
CODATA-China Secretariat, Beijing, China
Baoping Yan
Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
Abstract
It is important to improve data reliability and data access efficiency for data-intensive applications in a data grid environment. In this paper, we propose an Information Dispersal Algorithm (IDA)-based parallel storage scheme for massive data distribution and parallel access in the Scientific Data Grid. The scheme partitions a data file into unrecognizable blocks and distributes them across many target storage nodes according to user profile and system conditions. A subset of blocks, which can be downloaded in parallel to remote clients, is required to reconstruct the data file. This scheme can be deployed on the top of current grid middleware. A demonstration and experimental analysis show that the IDA-based parallel storage scheme has better data reliability and data access performance than the existing data replication methods. Furthermore, this scheme has the potential to reduce considerably storage requirements for large-scale databases on a data grid.
How to Cite:
Lu, W., Zhou, Y., Liu, L. and Yan, B., 2010. An IDA-Based Parallel Storage Scheme in the Scientific Data Grid. Data Science Journal, 9, pp.29–41. DOI: http://doi.org/10.2481/dsj.009-006
Published on
19 May 2010.
Peer Reviewed
Downloads