An IDA-Based Parallel Storage Scheme in the Scientific Data Grid

Weizhong Lu; Yuanchun Zhou; Lei Liu; Baoping Yan

Research Papers

An IDA-Based Parallel Storage Scheme in the Scientific Data Grid

Authors

Weizhong Lu
Yuanchun Zhou
Lei Liu
Baoping Yan

Abstract

It is important to improve data reliability and data access efficiency for data-intensive applications in a data grid environment. In this paper, we propose an Information Dispersal Algorithm (IDA)-based parallel storage scheme for massive data distribution and parallel access in the Scientific Data Grid. The scheme partitions a data file into unrecognizable blocks and distributes them across many target storage nodes according to user profile and system conditions. A subset of blocks, which can be downloaded in parallel to remote clients, is required to reconstruct the data file. This scheme can be deployed on the top of current grid middleware. A demonstration and experimental analysis show that the IDA-based parallel storage scheme has better data reliability and data access performance than the existing data replication methods. Furthermore, this scheme has the potential to reduce considerably storage requirements for large-scale databases on a data grid.

Keywords:

Year: 2010

Volume 9

Page/Article: 29-41

DOI: 10.2481/dsj.009-006

Published on May 19, 2010

Peer Reviewed

CC BY 4.0