Start Submission Become a Reviewer

Reading: Measuring Data Quality of Geoscience Datasets Using Data Mining Techniques

Download

A- A+
dyslexia friendly

Research Papers

Measuring Data Quality of Geoscience Datasets Using Data Mining Techniques

Authors:

Cuo Cai ,

Center for Information Science, Peking University, Beijing 100871, China
X close

Kunqing Xie

Center for Information Science, Peking University, Beijing 100871, China
X close

Abstract

Currently there are many methods of collecting geoscience data, such as station observations, satellite images, sensor networks, etc. All of these data sources from different regions and time intervals are combined in geoscience research activities today. Using a mixture of several different data sources may have benefits but may also lead to severe data quality problems, such as inconsistent data and missing values. There have been efforts to produce more consistent data sets from multiple data sources. However, because of the huge gaps in data quality among the different sources, data quality inequality among different regions and time intervals has still occurred in the resultant data sets. As the construction methods of these data sets are quite complicated, it would be difficult for users to know the data quality of a dataset not to mention the data quality for a specified location or a given time interval. In this paper, the authors address the problem by generating a data quality measure for all regions and time intervals of a dataset. The data quality measure is computed by comparing the constructed datasets and their sources or other relevant data, using data mining techniques. This paper also demonstrates how to handle major quality problems, such as outliers and missing values, by using data mining techniques in the geoscience data, especially in global climate data.
DOI: http://doi.org/10.2481/dsj.6.S738
How to Cite: Cai, C. & Xie, K., (2007). Measuring Data Quality of Geoscience Datasets Using Data Mining Techniques. Data Science Journal. 6, pp.S738–S742. DOI: http://doi.org/10.2481/dsj.6.S738
43
Views
44
Downloads
1
Citations
Published on 23 Oct 2007.
Peer Reviewed

Downloads

  • PDF (EN)

    comments powered by Disqus