Transparent Gif

Department of Computer Science

University of California, Santa Barbara

Abstract

Applying the Golden Rule of Sampling for Query Estimation

by: Yi-Leh Wu, Divyakant Agrawal, Amr El Abbadi

Abstract:

Query size estimation is crucial for many database system components.In particular, query optimizers need efficient and accurate query size estimation when deciding among alternative query plans. In this paper we propose a novel sampling technique based on the golden rule of sampling, introduced by von Neumann in 1947, for estimating range queries.The proposed technique randomly samples the frequency domain using the cumulative frequency distribution and yields good estimates without any a priori knowledge of the actual underlying distribution of spatial objects.We show experimentally that the proposed sampling technique gives smaller approximation error than the Min-Skew histogram based and wavelet based approaches for both synthetic and real datasets.Moreover, the proposed technique can be easily extended for higher dimensional datasets.

Keywords:

random sampling, cumulative frequency distribution, query estimation, range query

Date:

March 2001

Document: 2001-05

XHTML Validation | CSS Validation
Updated 14-Nov-2005
Questions should be directed to: webmaster@cs.ucsb.edu