Requirement-based data cube schema design
|Authors||Cheung, David W.
Lam, Tak Wah
Ting, Hing Fung
|Source||Proceedings of the Eighth International Conference on Information and Knowledge Management : CIKM '99, Kansas City, MI, USA, ACM, New York, USA, 2-6 Nov. 1999, p. 162-169|
|Summary||On-line analytical processing (OLAP) requires efficient processing of complex decision support queries over very large databases. It is well accepted that pre-computed data cubes can help reduce the response time of such queries dramatically. A very important design issue of an efficient OLAP system is therefore the choice of the right data cubes to materialize. We call this problem the data cube schema design problem. In this paper we show that the problem of finding an optimal data cube schema for an OLAP system with limited memory is NP-hard. As a more computationally efficient alternative, we propose a greedy approximation algorithm cMP and its variants. Algorithm cMP consists of two phases. In the first phase, an initial schema consisting of all the cubes required to efficiently answer the user queries is formed. In the second phase, cubes in the initial schema are selectively merged to satisfy the memory constraint. We show that cMP is very effective in pruning the search space for an optimal schema. This leads to a highly efficient algorithm. We report the efficiency and the effectiveness of cMP via an empirical study using the TPC-D benchmark. Our results show that the data cube schemas generated by cMP enable very efficient OLAP query processing.|
|Rights||© ACM, 1999. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the Eighth International Conference on Information and Knowledge Management : CIKM '99, Kansas City, MI, USA, 2-6 Nov. 1999, ACM, New York, USA, 1999, p. 162-169|