Aggregate Query Result Correctness using Pattern Tables
Abstract
The state-of-the-art techniques for aggregate query results correctness works well only when a reference table is available. We are proposing a technique, which will work well even when the reference table is absent. This technique uses pattern tables for checking the correctness of aggregate queries. It is demonstrated on Sofia Air Quality Dataset where complete clusters for aggregate queries are identified. The results show a reduction of 71 % in average query execution time for Pattern Table Method PTM over Data Table Method DTM. Further, the scaled data results till 5X show that PTM scales linearly while DTM scales linearly only till 3X. The algorithm execution time is analyzed for scaled data, the number of levels, and the number of NULLs. Our algorithm is well behaved till 5X for scaled data. The behavior of the algorithm beyond level 3 needs to be investigated further. The correctness of aggregate queries will help in ensuring the correctness of the analytics built on it.
Collections
- M Tech Dissertations [923]