Hot and cold data identification using query aware hybrid partitioning
Abstract
The price of main memory is reducing with time, which helps to store huge amount of data in the memory. OLTP applications have large database size. It is observed that some applications exhibit skewed access pattern i.e. not all the records are accessed every time. Older data is less likely to be accessed as compared to the recent data. The objective is to store data in such a way so that it makes optimal utilization of memory and helps in faster query execution. To identify this hot and cold data we have proposed a Query Aware approach using Hybrid Partitioning (QAA-HP) approach. For given query workload, QAA-HP identifies the hot schema and the hot data corresponding to it. The hot data and the cold data can be configured differently so that their directed queries are accelerated. Different configuration techniques like vertically partitioned table or binary tables, n-ary tables, horizontally partitioned tables are presented for this purpose. We have used TPC-C benchmark for our experiments, which is an OLTP workload. Initially tables are vertically partitioned for hot schema and then further partitioned horizontally for hot data. Metrics for performance analysis are designed based on Query Analysis and Query Execution Time. The results show that when taking 9% of the TPC-C data in clusters, 79% of the hottest query workload ���� is answered. The percentage of time gain ����% for hottest queries when run on hot clusters is observed to be 37% for cold runs and 31% for hot runs.
Collections
- M Tech Dissertations [923]