A Comparative Study of Time Series Databases
Abstract
Big data sectors such as the Internet of Things (IoT) generates enormous volumes of data. As IoT devices generate a vast volume of time-series data, the Time Series Database (TSDB) popularity has grown alongside the rise of IoT. Several large firms, such as Facebook and eBay use time series databases rather than relational databases. Time series databases are developed to manage and analyze huge amounts of time series data. However, it is not easy to choose the appropriate one from them. The most popular benchmarks compare the performance of different databases to each other but use random or synthetic data that applies to only one domain. It is required to comprehensively compare the performance of time series databases with a real world dataset. We have experimented with real world applications and summarised all the fundamental query types. Workloads are categorized into data loading, space consumption, and historical data access using basic queries. It compares three TSDB systems: TimescaleDB, Druid, and InfluxDB. We also compared the performance of Cassandra(a non relational database) against the performance of these three TSDB systems to see if something is interesting in the results. Our experiment shows significant performance differences for data injection time and query execution time when comparing real and synthetic datasets. The results are reported and analyzed.
Collections
- M Tech Dissertations [923]