Report Open Access
The goal of this project is to prepare the tools that will help to compare the performance of Kudu and Impala with the current Oracle schema for various data retrieval scenarios. In order to streamline the benchmarks and make them more reliable and repeatable, two tools are developed: DataPump and QueryBenchmark. DataPump allows to transmit data from existing Oracle archives to Kudu, thus making sure that the tests are executed on the same, representative data sets. It also allows to measure the highest achievable write rate to Kudu. Readout performance of Oracle and Kudu is measured by QueryBenchmark, which executes sets of queries specified in the configuration file and writes results to report files, which can later be processed to generate performance statistics and plots.