How to Make your Duck Fly: Advanced Floating Point Compression to the Rescue
Creators
Description
The massive volumes of data generated in diverse domains, such
as scientific computing, finance and environmental monitoring,
hinder our ability to perform multidimensional analysis at high
speeds and also yield significant storage and egress costs. Applying compression algorithms to reduce these costs is particularly
suitable for column-oriented DBMSs, as the values of individual
columns are usually similar and thus, allow for effective compression. However, this has not been the case for binary floating point numbers, as the space savings achieved by respective
compression algorithms are usually very modest. We present
here two lossless compression algorithms for floating point data,
termed Chimp and Patas, that attain impressive compression ratios and greatly outperform state-of-the-art approaches. We focus
on how these two algorithms impact the performance of DuckDB,
a purpose-built embeddable database for interactive analytics.
Our demonstration will showcase how our novel compression
approaches a) reduce storage requirements, and b) improve the
time needed to load and query data using DuckDB.
Files
Chimp_Patas_Demo.pdf
Files
(983.1 kB)
Name | Size | Download all |
---|---|---|
md5:c1ffabff47c4acedd53a3ed76f81bcef
|
983.1 kB | Preview Download |