krassowski/data-vault: v0.4.1
Description
IPython magic for simple, organized, compressed and encrypted storage & transfer of files between notebooks.
The %vault
magic provides a reproducible caching mechanism for variables exchange between notebooks. The cache is compressed, persistent and safe.
Differently to the builtin %store
magic, the variables are stored in plain sight, in a zipped archive, so that they can be easily accessed for manual inspection, or for the use by other tools.
Usage demonstration:
Let's open the vault (it will be created if not here yet):
%open_vault -p data/storage.zip
Generate some dummy dataset:
from pandas import DataFrame
from random import choice, randint
cities = ['London', 'Delhi', 'Tokyo', 'Lagos', 'Warsaw', 'Chongqing']
salaries = DataFrame([
{'salary': randint(0, 100), 'city': choice(cities)}
for i in range(10000)
])
Store variable in a module
And store it in the vault:
%vault store salaries in datasets
Stored salaries (None → 40CA7812) at Sunday, 08. Dec 2019 11:58
A short description is printed out (including a CRC32 hashsum and a timestamp) by default, but can be disabled by passing --timestamp False
to %open_vault
magic. Even more information enhancing the reproducibility is stored in the cell metadata.
Import variable from a module
We can now load the stored DataFrame in another (or the same) notebook:
%vault import salaries from datasets
Imported salaries (40CA7812) at Sunday, 08. Dec 2019 12:02
Thanks to (optional) memory optimizations we saved some RAM (87% as compared to unoptimized pd.read_csv()
result). To track how many MB were saved use --report_memory_gain
setting which will display memory optimization results below imports, for example:
Reduced memory usage by 87.28%, from 0.79 MB to 0.10 MB.
Files
krassowski/data-vault-v0.4.1.zip
Files
(28.2 kB)
Name | Size | Download all |
---|---|---|
md5:40dab86258c231abbf713aab456f24f4
|
28.2 kB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/krassowski/data-vault/tree/v0.4.1 (URL)