esowc/ecPoint-Calibrate: v0.11.0
Description
This release primarily addresses #112 and #116 by providing the user with an option to choose a format for PDT that's highly optimized and more likely to fit in the memory than ASCII files. This has been achieved using a combination of Parquet data format, lazy evaluation, and Categorical Types.
The release also includes fixes for some bug in Milestone 1.
User-facing updates- Date fields in PDT are formatted as
YYYY-MM-DD
(previouslyYYYYMMDD
), to prevent automatic type-cast toint64
. (5c1580e) - Implement a high-performance loader based on Apache Parquet. (fcc7ee8)
- Allow selecting between ASCII and Parquet containers for storing PDT, on the GUI. (91193c1)
- Support reading from Parquet files in postprocessing. The container type will be automatically inferred. (c8c519c)
- Improve memory utilization of PDT preloader during postprocessing, using lazy evaluation. (1d569b9)
- Fix metadata in PDT for Forecast Error (#93, 06c8ae7)
- Add loader in postprocessing workflow for expensive actions. (edfa101)
- Make FE(R) CSV dumps more legible. (#96, d9d8af4)
Allow negative addition factor in computations. (#83, 0cad6b8) #### Technical improvements
Significantly reduce Docker build context for the core service.
- Implement an interface for PDT loaders. (3cfccf9)
- Allow selecting specific columns from a PDT, without loading the whole file in memory. (3cfccf9)
- Implement iterator protocol for point data table loaders. This will allow us to perform lazy operations on PDT in future. (a0179fa)
- Cache dataframe and column properties in PDT loaders.
- Other typing improvements to loader classes.
Note: PDT = Point Data Table
Files
esowc/ecPoint-Calibrate-v0.11.0.zip
Files
(38.2 MB)
Name | Size | Download all |
---|---|---|
md5:7331a6f08f76634baacb0db764d8ad69
|
38.2 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/esowc/ecPoint-Calibrate/tree/v0.11.0 (URL)