Perform quality assurance on a raw IEA data file
iea_file_OK.RdWhen starting to work with an IEA data file,
it is important to verify its integrity.
This function performs some validation tests on .iea_file.
Usage
iea_file_OK(
.iea_file = NULL,
text = NULL,
expected_1st_line_start = ",,TIME",
expected_2nd_line_start = "COUNTRY,FLOW,PRODUCT",
expected_simple_start = expected_2nd_line_start,
.slurped_iea_df = NULL,
country = "COUNTRY",
flow = "FLOW",
product = "PRODUCT",
rowid = "rowid"
)Arguments
- .iea_file
the path to the raw IEA data file for which quality assurance is desired
- text
a string containing text to be parsed as an IEA file.
- expected_1st_line_start
the expected start of the first line of
iea_file. Default is ",,TIME".- expected_2nd_line_start
the expected start of the second line of
iea_file. Default is "COUNTRY,FLOW,PRODUCT".- expected_simple_start
the expected starting of the first line of
iea_file. Default is the value ofexpected_2nd_line_start. Note thatexpected_simple_startis sometimes encountered in data supplied by the IEA. Furthermore,expected_simple_startcould be the format of the file when somebody "helpfully" fiddles with the raw data from the IEA.- .slurped_iea_df
a data frame created by
slurp_iea_to_raw_df()- country
the name of the country column. Default is "COUNTRY".
- flow
the name of the flow column. Default is "FLOW".
- product
the name of the product column. Default is "PRODUCT".
- rowid
the name of a row number column added internally to
.iea_fileper country. Default is "rowid".
Details
At this time, the only verification step performed by this function is confirming that every country has the same flow and product rows in the same order. The approach is to add a per-country row number column to the data frame and delete all the data in year columns. Then, the resulting data frame is queried for duplicate row numbers. If none are found, the function returns the data frame read from the file.
Note that .iea_file is read internally with data.table::fread() without stripping white space.
If .slurped_iea_df is supplied, arguments .iea_file or text are ignored.
If .slurped_iea_df is absent,
either .iea_file or text are required, and
the helper function slurp_iea_to_raw_df() is called internally
to load a raw data frame of data.
Examples
library(magrittr)
#>
#> Attaching package: ‘magrittr’
#> The following objects are masked from ‘package:testthat’:
#>
#> equals, is_less_than, not
sample_iea_data_path() %>%
iea_file_OK()
#> [1] TRUE