1 Goal of the script

This script reads all CSV files exported from the GOM Inspect scripts, and imports them into one single CSV file and one single R file.
The script will:

  1. Read in the original CSV-files and put them together (step #3)
  2. Write an XLSX-file and save an R object ready for further analysis in R (step #4)
dir_in <- "analysis/raw_data/BU-072"
dir_out <- "analysis/derived_data/"

Raw data must be located in ~/analysis/raw_data/BU-072.
Formatted data will be saved in ~/analysis/derived_data/.

The knit directory for this script is the project directory.


2 Load packages

pack_to_load <- c("openxlsx", "R.utils")
sapply(pack_to_load, library, character.only = TRUE, logical.return = TRUE)
Warning: package 'openxlsx' was built under R version 4.1.3
Warning: package 'R.utils' was built under R version 4.1.3
openxlsx  R.utils 
    TRUE     TRUE 

3 Read in original CSV-files

3.1 List all CSV-files

Angles <- list.files(dir_in, pattern = "_w_.*\\.csv$", full.names = TRUE)
Angles.infos <- vector(mode = "list", length = length(Angles))
for (i in seq_along(Angles.infos)){
  Angles.infos[[i]] <- file.info(Angles[i])
}

Angles.infos <- do.call(rbind, Angles.infos)
Angles.infos <- data.frame(file = basename(row.names(Angles.infos)), Angles.infos) 

4 Read in the CSV-files

Angles.data <- vector(mode = "list", length = length(Angles))
for (i in seq_along(Angles.data)){
  Angles.data[[i]] <- read.csv(Angles[i])
  Angles.data[[i]][["steps"]] <- Angles.data[[i]][2,3] - Angles.data[[i]][1,3]
}

Angles.data <- do.call(rbind, Angles.data)
names(Angles.data) <- c("section", "angle_number", "dist_intersection", "segment_length", "3points", "2lines", "best_fit", "steps")
Angles.data <- Angles.data[c(1:2, 8, 3:7)]
str(Angles.data)
'data.frame':   1416 obs. of  8 variables:
 $ section          : chr  "BU-072_1_E1_RE_SEC-01_local" "BU-072_1_E1_RE_SEC-01_local" "BU-072_1_E1_RE_SEC-01_local" "BU-072_1_E1_RE_SEC-01_local" ...
 $ angle_number     : int  1 2 3 4 5 6 7 8 9 10 ...
 $ steps            : num  0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 ...
 $ dist_intersection: num  0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 ...
 $ segment_length   : num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
 $ 3points          : num  76.3 63.6 55.6 50.9 47.3 45.2 41.6 39.1 38.4 37.3 ...
 $ 2lines           : num  68.6 46.2 39.9 35.5 34.3 27.4 25.1 27 29.2 26.2 ...
 $ best_fit         : num  95.9 46.1 39.3 37.1 33.2 28.8 22.1 27.7 30 26.3 ...

5 Extract units of all variables

The units are based on the first CSV-file. Units are incorporated to the data object Angles.data as comments.

headers <- unlist(strsplit(readLines(Angles[1], n = 1), ","))
units.var <- sub(pattern = ".*\\[(.+)\\]", "\\1", headers)[-(1:2)]
units.var <- c(rep(units.var[1],2), units.var[-1])
names(units.var) <- names(Angles.data)[-(1:2)]
comment(Angles.data) <- units.var
units.var.table <- data.frame(variable = names(units.var), unit = units.var)

6 Save data

6.1 Format name of output file

file_out <- "BU-072"

The files will be saved as “~/BU-072.[ext]”.

6.2 Write to XLSX

write.xlsx(list(data = Angles.data, units = units.var.table, CSV_infos = Angles.infos), 
           file = paste0(dir_out, file_out, ".xlsx"))

6.3 Save R object

saveObject(Angles.data, file = paste0(dir_out, file_out, ".Rbin"))

7 sessionInfo() and RStudio version

sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] R.utils_2.11.0    R.oo_1.24.0       R.methodsS3_1.8.1 openxlsx_4.2.5   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3    digest_0.6.29   R6_2.5.1        jsonlite_1.8.0 
 [5] magrittr_2.0.3  evaluate_0.15   zip_2.2.0       stringi_1.7.6  
 [9] rlang_1.0.2     cli_3.3.0       rstudioapi_0.13 jquerylib_0.1.4
[13] bslib_0.3.1     rmarkdown_2.14  tools_4.1.0     stringr_1.4.0  
[17] xfun_0.30       yaml_2.3.5      fastmap_1.1.0   compiler_4.1.0 
[21] htmltools_0.5.2 knitr_1.39      sass_0.4.1     

RStudio version 1.4.1717.

8 Cite R packages used

for (i in pack_to_load) print(citation(i), bibtex = FALSE)

To cite package 'openxlsx' in publications use:

  Philipp Schauberger and Alexander Walker (2021). openxlsx: Read,
  Write and Edit xlsx Files. R package version 4.2.5.
  https://CRAN.R-project.org/package=openxlsx


To cite package 'R.utils' in publications use:

  Henrik Bengtsson (2021). R.utils: Various Programming Utilities. R
  package version 2.11.0. https://CRAN.R-project.org/package=R.utils

END OF SCRIPT