This R Markdown document is designed to transform data that is not in CWP format into CWP format. Initially, it changes the format of the data; subsequently, it maps the data to adhere to CWP standards. This markdown is automatically created from the function: https://raw.githubusercontent.com/eblondel/geoflow-tunaatlas/master/R/tunaatlas_scripts/pre-harmonization/east_pacific_ocean_catch_1deg_1m_bb_tunaatlasiattc_level0__byflag.R, the documentation keeps the format of roxygen2 skeleton.
A summary of the mapping process is provided. The path to the dataset is specified. You will find on this same repository on GitHub the first line of each dataset. The datasets are named after the historical name provided by tRFMOs while exporting and may change. The information provided in the Rmd allows understanding correctly which dataset should be used in this markdown.
Additional operations are performed next to verify other aspects of the data, such as the consistency of the geolocation, the values, and the reported catches in numbers and tons.
If you are interested in further details, the results and codes are available for review.
Each .Rmd script requires the user to knit the
dataset at the beginning of the script in order to execute the
harmonization process correctly. It is also possible to run the code
chunk by chunk but be sure to be in the correct working directory (i.e.,
the one of the .Rmd).
path_to_raw_dataset <- here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag.csv')
Harmonize data structure of IATTC LP (Pole-and-line) catch datasets
This function harmonizes the structure of IATTC catch-and-effort datasets specifically for LP (Pole-and-line) catches under the ‘LPTunaFlag’ designation. The function transforms raw dataset inputs into a harmonized format suitable for integration into the Tuna Atlas database. The function assumes specific initial data columns and outputs a structured dataset with additional metadata and code lists as needed.
@return This function does not return a value but outputs harmonized datasets and related files specified by the process for integration within the Tuna Atlas database.
@importFrom dplyr %>% select mutate @import reshape @seealso to convert IATTC nominal catch data structure, to convert IATTC LLTunaBillfish and LLShark data structure, @keywords IATTC, tuna, fisheries, data harmonization @export @author Paul Taconet, IRD @author Bastien Grasset, IRD
# Input data sample:
# Year Month Flag LatC1 LonC1 NumSets ALB BET BKJ BZX PBF SKJ TUN YFT
# 1978 1 USA 3.5 -79.5 2 0 0 0 0 0 6.05 0 4.74
# 1978 1 USA 20.5 -114.5 2 0 0 0 0 0 3.53 0 2.76
# 1978 1 USA 23.5 -111.5 2 0 0 0 0 0 20.80 0 4.50
# 1978 1 USA 23.5 -109.5 1 0 0 0 0 0 0.00 0 0.90
# 1978 1 USA 24.5 -111.5 1 0 0 0 0 0 1.51 0 1.18
# 1978 1 USA 25.5 -114.5 2 0 0 0 0 0 5.00 0 3.60
# Catch: final data sample:
# FishingFleet Gear time_start time_end AreaName School Species CatchType CatchUnits Catch
# USA LL 1992-07-01 1992-08-01 6425135 ALL BSH ALL NO 4
# USA LL 1993-04-01 1993-05-01 6425135 ALL BSH ALL NO 75
# USA LL 1993-04-01 1993-05-01 6430135 ALL BSH ALL NO 15
# USA LL 1993-05-01 1993-06-01 6425135 ALL BSH ALL NO 24
# USA LL 1994-03-01 1994-04-01 6425135 ALL BSH ALL NO 14
# USA LL 1994-03-01 1994-04-01 6430135 ALL BSH ALL NO 4
packages
if(!require(data.table)){
install.packages("data.table")
require(data.table)
}
if(!require(dplyr)){
install.packages("dplyr")
require(dplyr)
}
if(!require(reshape2)){
install.packages("reshape2")
require(reshape2)
}
Historical name for the dataset at source PublicLPTunaFlag.csv
opts <- options()
options(encoding = "UTF-8")
| # Catches Reach the catches pivot DSD using a function stored in IATTC_functions.R |
| ``` r source(“https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/sardara_functions/FUN_catches_IATTC_CE_Flag_or_SetType.R”) catches_pivot_IATTC <-FUN_catches_IATTC_CE_Flag_or_SetType(path_to_raw_dataset,“Flag”,“LP”) catches_pivot_IATTC$NumSets<-NULL |
| colToKeep_captures <- c(“FishingFleet”,“Gear”,“time_start”,“time_end”,“AreaName”,“School”,“Species”,“CatchType”,“CatchUnits”,“Catch”) source(“https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/sardara_functions/IATTC_CE_catches_pivotDSD_to_harmonizedDSD.R”) catches<-IATTC_CE_catches_pivotDSD_to_harmonizedDSD(catches_pivot_IATTC,colToKeep_captures) |
| colnames(catches)<-c(“fishing_fleet”,“gear_type”,“time_start”,“time_end”,“geographic_identifier”,“fishing_mode”,“species”,“measurement_type”,“measurement_unit”,“measurement_value”) catches\(source_authority<-"IATTC" catches\)measurement_type <- “RC” # Retained catches catches\(measurement <- "catch" catches\)measurement_processing_level <- “unknown” |
| catches\(time_start <- as.Date(catches\)time_start) catches\(time_end <- as.Date(catches\)time_end) dataset_temporal_extent <- paste( paste0(format(min(catches\(time_start), "%Y"), "-01-01"), paste0(format(max(catches\)time_end), “%Y”), “-12-31”), sep = “/” ) ``` |
| output in same folder as path_to_raw_dataset |
| ``` r output_name_dataset <- here::here(‘R/tunaatlas_scripts/pre-harmonization’, ‘iattc’, ‘catch’, ‘data’, ‘PublicLPTunaFlag_harmonized.csv’) |
| write.csv(catches, output_name_dataset, row.names = FALSE) georef_dataset <- catches ``` |
@ Load pre-harmonization scripts and apply mappings
download.file('https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/tunaatlas_scripts/pre-harmonization/map_codelists_no_DB.R', destfile = 'local_map_codelists_no_DB.R')
source('local_map_codelists_no_DB.R')
fact <- "catch"
mapping_codelist <- map_codelists_no_DB(fact, mapping_dataset = "https://raw.githubusercontent.com/fdiwg/fdi-mappings/main/global/firms/gta/codelist_mapping_rfmos_to_global.csv", dataset_to_map = georef_dataset, mapping_keep_src_code = FALSE, summary_mapping = TRUE, source_authority_to_map = c("IATTC", "CCSBT", "WCPFC"))
##
## mapping dimension gear_type with code list mapping
##
## mapping dimension species with code list mapping
##
## mapping dimension fishing_fleet with code list mapping
##
## mapping dimension fishing_mode with code list mapping
@ Handle unmapped values and save the results
georef_dataset <- mapping_codelist$dataset_mapped %>% dplyr::mutate(fishing_fleet = ifelse(fishing_fleet == 'UNK', 'NEI', fishing_fleet), gear_type = ifelse(gear_type == 'UNK', '99.9', gear_type))
data.table::fwrite(mapping_codelist$recap_mapping, here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag_recap_mapping.csv'))
data.table::fwrite(mapping_codelist$not_mapped_total, here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag_not_mapped_total.csv'))
data.table::fwrite(georef_dataset, here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag_CWP_dataset.csv'))
Display the first few rows of the mapping summaries
print(head(mapping_codelist$recap_mapping))
## # A tibble: 6 × 5
## src_code trg_code src_codingsystem trg_codingsystem source_authority
## <chr> <chr> <chr> <chr> <chr>
## 1 MEX MEX flag_iattc fishingfleet_firms IATTC
## 2 USA USA flag_iattc fishingfleet_firms IATTC
## 3 ALB ALB species_iattc species_asfis IATTC
## 4 BKJ BKJ species_iattc species_asfis IATTC
## 5 BZX BZX species_iattc species_asfis IATTC
## 6 PBF PBF species_iattc species_asfis IATTC