Introduction

This R Markdown document is designed to transform data that is not in CWP format into CWP format. Initially, it changes the format of the data; subsequently, it maps the data to adhere to CWP standards. This markdown is automatically created from the function: https://raw.githubusercontent.com/eblondel/geoflow-tunaatlas/master/R/tunaatlas_scripts/pre-harmonization/east_pacific_ocean_catch_1deg_1m_bb_tunaatlasiattc_level0__byflag.R, the documentation keeps the format of roxygen2 skeleton.

A summary of the mapping process is provided. The path to the dataset is specified. You will find on this same repository on GitHub the first line of each dataset. The datasets are named after the historical name provided by tRFMOs while exporting and may change. The information provided in the Rmd allows understanding correctly which dataset should be used in this markdown.

Additional operations are performed next to verify other aspects of the data, such as the consistency of the geolocation, the values, and the reported catches in numbers and tons.

If you are interested in further details, the results and codes are available for review.

Each .Rmd script requires the user to knit the dataset at the beginning of the script in order to execute the harmonization process correctly. It is also possible to run the code chunk by chunk but be sure to be in the correct working directory (i.e., the one of the .Rmd).

path_to_raw_dataset <- here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag.csv')

Harmonize data structure of IATTC LP (Pole-and-line) catch datasets

This function harmonizes the structure of IATTC catch-and-effort datasets specifically for LP (Pole-and-line) catches under the ‘LPTunaFlag’ designation. The function transforms raw dataset inputs into a harmonized format suitable for integration into the Tuna Atlas database. The function assumes specific initial data columns and outputs a structured dataset with additional metadata and code lists as needed.

@return This function does not return a value but outputs harmonized datasets and related files specified by the process for integration within the Tuna Atlas database.

@importFrom dplyr %>% select mutate @import reshape @seealso to convert IATTC nominal catch data structure, to convert IATTC LLTunaBillfish and LLShark data structure, @keywords IATTC, tuna, fisheries, data harmonization @export @author Paul Taconet, IRD @author Bastien Grasset, IRD

  # Input data sample:
  # Year Month Flag LatC1  LonC1 NumSets ALB BET BKJ BZX PBF   SKJ TUN  YFT
  # 1978     1  USA   3.5  -79.5       2   0   0   0   0   0  6.05   0 4.74
  # 1978     1  USA  20.5 -114.5       2   0   0   0   0   0  3.53   0 2.76
  # 1978     1  USA  23.5 -111.5       2   0   0   0   0   0 20.80   0 4.50
  # 1978     1  USA  23.5 -109.5       1   0   0   0   0   0  0.00   0 0.90
  # 1978     1  USA  24.5 -111.5       1   0   0   0   0   0  1.51   0 1.18
  # 1978     1  USA  25.5 -114.5       2   0   0   0   0   0  5.00   0 3.60
  
  # Catch: final data sample:
  # FishingFleet Gear time_start   time_end AreaName School Species CatchType CatchUnits Catch
  #  USA   LL 1992-07-01 1992-08-01  6425135    ALL     BSH       ALL         NO     4
  #  USA   LL 1993-04-01 1993-05-01  6425135    ALL     BSH       ALL         NO    75
  #  USA   LL 1993-04-01 1993-05-01  6430135    ALL     BSH       ALL         NO    15
  #  USA   LL 1993-05-01 1993-06-01  6425135    ALL     BSH       ALL         NO    24
  #  USA   LL 1994-03-01 1994-04-01  6425135    ALL     BSH       ALL         NO    14
  #  USA   LL 1994-03-01 1994-04-01  6430135    ALL     BSH       ALL         NO     4

packages

if(!require(data.table)){
  install.packages("data.table")
  require(data.table)
}
if(!require(dplyr)){
  install.packages("dplyr")
  require(dplyr)
}
if(!require(reshape2)){
  install.packages("reshape2")
  require(reshape2)
}

Historical name for the dataset at source PublicLPTunaFlag.csv

opts <- options()
options(encoding = "UTF-8")
# Catches Reach the catches pivot DSD using a function stored in IATTC_functions.R
``` r source(“https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/sardara_functions/FUN_catches_IATTC_CE_Flag_or_SetType.R”) catches_pivot_IATTC <-FUN_catches_IATTC_CE_Flag_or_SetType(path_to_raw_dataset,“Flag”,“LP”) catches_pivot_IATTC$NumSets<-NULL
colToKeep_captures <- c(“FishingFleet”,“Gear”,“time_start”,“time_end”,“AreaName”,“School”,“Species”,“CatchType”,“CatchUnits”,“Catch”) source(“https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/sardara_functions/IATTC_CE_catches_pivotDSD_to_harmonizedDSD.R”) catches<-IATTC_CE_catches_pivotDSD_to_harmonizedDSD(catches_pivot_IATTC,colToKeep_captures)
colnames(catches)<-c(“fishing_fleet”,“gear_type”,“time_start”,“time_end”,“geographic_identifier”,“fishing_mode”,“species”,“measurement_type”,“measurement_unit”,“measurement_value”) catches\(source_authority<-"IATTC" catches\)measurement_type <- “RC” # Retained catches catches\(measurement <- "catch" catches\)measurement_processing_level <- “unknown”
catches\(time_start <- as.Date(catches\)time_start) catches\(time_end <- as.Date(catches\)time_end) dataset_temporal_extent <- paste( paste0(format(min(catches\(time_start), "%Y"), "-01-01"), paste0(format(max(catches\)time_end), “%Y”), “-12-31”), sep = “/” ) ```
output in same folder as path_to_raw_dataset
``` r output_name_dataset <- here::here(‘R/tunaatlas_scripts/pre-harmonization’, ‘iattc’, ‘catch’, ‘data’, ‘PublicLPTunaFlag_harmonized.csv’)
write.csv(catches, output_name_dataset, row.names = FALSE) georef_dataset <- catches ```

@ Load pre-harmonization scripts and apply mappings

download.file('https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/tunaatlas_scripts/pre-harmonization/map_codelists_no_DB.R', destfile = 'local_map_codelists_no_DB.R')
source('local_map_codelists_no_DB.R')
fact <- "catch"
mapping_codelist <- map_codelists_no_DB(fact, mapping_dataset = "https://raw.githubusercontent.com/fdiwg/fdi-mappings/main/global/firms/gta/codelist_mapping_rfmos_to_global.csv", dataset_to_map = georef_dataset, mapping_keep_src_code = FALSE, summary_mapping = TRUE, source_authority_to_map = c("IATTC", "CCSBT", "WCPFC"))
## 
##  mapping dimension gear_type with code list mapping
## 
##  mapping dimension species with code list mapping
## 
##  mapping dimension fishing_fleet with code list mapping
## 
##  mapping dimension fishing_mode with code list mapping

@ Handle unmapped values and save the results

georef_dataset <- mapping_codelist$dataset_mapped %>% dplyr::mutate(fishing_fleet = ifelse(fishing_fleet == 'UNK', 'NEI', fishing_fleet), gear_type = ifelse(gear_type == 'UNK', '99.9', gear_type))
data.table::fwrite(mapping_codelist$recap_mapping, here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag_recap_mapping.csv'))
data.table::fwrite(mapping_codelist$not_mapped_total, here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag_not_mapped_total.csv'))
data.table::fwrite(georef_dataset, here::here('R/tunaatlas_scripts/pre-harmonization', 'iattc', 'catch', 'data', 'PublicLPTunaFlag_CWP_dataset.csv'))

Display the first few rows of the mapping summaries

print(head(mapping_codelist$recap_mapping))
## # A tibble: 6 × 5
##   src_code trg_code src_codingsystem trg_codingsystem   source_authority
##   <chr>    <chr>    <chr>            <chr>              <chr>           
## 1 MEX      MEX      flag_iattc       fishingfleet_firms IATTC           
## 2 USA      USA      flag_iattc       fishingfleet_firms IATTC           
## 3 ALB      ALB      species_iattc    species_asfis      IATTC           
## 4 BKJ      BKJ      species_iattc    species_asfis      IATTC           
## 5 BZX      BZX      species_iattc    species_asfis      IATTC           
## 6 PBF      PBF      species_iattc    species_asfis      IATTC