This R Markdown document is designed to transform data that is not in CWP format into CWP format. Initially, it changes the format of the data; subsequently, it maps the data to adhere to CWP standards. This markdown is created from a function so the documentation keep the format of roxygen2 skeleton A summary of the mapping process is provided. The path to the dataset is specified, you will find on this same repository on github the first line of each dataset. The datasets are named after the historical name provided by tRFMOs while exporting and may change. The information provided in the Rmd allows to understand correctly which dataset should be used in this markdown. Additional operations are performed next to verify other aspects of the data, such as the consistency of the geolocation, the values, and the reported catches in numbers and tons. If you are interested in further details, the results and codes are available for review.
path_to_raw_dataset <- here::here('tunaatlas_scripts/pre-harmonization', 'iotc', 'effort', 'data', 'IOTC-DATASETS-2023-04-24-CE-Longline_1950-2021.csv')
Harmonize IOTC Coastal and Longline Effort Datasets
This function processes and harmonizes Indian Ocean Tuna Commission (IOTC) effort datasets for coastal and longline fisheries. It prepares the data for integration into the Tuna Atlas database, ensuring compliance with data standardization requirements and optionally including metadata if the dataset is intended for database loading.
@return None; the function outputs files directly, including harmonized datasets, optional metadata, and code lists for integration within the Tuna Atlas database.
@details This function modifies the dataset to include only essential fields, performs any necessary calculations for effort units, and standardizes the format for date fields and geographical identifiers. Metadata integration is contingent on the final use of the dataset within the Tuna Atlas database.
@importFrom dplyr filter mutate @importFrom readr read_csv write_csv @seealso for initial effort data processing, for converting effort data to a standardized structure. @export @keywords IOTC, tuna, fisheries, data harmonization, effort data @author Paul Taconet, IRD @author Bastien Grasset, IRD
if(!require(dplyr)){
install.packages("dplyr")
require(dplyr)
}
Input data sample: Fleet Gear Year MonthStart MonthEnd iGrid Grid Effort EffortUnits QualityCode Source YFT.NO YFT.MT BET.NO BET.MT SKJ.NO SKJ.MT ALB.NO ALB.MT SBF.NO SBF.MT LOT.NO LOT.MT FRZ.NO FRZ.MT KAW.NO KAW.MT MDV LLCO 2014 12 12 5100067 5100067 6000 HOOKS 0 LO NA 0.25 NA 0.55 NA NA NA NA NA NA NA NA NA NA NA NA MDV HAND 2013 2 2 5100069 5100069 2 FDAYS 0 LO NA 74.60 NA NA NA NA NA NA NA NA NA NA NA NA NA NA MDV HAND 2013 3 3 5100069 5100069 1 FDAYS 0 LO NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA MDV HAND 2013 4 4 5100069 5100069 5 FDAYS 0 LO NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA MDV HAND 2013 5 5 5100069 5100069 1 FDAYS 0 LO NA 0.18 NA NA NA NA NA NA NA NA NA NA NA NA NA NA MDV HAND 2013 7 7 5100069 5100069 4 FDAYS 0 LO NA 3.14 NA NA NA NA NA NA NA NA NA NA NA NA NA NA COM.NO COM.MT SWO.NO SWO.MT BILL.NO BILL.MT TUX.NO TUX.MT SKH.NO SKH.MT NTAD.NO NTAD.MT NA NA NA 1.07 NA 0.02 NA NA NA NA NA 0.01 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA Effort: final data sample: Flag Gear time_start time_end AreaName School EffortUnits Effort AUS HAND 2001-04-01 2001-05-01 5221114 IND DAYS 4 AUS HAND 2001-04-01 2001-05-01 5229114 IND DAYS 1 AUS HAND 2001-05-01 2001-06-01 5221114 IND DAYS 3 AUS HAND 2001-06-01 2001-07-01 5221114 IND DAYS 9 AUS HAND 2001-07-01 2001-08-01 5221114 IND DAYS 3 AUS HAND 2001-07-01 2001-08-01 5225113 IND DAYS 1 Historical name for the dataset at source IOTC-DATASETS-2023-04-24-CE-Coastal_1950-2021.csv
opts <- options()
options(encoding = "UTF-8")
#Efforts
colToKeep_efforts <- c("FishingFleet","Gear","time_start","time_end","AreaName","School","EffortUnits","Effort")
source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/sardara_functions/FUN_efforts_IOTC_CE.R")
efforts_pivot_IOTC<-FUN_efforts_IOTC_CE(path_to_raw_dataset,11)
source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/sardara_functions/IOTC_CE_effort_pivotDSD_to_harmonizedDSD.R")
efforts<-IOTC_CE_effort_pivotDSD_to_harmonizedDSD(efforts_pivot_IOTC,colToKeep_efforts)
colnames(efforts)<-c("fishing_fleet","gear_type","time_start","time_end","geographic_identifier","fishing_mode","measurement_unit","measurement_value")
efforts$source_authority<-"IOTC"
efforts$measurement<-"effort"
efforts$time_start <- as.Date(efforts$time_start)
efforts$time_end <- as.Date(efforts$time_end)
dataset_temporal_extent <- paste(
paste0(format(min(efforts$time_start), "%Y"), "-01-01"),
paste0(format(max(efforts$time_end), "%Y"), "-12-31"),
sep = "/"
)
output_name_dataset <- "Dataset_harmonized.csv"
write.csv(efforts, output_name_dataset, row.names = FALSE)
georef_dataset <- efforts
@ Load pre-harmonization scripts and apply mappings
download.file('https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/R/tunaatlas_scripts/pre-harmonization/map_codelists_no_DB.R', destfile = 'local_map_codelists_no_DB.R')
source('local_map_codelists_no_DB.R')
fact <- "effort"
mapping_codelist <- map_codelists_no_DB(fact, mapping_dataset = "https://raw.githubusercontent.com/fdiwg/fdi-mappings/main/global/firms/gta/codelist_mapping_rfmos_to_global.csv", dataset_to_map = georef_dataset, mapping_keep_src_code = FALSE, summary_mapping = TRUE, source_authority_to_map = c("IATTC", "CCSBT", "WCPFC", "ICCAT", "IOTC"))
##
## mapping dimension gear_type with code list mapping
## mapping dimension fishing_fleet with code list mapping
## mapping dimension fishing_mode with code list mapping
## mapping dimension measurement_unit with code list mapping
@ Handle unmapped values and save the results
georef_dataset <- mapping_codelist$dataset_mapped %>% dplyr::mutate(fishing_fleet = ifelse(fishing_fleet == 'UNK', 'NEI', fishing_fleet), gear_type = ifelse(gear_type == 'UNK', '99.9', gear_type))
fwrite(mapping_codelist$recap_mapping, 'recap_mapping.csv')
fwrite(mapping_codelist$not_mapped_total, 'not_mapped_total.csv')
fwrite(georef_dataset, 'CWP_dataset.csv')
Display the first few rows of the mapping summaries
print(head(mapping_codelist$recap_mapping))
## # A tibble: 6 × 5
## src_code trg_code src_codingsystem trg_codingsystem source_authority
## <chr> <chr> <chr> <chr> <chr>
## 1 DAYS DAYS effortunit_iotc effortunit_rfmos IOTC
## 2 FDAYS FDAYS effortunit_iotc effortunit_rfmos IOTC
## 3 HOOKS HOOKS effortunit_iotc effortunit_rfmos IOTC
## 4 SETS SETS effortunit_iotc effortunit_rfmos IOTC
## 5 TRIPS TRIPS effortunit_iotc effortunit_rfmos IOTC
## 6 DAYS FDAYS effortunit_wcpfc effortunit_rfmos WCPFC
print(head(mapping_codelist$not_mapped_total))
## Value source_authority Dimension
## 1 AUS IOTC fishing_fleet
## 1645 CHN IOTC fishing_fleet
## 5336 EUESP IOTC fishing_fleet
## 7166 EUFRA IOTC fishing_fleet
## 7177 EUMYT IOTC fishing_fleet
## 7183 EUPRT IOTC fishing_fleet