
# graphMutationTimemaps.pl
## Interactive SVG SARS-CoV-2 mutation timemaps 
### 2021
### email: rwarren [at] bcgsc [dot] ca

### Description

Script to generate interactive SARS-CoV-2 mutation timemap scalable vector graphics (SVG), as seen here: https://bcgsc.github.io/SARS2/

Users may adjust parameters to create custom maps of different regions of the SARS-CoV-2 genome and control the minimum number of variants per day/year filters, datapoint opacity, jurisdictions, year, etc. to suit their specific needs. 


### Reference and Documentation 
-----------

*If you use our data/maps, please cite:

Warren RL and Birol I. Interactive SARS-CoV-2 mutation timemaps [version 1; peer review: awaiting peer review]. F1000Research 2021, 10:68 (https://doi.org/10.12688/f1000research.50857.1)

The article has detailed explanations on the features of the SVG timemaps.


### Maps availabilities
-----------

MAPS are available at:
https://bcgsc.github.io/SARS2/


### Data availability
-----------

To generate custom maps, you will need to download variant data (updated weekly). See below "Requirements"

https://www.bcgsc.ca/downloads/btl/SARS-CoV-2/mutations/


### Running the code
-----------

Make sure you have PERL installed on your system (see other requirements below)
No library dependencies
Additional files required (see below)

Usage: ./graphMutationTimemaps.pl
	<variation.txt (required)>
	<variant effect .tsv (required)>
	<annotation .gff (required)>
	<"continent name", or NA for all (required)>
	<nucleotide/a.a. variant(s) (e.g. "C22227T D614G"), or NA for none (required)>
	<scaling factor (required)>
	<all / Missense mutations 1,2 / 0 (1=byJurisdiction,2=byType default=1)>
	<opacity 0-1 (more to less transparent, default=0.75)>
	<year - optional(default=2020)>
	<basename - optional>
	<buffer (axis start from centre - optional, default=725 pixels)>
	<min. total genome support - optional, default=10>
	<min. daily genome support - optional, default=2>
	<plot data as % (1=yes, 0=no/default - optional)>
	<genome start coordinate (optional, default=1)>
	<genome end coordinate (optional, default=29903)>


### Requirements
-----------

When running "graphMutationTimemaps.pl" make sure the following files are in your working directory
1) country.tsv [included]
2) daynumber2020.tsv (or daynumber2021.tsv) [included]
3) wuhan-hu1-3cleaner.gff3 [included]
4) SARS-CoV-2_gisaid-ntedit-mutation_count-effect.tsv  (unzipped) [not included, must download, see "Data availability"]
5) SARS-CoV-2_gisaid-ntedit-mutation_nonredundantlist.txt (unzipped) [not included, must download, see "Data availability"]


### Notes on rendering SVGs
-----------

Interactive SVGs are best viewed on Firefox (fast) or Chrome (albeit slow) browsers.
Safari browsers will render the maps almost instantly, though mouse hover highlight decoration may be missing (but no information is missing).
User experience may vary based on your system.


### Examples
-----------

In the examples below, the files :
totalNEW.txt refers to "SARS-CoV-2_gisaid-ntedit-mutation_nonredundantlist.txt"
totalNEWvareffect.tsv refers to "SARS-CoV-2_gisaid-ntedit-mutation_count-effect.tsv"


#### Example commands to generate interactive SARS-CoV-2 SVG plots for year 2020

```
# SVG PLOTS
#
### GENOME ALL 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 1 0.15 2020 2020SARS2all 4300 10 5
### GENOME MISSENSE 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 0 0.15 2020 2020SARS2missense 4300 10 5
### GENOME TYPE 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 2 0.15 2020 2020SARS2type 4300 10 5
### INDELS RELAX
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffectINDELS.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 0 0.15 2020 2020SARS2missenseINDELS 4300 10 2 0
### SPIKE ALL 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 1 0.25 2020 2020SARS2spike_all 700 10 5 0 21500 25500
### SPIKE MISSENSE 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020SARS2spike_missense 700 10 5 0 21500 25500
### SPIKE MISSENSE CUSTOM
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA "D614G D936Y L54F N439K D253G Q675H S477N A1020S T723I A626S A222V A262S P272L A520S G1167V E583D L18F D215H Q23H V143F E484K L452R W152C D1118H S982A A570D Y145H T716I P681H N501Y N501T N501S" 4.49 0 0.25 2020 2020SARS2spike_custom 700 10 1 0 21500 25500
### SPIKE TYPE
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 2 0.25 2020 2020SARS2spike_type 700 10 5 0 21500 25500
### SPIKE INDELS RELAX
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffectINDELS.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020SARS2spike_missenseINDELS 700 10 2 0 21500 25500
#
### GENOME MISSENSE 100, 3
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 0 0.15 2020 2020SARS2missensevarCov100 4300 100 3
### SPIKE MISSENSE 100, 2
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov100 725 100 2 0 21500 25500
### SPIKE MISSENSE 10, 2
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10 725 10 2 0 21500 25500
### EACH CONTINENT
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 "North America" NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10northamerica 725 10 2 0 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 "South America" NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10southamerica 725 10 2 0 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Europe NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10europe 725 10 2 0 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Asia NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10asia 725 10 2 0 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Africa NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10africa 725 10 2 0 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Oceania NA 4.49 0 0.25 2020 2020SARS2spike_missensevarCov10oceania 725 10 2 0 21500 25500
#####
#
# PERCENTAGE
#
#####
### GENOME ALL 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 1 0.15 2020 2020ratioSARS2all 4300 10 5 1
### GENOME MISSENSE 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 0 0.15 2020 2020ratioSARS2missense 4300 10 5 1
### GENOME TYPE
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 2 0.15 2020 2020ratioSARS2type 4300 10 5 1
### INDELS RELAX
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffectINDELS.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 0 0.15 2020 2020ratioSARS2missenseINDELS 4300 10 2 1
### SPIKE ALL 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 1 0.25 2020 2020ratioSARS2spike_all 700 10 5 1 21500 25500
### SPIKE MISSENSE 10, 5
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020ratioSARS2spike_missense 700 10 5 1 21500 25500
### SPIKE MISSENSE CUSTOM
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA "D614G D936Y L54F N439K D253G Q675H S477N A1020S T723I A626S A222V A262S P272L A520S G1167V E583D L18F D215H Q23H V143F E484K L452R W152C D1118H S982A A570D Y145H T716I P681H N501Y N501T N501S" 4.49 0 0.25 2020 2020ratioSARS2spike_custom 700 10 1 1 21500 25500
### SPIKE TYPE
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 2 0.25 2020 2020ratioSARS2spike_type 700 10 5 1 21500 25500
### GENOME MISSENSE 100, 3
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 32.5 0 0.15 2020 2020ratioSARS2missensevarCov100 4300 100 3 1
### SPIKE MISSENSE 100, 2
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov100 725 100 2 1 21500 25500
### SPIKE MISSENSE 10, 2
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10 725 10 2 1 21500 25500
### SPIKE INDELS RELAX
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffectINDELS.tsv wuhan-hu1-3cleaner.gff3 NA NA 4.49 0 0.25 2020 2020ratioSARS2spike_missenseINDELS 700 10 2 1 21500 25500
### EACH CONTINENT
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 "North America" NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10northamerica 725 10 2 1 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 "South America" NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10southamerica 725 10 2 1 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Europe NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10europe 725 10 2 1 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Asia NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10asia 725 10 2 1 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Africa NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10africa 725 10 2 1 21500 25500
###
./graphMutationTimemaps.pl totalNEW.txt totalNEWvareffect.tsv wuhan-hu1-3cleaner.gff3 Oceania NA 4.49 0 0.25 2020 2020ratioSARS2spike_missensevarCov10oceania 725 10 2 1 21500 25500
```

Adjust the parameters to create custom maps of different regions of the SARS-CoV-2 genome, different filters on the minimum number of variants per day/year, datapoint opacity, jurisdictions, year, etc.


### License
-------

graphMutationTimemaps.pl Copyright (c) 2021 Rene L Warren.  All rights reserved.

graphMutationTimemaps.pl is released under the GNU General Public License v3. The maps it generates are licensed under the CC-BY (https://creativecommons.org/licenses/by/4.0/).

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, version 3.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.

We hope you enjoy and contribute to this free software.
