1. Install the genome and gene data (https://github.com/STOmics/SAW/tree/main/Scripts/pre_buildIndexedRef) 2. Creat the reference by reference_build.sh 3. create the job by data_preprocess.ipynb