Published February 27, 2021 | Version v2
Software Open

PPPredSS Code

  • 1. Montana State University
  • 2. University of North Florida

Description

Dependencies

Here is the list of packages required for running the Python code:

pandas
torch
keras
numpy
tqdm
gensim
sklearn
nltk
spacy
networkx
en_core_web_sm

How to Run

  1. Unzip sequences_labels.zipword2vec_100_10_5.zippropheno_scoms.zip, and propheno_masks.zip files located in the data directory.
  2. Run the main.py file using the following command:
python main.py \
	--path           path_to_data_folder \
	--epochs         20 \
	--bert_epochs    4 \
	--seeds          15

Input Parameters

  1. path -> this parameter is the path to the data folder where the training and test sets are located.
  2. epochs -> this parameter is used as the number of epochs that the CNN and RNN models are trained.
  3. bert_epochs - this parameter is used as the number of epochs for fine-tuning the BERT model.
  4. seeds -> this parameter shows the number of times to repeat the training and averaging results.

Jupyter Notebook

The notebook for the code is also available in the root folder which can be used as an alternative to the main.py.

Files

PPPredSS-main.zip

Files (472.5 MB)

Name Size Download all
md5:7ecf7f2f81e642db61fed328bdf20417
472.5 MB Preview Download