====================================================
CONTENT OF ARCHIVE "Cherri_models_data_v4.zip"
====================================================

Cherris model data containing precomputed models and there training data.

Cherri is a tool for classifying RNA-RNA interaction regions. 
This model collection contains 8 different trained models and can be directly used to evaluate human or mouse data. 
The folder structure splits the data in two parts: models trained with graph features and models trained without. On the next level the models are named by their original dataset source. All subfolders contain one folder for the feature file and another folder for the trained model.
Additionally the data used to computed the base MFR-model is stored in the folder MFE_data. 


Description of folder contents:


First level folders:
====================

MFE_data
Model_with_graph_features
Model_without_graph_features

Each folder contains 4 different models, with the difference that the first 4 are trained with and the second 4 are trained without graph features.


Second level folders:
=====================
Full
PARIS_human
PARIS_human_RBP
PARIS_human_mouse

The folder names correspond to the dataset source used to train the respective model. 

In the following, "model_name" is used as a placeholder to describe the general data structure.


Third level files are in Model_[with|without]_graph_features folders:
===================================================================

feature_filtered_model_[name]_context_150_pos_occ_neg.csv: negative training instances
feature_filtered_model_[name]_context_150_pos_occ_pos.csv: positive training instances
training_data_model_[name]_context_150.npz: positive and negative instances including the graph features, in the Model_with_graph_features folder. 

full_model_[name]_context_150.model: final found estimator trained on the full dataset
 
occupied_regions.obj: pickled InterLap dictionary storing occupied positions 

If you would like to use Cherri in evaluation mode, provide the full_model_name_context_150.model as model and training_data_model_name_context_150.npz as feature file and set the structure feature parameter to 'on' or 'off' depending on the chosen input.

If you want to create a mixed model or you want to re-evaluate the current model, please use the positive and negative training instances. Set the input RRI directory to the location of the model_name and provide the model_name as replicate. 


Third level files are in MFE_data folder: 
=====================================================================

feature_filtered_model_[name]_MFE_context_150_pos_occ_neg.csv: negative training instances
feature_filtered_model_[name]_MFE_context_150_pos_occ_pos.csv: positive training instances