README Isoform Interpretation by Expectation Maximization (isopretEM) is a method for infering isoform specific functions based on the relationship between sequence and functional isoform similarity. The purpose of this zenodo repository is to archive the versions used for the original publication of the manuscript "An expectation-maximization framework for comprehensive prediction of isoform-specific functions" by Karlebach et al. (preprint available here: https://www.biorxiv.org/content/10.1101/2022.05.13.491897v1). Most users will want to use the GitHub repository: https://github.com/TheJacksonLaboratory/isopretEM The original publication corresponds to the tagged release 1.0.0 in the GitHub publication. ## Scripts - translate_isoforms.R - predict.R - combine_tables.R ## gold-standard isoform-specific dataset - isoform-literature-curation.tsv ## Predictions - isoform_function_list_bp.txt (biological process) - isoform_function_list_cc.txt (cellular component) - isoform_function_list_mf.txt (molecular function) ## Versions of the input datasets used for the publication - hgnc_complete_set.txt - goa_human.gaf - interpro_domains.txt Note that additionally, users will require the file Homo_sapiens.GRCh38.91.gtf. For the original publication, we used the following version: genome-build GRCh38.p10, genome-date 2013-12, genome-build-accession NCBI:GCA_000001405.25, genebuild-last-updated 2017-06 The file is available here: https://www.ncbi.nlm.nih.gov/data-hub/genome/GCF_000001405.40/