Published November 18, 2021 | Version v1
Journal article Open

Positional SHAP (PoSHAP) for Interpretation of Machine Learning Models Trained from Biological Sequences

  • 1. Medical College of Wisconsin

Description

Machine learning with multi-layered artificial neural networks, also known as “deep learning,” is effective for making biological predictions. However, model interpretation is challenging, especially for sequential input data used with recurrent neural network architectures. Here, we introduce a framework called “Positional SHAP” (PoSHAP) to interpret models trained from biological sequences by utilizing SHapely Additive exPlanations (SHAP) to generate positional model interpretations. We demonstrate this using three long short-term memory (LSTM) regression models that predict peptide properties, including binding affinity to major histocompatibility complexes (MHC), and collisional cross section (CCS) measured by ion mobility spectrometry. Interpretation of these models with PoSHAP reproduced MHC class I (rhesus macaque Mamu-A1*001 and human A*11:01) peptide binding motifs, reflected known properties of peptide CCS, and provided new insights into interpositional dependencies of amino acid interactions. PoSHAP should have widespread utility for interpreting a variety of models trained from biological sequences.

Files

20201230_all5_xtest.txt

Files (47.6 MB)

Name Size Download all
md5:434265fe7ee111f2839fd968dbd1e833
1.5 MB Preview Download
md5:295432e7a3af2be458ad458a1ae257df
11.0 MB Preview Download
md5:ca4ee3715ffa2ac2dbf069596cb5b277
2.8 MB Preview Download
md5:d745482ee6cfda2de97cbea6153662eb
770.8 kB Preview Download
md5:0b41fbe1d3706bd4bd83bdc0725e0eb3
5.5 MB Preview Download
md5:b8f2861f2c2042519c9685f7198d0e28
1.4 MB Preview Download
md5:0870360be500cdba5b3c7017d5fd8b20
1.2 MB Preview Download
md5:2059ff7d70f0b27a1cf8d2822fd401aa
8.3 MB Preview Download
md5:306bd3bf71d17c8bfbf32f97859401d4
2.1 MB Preview Download
md5:89f102c52ab5c18dc57931ef9076604d
119.6 kB Preview Download
md5:b04e33f92775727619d6412b839deefb
861.5 kB Preview Download
md5:2286086a5d8047aa985af84bd601bbc6
214.7 kB Preview Download
md5:28c796dd0d648f0914cd5ce06670fd3f
157.9 kB Preview Download
md5:8e974e2e35a54f85ab72e92c88bd40e0
1.1 MB Preview Download
md5:d1cd559583cbe9984216fc95d516f8d8
284.1 kB Preview Download
md5:8db0e0167d3df1bdbe23279cedf91405
16.4 kB Preview Download
md5:7969d526321f333c50d083c1a43b9e8f
118.2 kB Preview Download
md5:14d56488884621cd2c9c1bf627ec449c
29.6 kB Preview Download
md5:f3c0d84ebabc5199887ae66aea1de664
1.2 MB Download
md5:a46c034b2291d04a08b47556a4a72928
27.3 kB Download
md5:be5e470f769384099e85d3339de48985
13.6 kB Download
md5:ee4a672024aa311eb6537adb8a74e04d
1.2 MB Download
md5:207db7395aed8dd31e07d569268eba11
475.5 kB Download
md5:fa39b9ddd5fc577267eb23a4b777658c
20.5 kB Download
md5:46d79236894faaa8b9ec916739d7592f
528.0 kB Download
md5:7b647400d4772f0e753f601be06fe926
379.8 kB Download
md5:9f8fd29ca0fce414722aedde342f342b
597 Bytes Preview Download
md5:5245b1e8361c9bea1c5b1643c13dd653
6.3 MB Preview Download
md5:5d4765f192cc466048535c83d6e705f6
8.2 kB Preview Download