mllm-shap: A Shapley Value Explainability Platform for Text-Audio Multimodal Large Language Models

Pozorski, Paweł Dominik; Muszyński, Jakub Miłosz; Ganzha, Maria

doi:10.5281/zenodo.19678283

Published April 21, 2026 | Version v1

Publication Open

mllm-shap: A Shapley Value Explainability Platform for Text-Audio Multimodal Large Language Models

1. Warsaw University of Technology
2. Systems Research Institute

We present mllm-shap, an open-source Python platform for researchers and ML practitioners that extends Shapley value (SV) explainability from text-only large language models to multimodal LLMs (MLLMs) that jointly process text and audio. Building on the token-level SV framework introduced by TokenSHAP, mllm-shap addresses three challenges absent in the text-only setting: (1) modality-aware coalition masking that handles the coexistence of text tokens and dense audio encoder frames within a single input, (2) multi-turn conversation tracking with per-token role and modality metadata, and (3) audio token grouping via phonetic alignment that reduces the coalition space by 10–50×. The platform ships as a pip-installable package implementing five SV estimation strategies – including a Complementary Contributions estimator with Neyman-optimal allocation that outperforms Monte Carlo baselines – together with an interactive web GUI for real-time attribution visualization. To our knowledge, mllm-shap is the first publicly available framework for complete, reproducible SV-based explainability of text-audio MLLMs. The package is MIT-licensed with full source code on GitHub and a demonstration video included as supplementary material.

Files

Association_for_Computational_LinguisticsACLconference.pdf

Files (1.6 MB)

Name	Size	Download all
Association_for_Computational_Linguistics__ACL__conference.pdf md5:07e18aa032fef114c74bdbd6336621b5	1.6 MB	Preview Download

Additional details

Repository URL: https://github.com/Pawlo77/MLLM-Shap
Programming language: Python
Development Status: Active

	All versions	This version
Views	50	50
Downloads	44	44
Data volume	83.1 MB	83.1 MB

mllm-shap: A Shapley Value Explainability Platform for Text-Audio Multimodal Large Language Models

Authors/Creators

Description

Files

Association_for_Computational_Linguistics__ACL__conference.pdf

Files (1.6 MB)

Additional details

Software

Association_for_Computational_LinguisticsACLconference.pdf