Published April 28, 2026 | Version v2
Technical note Open

Approximating the Shapley Value with Many Players: A Hybrid Exact/Monte Carlo Estimator

  • 1. CIRPEE & PEP

Description

Computing the Shapley value exactly requires evaluating v(·) over all 2^n coalitions, which is intractable for n ≳ 25. This paper addresses the question of how to draw coalitions efficiently when exact computation is infeasible. We establish a level-stratified framework that clarifies the variance ordering of all existing drawing strategies, and propose two new estimators. The Hybrid Exact/Sampling estimator enumerates coalition-size levels with few coalitions exactly and samples only the large middle levels, achieving strictly lower variance than any purely stochastic method. The Neyman optimal-allocation variant minimises variance when within-level dispersion is heterogeneous. A unified comparative analysis covers the full spectrum of current methods: permutation sampler (Castro et al., 2009), level-stratified sampler (Maleki et al., 2013), KernelSHAP family (Lundberg & Lee, 2017; Covert & Lee, 2021; Olsen & Jullum, 2025), Owen/antithetic samplers (KhademSohi et al., 2025), and the non-asymptotic framework of Chen et al. (NeurIPS 2025). The bounded marginal impact index C(v) precisely characterises the conditions under which each method dominates. For the broad class of pseudo-continuous functions (C ≪ n), the hybrid reduces RMSE by 36–92% over the permutation sampler and one to two orders of magnitude over KernelSHAP at equal computational cost. A scaling experiment at n = 60 using a characteristic function with a closed-form Shapley value confirms that the advantage is preserved at larger n, and establishes a practical threshold rule: setting τ = C(n−1, 2) maintains six exact levels regardless of n.

Files

Shapley_Hybrid_Araar_April_2026.pdf

Files (412.1 kB)

Name Size Download all
md5:41f96d2a8ee8dceb407a2862707a1507
412.1 kB Preview Download

Additional details

Dates

Created
2026-04-16