Structural predictors of analytical fragility in social science: evidence from SCORE Multi100
Description
The SCORE Multi100 programme assigned four to seven independent analysts to each of 100 published social and behavioural science papers, producing 508 analyst-paper rows, a mean paper-level strict agreement rate of 0.36, and 64 fragile papers (a<0.5). The paper-level distribution of agreement is starkly bimodal: the 64 fragile papers coincide exactly with the 64 papers satisfying a≤0.2, and no paper lies in the open interval (0.2, 0.8); a two-component Gaussian mixture beats a unimodal fit by ΔBIC ≈ 1190.
From a pre-specified predictor set, two structural correlates of fragility survive the paper's primary inferential standard. Log sample size correlates positively with agreement at paper resolution (Spearman ρ = 0.22, p = 0.034, n = 93), and a transparency-fragility paradox emerges in which high-badge papers show higher fragility than low-badge papers (means 0.286 vs. 0.552, Mann-Whitney p = 0.013; paper-level fragility odds ratio OR = 2.86, 95% CI [1.18, 6.92], p = 0.020). The row-level Total_Hours effect (ρ = 0.11, p = 0.018) attenuates under analyst-clustered robust standard errors, where no row-level predictor reaches α = 0.05.
A pre-specified null on ||z|-1.96| (ρ = -0.04, p = 0.69) guards against selective reporting. A cross-validated multivariate logistic classifier attains paper-level AUC = 0.667 ± 0.167 (n = 63) and row-level GroupKFold AUC = 0.625 (n = 474), well above chance but well below ceiling on 100 papers.
Methodology note: the manuscript was produced end-to-end by an autonomous multi-agent research system, architected and operated by the sole human author, who takes full responsibility for all content. The replication archive (OSF DOI 10.17605/OSF.IO/TJEHY) contains the pre-specified analysis plan, evidence manifest, scripts, and SHA-256-anchored data.
Files
Chabane_2026_SCORE_Multi100.pdf
Files
(681.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:1ca04fd8f5b5aab65497ef6d29b5afcb
|
681.7 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- Dataset: 10.17605/OSF.IO/TJEHY (DOI)