Published May 5, 2026 | Version v1
Preprint Open

Dissecting BERT Layers: FFN Dual Role, Separability-Guided Layer Skip, and Interpretable Classification via Charge-Flow Learning

Authors/Creators

Description

We present a layer-level analysis framework for BERT across five GLUE tasks. Using RX(River XAI), a charge-flow based interpretable learning framework, we replace BERT’s classifier with a 2–16 node interpretable network and identify removable layers through separability analysis. Our key contributions are: (1) a separability-guided layer skip method validated by actual BERT forward-pass experiments on all five tasks, (2) quantitative decomposition of FFN’s dual role — 92% structural (norm normalization) vs. 8% classification-relevant — explaining why FFN removal causes model collapse while individual layers appear “harmful” to classification, and (3) error analysis revealing that 60–93% of misclassifications are high-confidence errors (margin > 0.3), indicating BERT’s CLS representation itself is the bottleneck. RX is one application of a broader proprietary learning framework developed at River Lab; method specifics are subject to intellectual property protection.

Files

river_7 (2).pdf

Files (295.4 kB)

Name Size Download all
md5:53de7e3755af7d2e4edbba3d4237fb6d
295.4 kB Preview Download